Resources
Resources for Finding Datasets for Education
While it is important to teach statistics and data science with real data, it is not an easy task to find real data to demonstrate a specific concept. Below is a list of resources I have used or have had recommended to me for finding real data, and hopefully someone else can find it useful too!
If you have any feedback to help improve this list or resources to add please let me know by filling out this brief Google Form.
Repositories From Professional Organizations for Teaching
- JEDI JEDI-informed teaching of statistics
- SSDSE Dataset Repository
- CAUSE Resource Library
- Teaching of Statistics in the Health Sciences (TSHS) Resources Portal, a source for well-documented health-related datasets and teaching materials.
- WISE (Web Interface for Statistics Education) has a collection of demonstrations and tutorials.
Published Research
- Study Finds More accessible topics from a wide range of fields
- Google News Health and Sciences Section
- ICPSR International social research
- OSF open data repository General platform for sharing research, data, materials
- Journals such as PLOS1
Other Repositories or Data Portals
- UCI machine learning repository Machine learning
- Nationmaster portal to international economic, demographic and social data.
- Awesome Public Data Sets by Xiaming Chen and other contributors
- Data and Story Library (DASL)
- Awesome public datasets
- General Social Survey American demographics, behaviors, and opinions (good for two categorical variables)
- Bikeshare data portal
- Data.gov
- Data is Plural
- Edinburgh Open Data
- CORGIS: The Collection of Really Great, Interesting, Situated Datasets
- Google Dataset Search
- Harvard Dataverse
- NHS Scotland Open Data
- IPUMS survey data from around the world
- Los Angeles Open Data
- NYC OpenData
- Open access to Scotland’s official statistics
- PRISM Data Archive Project
- UCI Machine Learning Repository
- UN data
- UK Government Data
- US Government Data
- Youth Risk Behavior Surveillance System (YRBSS)
- Rebecca Barter’s List of Public Databases
Open Access Textbooks
- Peter K. Dunn (2024). Scientific Research and Methodology: An introduction to quantitative research in science and health. https://bookdown.org/pkaldunn/SRM-Textbook
- Bayes Rules! An Introduction to Applied Bayesian Modeling
- Open Intro Statistics
Other Ideas of Places to Look
- Singer 1990 suggests looking at published papers from local dissertations