Skip to Main Content library banner. University of the Cumberlands Grover M. Hermann Library

Quantitative Research: Data & Statistics

Statistical Databases

Note: You can search for statistics and data via special search options in databases such as APA PsycInfo and MedLINE.

Government and Health Datasets

Secondary Data (Open Access Data/Data Born Datasets)

What is Secondary data?

Secondary data is information not originally collected by the current researcher.

Notes about Open Access Data Sets/Open Born Datasets

Open data is openly accessible, exploitable, editable, and shared by anyone for any purpose.

Statista is an example of an open-born data collection.

Other examples include: AWS Open Data Registry, Data.gov, European Data Portal, Figshare, Github, Google Dataset Search, Kaggle, Microsoft Azure Open Datasets OpenDatasoft, Open Dataset, Quandel, Stanford Large Network Dataset Collection, UCI Machine Learning Repository, United Nations Development Programme (UNDP), World Bank Open Data, Zenodo

You as the researcher must evaluate the quality of data for your study. As a part of the IRB approval process, you must determine if the data you want to use is ethical. Consider the list of factors below. You should consult your dissertation chair, your methodologist, or the IRB office with any questions.

What factors should I consider when evaluating open-access data?

  • Audience: Is the audience specialized or general? Is it appropriate for your study?
  • Bias: Because third parties often curate open-born datasets, their representativeness may be limited.
  • Credentials: Attempt to ascertain the credentials of the authors/source of the information.
  • Confidentiality: Determine that information gathered from or about research participants in the course of a study is private and should only be revealed to third parties with the explicit consent of the individuals from whom the information was obtained.
  • Data Collection: How was the data collected?
  • Data Interoperability: What type of data and quality of data? Aggregate data or disaggregate?  
  • Timeliness: When was the data published?
  • Transparency: Determine if all relevant information, such as methodology and financial interests, are readily disclosed and communicated.
  • Verification: Can the numbers be verified? Do other sources provide similar numbers?

Common Challenges with Open Access/Open Born Datasets

  1. Appropriateness (Appropriate use of data)
  2. Bias and representativeness
  3. Confidentiality of data (data privacy and security)
  4. Data interoperability
  5. Data ownership and attribution
  6. Lack of Transparency
  7. Quality Control (You must ensure it meets ethical standards)

Ultimately, as the researcher, you must be able to discuss how the dataset was created and determine if the data was curated ethically. In addition, you must identify limitations, biases, privacy concerns, and questions/assumptions of the data collectors.

You are encouraged to reach out to the dataset creator for information.

Data Sets