Skip to main content
 

Data Services

A Guide of Library Data Services

Why Share Your Data?

Data Sharing has multiple benefits which include increasing the visibility of your research, facilitate new discoveries by other researchers, and meet funding requirements.

Increase your visibility by including your data in open access repositories.  Raise your prominence by contributing to the field and keep your research relevant.

Facilitate new discoveries by allowing other researches to use your data sets.  Data sets can be re-analyzed to answer a research question and prevent duplication of effort.

Meet funding requirements as many federal grants require data sharing around the time of publication.  Individual journals and publishers also are requiring or encouraging data sharing in their editorial policies.

Make public assets available to the public through publishing data papers, research articles with your conclusions and submitting datasets to open repositories.

Other Incentives:

  • Data repositories create DOIs allowing for citations of the data
  • Data journals allow for data publications and full text articles
  • NIH Biosketch allows for the citing of non-traditional research outputs such as datasets
  • Data papers allow researchers to receive citations for a dataset like a publication
  • Sharing data can result in increased citations for published studies
  • Maximize transparency, accountability, and scrutiny of research findings

Not all data can be shared freely.

Restrictions on

  • Threatened or endangered species
  • National security and classified research
  • Export controls (technologies)
  • Personal Health Information
  • Data involving children or prisoners

Best Practices

Ensure you can share your data easily:

  • Use Open Source file formats (ASCII)
  • Provide software information in metadata (software name, version, operating system)
  • List number of files in file structure in metadata
  • Select a consistent file format
  • Select files that can be edited regardless of application
  • Files should follow a documented standard
  • File should stay unencrypted and uncompressed
  • Upload datasets to open repositories for easy access
  • Keep it clean by avoiding missing values or unclear column headers
  • Understand funder/institution policies on data sharing
  • Remove personal data

Funder Requirements

Funder Policy Effective Publication Requirements DMP Required? Data Requirements
AHRQ October 1, 2015 Provide full public access to publications no later than 12 months after the official date of publication.  AHRQ will archive it's publications in PMC Yes Small data sets to be released quickly, larger data sets are to be released in waves as it becomes available or as main finders are published More Info
CDC January 2015 Data available coincident with publication of paper Yes Data set released within 30 months after the end of data collection or generation, expect surveillance data which should be made available within a year More Info
FDA December 29, 2015 Submit final peer-reviewed manuscript to NIHMS upon publication; made available in PubMed Central within 12 months Yes Share data underlying research papers when the paper is published 
More Info
IES 2012 Publications submitted to ERIC within 12 months of publisher's official date of final publication Yes Final Research Data and free of personal identifiers; data available no later than time of publication More Info
NIAID July 2017 Submit final peer-reviewed manuscript to NIHMS upon publication; made available in PubMed Central within 12 months Yes A data sharing plan is required only for applications requesting $500,000 or more. More Info
NIH April 7, 2008 Submit final peer-reviewed manuscript to NIHMS upon publication; made available in PubMed Central within 12 months Yes For grants over $500,000 a data sharing plan must be included in the application.  Key Elements
NSF January 18, 2011 Submit final peer-reviewed manuscript to NSF Public Access Repository upon publication; made available within 12 months in repository and/or on participating publishers’ sites Yes Investigators are expected to share with other researchers the primary data, samples, physical collections and other supporting materials created or gathered in work under NSF grants More Info

Find Additional Federal Grants

Journal and Publisher Policies

Journal or Publisher Name

Policy Summary
Annals of Internal Medicine Manuscripts submitted to Annals that report the results of clinical trials must contain a data sharing statement that meet the ICMJE recommendations.
BMJ BMJ requires a data sharing statement for all research papers. For papers that do not report a trial, we do not require that the authors agree to share the data, just that they say whether they will.
ICMJE As of 1 July 2018 manuscripts submitted to ICMJE journals that report the results of clinical trials must contain a data sharing statement.
JAMA For reports of randomized clinical trials, authors are required to provide a Data Sharing Statement to indicate if data will be shared or not.
The Lancet From July 1, 2018, all submitted reports of clinical trials must contain a data sharing statement, to be included at the end of the manuscript.
New England Journal of Medicine The ICMJE and, therefore, NEJM require investigators to submit a data-sharing statement (2018) and register a data-sharing plan when registering a trial (2019).
PLoS The data underlying the findings of research published in PLOS journals must be made publicly available.
PNAS To allow others to replicate and build on work published in PNAS, authors must make materials, data, and associated protocols, including code and scripts, available to readers.
Science The Science Journals support the efforts of databases that aggregate published data for the use of the scientific community. Therefore, before publication, large data sets must be deposited in an approved database and an accession number or a specific access address must be included in the published paper.
Springer Nature At Springer Nature we want to enable all of our authors and journals to publish the best research, which includes achieving community best practices in the sharing and archiving of research data.
Sherpa Juliet Sherpa Juliet enables researchers and librarians to see funders' conditions for open access publication.

FAIR

Findable- Data and metadata are easy to find by both humans and computers.
Accessible- Data use is open to the greatest extent, allowing others to query or copy data for their own use.
Interoperable- Data can be interpreted by a computer easily.
Re-useable- Data and metadata is well described for both humans and computers to replicate or combine data with other datasets.

More Info

FAIR Data Principles explained

Common Data Elements

Common Data Element (CDE) - A data element that is common to multiple data sets across different studies.

Certain types of CDEs are sometimes described:

  • Universal - CDEs that may be used in studies, regardless of the specific disease or condition of interest, e.g., demographic information of study subjects or medical history
  • Domain-specific - CDEs that are designed and intended for use in studies of a particular topic, disease or condition, body system, or other classification, e.g., Parkinson's disease, Alzheimer's disease, diabetes, ophthalmology. Some domains are broadly applicable to a wide range of studies, while others are more useful in specific fields of clinical research.
  • Required - CDEs that are required or expected, as a matter of institutional policy (e.g., research funder or performer), to be collected for all subjects in studies of a particular type, e.g., NIH-funded studies of neurological disease
  • Core - CDEs that are required or expected to be collected in particular classes of studies, e.g., any study of neurological disease or cancer, any genome-wide association study. Other, domain-specific common data elements may be suggested, expected, or required for collection, depending on the more specific focus of the study (e.g., Alzheimer's disease or ovarian cancer)

NIH Common Data Element Portal

NIH CDE Repository

Draft of U.S Core Data for Interoperability of Clinical Research

Data Journals

Data papers are an easy way to share data.  They are not traditional research articles, as their purpose is to simply describe your dataset.  

There are over 200 data journals, with 120 of those in biological, medical or health care fields.  They are mostly open access, indexed in widely used databases, peer reviewed, and some have impact factors.

The benefit to submitting a data paper is to double the publication output from a single research project.  Publishers consider data papers complementary to research articles, and not a prior publication.

The paper is easy to write as it is usually brief and you are provided an article template to follow.

Since data papers can be cited, it is easy to track reuse of your data and contribute to one's h-index.