Frequently Asked Questions
No, the DMS plan replaces the GDS plan in terms of NIH requirements for proposals submitted on or after January 25, 2023.
See https://sharing.nih.gov/genomic-data-sharing-policy/developing-genomic-data-sharing-plans for more information.
For prior GDS policy information see, https://sharing.nih.gov/genomic-data-sharing-policy/developing-genomic-data-sharing-plans#before.
IU REDCap can be part of your DMS plan for collecting, storing, and managing data. Storage for conducting research differs in many ways from storage that enables data sharing. You will need to select a repository which meets the NIH desirable characteristics for data repositories and which makes the data more Findable, Accessible, Interoperable, and Reusable (FAIR). Repository appropriateness is determined by how the features of your data match up with available repositories. Librarians and other research support professionals can help you identify, evaluate, and select the best available repository(ies).
For support in evaluating or choosing data repositories, contact your local data librarian:
- IU School of Medicine: Levi Dolan, ldolan@iu.edu
IU Indianapolis: Heather Coates, hcoates@iu.edu
IU Bloomington Wells Library: Ethan Fridmanski, ejfridma@iu.edu
See https://sharing.nih.gov/data-management-and-sharing-policy/sharing-scientific-data/selecting-a-data-repository for more information.
Yes. There are repositories which accept large data sets. However, large data are expensive to store and manage, so you will probably need to discuss the specific needs for storing and describing your data with the repository personnel. Some repositories such as the Inter-university Consortium for Political and Social Research (ICPSR) provide detailed quotes that you can use to develop your DMS budget. Plan ahead so you have time to engage with potential repository(ies).
The NNLM Data Repository Finder may help you narrow the options to those most relevant to your data: https://www.nnlm.gov/finder.
For support in evaluating or choosing data repositories, contact your local data librarian:
- IU School of Medicine: Levi Dolan, ldolan@iu.edu
IU Indianapolis: Heather Coates, hcoates@iu.edu
IU Bloomington Wells Library: Ethan Fridmanski, ejfridma@iu.edu
See https://sharing.nih.gov/data-management-and-sharing-policy/planning-and-budgeting-for-data-management-and- sharing/budgeting-for-data-management-sharing#after for more information.
Supplementary data can be a good option in the short-term, but relying on journals to provide access to your data is not a viable long-term option. Several of the desirable characteristics of data repositories enable long-term access. Publishers do not typically guarantee access for any specific period of time and frequently discard those files without providing advance notice. The links to supplemental materials often do not work after a few years. If a specialized repository is not available, a good option is to deposit supplemental data in general repository such as DataWorks or Zenodo.
See https://sharing.nih.gov/data-management-and-sharing-policy/sharing-scientific-data/repositories-for-sharing- scientific-data for more information.
The DMS plan is a baseline requirement for all NIH institutes, centers, and offices (ICO) for grant proposals that will generate scientific data. Some ICO may ask you to provide additional details or to use specific repositories, data standards, etc. Certain types of studies, such as clinical trials, may have more detailed expectations. Check the Funding Opportunity Announcement (FOA) for additional guidance.
See https://sharing.nih.gov/other-sharing-policies/nih-institute-and-center-data-sharing-policies for more information.
The NIH DMS policy expects scientific data sharing to be maximized. Data sharing encompasses a wide range of options, including the scope of data to be shared and the access options for individuals interested in reusing it. Data sharing does not necessarily mean that all data must be shared openly. There are several strategies for sharing protected data in ways which do not violate relevant laws and regulations. In some cases, a data set can be de-identified and shared openly. Please note that labeling a data set as de-identified may be based on different thresholds depending on the regulatory context (e.g., HIPAA as compared to the Common Rule). A data set containing identifiable or other protected elements can be shared via an appropriate controlled access repository. In some cases, creating a data set of aggregated data or identifiable elements that have been statistically masked to disturbed to prevent re-identification may be feasible. When protected data elements are included in a data set, some sort of Data Use Agreement, Data Sharing Agreement, Deposit Agreement, or other Research Agreement is necessary to meet our obligations to protect the data.
No single research support group can answer all your questions. It truly takes a village of support to navigate the administrative, regulatory, financial, and data management and sharing requirements associated with federal funding.
For questions about Data Use Agreements, Data Sharing Agreements, Repository Deposit Agreements, and other Research Agreements, contact Research Contracting at oraresco@iu.edu.
For support in evaluating or choosing data repositories, contact your local data librarian:
- IU School of Medicine: Levi Dolan, ldolan@iu.edu
- IU Indianapolis: Heather Coates, hcoates@iu.edu
- IU Bloomington Wells Library: Ethan Fridmanski, ejfridma@iu.edu
See https://sharing.nih.gov/data-management-and-sharing-policy/sharing-scientific-data/selecting-a-data- repository#additional-considerations-for-human-data for more information.
Principal Investigators are best positioned to fully understand the context of a research project, so a collaborative approach with statistical or data protection experts is the recommended strategy for minimizing disclosure risk and satisfying relevant requirements for stating that a data set is de-identified. Currently, there are several services that can support generation of a de-identified data set at various phases of the research process. There may be costs associated with use of these services.
- IU School of Medicine Biostatistics & Health Data Science, https://medicine.iu.edu/biostatistics/services
- School of Public Health-Bloomington Biostatistics Consulting Center, https://biostats.indiana.edu/index.html
- Indiana Statistical Consulting Center, https://iscc.indiana.edu/
- Social Science Research Commons, https://ssrc.indiana.edu/services/consulting.html
*External Resources which may be of support:
- The National Institute of Standards & Technology has assembled a list of tools that can be used for de- identifying data. While these tools support users in removing identifiable information, the user is responsible for making decisions related to disclosure risk. See https://www.nist.gov/itl/applied-cybersecurity/privacy- engineering/collaboration-space/focus-areas/de-id/tools
- The Qualitative Data Sharing Toolkit was developed by the Institute for Informatics, Bioethics Research Center at Washington University in St. Louis, and ICPSR with funding from the National Human Genome Research Institute (NHGRI). It is available at https://qdstoolkit.org/
- Some repositories, such as the Inter-university Consortium for Political and Social Research (ICPSR) have curators with expertise in estimating disclosure risk and de-identification. This level of curation is typically associated with a cost, though some programs or funders may directly fund curation.
*Inclusion on this list does not constitute endorsement by Indiana University or this Working Group.
It is allowable to budget for fees from IU services as long as they are from a University Controller Office (UCO) approved recharge center or Revenue Producing Activity. If not, then only direct charges related to the data management could be charged to the grant.
For questions related to how this impacts the proposal or proposal budget, contact iuprop@iu.edu.
See https://sharing.nih.gov/data-management-and-sharing-policy/planning-and-budgeting-for-data-management-and- sharing/budgeting-for-data-management-sharing#after for more information.
Yes, it may be appropriate to deposit different subsets or data sets resulting from a project to different repositories. Most research projects generate several different types of data, each of which may have distinct characteristics that inform how and where the data are shared. Since many repositories ask for non-exclusive permission to distribute the data deposited, this does not limit you from sharing the data in another repository. However, depositing the same version of a data set in multiple repositories can cause confusion about which version to cite and may pose additional challenges for potential users. When specialist data repositories exist, it is recommended practice to deposit the data which are best supported (e.g., genomic sequences, protein structures, neuroimaging, etc.) by those repositories. When a specialist data repository does not exist, you have more flexibility in choosing how to bundle or package data sets for deposit.
The NNLM Data Repository Finder may help you narrow the options to those most relevant to your data.
See https://sharing.nih.gov/data-management-and-sharing-policy/sharing-scientific-data/selecting-a-data-repository for more information.
Open (access) data or sharing data openly usually refers to data that is accessible to anyone with an internet connection. Even though a data set might be open for access, there may be restrictions on what you can do with it. For example, a data set with CC-BY license requires a user to give attribution when they use the data. There are many types of licenses and agreements which specify the rights of the creator and the user.
Unless a data set is fully de-identified per relevant study regulations (direct identifiers removed), sharing data about human participants beyond the project team generally requires that you obtain consent for such sharing. In some cases, the inclusion of indirect identifiers or a pre-existing data agreement may require controlled access sharing and/or an agreement.
At IU, only data that is classified as public (see https://datamanagement.iu.edu/data-classifications/index.html) can be shared openly. Data sets containing any protected data elements cannot be shared openly.
There are some data elements that are considered inherently identifiable. The presence of any of these data elements in a data set shared openly may pose a risk to patients or human participants.
- List of 18 identifiers regulated under HIPAA, https://kb.iu.edu/d/bdtx
- Student PII, https://datamanagement.iu.edu/tools/dsh.html#Student%20PII%20- %20Personally%20Identifiable%20Information
- Finger or voice-print, audio recordings, and other biometric identifiers
- Photographic image(s), not limited to image(s) of the face, that could uniquely identify the individual research participant
- Any other characteristic, or combination of data elements, that could uniquely identify the individual research participant
- De-identifying a data set is a process that is often unique to the research project. If you have questions about whether a data set can be shared openly, contact the relevant data steward(s).
See https://sharing.nih.gov/data-management-and-sharing-policy/protecting-participant-privacy-when-sharing- scientific-data/principles-and-best-practices-for-protecting-participant-privacy for more information.
Unlike open (access) data, data shared in controlled ways requires users to request access and often involves a review process. Controlled data sharing describes a broad range of practices that includes everything but open data sharing. In most cases, data sharing should be conducted under agreements to protect both parties. Some data agreements may be stand-alone, while others may be addressed in funding or other agreements. While the data may be publicly listed in a catalog or index, users are required to request and obtain approval before access to the data is given. For example, the Data Access Committee (DAC) reviews requests to reuse data in dbGap.
Sharing data about human participants requires that you obtain consent for such sharing.
Under HIPAA, the standard term for sharing of a limited data set is Data Use Agreement, whereas in other situations it may be called a Data Sharing Agreement or a Data Transfer Agreement.
See https://sharing.nih.gov/data-management-and-sharing-policy/sharing-scientific-data/data-sharing- approaches#after and https://sharing.nih.gov/data-management-and-sharing-policy/sharing-scientific-data/selecting-a- data-repository for more information.
At IU, the PI is responsible for monitoring and ensuring compliance with data management and sharing practices as set forth in the DMS Plan.
Please see https://research.iu.edu/policies/nih-dms-plan-guidance.html
- SecureMyResearch, smr@iu.edu
- Ruth Lilly Medical Library, medlref@iu.edu
- IU Indianapolis University Library - Heather Coates, hcoates@iu.edu
- IUB Wells Library – Ethan Fridmanski, ejfridma@iu.edu