Choosing the Best Data Repository for You

What is data storage and why should you care?

Data storage occurs throughout the research process. Short-term storage of data during the active data collection phase of a research project or ephemeral storage of backup copies of data during sequential steps of data analysis may be accomplished on local computer drives or in a networked environment. Long-term storage or archiving of final data products is more likely to occur in a networked environment or in off-site repositories. All of these storage decisions and steps are an important part of the research process and the data management lifecycle.

For more information, please visit the UNLV Libraries' Data Management Guide.

Choosing a repository

There are many discipline-specific archives or repositories available to researchers. Depositing research data in these specific repositories may be required by certain publishers or funding agencies. When choosing a repository for your data, there are a few questions you want to keep in mind.

How large are your data files?

Certain repositories can only accommodate projects up to a certain size.

Are your data files in an open format (i.e. .csv instead of .xsls, .pdf instead of .docx)?

File formats evolve over time. While some formats are free to use, others, such as Microsoft Word or Excel require accounts or subscriptions. Distributing your data in open formats ensures that more people will be able to access your research.

What information do you need to provide for others so your data is reuseable?

Data repositories will ask for some form of documentation, such as README.txt files, which detail information and processes that enable researchers to replicate and reuse data.

Has your data been de-identified enough to allow for sharing?

Are you working with sensitive data or does your data need restricted access? Repositories will have different capabilities in terms of data sharing. Repositories such as the social science repository ICPSR offer different options regarding data confidentiality and restricted use.

When should your data be released?

Does your data need to be embargoed (delaying publication until a specified time) or can it be released immediately to the public?

How would you like to license your data?

While datasets cannot be copyrighted, they can be licensed, so it's important to know how you would like your data to be reused. While CC-0 allows data to be reused in the most open fashion, some researchers prefer to use CC-BY, which requires attribution to the creator.

General repositories: a comparison

The Registry of Research Data Repositories is a tool for helping people identify and locate online repositories of research data.

For a comparison of popular data repositories, please view our Data Repositories spreadsheet.

Need help?

If you have any questions about depositing your data in external repositories, please reach out to the Data Librarian, Halle Burns at halle.burns@unlv.edu or set up a consultation.

Ask Us