Developing Your Data Management Plan

Introduction

Today, most federally and privately funded grant applications require submission of a data management plan, detailing decisions around description, format, storage, ethics, and more. In response to researcher needs in these areas, UNLV Libraries Data Services faculty created Developing Your Data Management Plan. This resource introduces issues and questions for you to consider as you begin developing your data management plan.

Data Services is happy to help you work through your data management needs. If you would like to consult with a librarian, please review the questions and considerations below prior to your meeting.

Data Description

A way to define how data is collected and documented to enable the eventual replication and reuse of data.

  1. What type(s) of data will be produced?
  2. How much data do you anticipate collecting and generating by the end of your project? (This can be a rough estimate, for example 3 gigabytes)
  3. How will data be collected?
  4. What software, tools, or programming languages will you use for the creation, analysis, or visualization of the data?

Format

File formats refer to ways information is stored in a computer file. Formats change over time, so it is important to consider the potential for format obsolescence when choosing a format for long-term storage.

  1. What file formats will be used in your project?
  2. Are your file formats open or proprietary?
  3. Are your file formats a community standard?
  4. If your data is to be documented using an uncommon or proprietary file format, will it be converted to an open format for public distribution?

Metadata

Metadata is essentially data about data. Metadata helps you document the contents or the characteristics of your data files. What is documented, however, changes depending on the metadata standard you use.

  1. Does your research field possess disciplinary standards required for data documentation?
  2. What details are necessary for others to use your data and/or code?
  3. What directory and file naming conventions will you use?

Ethics and Privacy

Not all data can be shared openly. Particularly when working with human subjects or sensitive data, ethical and privacy considerations must be taken into account prior to sharing your data.

  1. Does your data require security considerations (e.g. sensitive data)?
  2. Do you have any datasets or code that cannot be shared? Examples:
    1. Protecting intellectual property 
    2. Data owned by another party, or regulated by policy or law
  3. If you possess information that cannot be shared, will a portion of the data be transformed to allow public dissemination? Examples:
    1. Identifiers removed from human subject data
    2. Only aggregate data disseminated
    3. Use of a restricted data repository (e.g. ICPSR)

Storage and Backup

Properly storing and backing up your data is integral to ensuring your data is protected from situations such as corruption or loss. Knowing how you plan to store data, particularly sensitive data including data on human subjects, ultimately protects your subjects as well as your data.

  1. How will you track versions of your datasets and/or code?
  2. How and where will your data be stored?
  3. What measures will you take to protect your data?
    1. Who has access to your data?
    2. Is your data password protected?
    3. Is your data encrypted?

Intellectual Property Rights

Intellectual property rights pertain to the rights given to a person over something of their own creation. Most commonly, these rights consist of copyright, patenting, or licensing. Most data in the United States is not copyrightable, however, you can license your data.

  1. How would you like your data to be cited by others?
  2. Are there any restrictions or conditions for re-use of which others should be aware?
    1. Do you want to license your data? (e.g. Creative Commons)
    2. Is your data collected from other sources? If yes, are there any restrictions or conditions for that data?

Access and Sharing

Access and sharing of data may be done in conjunction with archiving and preservation (below). Providing access to your dataset may be at the request of the funder or publisher or your own interest in seeing others reuse your data.

  1. When will you share your data? 
  2. Which of the datasets or software used or generated during the project will you share? 
  3. What tools will be required to use your datasets (e.g.: software, instruments?)
  4. How will your datasets or code be shared with others (e.g., be made publicly accessible or shared with researchers by direct request only?) 
    1. View the r3data.org list of data repositories.
  5. Who is expected to use the shared datasets and/or software?

Archiving and Preservation

This section suggests considerations for long term data storage and preservation. This effort may be in conjunction with access and sharing (above).

  1. Which datasets and/or code will be preserved after the project (i.e., stored with sufficient organization and documentation for future reference)? 
  2. Who is the audience for your preserved data?
  3. How long will data and/or code be preserved after the project?
  4. Who will manage and administer the preserved data?
  5. If you are using a repository or service to share your data, is there a formal archiving or data-sharing agreement required for depositing to the archive?

Ask Us