Before Creating a New Dataverse Dataset
- Assess which data files you will include and that you need to share. Keep in mind that Dataverse is for data that is ready to be shared and made available. It is not for working storage. If you need working storage see ITS's recommendations or the Office of Research Computing for storage options for researchers, or you could use the Open Science Framework for managing your project.
- To prepare your data files for uploading, consider converting your files to a non-proprietary or an open format, if possible. You may make your files available in both proprietary and non-proprietary formats; it does not need to be one or the other. Depending on the file format, Dataverse will convert the file, but it cannot do this for all formats.
- Include documentation for your dataset to help others understand the contents of your dataset and the contents of specific files. Documentation can be in the form of a readme file, data dictionary, or codebook. Upload the documentation file(s) along with your data files. Check out these templates from University of Virginia and Cornell University.
- If your data files are already shared through another open repository, you do not have to also upload the files to Dataverse. You can create a metadata-only dataset (no files attached) and include, as part of the dataset metadata, the URL and name of the repository where your files are.
- If you need assistance contact, datahelp@gmu.edu.
Creating a New Dataset
- Log in to Dataverse: https://dataverse.orc.gmu.edu with your Mason NetID.
Click the Log In link at the top-right of the page and select George Mason University Federated Login.
- Click on the "Add Data" button.
- If you would like to create a sub-dataverse for your department, school, or research center please contact us at datahelp@gmu.edu and we will set one up for you.
- If you would like to create a sub-dataverse for your department, school, or research center please contact us at datahelp@gmu.edu and we will set one up for you.
- Fill in all the required fields. You will see them marked with a red asterisk on the metadata form.
- Title
- Author (can add more than one author)
- Contact (can add more than one contact)
- Dataset Description (abstract)
- Subject (can add more than one subject)
- Data Creation Date (Date when the data collection or other materials were produced/created – not distributed, published or deposited.
- Fill in any additional metadata fields your dataset needs.
- NOTE: you can come back and add additional metadata once you have completed the initial dataset creation — before it's published. (See the Additional Metadata section below).
- If you are creating a metadata-only dataset because your data files are already available through another open repository, fill out the fields in the “Other Location for Dataset” section: Repository Name and URL of your dataset/files. You do not have to upload any files for metadata-only datasets. Skip to step 9.
- To add your files, scroll down to the "Files" section and click on "Select Files to Add."
- Tip: you can drag and drop or select multiple files at a time from your computer, directly into the upload area. Your files will appear below the "Select Files to Add" button.
- NOTE: you can come back and add additional files once you have completed the initial dataset creation. (See Upload Additional Files section below).
- The file upload limit is 3 GB. If you need to upload a larger file, contact datahelp@gmu.edu.
- You can change filenames, add descriptions and tags (via the "Edit Tags" button) for each file.
- Click the "Save Dataset" button when you are done. Your unpublished dataset is now created.
- Review your draft/unpublished dataset. The "Files" tab displays the files you have uploaded. If you need to add more files, delete files or modify file metadata, see the Dataverse User Guide for Editing Files. The "Metadata" tab displays the metadata. (See the Additional Metadata section below).
- If you have uploaded all the files and have filled out all the metadata you need for your dataset, you can publish your dataset (see Publish a Dataset section below). If you haven't finished adding files or completed your metadata, you can save your work and return later and publish your dataset.
Accessing your published and unpublished datasets
To go back to any of your datasets, after you have logged-in to Dataverse, click your name in the top-right and select “My Data." You will see your various published and unpublished datasets. Click a dataset link to view the record.
Additional Metadata
Go to your dataset to edit. Go the metadata tab. Select "Add + Edit Metadata." Fill in any additional fields and click "Save Changes."
Upload Additional Files
All files should be uploaded to a dataset before it is published. To upload new files to a dataset, go your dataset to update and click on the "Upload Files" button in the Files tab.
Submit a Dataset for Review and Publication
When you publish a dataset you make it available to the public so that other users can browse or search for it. Once your dataset is ready to go public, go to your dataset page and click on the "Submit for Review" button on the right hand side of the page. Your data files will be reviewed to verify that they are publishable.
- If there are any issues with the data files, they will be returned to you. Someone from the University Libraries Digital Scholarship Center will reach out to you with suggestions for modifications and guidance. What may hold up the process?
- Messy data
- Data that includes personally identifiable information (PII) - please consult these guidelines for removing PII
- Lack of documentation such as a readme file, codebook, data dictionary, and related - guidelines for creating documentation
- When files are ready for publication, you will be sent a deposit agreement to sign and the files will be published.
NOTE: Once a dataset is made public it can no longer be unpublished. If you notice a problem with your dataset, contact datahelp@gmu.edu to get the problem resolved. You may need to remove data files in circumstances where you need to redact data or inadvertently published personally identifiable information.
Sharing a Published Dataset
Every published dataset in Dataverse is assigned a Digital Object Identifier (DOI). Use the DOI as the link when sharing and publicizing your dataset. The DOI is part of your dataset citation, see the blue box on your dataset page.
For more detailed instructions, consult the Dataverse User Guide.
If you need assistance contact, datahelp@gmu.edu.
The above instructions were adapted with permission from the University of Virginia Libraries.