What is data curation?
In simple terms, data curation is the art of maintaining the value of data (source: TechRebuplic). Curation at CanWIN consists of several steps which take a dataset from its raw state to a form that is understandable, retains value long term, and is FAIR - Findable, Accessible, Interoperable and Reusable. These steps include:
- Collecting data
- Organizing data
- Cleaning data
- Enhancing/enriching data
- Standardizing data
- Storing/preserving data for future use.
Data curation takes place at all stages of the data life cycle.
The Data Life Cycle
The sections below highlight key parts of the curation process recommended by CanWIN prior to uploading a dataset.
- Submitting data and metadata - we provide the steps to submitting data and metadata to CanWIN's Data Catalogue when you do not have an account. If you need assistance filling out the metadata forms, the Help for completing metadata forms page provides help for common metadata fields users have questions about.
- CanWIN Curated Vocabularies - we provide you with best practice curation vocabularies for your dataset variable names and dataset metadata fields that can be added into the dataset for better user context.