Data management, also referred to as “data curation” is increasingly important as agencies collect and use larger and more complex data sets, and as more staff use data to inform decisions. The data management principle suggests that agencies should work to ensure that data continue to remain relevant beyond their initial use. Only when data continue to be accessible and up to date do they retain their value for research and decision making. Data management and curation therefore requires agencies to consider the aspects of:
- Storage: acquiring and maintaining the hardware and software required to store and retrieve data in the future. Proper storage methods including servers and databases are necessary to support easy update and retrieval of data in the future, and prevent corruption or degradation of data. Cloud storage is an increasingly secure, economical, and accessible approach.
- Geocoding: assigning geographic coordinates to data or otherwise assigning location attributes during data collection, so that they can be queried for their spatial properties. Geocoding is almost entirely automated with the use of new data sources such as GPS, cellular, etc.
- Updates: adding new samples of data over time and amending or archiving existing data to ensure a database best reflects current real-world conditions as well trends over time. Periodic updates and refreshes help avoid obsolescence – or the aging of valuable information contained in the data.
- Metadata: generating descriptive information – “data about the data” – to provide a dictionary and attributes for informing how data will be used. This metadata can include information about the origin, file format, structure, attributes, quality, caveats, and previous use of data. Good metadata is helpful for future users who many not be familiar with the data source, as it enables users to quickly familiarize themselves with the “background” or context of the data.
- Access: controlling who has the ability to view and edit data. Data could be “open” or widely accessible, or otherwise privileged or restricted.
Data management touches other principles discussed in this report. For example, access and storage are closely tied to the principle of security, and the decision to store data should be consistent with choices made in light of the data minimization principle.
Principle Checklist:
- Does your agency retain and store information that it believes will be valuable for research or decision making in the future?
- Does your agency prescribe or otherwise require how geographic or location-related attributes are recorded and used?
- Is your agency’s data frequently refreshed, and will it be updated or cleaned in the future?
- Have the data themselves been explained or described logically so that the data are understood and used appropriately?
- Are the data easy to find by those who need to use them?
- Are the data openly available, and are any restrictions clearly stated and justified?
Answering these questions can help your agency understand if the principle of data management applies to your data collection and use activity, and identify possible actions or tools to aid implementation. If answers to any of these questions are “yes,” then the principle of data management could apply to your agency.