Data management resources and reading
Need to learn more about data management? Here is a list of useful reading material with brief summaries to help steer you towards those relevant to you. The resources have been classified into three categories: practical advice, principles, and implementations.
Practical advice
- Ten Simple Rules for Creating a Good Data Management Plan
Article by William K. Michener published in PLOS Computational Biology describing how to write a good data management plan in ten simple rules. Each of the rules is placed in the context of the data life cycle. - Ten Simple Rules for Digital Data Storage
Article by Edmund M. Hart et al. published in PLOS Computational Biology giving practical advice on how to store and organise data files. - How to share data with a statistician
Guide by Jeff Leek giving practical advice on how to share data with a statistician or data scientist. - Tidy Data
Article by Hadley Wickham published in the Journal of Statistical Software introducing the concept of “tidy data”, a specific structure that makes it easier to visualise and work with tabular data.
Principles
- The FAIR Guiding Principles for scientific data management and stewardship
Article by Mark D. Wilkinson et al. published in Scientific Data describing principles required for machines to be able to discover and consume data. The FAIR principles described are: Findability, Accessibility, Interoperability, and Reusability. - 10 aspects of highly effective research data
Blog post in Elsevier by Anita de Waard, Helena Cousijn, and IJsbrand Jan Aalbersberg describing an extension of the FAIR principles that is meant to function as a roadmap for incremental and continual improvements of data management processes and systems.
Implementations
- Implementing a genomic data management system using iRODS in the Wellcome Trust Sanger Institute
Article by Gen-Tao Chiang et al. published in BMC Bioinformatics describing their experience of implementing a data management solution using iRODS to manage the large volumes of sequencing data generated within the Wellcome Trust Sanger Institute. - openBIS: a flexible framework for managing and analyzing complex data in biology research
Article by Angela Bauch et al. published in BMC Bioinformatics describing the implementation and usage of openBIS, a framework for constructing information systems for managing biological data.