A research group in the natural sciences aims to document the preparation and examination of samples in such a way that research questions, methods, and results remain traceable and reusable. Analyses of collected data are performed by a variety of researchers, with and without programming skills.
Frequently changing research questions and altered protocols as well as a high fluctuation of researchers characterize the research routine. Conducting laboratory work is costly and time-consuming, so as much legacy data as possible should be integrated into later analyses. Collected data must be archived for at least 10 years and it should be possible to aggregate it in a structured manner for dissemination to other research groups or publications.
Requirements: Flexibility and semantic data structuring
This case study describes requirements to help design data management in a FAIR (Findable, Accessible, Interoperable, Reusable) manner:
- Linking samples to SOPs (Standard Operating Procedures), source materials, (electronic) laboratory notebooks, and instrument settings and results, in addition to previous and resulting scientific publications.
- Flexibility if SOPs, experimental procedure, experimental setup, or equipment used change: It must still be possible to run the same searches on experiments performed differently to compare results or collectively evaluate them.
- Accessing and searching the data and their connections must be both user-friendly and powerful, but it must also be possible to access them programmatically via APIs (programming interfaces).
- Export of individual records or larger data sets must be easily possible and in standardized data formats for exchange and archiving.
Implementation with LinkAhead
LinkAhead links data records of various types with each other by default and makes these links accessible for searching in the same way as the managed data itself. Thus, the complete lifecycle of publications, hypotheses based on them, experiments, analyses and further publications can be easily represented. The search as well as the access to data sets is either possible directly in the web interface of LinkAhead or via an API, for which client libraries (on GitLab, e.g. for Python) already exist freely available.
Flexible data models are part of LinkAhead’s DNA, so later adaptations are easily possible, even if new types of data are added or the way they are connected changes fundamentally. Search queries then return both “old” and “new” data sets. LinkAhead’s dataset or parts of it can be exported in a standardized way, for example as XML, and thus be passed on or archived independently.