Jan 28, 2023
Versioning: Insight into the life history of research data and research software
Changes to text-based research data or the analysis script are quickly made. Especially when working in a larger team. In order to maintain the transparency of research, it is necessary to document the changes. Otherwise, even very practical and all too common questions in everyday research can lead to difficulties - a few examples:
- What was the exact status of the script for analysis when we evaluated the data three months ago?
- Exactly what changes have my colleagues made?
- How do I get to the part of the document that I accidentally deleted last week?
- When exactly did this error slip into our script?
- What exactly do the name suffixes of the different file versions mean?
In many cases, manually managed change logs together with the systematic creation of backups provide a remedy. Such an approach, for all its simplicity, is also sensitive to errors and requires a high degree of readiness.
Therefore, we recommend the use of an automatic version management system (e.g. Git, Subversion, Mercurial). These systems often have their origins in software development, but can be used largely independently of the content and file formats. In this case, you benefit from the following functionalities:
- Logging of changes to the files (incl. timestamp and person)
- Comparison between different versions of the files
- Restoration of older versions of the files
In addition, parallel lines of development and the shared work of several people on the same files can be professionally managed with its help.
Git is one of the most widely used version control systems. The files are often initially managed in a local repository, which is essentially a simple folder. These local repositories can also be made available on a network drive or web-based platforms such as GitHub or GitLab a team. The access to the functionalities of the version management is performed via the command line of the operating system, a graphical user interface or corresponding web interfaces.
Members of TU Dresden can use the GitLab instance of TU Chemnitz free of charge.
Do you have further questions on this topic or would you like more detailed information on specific elements?
We would be glad to hear from you. Please contact us at Service Center Research Data by email or book an appointment.