Data Management can be generally considered as any activity involving data outside of actually using the data.
Data management is best defined as any and all of the following examples:
Researchers collaborating on projects will often need to share primary data and preliminary results, so it is often necessary for them to transfer data between computers. The most common method for transferring files is with email attachments, but there are limits to the size of files that can be transferred. Removable data storage media, such as USB keys and CDs or DVDs can transfer large amounts of data, but require the researcher to physically carry the data to its destination.
To assist good data management, VU has made the R: drive available to all researchers and research students. The VU R: drive is a central storage space which is secure and backed up. The default allocation for a project is 10 GB and more is available on request. The storage can be used for anything from working files to long term retention of research data/files.
Benefits of using R: Drive:
Often researchers will work on their university desktop as well as a laptop, and possible a home computer. Typically files are just copied back and forth between the computers. This is the most obvious method but has a number of drawbacks:
If you find you are synchronising your data regularly and are experiencing difficulties with this, then you should consider using the R: drive and accessing it remotely using the VU VPN (Virtual Private Network). This way you can edit all your data in the one place. You will need to install the VU VPN software on the remote computer, run it and then connect. If you're using a VU laptop you may then have access (once connected). If not you may need to follow the next section on non-VU PCs.
Download the instructions from the ITS Wireless Network Knowledgebase Base (ignore the confusing name) and download the software from the ITS downloads page (bottom of page). There are instructions and downloads for Windows and Macs.
Many research projects are carried out collaboratively: between postgraduates and their supervisors; within departmental research groups; as cross-discipline research, and as inter-university research. When working with a large volume of research data, it is worth considering using collaborative tools such as the R: drive or the AARNet CloudStor service. CloudStor allows people to send or distribute a large file to a number of internal and external colleagues. Despite the name, you cannot "store" files in CloudStor for any reasonable period. CloudStor is the equivalent of email for large files.
When the data is constantly being edited, especially by multiple users, it is a good idea to implement some form of version control to keep track of changes. This can be as simple as appending a number to the end of a file after each major edit. For example:
Such conventions are good for simple work but quickly become unmanageable when you have multiple authors or make lots of edits. The alternative is to use revision (or version) control software. Such programs offer several advantages:
The software requires you to input a description of the changes made, which makes it easier to pick up where you left off and for collaborators to see what you are doing
You can be confident with making major changes as you can revert to an old version if you make a mistake. You can also easily compare two versions to help you find errors
Useful for people who use more than one computer. It implicitly provides synchronisation and is good for resolving conflicting changes
TortoiseSVN is a popular example of a version control system that integrates with Windows Explorer making it one of the easiest to use.
Managing your data allows you to work more efficiently, produce higher quality data, achieve greater exposure for your research, and protect your data from being lost or misused.
Making regular backups of data is probably the most important and, fortunately, one easiest tasks to manage.
Although most people are quite aware of the risk and cost of losing data through hard drive failure or accidental deletion, it is best to have a policy and schedule in place for maintaining data backups. When considering your backup strategy, you need to know:
Backup security requires further mention. If the data is sensitive then it should not be stored on a computer that is connected to the internet, and preferably not connected to any network. If the data needs to be destroyed at the end of a project then consider what level is required – a hard drive will need to be overwritten several hundred times to ensure that no data can be recovered.
You can use the VU R: drive for backup. And, if you use it as your main storage - back up happens automatically.