Research Data Storage
Each research project is characterized by distinct data storage requirements, which must be carefully aligned with the functionalities of available storage platforms in order to identify the most appropriate solution. Such requirements may include:
- High-capacity storage with provisions for ongoing growth,
- Storage of sensitive or restricted data, and
- Support for computation and analysis.
In some research contexts, particularly those involving Indigenous communities, data storage must also respect community preferences related to ownership and stewardship. Communities may request that data be stored on infrastructure they provide, thereby ensuring community possession and oversight rather than university-based custody. For researchers interested in this approach, the BC First Nations Data Governance Initiative (BCFNDGI) offers a Privacy and Security Policy Manual, which includes model policies for data storage.
To assist in selecting an appropriate storage environment, researchers are encouraged to consult the Research Data Storage Finder, an interactive resource developed by the Department of Library and Archives, which catalogues recommended storage and backup options.
Broadly, data storage platforms can be categorized into three primary types, each with associated advantages and limitations.
Local storage refers to devices that are either integrated into a researcher’s computer (e.g., internal hard drives or solid-state drives) or directly connected to it (e.g., external hard drives, USB flash drives, DVDs, and other physical media).
Advantages include:
- Simplicity of use: External drives can be attached directly to a computer and used immediately, without the need for specialized software or account setup.
- High transfer speed: Data can be written to or retrieved from local devices very efficiently.
- Affordability for moderate capacity needs: Local devices are comparatively inexpensive for data volumes up to approximately 10 terabytes.
- Portability: External devices can be moved and transported with relative ease.
- Offline accessibility: Data remains available without requiring internet connectivity.
- Restricted physical access: Only individuals with direct possession of the device can access the stored data, which may enhance control over small-scale projects.
Network storage consists of devices connected to a local or institutional network, rather than directly to a single computer. Such storage is typically operated by campus IT services or by individual researchers using technologies such as network-attached storage (NAS). NAS devices, in particular, are designed to connect via a network rather than a direct computer interface.
Key strengths include:
- Controlled access: Data is available only to individuals authorized to connect to the designated server or institutional network.
- Centralization for collaborative work: Network storage facilitates shared access by multiple researchers, making it well-suited for collaborative groups or teams.
- Backup functionality: Network storage supports automated or scheduled backup processes, providing a reliable safeguard against data loss.
Cloud storage refers to platforms that host data online and allow remote access via the internet. Providers may be commercial entities (e.g., Dropbox, Google Drive, Microsoft OneDrive) or open-source systems (e.g., SeaFile, OwnCloud, NextCloud) which may be implemented by research organizations or institutions.
Advantages include:
- Version control and file recovery: Most services retain change histories, enabling reversion to earlier versions of documents and recovery of deleted files.
- File sharing and collaboration: Cloud platforms facilitate seamless sharing across collaborators, including joint editing and annotation.
- Universal accessibility: Data can be accessed from any device with an internet connection, regardless of location.
Limitations to consider:
- Access speed: Files may need to be downloaded prior to analysis, which can slow workflows when handling large datasets.
- Security risks: Cloud storage requires rigorous safeguards such as strong, unique passwords and ongoing account protection. Vulnerabilities remain, including the potential for platform breaches or unauthorized access.
Ensuring the security of research data is an essential component of responsible research data management (RDM). This document outlines the use of 7-Zip for file and folder encryption and Cryptomator for cloud storage encryption, emphasizing up-to-date best practices for both individual and collaborative contexts.
Why Encrypt?
- Protect sensitive and confidential research data from unauthorized access
- Ensure compliance with institutional, legal, and ethical obligations for research data security
Recommended Tools
| Feature | 7-Zip | Cryptomator |
| Cost/License | Free, open source | Free, open source |
| Platforms | Windows, Linux (CLI) | Win, Mac, Linux, Mobile |
| Encryption | AES-256 on entire archive; option to encrypt names | AES encryption per file; transparent folder encryption for cloud |
| Primary Use | Local file/folder encryption and compression | Secure cloud sync (e.g. Google Drive) |
| Sharing | Send encyrpted archive plus password | Share vault folder in cloud + vault password |
| Cloud Operation | Not optimized for cloud; for local files/folders | Designed for cloud , but requires Google Drive for desktop to be in "mirror" mode. |
| Download Location | 7Zip Download | Cryptomator Download |