How can you set up research data management from the start?

A good plan is half the work.

Research Proposal

Research Proposal / Grant Office

Grant programmes from organisations like NWO, ZonMw and ERC require you to think about the method of data collection, the journey of the data in your research project and how to protect or share data during and after the research project. It is important to bear in mind the specific laws and regulations that apply to the kind of data that is collected. If a project involves data on individuals and organisations this impacts the design of the necessary IT infrastructure. A more detailed description of this will later be captured in the data management plan.

When writing your research proposal the following items are important:

  • Fill in the Data Management Section if your funder requires this
  • Planning: One of the early deliverables will be a detailed Data Management Plan
  • Budget: Take into account the costs (labour and material) for data storage during and data archiving after your project.
  • Writing: Funders that distribute grants like to maximise the effectiveness of this investment. It is therefore highly recommended that the data will be made Findable, Accessible, Interoperable and Re-usable (FAIR Principles). This does not mean that the data have to be open: Laws, licenses and contracts regarding personal and sensitive data may limit the possibility to share the data publicly.

The [RDM Support Desk[(https://vu.nl/en/about-vu/more-about/rdm-support-desk) provides advice and help when writing a Data Management Section as part of the research proposal. Also make sure to reach out to the VU Grants Office (IXA-GO) for advice and practical aid for your grant in general as early as possible.

Data Management Section

Many funders require researchers to include a section in their project proposal about Research Data Management, in which they explain whether existing data will be reused, whether new data will be collected or generated during the project, and how they plan to structure, archive and share their data. Depending on requirements of the funder, the paragraph can be short or more extensive.

Funders may have different requirements for the data management section in the project proposal. Always check what your funder asks for. Below is a list of information on data management sections from main Dutch funding bodies.

We recommend you to ask advice from the RDM Support Desk when writing your data management section.

RDM Costs

Costs & Data Management

Many research funders encourage applicants to include data management and sharing costs in research proposals. Some funders will provide advice on costs related to data management. Some remarks on costs are provided here:

  • The Data Management Plan should describe the activities that incur costs and provide justification for the allocation of resources (example: acquisition of a programmer who will write software needed to capture the data).
  • No expenditure can be ‘double funded’, i.e. a service that is centrally supported by indirect costs must not be included as a direct cost as well (example: computers that are already provided to employees and paid for by the university may not be included).
  • The budget and justification should broadly indicate where RDM costs will be incurred, where possible. E.g. data capture and cleaning, data curation and preservation, data sharing.
  • Include budget for long-term storage if data are expected to be deposited in a repository not funded by the university or external funders (VU repositories are: DataverseNL, Yoda). 🔒 VU has an internal breakdown of costs for storage and archiving for VU-managed storage and repositories.

A practical costing tool is available from the UK Data Archive. Based on this costing tool, Utrecht University has developed a guide to calculate the costs of data management. You can use those guides as well to estimate the costs needed specifically for RDM.

Most material costs of the storage solutions offered by the VU are covered centrally (up to 500 GB), but if you need to specify the costs for your project, look at the 🔒 Research & Archiving Storage Cost Model

Examples to put in a data management plan:

Data Stage Dataset Type of data Costs
Raw data Interviews Audio files Audio equipment rental
Location rental costs
Data storage & backup
Processed data Transcription of interviews Word files Personnel costs: hiring research assistants for manual entry
Data storage & backup
Analysis software R script Personnel costs: programmer to write a programme to mine the data
Analysed data Regression graphic Photoshop files Software costs
Project Website HTML, Java Hosting fee
Personnel to build initial website

RDM Requirements

If you do research at the VU, you may be subject to the requirements for Research Data Management formulated by various parties. Please check which requirements apply to your research project.

Many funders have specific requirements for RDM. The exact requirements vary by funder. They usually include a Data Management Section in the project proposal and a Data Management Plan (DMP) after funding has been granted. As funding agencies invest financially in your research project, they often have demands concerning research integrity, data quality, data publication and reusability. As research output, data are often compared to a kind of public good that should be made available to the community for re-use if possible. Always check what demands are set by a funder before you apply.

Funding agencies

Data management section in project proposal

At a grant application, some funders request a short data section in your project proposal or an outline of a Data Management Plan. Without these your proposal will not be eligible for review.

Data Management Plan

In a Data Management Plan (DMP; see also the section Data Management Plan) you explain how you will handle your research data. Check with your funder at what stage a DMP has to be submitted and how it should be composed. VU has a DMP template that has been acknowledged by NWO, ZonMw and ERC. We recommend you to use this VU template. See the DMP page for more information and instructions on how to select this template in DMPonline.

The tool DMPonline can be used to access and fill in a DMP template. You can also write a DMP in collaboration and invite a third party to comment or give feedback on your DMP. You can use the button ‘Request feedback’ to ask for feedback from a data steward. In order to write a DMP, you need to create your own account.

Overview of funders’ RDM requirements and DMP templates

The Consortium of European Social Science Data Archives (CESSDA) presents a comprehensive overview of data management requirements and templates of the main Dutch and European funding bodies. This is helpful if you want to quickly find more information. However, make sure you always check the details that you receive in the documentation of your actual funding agency, so that you are aware of all up-to-date requirements.

Publishing your data and terms of use

Normally a funder requires you to publish your data in a data repository at the end of the project (unless this is prohibited by legislation). For that reason, DMP templates usually include the following questions:

  • where your dataset can be found
  • whether your dataset has a Persistent Identifier
  • how your data are documented
  • whether your data may be reused freely or not and which terms and conditions apply

Please consider your funder’s data publishing requirements, so that you can take the necessary steps before and during your research project. For example, if you are working with personal data and you want to publish them in a data repository, this needs to be included in the informed consent forms that your participants have to sign.

Local requirements from your university and faculty

The VU is committed to support research that meets the highest requirements of replicability and transparency. The FAIR data principles, the purpose of which is to render research data Findable, Accessible, Interoperable and Reusable, the General Data Protection Regulation (GDPR) and the principles of Open Science are at the foundation of the Research Data Management (RDM) policy of the VU.

In addition to the central policy for RDM, faculties of the VU also have developed their own implementation of this policy.

Please check the relevant local policies and Standard Operating Procedures relevant for your faculty or department before you start your research project. An overview of all available policy documents can be found in the section VU policies and regulations.

Consortium partners

Partner institutions in a consortium may also have research data management requirements, for example with respect to data security. They may ask for:

  • certification in relation to data security of the VU’s infrastructure
  • statements from the IT department about the IT systems being used at the VU

The RDM Support Desk or your faculty’s research support office can help you with this.

Collaboration

Some research projects involve more than one partner organisation. Be sure to indicate exactly who is responsible for collecting and managing the data in each case, where, and how. If more than one organisation is involved, it may also be necessary to create a Consortium Agreement. Depending on the area or sector of each project and of the degree of technical complexity that is involved, the Consortium Agreement usually contains the following information:

  • provisions on the governance structure of the consortium;
  • technical provisions (e.g. the tasks of each party and the project schedule, description of the data collection responsibilities);
  • financial provisions (e.g. the distribution of funds among participants, the financial plan, etc).

The agreement can include a section on who is ultimately responsible for the data and whether the data will be shared afterwards or whether certain restrictions on re-use apply. These restrictions can also be related to copyright issues or pending patent requests. IXA can help you to draw up a consortium agreement. The RDM Support Desk at the University Library can also help with questions about legal matters.

If you are working with personal data, GDPR requires that all parties working with the data sign a joint controller agreement. You can ask your 🔒 Privacy Champion for advice about this. For multi-centre clinical research, a Clinical Trial Agreement is recommended.

For projects funded by the European Union, several sources are available:

Data Security

Data classification

‘Security’ is often regarded as a fixed state. Therefore, people tend to think of security measures as fixed solutions in the form of technological measures. In reality, security is an assessment of the level of protection against a certain threat, that you consider to deal with that threat adequately enough. Whether or not security is accurate depends on the value of the data and the quality of protective measures.

The value of data or applications is established through classification in Confidentiality, Integrity and Availability (CIA) or in Dutch Beschikbaarheid, Integriteit en Vertrouwelijkheid (BIV).

Traditionally, this classification assesses the value of an entity (data or application) to an organisation. For research data, however, the value to the University is in all cases the same. The value of each research project is the same. Does that mean that there is no need to classify research data? Referring back to the definition of security, it is the assessment of the level of protection against a certain threat and its accuracy depends on the value of (in this case) data. The reason to classify research data is that there is a huge variety in potential risks in case of data loss or theft.

The reason that VU and its reseachers need to classify data is to understand the variety in risk that exists in order to assess if security measures are accurate.

Data classification is about the level of sensitivity (low, medium or high) of your data assets so you can judge the risks to your research (group). This will help you when deciding what security and protection measures you need to take for handling the data or parts of the data.

Data classification criteria

In order to classify your data collection or data processing (in categories from low, to medium, or high), the following properties are considered.

  • Availability: what risks are associated with accessibility to data (i.e. how readily do the data need to be available for use and how damaging would it be to your research if data are lost), what measures should you take to prevent data loss?
  • Integrity: what do you do to prevent measurement or data entry errors, corruption of stored data or unauthorised changes to the stored data?
  • Confidentiality: how securely do data need to be managed to prevent sharing of data with unauthorised individuals? The necessity for confidentiality depends on the sensitivity of the information, either as sensitive personal information or confidential business information, as well as the vulnerability of the subjects from whom the data is collected and the laws that apply to the data being collected and analysed. In some cases, confidentiality can be very high; when the confidentiality is high or very high, please contact the RDM Support Desk.

For all of these aspects, the damage impact should be considered, i.e. te risks to all parties involved (i.e. participants, but also the VU as an institute, the researchers, any collaborators etc.). Untoward outcomes could be loss of privacy/secrecy, reputation damage, financial costs, fraud, mental, social or physical harm.

Examples of Highly classified data

Your data are classified as ‘high’ when you collect or process the following data:

  • personal data
  • state secrets
  • competitive corporate information
  • animal-testing data

Personal data

Do not confuse the risks of data loss with the need to comply to legal regulations. Data security is part of risk management and is aimed at balancing protection against productivity, investments against profit. The General Data Protection Regulation is a European Law in the legal area of Human Rights and concerns the use of personal data. Personal data are a type of data that is commonly processed in many fields of scientific research. You collect or process personal data when the data can be linked to a unique individual, either directly through direct identifiers such as name, address, IP-address etc., or indirectly through a combination of information. Personal data need to be protected. More information about personal data, data protection and the GDPR can be found in the section GDPR & Privacy.

Data Classification tool for researchers

To help you to determine the data classification for your research data assets, the VU has developed a tool that will help you to assess and classify the availability, integrity and confidentiality risks of these assets. Based on your results from using the tool, you may need to seek further advice from VU Security and Privacy Experts (see below). Some basic security tips were compiled by the data steward of the Faculty of Behavioural and Movement Sciences.

VU Security and Privacy experts

VU Security and Privacy experts can help you with the details on these aspects.

  • General questions about information security: RDM Support Desk. If you need advice when determining the data classification of your data assets, you can contact them.
  • Reporting a (potential) data breach: IT Servicedesk. A data breach is an incident in which the possibility exists that the confidentiality, integrity or availability of information or data processing systems has been potentially threatened, for example attempts to gain unauthorised access to information or systems (hacking), the loss of a USB stick with sensitive information, data theft of hardware.
  • Tailored advice or support: The RDM Support Desk can assist researchers in the process of requesting capacity at IT for setting up and/or assessing of information security plans or paragraphs. An information security plan is particularly important in projects with a complex infrastructure (e.g. international collaboration, use of various data sources and databases), tailored solutions and requirements from funding agencies or external partners.

Read more practical information about this below in the section Data Protection & Security, or the GDPR support section.

Data Protection & Security

Where sensitive information is collected, the researcher must consider the following:

  • who has access to the data during the study, and how the data will be made available after publication
  • what security regimes apply to sensitive data, and how data are protected
  • how data access during and after the project will be managed
  • how to deal with sensitive information
  • whether informed consent is required and how the forms will be accessed and stored

On the 🔒 VU Intranet information is available on Security, data loss and reporting incidents. Legal experts also can help you if you have questions about working with personal data and/or if you have to perform a Data Protection Impact Assessment. On the VU website you can find more information about 🔒 DPIAs at the VU. The data steward for the Faculty of Behavioural and Movement Sciences has also created a guide about data encryption.

GDPR & Privacy

GDPR in Practice

Important definitions

  • Personal data refers to any information relating to an identified or identifiable natural person (‘data subject’). See also the definition of ’personal data’ according to the official text of the GDPR.
  • Data processing refers to any action performed on data, such as collecting, storing, modifying, distributing, deleting data. See also the definition of ‘processing’ according the official text of the GDPR.
  • Direct and indirect identification: Some identifiers enable you to single out an indiviual directly, such as name, address, IP-address etc. Individuals can also be identifed indirectly through:
    • a combination of information that uniquely singles out an individual (e.g. a male with breast cancer in a breast cancer registry, a pregnant individual over 50 etc.), this includes information in one record and information across different data files or datasets
    • unique information or patterns that are specific to an individual (e.g. genomic data, a very specific occupation, such as the president of a large company, repeated physical measurements or movement patterns that create a unique profile of an individual or measurements that are extreme and could be linked to subjects such as high-level athletes)
    • data that are linked to directly identifying information through a random identification code or number
  • Pseudonymous data: Data that are indirectly identifiable are generally considered to be pseudonymous; this means that they are NOT anonymous and still qualify as personal data. Therefore privacy laws, such as the GDPR, do in fact apply to these data. This is for example the case when direct identifiers are removed from the research data and put into a key file (or what is usually called a subject identification log in medical research) with which the direct identifiers can be mapped to the research data through unique codes, so that reidentification is possible. These data are therefore pseudonymous, and not anonymous. The LCRDM has made a reference card that illustrates the difference between pseudonymous and anonymous data.

Background information

Privacy in research - Privacy five-step plan

Where research requires the collection of personal data, the researcher has to follow the Privacy five-step plan to make sure to carry out the research in line with the GDPR.

VSNU Code of Conduct for using personal data in research

The VSNU’s Code of Conduct for Research Integrity (Dutch, English, 2018) includes a reference to the GDPR and its Dutch implementation law UAVG. An updated Code of Conduct for Using Personal Data in Research which complies with GDPR is still work in progress.

Support within your faculty: Privacy Champions

Each faculty has one or more Privacy Champions, who are the first point of contact for questions relating to privacy and the GDPR. The Privacy Champions can help you with completing a Data Protection Impact Assessment, registering your research in the record of processing activities, designing informed consent forms and other questions relating to the GDPR. The 🔒 list of Privacy Champions can be found on VU’s website. It is important that you make an overview of what data you are collecting. Your privacy champion can help you with this.

An important issue in informed consent forms, is the possible future (re-)use of the data. The Privacy Champion of the Faculty of Behavioural and Movement Sciences prepared a checklist for what to consider when creating an informed consent form. An important issue in informed consent forms, is the possible future (re-)use of the data. You should always ask your 🔒 Privacy Champion for advice when drawing up an informed consent form.

Complete a Data Protection Impact Assessment (DPIA)

When scientific research includes the processing of personal data, conducting a Data Protection Impact Assessment (DPIA) may be a legal requirement under the GDPR. If it is not a legal requirement, conducting a DPIA is always a helpful exercise to make sure that you address all legal aspects that need to be addressed. It is the best way to GDPR-proof your research.

What is a DPIA?

A DPIA is an assessment to identify the risks of processing personal data. It consists of a number of questions on the basis of which you determine whether the processing of personal data in your research project is legitimate and which measures should be taken to make sure this processing takes place within the boundaries of the GDPR. A DPIA doesn’t deliver an automatic report at the end, but it rather makes you think about all relevant topics you need to address before starting the processing of personal data. The outcome of a DPIA should be used to determine appropriate measures to mitigate the identified risks, such as data minimisation (not collecting more data than necessary), pseudonymising data, selecting appropriate tools for data storage and data sharing.

When is a DPIA required?

A DPIA is required when the processing of personal data is likely to result in a “high risk” for the participants of your research project. This is for example most likely the case when scientific research includes the processing of special categories of personal data, such as data concerning health, religious or philosophical beliefs, political opinions or criminal convictions and offences (see Privacy in Research - 10 key rules for more information about special categories of personal data).

There are two DPIA lists which describe situations in which a DPIA is required:

  • The Dutch data protection authority (Autoriteit Persoonsgegevens) has published a list of 17 “high risk” situations in which a DPIA is mandatory.
  • The European data protection authorities have together published a list of 9 criteria which can be used to determine whether there is a “high risk”.

You should consult your 🔒 Privacy Champion to determine whether a PreDPIA is required in your situation.

How can I complete a DPIA?

The VU has a DPIA template based on a form provided by the Dutch Government (see the original template if you wish to have more background information, only available in Dutch).

You should request the template from your 🔒 Privacy Champion.

Please complete a DPIA at least before you start collecting personal data. In some cases, it might be useful to have a look at the DPIA template at the stage of writing a research proposal.

If you are not sure whether it is required to conduct a DPIA or if you need help completing a DPIA, please contact your faculty’s 🔒 Privacy Champion. If needed they can contact the legal specialists of Institutional and Legal Affairs.

Policies & Regulations

VU General Policies and Regulations

Research data management policy

VU Amsterdam considers the careful handling of research data to be very important. The university has therefore formulated a Research Data Management policy which provides guidance for researchers and policy officers at VU Amsterdam.

Since the VU policy for RDM is formulated in general terms, faculties have worked out more detailed guidelines for their own faculty. These faculty-specific guidelines can be found below.

For RDM policies and guidelines at Amsterdam UMC, location VUmc, please get in touch with Research Data Management Support at Amsterdam UMC.

Academic integrity complaints procedure

Both VU Amsterdam and Amsterdam UMC, location VUmc, employ a joint policy for the handling academic integrity complaints. This policy outlines the steps to be taken in the event of a complaint, the officers who play a role in this procedure, and what should be expected once a complaint has been lodged.

Confidential counselors

The VU has a number of confidential counsellors who handle academic integrity issues.

Data breach incident report

From 2016 onwards, any data security breaches (particularly those that have, or are likely to have, serious adverse consequences to the protection of personal data) should be reported immediately to the IT Servicedesk. Read the 🔒 protocol reporting a data breach.

Regulations and Guidelines

Some faculties and departments have their own guidelines for RDM. You can find an overview of such guidelines below.

Ethics Committees

In cases where research involves human or animal participants, a research proposal may need to be reviewed by an ethics committee. VU and Amsterdam UMC, location VUmc, have several ethics committees, which are listed below. Please note that researchers at the VU also have to go to the METc at VUmc if their research is subject to the WMO, which is not restricted to research at VUmc.

Netherlands Code of Conduct for Scientific Integrity

Dutch scientists are required to comply with the Netherlands Code of Conduct for Research Integrity (VSNU, 2018). The principles of proper scientific and scholarly research, according to the Code of Conduct are:

  • Honesty
  • Scrupulousness
  • Transparency
  • Independence
  • Responsibility

The principles of honesty and transparency state explicit guidelines on the way in which you treat your research data:

  • Honesty: you should refrain from fabricating or falsifying data
  • Transparency:
    • You should ensure that it is clear to others what data your research is based on, how the data were obtained, what the results are and how you got to these results
    • All steps in your research process must be verifiable (e.g. choice of research question, research design, methodology, sources used), so that it is clear to others how your research was conducted

To live up to these general principles, the Code of Conduct provides the following standards, which are addressed in a DMP, for good research practices related to data management:

  • Provide a description of the way in which the collected research data are organised and classified, so that they can be verified and re-used (standard 3.2.10)
  • Make research data public upon completion of your research project; if this is not possible, explain why (standards 3.2.11 and 3.4.45)
  • Describe the data you have collected and used in your research honestly, scrupulously and transparently (standard 3.3.23)
  • Manage your data carefully and store both the raw and processed versions for a period appropriate for your discipline (standard 3.3.24)
  • Contribute towards making data FAIR, where possible (standard 3.3.25)
  • Be transparent about your methods and working procedures by using e.g. research protocols, logs, lab journals or reports to describe these processes (standard 3.4.35)

Strategy Evalution Protocol

The Strategy Evaluation Protocol 2021-2027 (SEP) from the VSNU is used to assess the quality of research at Dutch universities, NWO and Academy institutes. It promotes the handling and storing of raw and processed data with care and integrity.

The SEP formulates questions on how a research institute deals with and stores raw and processed data. It also assesses the output of research institutes, including datasets, and the use of such output by peers and societal target groups.

By registering your datasets in the VU Research Portal, you contribute to an overview of datasets of your department, faculty and the VU as a whole.

NWO Data Policy

NWO aims to ensure that all the research it funds is openly accessible to everyone as part of it’s Open Science policy. Researchers are therefore expected to preserve the data resulting from their projects for at least ten years, unless legal provisions or discipline-specific guidelines dictate otherwise. As much as possible, research data should be made publicly available for re-use. As a minimum, NWO requires that the data underpinning research papers should be made available to other researchers at the time of the article’s publication, unless there are valid reasons not to do so.

The guiding principle here is ‘as open as possible, as closed as necessary.’ Due consideration is given to aspects such as privacy, public security, ethical limitations, property rights and commercial interests. In relation to research data, NWO recognizes that software (algorithms, scripts and code developed by researchers in the course of their work) may be necessary to access and interpret data. In such cases, the data management plan will be expected to address how information about such items will be made available alongside the data.

More information on Data Management is also available on the NWO website where a NWO Data Management Template is made available. The VU Data Management template in DMP Online is certified by both NWO and ZonMW and can also be used by VU researchers for projects funded by both organizations.