This lesson is still being designed and assembled (Pre-Alpha version)

Separating identifying variables from your data

Overview

Teaching: 0 min
Exercises: 0 min
Questions
  • What is sensitive data?

  • How can we make data non-sensitive and still useful?

Objectives
  • First learning objective. (FIXME)

Sensitive data are data that can be used to identify an individual, species, object, or location that introduces a risk of discrimination, harm, or unwanted attention. Major, familiar categories of sensitive data are: personal data - health and medical data - ecological data that may place vulnerable species at risk.

Separating or de-identifying your data generally occurs to protect an individuals privacy. According to the Australian Privacy Act 1988, “personal information is de-identified if the information is no longer about an identifiable individual or an individual who is reasonably identifiable”. De-identified information is no longer considered personal information and can be shared. More information on the Commonwealth Privacy Act can be located at https://www.legislation.gov.au/Details/C2016C00979

De-identifiying aims to allow data to be used by others for publishing, sharing and reuse without the possibility of individuals/location being re-identified. It may also be used to protect the location of archaeological findings, cultural data of location of endangered species.

Any identifiers (name, date of birth, address or geospatial locations etc) should be removed from main data set and replaced with a code/key. The code/key is then preferably encrypted and stored separately. By storing de-identified data in a secure solution, you are meeting safety, controlled, ethical, privacy and funding agency requirements.

Re-identifing an individual is possible by recombining the de-identifiable data set and the identifiers.

Australian practical guidance for De-identification (ARDC)

Australian Research Data Commons (ARDC) formerly known as Australian National Data Service (ANDS) released a fabulous guide on De-identification. The De-identification guide is intended for researchers who own a data set and wish to share safely with fellow researchers or for publishing of data. The guide can be located here https://www.ands.org.au/working-with-data/sensitive-data/de-identifying-data

Here are examples of practical guidelines available nationally

Tips for managing de-identificatioin (ARDC)

Management of identifiable data (ARDC)

Data may often need to be identifiable (i.e. contains personal information) during the process of research, e.g. for analysis. If data is identifiable then ethical and privacy requirements can be met through access control and data security. This may take the form of:

Safely sharing sensitive data guide (ARDC)

Attribution:

Key Points

  • First key point. Brief Answer to questions. (FIXME)