Mitigating Cyber Attack Risks - Data De-identification

In the ever-expanding digital landscape, where data reigns supreme, privacy remains a precious commodity. As we navigate how businesses use new technologies, the need to safeguard personal data has never been more critical. In this era of data-driven decision-making, data de-identification plays an important role in maintaining the balance between hyper-personalisation and privacy.

This Insight focuses on the importance of de-identification to mitigate cyber risks and how to best implement it into your business.

What is data de-identification?

‘De-identification’ is the process of removing or altering data so that individuals can no longer be ‘reasonably identified’. Personal identifiers can be removed, obscured, aggregated or altered in some way so that the data no longer contains personal information or information from which a person can be reasonably identified.

De-identification is used by Australian businesses as a privacy-enhancing tool. When done correctly, it can help your business comply with its obligations under the Privacy Act 1988 (Cth) (Privacy Act) and Australian Privacy Principles (APPs), build trust in your data governance, and mitigate the impact of a cyberattack on your organisation.

Methods of De-Identification

De-identification usually involves removing direct identifiers, or removing or altering identifying information. Two examples of de-identifying data include the Safe Harbour method and pseudonymisation.

Safe Harbour method

The Safe Harbour method involves removing identifiers from data, for example:

Names
Dates, except the year
Geographic data
Phone numbers
Email addresses
Medical record numbers
Account numbers
Certificate/licence numbers
Web URLs
Device identifiers
IP addresses
Biometric identifiers

While this method is relatively simple and low cost, it can be restrictive and result in data having limited utility.

Pseudonymisation

Pseudonymisation involves masking personal identifiers by replacing them with temporary IDs, effectively removing personal information from data. If pseudonymised data is supplemented with other data, the original dataset or personal information contained in it may be reidentified.

If you use pseudonymised data, you must understand how that data is being used and manipulated, and if and how any personal information might be reidentified.

You must ensure all necessary consents have been obtained from individuals, and that any risks of reidentification are appropriately managed.

Other techniques

Some other techniques use to de-identify data include:

redacting information (including pixelation in digital recordings);
aggregating data (summarising data from multiple sources);
hashing (one way encryption of identifiers);
generalising (for example, replacing exact birth date with general bracket);
suppressing (marking some values as missing);
data swapping (for example, swapping salaries between individuals in the data set who share a postcode to ensure the aggregate remains valid); and
differential privacy (describing the patterns of groups within the dataset whilst withholding information about individuals).

Obligations under the Australian Privacy Principles (APPs)

Under the Privacy Act, personal information is “information or an opinion about an identifiable individual, or an individual who is reasonably identifiable”. To be de-identified, data must be removed or altered so that it can no longer be used to reasonably identify an individual, and is no longer personal information.

The definition of personal information may change in upcoming privacy reforms, following the Attorney-General’s Departments review of the Privacy Act. The Government will be consulting with stakeholder groups before drafting legislation in 2024.

The APPs require that personal information must be de-identified or destroyed if an APP entity no longer needs that personal information for the purposes for which it was initially collected, used or disclosed.

There are limited exceptions to this requirement. For example, there are instances where organisations may collect health information about individuals, and the requirement does not apply to certain Commonwealth records.

Why you should de-identify data

There are several reasons why Australian businesses should de-identify personal information, including to:

comply with the APPs;
analyse datasets;
mitigate risks to the business, employees and customers if it suffers a cyberattack (as de-identification minimises the data and personal information vulnerable to a cyberattack);
build customer trust.

If data is properly de-identified and an organisation suffers a data breach, it may not be required to report the breach to the OAIC as the data should not contain personal information. In this sense, de-identification can significantly limit risk exposure, protecting the organisation, its employees and customers.

Risks of Data De-identification

Utility of de-identified data

One of the struggles businesses face with data de-identification is balancing privacy protections with retaining the utility of the data. Data which has been significantly altered in the aim of removing any personal identifiers can prove less useful to an organisation depending on what the data is used for. Therefore, there is great difficulty in finding a balance between enough privacy protection and still having a purpose for the data.

Re-identification

Just because data has been de-identified, it does not guarantee that the data is being processed fairly and ethically, or that it will remain de-identified.

Re-identification is where de-identified data is matched with other available auxiliary information, resulting in personal information in the original dataset being re-identified.

Information is becoming increasingly accessible through various online platforms. Mass data analysis is becoming more widespread and commonplace with improving AI tools. Previously anonymous data points will be more easy to pieced together, leading to the resurrection of identities that were meant to remain obscured. This risk amplifies when seemingly unrelated datasets are combined, to create a comprehensive mosaic of an individual’s life.

Re-identification not only jeopardises privacy but also opens avenues for malicious exploitation of personal information, from unauthorised, hyper-targeted advertising to more nefarious activities.

When handling de-identified data, you must always be aware of any potential risks that may result in that data being re-identified and misused. Your business needs adequate protections in place to manage this risk.

For more information on data governance see Sainty Law’s insight on data governance tools here.

Next Steps for your stored personal data

Data de-identification is an important step in an organisation’s data governance policies and cyber resilience. Its important to educate employees on data de-identification, when to de-identify, and how to manage risks of re-identification.

Contact Sainty Law at lawyers@saintylaw.com.au for more information and to seek guidance on your business’ privacy obligations, and how to best improve your businesses data governance practices.