[liberationtech] CORRECTION: European privacy regulators' excellent paper on Anonymisation Techniques
Caspar Bowden (lists)
lists at casparbowden.net
Sun Apr 20 02:30:12 PDT 2014
On 17/04/14 09:09, Shava Nerad wrote:
>
> Do they have teeth to enforce that, Caspar? The political will, do
> you think?
>
Until/unless the new GDPR, enforcement depends on both teeth and guts of
DPAs under 28 national laws. Any of 500m data subjects can file a
complaint, citing appropriate chunks of the Opinion. Such a complaint
might be about refusal of rights to access (and in new GDPR delete)
clickstream data (pseudonymous by Cookie|IP tuple), and whether such
data is treated as personal under the terms of privacy statements,
especially if transferred to US under any legal ground.
Another good target is any type social network graph data, nominally
de-identified but retaining social structure. Such data is almost
impossible to anonymize without vast data reduction
In case anyone here interested in jumping down European rabbit hole,
here's a few background notes of work in progress - comments welcome
The story begins in 1995 when as the price for allowing the DP Directive
(EC95/46) to proceed, the UK engineered shoving what was previously an
Article into a Recital (which MS need not transpose), defining
anonymisation. Effectively the UK said "it our country and we will
define pseudonymous as anonymous if we want to". But if you ask a
member of the public whether they can vote anonymously in UK, they
usually change their minds when told Parliament keeps a copy of all
voting slips and ballots, and that MI5 could join them up again if they
wanted to (and did in the 50s). That's the principle at stake.
In the new DP Regulation, in Council the UK wants to take away any data
breach notification to the individual for pseudonymous data, and worse
LIBE defined pseudonymous to include "identity escrow", aka Trusted
Third Party (with Amendments ALDE promoted and US/UK influenced). LIBE
also were bamboozled into nullifying access and deletion rights to
pseudonymous data in a different misguided (or lethally deceptive) Amndt.
The mere creation or retention of personal data engages rights to
privacy and Data Protection, irrespective of how it is subsequently
used, and this may be disproportionate as spectacularly found by CJEU
last week. You can't have a single market with 2 Member States with
largest Internet sector (UK, IE) arbitraging the vast loophole of
"pseudonymous=anonymous=unregulated".
This is the climax of a 20 struggle of over the term anonymity, which is
why the excellent WP29 Opinion 216 is so timely and welcome). The UK is
basically trying to get any privacy promiscuous pseudonymity project off
the ground, with an OpenData/BigData tag, in a race before the sausage
machine of GDPR negotiation resumes. Exit from the EU (and probably CoE
108) would be the only way to continue the pretence after the
Regulation, or - as the UK is flailingly trying to do - provide so many
exemptions for "pseudonymisation" that it legitimatises the UK 20 year
out-on-a-limb position.
Comp.sci has developed a battery of techniques in last 20 years to
distribute privacy risk and still do useful calculations. However, one
of the main conclusions of the Opinion was that no single metric or
prescription exists. True anonymisation remains an art which requires
PhD comp.sci expertise applied case-by-case.
The BigData hoopla last several years is essentially a propaganda
code-word for the idea that pseudonymous processing should be
de-regulated as "anonymous".
In 2011 ICO held a workshop at Wellcome attended by UK stats research
bods, and Prof.Paul Ohm (flown in by them because of his breakthrough
paper describing both the NetFlix de-anonymisatioin and Differential
Privacy) ripped into their bogus pseudonymity=anonymity concept as
incompatible with Rec.26. They ignored that.
In 2012 they issued a Code of Practice. At the launch event I pointed
out in Q&A that it contained: (my emphasis)
* pp.7 /We draw a distinction between anonymisation techniques used to
produce aggregated information, for example, and those -- *such as
pseudonymisation* -- that *produce anonymised* data but on an
individual-level basis./
* pp.21 /the possibility of linking several anonymised datasets to the
same individual can be a precursor to identification. This does not
mean though, that //*effective anonymisation through
pseudonymisation*//becomes impossible/
* pp.42 /Using a //*trusted third party*//*to anonymise *//data/
(section)
o [not re: pseudonymity per se, but reversibility is anonymity
oxymoron]
* pp.51 /Appendix 2 -- Some key anonymisation techniques/
o /_*Pseudonymisation*_/ (section)
So the entire CoP is based on the false premise "pseudonymous = type of
anonymous", which is flatly contradicted by Recital.26 (the one defining
anonymity stringently), but on the face of it compatible with UK law,
because UK never transposed Rec.26. For the last 19 years, whenever you
read "anonymous" in a UK policy document, the UK had two fingers crossed
behind its back - that pseudonymous data counted as "anonymous" (and
therefore unregulated"
There is also a sociology-of-science explanation for this confusion,
about the difference in outlook between a statistical and comp.sci
privacy researcher. Pseudonymisation is defined as "/formal
anonymisation/" as a term of art in statistics scientific literature
(and other). It isn't used in this sense in the computer science of
privacy (indeed it's a solecism).
Statistical researchers *definition* for "anonymity" exempts
identification by the researcher. It's a blind spot, perhaps a cultural
assumption with origins of statistics at the heart of the state. Every
statistical agency in the EU - including Eurostat - releases data whilst
retaining the original data, but assesses the "anonymity" of their
disclosures exempting their own knowledge.
It isn't therefore very useful to start with this terminology (never
devised with privacy as the central concept), as the basis for a Code
supposed to reflect EU DP. But for 15 years no butter has melted in
mouths of ICO officials when this point is put to them point blank. So
is that "well done ICO", or "what a complete waste of time"?
In contrast, WP29 in their exemplary new Opinion on Anonymisation
Techniques condense 15 yrs of comp.sci privacy research into three criteria:
Is is still possible to:
1. single out an individual?
2. link records relating to individuals?
3. can information be inferred concerning individuals ?
Computer science is the only discipline that has rigorously studied
privacy exposure from the viewpoint of the individual human right to
privacy and Data Protection. These three WP criteria include the risks
statisticians implicitly exclude, which are the risks concomitant on
them having the data in the first place, and comp.sci has developed
techniques like secure multi-party computation and Private Information
Retrieval which obviate knowledge by a central party.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mailman.stanford.edu/pipermail/liberationtech/attachments/20140420/d333a027/attachment.html>
More information about the liberationtech
mailing list