[Bigbang-dev] Data sharing allowance

Thomas Streinz tfs253 at nyu.edu
Fri Oct 1 13:21:31 CEST 2021


Hi group,

I have been a lurker on this mailing list for quite a while and I'm glad
that I may be able to provide some context on this issue that may be
helpful. Let me also state at the outset that the following does *not*
constitute legal advice and that I won't bill you 300 Euros for it either
(indeed, I'm afraid, that number may be way too low to get actual legal
advice that goes beyond reciting the relevant provisions of GDPR).

That said, I found this guidance from IAPP (the international Association
of Privacy Professionals which has evolved into a a quite influential
organization):
https://iapp.org/news/a/publicly-available-data-under-gdpr-main-considerations/
Note how some of the guidance provided there is in tension with pervasive
research practices, especially in data science fields ("when the data is
part of official registers, such registers should be consulted on a
need-to-know basis rather than copied in bulk just in case some data might
be relevant").

My reading of this and the relevant provisions of GDPR suggests a ton of
open questions, many of which indeed have not been resolved. For example,
it's not clear whether (for purposes of escaping the additional
requirements for sensitive data under Article 9) the data subjects in
question made the personal data contained in their email "manifestly"
public (that is: with the intention of further processing) - did the
participants foresee the eventual creation of BigBang? It's also not clear
to me how the requirements under Article 14 (need to inform data subjects)
can be fulfilled in practice.

The scope of the research exception (Article 89) has been contested for a
while and is a good example for the tensions in data protection law:
researchers were worried that data protection law might make their work
impossible; data protection activists were worried that a too broad
exception would be exploited, including by commercial actors. The result is
a terribly drafted provision. In my personal political opinion, I don't
understand why Article 89 GDPR does not distinguish between public research
in the public interest and private research in the private interest. I
attach the leading commentary on Article 89, which unfortunately doesn't
offer much useful guidance for our purposes. At least it references the
relevant recitals at the beginning of GDPR which are part of the political
compromise and can be helpful to understand better what the lawmakers had
in mind (this is, for example, where the advice to use pseudonomization may
be coming from, because that idea is mentioned in the relevant recitals;
I'm not convinced this actually solves the problem because even
pseudonomized data remains personal data and it will often be easy to
re-identify the individuals if one wants to). I'm wondering, however, if it
might be feasible to make the datasets only available for research purposes
and only to other researchers to stay within the bounds of the research
exception?

Like Niels, I have been worried for a while that data protection law might
eventually throw a wrench into the important work that this group is doing.
I haven't been privy to the whole conversation so far. I assume that the
issue is whether or not the datasets you have assembled can or should be
shared, and if so, under what conditions?

Note that the exceptions for "public" archives don't apply because those
provisions only refer to archives that are required by law (which is not
the case for IETF mailing lists). As Niels suggests, under a functional
analysis, this research should be treated the same as research scrutinizing
public communications of parliamentarians. Unfortunately, I doubt that a
European Court would see it that way.

Maybe we can discuss this at one of the next BigBang meetings, in case
helpful. One literature that I haven't consulted this morning concerns the
interplay between "open data" and data protection law, which may offer some
cues as to what's legally possible and what's clearly off limits (eg this
paper: https://papers.ssrn.com/sol3/papers.cfm?abstract_id=2695005).

Sorry this got so long. All best to all of you on this list (whether
actively participating or just lurking) -- Thomas

PS: For browsing GDPR, I recommend: https://gdpr-info.eu/ (which also lists
the relevant recitals under each article)


On Fri, Oct 1, 2021 at 5:49 AM Niels ten Oever <mail at nielstenoever.net>
wrote:

> Yeah, I was kinda of afraid for this. I would definitely support spending
> some money on the legal advice.
>
> Weird thing is that data protection officers at university deal with this
> all very differently, I guess GDPR is also still a developing practice. So
> would be good to get a specialist to look at it.
>
> One part of this that the person did not reply to, it that these
> mailinglists imho should be understood as public policy making. And policy
> makers have less expectations of privacy. I think that argument can also be
> made because the openness of the mailinglists is also explicitly used as
> legitimacy strategy for the standard-setting institutions.
>
> Best,
>
> Niels
>
>
> On 9/30/21 11:01 PM, Christoph Becker wrote:
> > Hi all,
> > you might have noticed that here has been discussion on how we should
> share the datasets we have collected of public mailing archives. Our data
> format is quite different from how they are presented on GNU mailman or
> Listserv, which creates certain points of concern we should not neglect.
> > I have been in contact with some people through the Prototype fund and
> have obtained the following advise:
> >
> > """
> > Since you are dealing with "fully or partially automated processing of
> personal data" (Art. 2 Para. 1 GDPR), you fall under the provisions of the
> GDPR. Where you got the data from should be irrelevant for this point.
> Since you have collected the data without the consent of the persons, Art.
> 14 GDPR (information obligation if the personal data was not collected from
> the person concerned) could also be of interest. There are exceptions for
> scientific purposes (Art. 89 GDPR), but here too you have to pay close
> attention. Note that hashing mail addresses does not necessarily make the
> data "less dangerous". It would be better to pseudonymized the whole thing.
> > My tip would be not to pass on any data, to refer to the scientific
> aspect of the processing and to spend € 200-300 on legal advice.
> > """
> >
> > Through the Prototype fund we have the financial means to pay for legal
> advise.
> > Please share your thoughts, comments, ideas.
> >
> > Best Wishes,
> > Christoph
> >
> >
> > --
> > <><><><><><><><><><><><><><><><>
> > //
> > /Christoph Becker /(/he/him/his/)///
> > PostDoc at the/
> > /
> > Institute for Biodiversity and Ecosystem Dynamics and
> > Institute for Advanced Study
> > University of Amsterdam
> > P.O.Box 94248, NL - 1090 GE Amsterdam
> > The Netherlands
> > christovis.github.io/ <
> https://urldefense.proofpoint.com/v2/url?u=https-3A__christovis.github.io_&d=DwIGaQ&c=slrrB7dE8n7gBJbeO0g-IQ&r=6izWEoU5Au7hYN0VzT06cQ&m=kfk0LmXR_KN7j89dcn1Aq1wYb3ZohW8qxS8pxEbaDXs&s=ZVATk_IeiqyeMm2n5u8DDKmvxJUjANEua9ce_ETyYmY&e=
> >/
> >
> > _______________________________________________
> > Bigbang-dev mailing list
> > Bigbang-dev at data-activism.net
> >
> https://urldefense.proofpoint.com/v2/url?u=https-3A__lists.ghserv.net_mailman_listinfo_bigbang-2Ddev&d=DwIGaQ&c=slrrB7dE8n7gBJbeO0g-IQ&r=6izWEoU5Au7hYN0VzT06cQ&m=kfk0LmXR_KN7j89dcn1Aq1wYb3ZohW8qxS8pxEbaDXs&s=pgSXsvdUDcyIdwWzzuG2nEnGqcHzA0ZFQL7R7qQOW5w&e=
> >
>
> --
> Niels ten Oever, PhD
> Postdoctoral Researcher - Media Studies Department - University of
> Amsterdam
> Affiliated Faculty - Digital Democracy Institute - Simon Fraser University
> Research Fellow - Centre for Internet and Human Rights - European
> University Viadrina
> Associated Scholar - Centro de Tecnologia e Sociedade - Fundação Getúlio
> Vargas
>
> W:
> https://urldefense.proofpoint.com/v2/url?u=https-3A__nielstenoever.net&d=DwIGaQ&c=slrrB7dE8n7gBJbeO0g-IQ&r=6izWEoU5Au7hYN0VzT06cQ&m=kfk0LmXR_KN7j89dcn1Aq1wYb3ZohW8qxS8pxEbaDXs&s=nfRmXWnggXqHU8A2tmrYcBp45DZ5g0ASFe1T57NR4s4&e=
> E: mail at nielstenoever.net
> T: @nielstenoever
> P/S/WA: +31629051853
> PGP: 2458 0B70 5C4A FD8A 9488 643A 0ED8 3F3A 468A C8B3
>
> Read my latest article on Internet infrastructure governance in
> Globalizations here:
> https://urldefense.proofpoint.com/v2/url?u=https-3A__www.tandfonline.com_doi_full_10.1080_14747731.2021.1953221&d=DwIGaQ&c=slrrB7dE8n7gBJbeO0g-IQ&r=6izWEoU5Au7hYN0VzT06cQ&m=kfk0LmXR_KN7j89dcn1Aq1wYb3ZohW8qxS8pxEbaDXs&s=RZamNp83LA7uP9EJSscHVW-OXZ0zPM5VQ9p5jiK3smI&e=
>
> _______________________________________________
> Bigbang-dev mailing list
> Bigbang-dev at data-activism.net
>
> https://urldefense.proofpoint.com/v2/url?u=https-3A__lists.ghserv.net_mailman_listinfo_bigbang-2Ddev&d=DwIGaQ&c=slrrB7dE8n7gBJbeO0g-IQ&r=6izWEoU5Au7hYN0VzT06cQ&m=kfk0LmXR_KN7j89dcn1Aq1wYb3ZohW8qxS8pxEbaDXs&s=pgSXsvdUDcyIdwWzzuG2nEnGqcHzA0ZFQL7R7qQOW5w&e=
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.ghserv.net/pipermail/bigbang-dev/attachments/20211001/7ae12d5c/attachment-0001.htm>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: Article 89 (Safeguards and derogations for scientific and other purposes) - Christian Wiese Svanberg.pdf
Type: application/pdf
Size: 124903 bytes
Desc: not available
URL: <http://lists.ghserv.net/pipermail/bigbang-dev/attachments/20211001/7ae12d5c/attachment-0001.pdf>


More information about the Bigbang-dev mailing list