[Bigbang-dev] Data sharing allowance

Thomas Streinz tfs253 at nyu.edu
Fri Oct 1 20:43:43 CEST 2021


Thanks, Seb. I should have been clearer: the "making manifestly public"
prong only helps with Article 9 - *but not with other provisions*. In terms
of lawfulness of processing (Article 6), for example, there is a question
whether one could rely on Article 6(1)(f) - legitimate interests by
claiming that there is (global?) public interest in this (personal) data
(contained in the emails) being publicly available or at least available to
researchers. The problem with this prong is that it's ultimately a
balancing exercise and there is a risk that a Court would say that the data
protection rights of the data subjects outweigh the public interest in
access to the emails they sent (this is one of many reasons why commercial
actors so often rely on Article 6(1)(a) - consent). So, unfortunately,
BigBang can't rest easy.

I'm also not quite sure (as in: genuinely uncertain) whether it's right to
say that the authors of emails assumed that their input would be publicly
available to (potentially) billions or mined by researchers in the way
BigBang does? Doesn't it make a difference (normatively) that the community
of Internet researchers was initially relatively small and close-knit and
access to the public mailing lists only sought by insiders?



On Fri, Oct 1, 2021 at 1:19 PM Sebastian Benthall <sbenthall at gmail.com>
wrote:

> Thanks so much, Thomas. Let me join the others in welcoming your input on
> this.
>
> My two cents are that we are totally fine with respect to the GDPR,
> because:
>
> > For example, it's not clear whether (for purposes of escaping the
> additional requirements for sensitive data under Article 9) the data
> subjects in question made the personal data contained in their email
> "manifestly" public (that is: with the intention of further processing) -
> did the participants foresee the eventual creation of BigBang?
>
> The answer to this question is "Yes". Not specifically BigBang, of course,
> but these are the people designing Internet protocols, who are the least
> naive people on the planet about what it means to put data in clear text on
> the Internet. Since "further processing" of this data includes being
> indexed by search engines, which has been going on long before BigBang, and
> has no doubt been used by the participants as they engage these materials,
> the data absolutely IS manifestly public. We can rest easy.
>
>
>
> On Fri, Oct 1, 2021, 6:22 AM Thomas Streinz <tfs253 at nyu.edu> wrote:
>
>> Hi group,
>>
>> I have been a lurker on this mailing list for quite a while and I'm glad
>> that I may be able to provide some context on this issue that may be
>> helpful. Let me also state at the outset that the following does *not*
>> constitute legal advice and that I won't bill you 300 Euros for it either
>> (indeed, I'm afraid, that number may be way too low to get actual legal
>> advice that goes beyond reciting the relevant provisions of GDPR).
>>
>> That said, I found this guidance from IAPP (the international Association
>> of Privacy Professionals which has evolved into a a quite influential
>> organization):
>> https://iapp.org/news/a/publicly-available-data-under-gdpr-main-considerations/
>> Note how some of the guidance provided there is in tension with pervasive
>> research practices, especially in data science fields ("when the data is
>> part of official registers, such registers should be consulted on a
>> need-to-know basis rather than copied in bulk just in case some data might
>> be relevant").
>>
>> My reading of this and the relevant provisions of GDPR suggests a ton of
>> open questions, many of which indeed have not been resolved. For example,
>> it's not clear whether (for purposes of escaping the additional
>> requirements for sensitive data under Article 9) the data subjects in
>> question made the personal data contained in their email "manifestly"
>> public (that is: with the intention of further processing) - did the
>> participants foresee the eventual creation of BigBang? It's also not clear
>> to me how the requirements under Article 14 (need to inform data subjects)
>> can be fulfilled in practice.
>>
>> The scope of the research exception (Article 89) has been contested for a
>> while and is a good example for the tensions in data protection law:
>> researchers were worried that data protection law might make their work
>> impossible; data protection activists were worried that a too broad
>> exception would be exploited, including by commercial actors. The result is
>> a terribly drafted provision. In my personal political opinion, I don't
>> understand why Article 89 GDPR does not distinguish between public research
>> in the public interest and private research in the private interest. I
>> attach the leading commentary on Article 89, which unfortunately doesn't
>> offer much useful guidance for our purposes. At least it references the
>> relevant recitals at the beginning of GDPR which are part of the political
>> compromise and can be helpful to understand better what the lawmakers had
>> in mind (this is, for example, where the advice to use pseudonomization may
>> be coming from, because that idea is mentioned in the relevant recitals;
>> I'm not convinced this actually solves the problem because even
>> pseudonomized data remains personal data and it will often be easy to
>> re-identify the individuals if one wants to). I'm wondering, however, if it
>> might be feasible to make the datasets only available for research purposes
>> and only to other researchers to stay within the bounds of the research
>> exception?
>>
>> Like Niels, I have been worried for a while that data protection law
>> might eventually throw a wrench into the important work that this group is
>> doing. I haven't been privy to the whole conversation so far. I assume that
>> the issue is whether or not the datasets you have assembled can or should
>> be shared, and if so, under what conditions?
>>
>> Note that the exceptions for "public" archives don't apply because those
>> provisions only refer to archives that are required by law (which is not
>> the case for IETF mailing lists). As Niels suggests, under a functional
>> analysis, this research should be treated the same as research scrutinizing
>> public communications of parliamentarians. Unfortunately, I doubt that a
>> European Court would see it that way.
>>
>> Maybe we can discuss this at one of the next BigBang meetings, in case
>> helpful. One literature that I haven't consulted this morning concerns the
>> interplay between "open data" and data protection law, which may offer some
>> cues as to what's legally possible and what's clearly off limits (eg this
>> paper: https://papers.ssrn.com/sol3/papers.cfm?abstract_id=2695005).
>>
>> Sorry this got so long. All best to all of you on this list (whether
>> actively participating or just lurking) -- Thomas
>>
>> PS: For browsing GDPR, I recommend: https://gdpr-info.eu/ (which also
>> lists the relevant recitals under each article)
>>
>>
>> On Fri, Oct 1, 2021 at 5:49 AM Niels ten Oever <mail at nielstenoever.net>
>> wrote:
>>
>>> Yeah, I was kinda of afraid for this. I would definitely support
>>> spending some money on the legal advice.
>>>
>>> Weird thing is that data protection officers at university deal with
>>> this all very differently, I guess GDPR is also still a developing
>>> practice. So would be good to get a specialist to look at it.
>>>
>>> One part of this that the person did not reply to, it that these
>>> mailinglists imho should be understood as public policy making. And policy
>>> makers have less expectations of privacy. I think that argument can also be
>>> made because the openness of the mailinglists is also explicitly used as
>>> legitimacy strategy for the standard-setting institutions.
>>>
>>> Best,
>>>
>>> Niels
>>>
>>>
>>> On 9/30/21 11:01 PM, Christoph Becker wrote:
>>> > Hi all,
>>> > you might have noticed that here has been discussion on how we should
>>> share the datasets we have collected of public mailing archives. Our data
>>> format is quite different from how they are presented on GNU mailman or
>>> Listserv, which creates certain points of concern we should not neglect.
>>> > I have been in contact with some people through the Prototype fund and
>>> have obtained the following advise:
>>> >
>>> > """
>>> > Since you are dealing with "fully or partially automated processing of
>>> personal data" (Art. 2 Para. 1 GDPR), you fall under the provisions of the
>>> GDPR. Where you got the data from should be irrelevant for this point.
>>> Since you have collected the data without the consent of the persons, Art.
>>> 14 GDPR (information obligation if the personal data was not collected from
>>> the person concerned) could also be of interest. There are exceptions for
>>> scientific purposes (Art. 89 GDPR), but here too you have to pay close
>>> attention. Note that hashing mail addresses does not necessarily make the
>>> data "less dangerous". It would be better to pseudonymized the whole thing.
>>> > My tip would be not to pass on any data, to refer to the scientific
>>> aspect of the processing and to spend € 200-300 on legal advice.
>>> > """
>>> >
>>> > Through the Prototype fund we have the financial means to pay for
>>> legal advise.
>>> > Please share your thoughts, comments, ideas.
>>> >
>>> > Best Wishes,
>>> > Christoph
>>> >
>>> >
>>> > --
>>> > <><><><><><><><><><><><><><><><>
>>> > //
>>> > /Christoph Becker /(/he/him/his/)///
>>> > PostDoc at the/
>>> > /
>>> > Institute for Biodiversity and Ecosystem Dynamics and
>>> > Institute for Advanced Study
>>> > University of Amsterdam
>>> > P.O.Box 94248, NL - 1090 GE Amsterdam
>>> > The Netherlands
>>> > christovis.github.io/ <
>>> https://urldefense.proofpoint.com/v2/url?u=https-3A__christovis.github.io_&d=DwIGaQ&c=slrrB7dE8n7gBJbeO0g-IQ&r=6izWEoU5Au7hYN0VzT06cQ&m=kfk0LmXR_KN7j89dcn1Aq1wYb3ZohW8qxS8pxEbaDXs&s=ZVATk_IeiqyeMm2n5u8DDKmvxJUjANEua9ce_ETyYmY&e=
>>> >/
>>> >
>>> > _______________________________________________
>>> > Bigbang-dev mailing list
>>> > Bigbang-dev at data-activism.net
>>> >
>>> https://urldefense.proofpoint.com/v2/url?u=https-3A__lists.ghserv.net_mailman_listinfo_bigbang-2Ddev&d=DwIGaQ&c=slrrB7dE8n7gBJbeO0g-IQ&r=6izWEoU5Au7hYN0VzT06cQ&m=kfk0LmXR_KN7j89dcn1Aq1wYb3ZohW8qxS8pxEbaDXs&s=pgSXsvdUDcyIdwWzzuG2nEnGqcHzA0ZFQL7R7qQOW5w&e=
>>> >
>>>
>>> --
>>> Niels ten Oever, PhD
>>> Postdoctoral Researcher - Media Studies Department - University of
>>> Amsterdam
>>> Affiliated Faculty - Digital Democracy Institute - Simon Fraser
>>> University
>>> Research Fellow - Centre for Internet and Human Rights - European
>>> University Viadrina
>>> Associated Scholar - Centro de Tecnologia e Sociedade - Fundação Getúlio
>>> Vargas
>>>
>>> W:
>>> https://urldefense.proofpoint.com/v2/url?u=https-3A__nielstenoever.net&d=DwIGaQ&c=slrrB7dE8n7gBJbeO0g-IQ&r=6izWEoU5Au7hYN0VzT06cQ&m=kfk0LmXR_KN7j89dcn1Aq1wYb3ZohW8qxS8pxEbaDXs&s=nfRmXWnggXqHU8A2tmrYcBp45DZ5g0ASFe1T57NR4s4&e=
>>> E: mail at nielstenoever.net
>>> T: @nielstenoever
>>> P/S/WA: +31629051853
>>> PGP: 2458 0B70 5C4A FD8A 9488 643A 0ED8 3F3A 468A C8B3
>>>
>>> Read my latest article on Internet infrastructure governance in
>>> Globalizations here:
>>> https://urldefense.proofpoint.com/v2/url?u=https-3A__www.tandfonline.com_doi_full_10.1080_14747731.2021.1953221&d=DwIGaQ&c=slrrB7dE8n7gBJbeO0g-IQ&r=6izWEoU5Au7hYN0VzT06cQ&m=kfk0LmXR_KN7j89dcn1Aq1wYb3ZohW8qxS8pxEbaDXs&s=RZamNp83LA7uP9EJSscHVW-OXZ0zPM5VQ9p5jiK3smI&e=
>>>
>>> _______________________________________________
>>> Bigbang-dev mailing list
>>> Bigbang-dev at data-activism.net
>>>
>>> https://urldefense.proofpoint.com/v2/url?u=https-3A__lists.ghserv.net_mailman_listinfo_bigbang-2Ddev&d=DwIGaQ&c=slrrB7dE8n7gBJbeO0g-IQ&r=6izWEoU5Au7hYN0VzT06cQ&m=kfk0LmXR_KN7j89dcn1Aq1wYb3ZohW8qxS8pxEbaDXs&s=pgSXsvdUDcyIdwWzzuG2nEnGqcHzA0ZFQL7R7qQOW5w&e=
>>>
>> _______________________________________________
>> Bigbang-dev mailing list
>> Bigbang-dev at data-activism.net
>> https://lists.ghserv.net/mailman/listinfo/bigbang-dev
>>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.ghserv.net/pipermail/bigbang-dev/attachments/20211001/4e0d10a3/attachment-0001.htm>


More information about the Bigbang-dev mailing list