[Bigbang-dev] Data sharing allowance
Thomas Streinz
tfs253 at nyu.edu
Fri Oct 1 16:26:11 CEST 2021
Thanks! End of November is definitely feasible.
On Fri, Oct 1, 2021 at 10:20 AM Christoph Becker <chrbecker01 at gmail.com>
wrote:
> Hi Thomas,
> thank you so much for sharing you insights and helping us out with this
> tricky topic!
>
> > Can you give me a rough sense of the timeline you have in mind for this
> undertaking?
> There is no hard deadline (as far as I know). However, we have already
> received requests to access the data. Furthermore, upcoming workshops in
> which the BigBang software tool is going to be showcased is going to
> attract more attention.
> Therefore, I would suggest having an updated version of our access permit
> application ready by the end of November (if that is feasible).
>
> Best Wishes,
> Christoph
>
> P.S.: If anyone is lurking here who would like access sooner, please let
> yourself be heard :-)
>
>
> Op vr 1 okt. 2021 om 14:50 schreef Thomas Streinz <tfs253 at nyu.edu>:
>
>> Thank you, Niels. Happy to help. I can also try to get advice from folks
>> who are more familiar with the relevant aspects of European data protection
>> law than I am (in particular researchers who have researched the research
>> exception). Can you give me a rough sense of the timeline you have in mind
>> for this undertaking? Updating the data access permit application seems
>> indeed like the way to go.
>>
>> Best,
>> Thomas
>>
>>
>> On Fri, Oct 1, 2021 at 9:31 AM Niels ten Oever <mail at nielstenoever.net>
>> wrote:
>>
>>> Thanks a lot for this Thomas. I think this sentence in your analysis
>>> very much echoes the path we are trying to go down now:
>>>
>>> > I'm wondering, however, if it might be feasible to make the datasets
>>> only available for research purposes and only to other researchers to stay
>>> within the bounds of the research exception?
>>>
>>> Building on Thomas' excellent analysis + suggestions, I think we should
>>> pursue the road we are on. Namely, create the data store, only share it
>>> with other research, when we share it, do it based on an agreement that
>>> stipulates our understanding of the GDPR and why we think this is OK, and
>>> how they in turn should use the data if they agree.
>>>
>>> For this, I think we will need to slightly update our 'Data Access
>>> Permit Application' [0] with our GDPR interpretation. Thomas, could you
>>> perhaps help us with what such language could/should look like?
>>>
>>> Best,
>>>
>>> Niels
>>>
>>>
>>> [0]
>>> https://github.com/datactive/bigbang/blob/main/data_access_permit_application.md
>>>
>>> On 10/1/21 1:21 PM, Thomas Streinz wrote:
>>> > Hi group,
>>> >
>>> > I have been a lurker on this mailing list for quite a while and I'm
>>> glad that I may be able to provide some context on this issue that may be
>>> helpful. Let me also state at the outset that the following does *not*
>>> constitute legal advice and that I won't bill you 300 Euros for it either
>>> (indeed, I'm afraid, that number may be way too low to get actual legal
>>> advice that goes beyond reciting the relevant provisions of GDPR).
>>> >
>>> > That said, I found this guidance from IAPP (the international
>>> Association of Privacy Professionals which has evolved into a a quite
>>> influential organization):
>>> https://iapp.org/news/a/publicly-available-data-under-gdpr-main-considerations/
>>> <
>>> https://iapp.org/news/a/publicly-available-data-under-gdpr-main-considerations/>
>>> Note how some of the guidance provided there is in tension with pervasive
>>> research practices, especially in data science fields ("when the data is
>>> part of official registers, such registers should be consulted on a
>>> need-to-know basis rather than copied in bulk just in case some data might
>>> be relevant").
>>> >
>>> > My reading of this and the relevant provisions of GDPR suggests a ton
>>> of open questions, many of which indeed have not been resolved. For
>>> example, it's not clear whether (for purposes of escaping the additional
>>> requirements for sensitive data under Article 9) the data subjects in
>>> question made the personal data contained in their email "manifestly"
>>> public (that is: with the intention of further processing) - did the
>>> participants foresee the eventual creation of BigBang? It's also not clear
>>> to me how the requirements under Article 14 (need to inform data subjects)
>>> can be fulfilled in practice.
>>> >
>>> > The scope of the research exception (Article 89) has been contested
>>> for a while and is a good example for the tensions in data protection law:
>>> researchers were worried that data protection law might make their work
>>> impossible; data protection activists were worried that a too broad
>>> exception would be exploited, including by commercial actors. The result is
>>> a terribly drafted provision. In my personal political opinion, I don't
>>> understand why Article 89 GDPR does not distinguish between public research
>>> in the public interest and private research in the private interest. I
>>> attach the leading commentary on Article 89, which unfortunately doesn't
>>> offer much useful guidance for our purposes. At least it references the
>>> relevant recitals at the beginning of GDPR which are part of the political
>>> compromise and can be helpful to understand better what the lawmakers had
>>> in mind (this is, for example, where the advice to use pseudonomization may
>>> be coming from, because that idea is
>>> > mentioned in the relevant recitals; I'm not convinced this actually
>>> solves the problem because even pseudonomized data remains personal data
>>> and it will often be easy to re-identify the individuals if one wants to).
>>> I'm wondering, however, if it might be feasible to make the datasets only
>>> available for research purposes and only to other researchers to stay
>>> within the bounds of the research exception?
>>> >
>>> > Like Niels, I have been worried for a while that data protection law
>>> might eventually throw a wrench into the important work that this group is
>>> doing. I haven't been privy to the whole conversation so far. I assume that
>>> the issue is whether or not the datasets you have assembled can or should
>>> be shared, and if so, under what conditions?
>>> >
>>> > Note that the exceptions for "public" archives don't apply because
>>> those provisions only refer to archives that are required by law (which is
>>> not the case for IETF mailing lists). As Niels suggests, under a functional
>>> analysis, this research should be treated the same as research scrutinizing
>>> public communications of parliamentarians. Unfortunately, I doubt that a
>>> European Court would see it that way.
>>> >
>>> > Maybe we can discuss this at one of the next BigBang meetings, in case
>>> helpful. One literature that I haven't consulted this morning concerns the
>>> interplay between "open data" and data protection law, which may offer some
>>> cues as to what's legally possible and what's clearly off limits (eg this
>>> paper: https://papers.ssrn.com/sol3/papers.cfm?abstract_id=2695005 <
>>> https://papers.ssrn.com/sol3/papers.cfm?abstract_id=2695005>).
>>> >
>>> > Sorry this got so long. All best to all of you on this list (whether
>>> actively participating or just lurking) -- Thomas
>>> >
>>> > PS: For browsing GDPR, I recommend: https://gdpr-info.eu/ <
>>> https://gdpr-info.eu/> (which also lists the relevant recitals under
>>> each article)
>>> >
>>> >
>>> > On Fri, Oct 1, 2021 at 5:49 AM Niels ten Oever <mail at nielstenoever.net
>>> <mailto:mail at nielstenoever.net>> wrote:
>>> >
>>> > Yeah, I was kinda of afraid for this. I would definitely support
>>> spending some money on the legal advice.
>>> >
>>> > Weird thing is that data protection officers at university deal
>>> with this all very differently, I guess GDPR is also still a developing
>>> practice. So would be good to get a specialist to look at it.
>>> >
>>> > One part of this that the person did not reply to, it that these
>>> mailinglists imho should be understood as public policy making. And policy
>>> makers have less expectations of privacy. I think that argument can also be
>>> made because the openness of the mailinglists is also explicitly used as
>>> legitimacy strategy for the standard-setting institutions.
>>> >
>>> > Best,
>>> >
>>> > Niels
>>> >
>>> >
>>> > On 9/30/21 11:01 PM, Christoph Becker wrote:
>>> > > Hi all,
>>> > > you might have noticed that here has been discussion on how we
>>> should share the datasets we have collected of public mailing archives. Our
>>> data format is quite different from how they are presented on GNU mailman
>>> or Listserv, which creates certain points of concern we should not neglect.
>>> > > I have been in contact with some people through the Prototype
>>> fund and have obtained the following advise:
>>> > >
>>> > > """
>>> > > Since you are dealing with "fully or partially automated
>>> processing of personal data" (Art. 2 Para. 1 GDPR), you fall under the
>>> provisions of the GDPR. Where you got the data from should be irrelevant
>>> for this point. Since you have collected the data without the consent of
>>> the persons, Art. 14 GDPR (information obligation if the personal data was
>>> not collected from the person concerned) could also be of interest. There
>>> are exceptions for scientific purposes (Art. 89 GDPR), but here too you
>>> have to pay close attention. Note that hashing mail addresses does not
>>> necessarily make the data "less dangerous". It would be better to
>>> pseudonymized the whole thing.
>>> > > My tip would be not to pass on any data, to refer to the
>>> scientific aspect of the processing and to spend € 200-300 on legal advice.
>>> > > """
>>> > >
>>> > > Through the Prototype fund we have the financial means to pay
>>> for legal advise.
>>> > > Please share your thoughts, comments, ideas.
>>> > >
>>> > > Best Wishes,
>>> > > Christoph
>>> > >
>>> > >
>>> > > --
>>> > > <><><><><><><><><><><><><><><><>
>>> > > //
>>> > > /Christoph Becker /(/he/him/his/)///
>>> > > PostDoc at the/
>>> > > /
>>> > > Institute for Biodiversity and Ecosystem Dynamics and
>>> > > Institute for Advanced Study
>>> > > University of Amsterdam
>>> > > P.O.Box 94248, NL - 1090 GE Amsterdam
>>> > > The Netherlands
>>> > > christovis.github.io/ <http://christovis.github.io/> <
>>> https://urldefense.proofpoint.com/v2/url?u=https-3A__christovis.github.io_&d=DwIGaQ&c=slrrB7dE8n7gBJbeO0g-IQ&r=6izWEoU5Au7hYN0VzT06cQ&m=kfk0LmXR_KN7j89dcn1Aq1wYb3ZohW8qxS8pxEbaDXs&s=ZVATk_IeiqyeMm2n5u8DDKmvxJUjANEua9ce_ETyYmY&e=
>>> <
>>> https://urldefense.proofpoint.com/v2/url?u=https-3A__christovis.github.io_&d=DwIGaQ&c=slrrB7dE8n7gBJbeO0g-IQ&r=6izWEoU5Au7hYN0VzT06cQ&m=kfk0LmXR_KN7j89dcn1Aq1wYb3ZohW8qxS8pxEbaDXs&s=ZVATk_IeiqyeMm2n5u8DDKmvxJUjANEua9ce_ETyYmY&e=>
>>> >/
>>> > >
>>> > > _______________________________________________
>>> > > Bigbang-dev mailing list
>>> > > Bigbang-dev at data-activism.net <mailto:
>>> Bigbang-dev at data-activism.net>
>>> > >
>>> https://urldefense.proofpoint.com/v2/url?u=https-3A__lists.ghserv.net_mailman_listinfo_bigbang-2Ddev&d=DwIGaQ&c=slrrB7dE8n7gBJbeO0g-IQ&r=6izWEoU5Au7hYN0VzT06cQ&m=kfk0LmXR_KN7j89dcn1Aq1wYb3ZohW8qxS8pxEbaDXs&s=pgSXsvdUDcyIdwWzzuG2nEnGqcHzA0ZFQL7R7qQOW5w&e=
>>> <
>>> https://urldefense.proofpoint.com/v2/url?u=https-3A__lists.ghserv.net_mailman_listinfo_bigbang-2Ddev&d=DwIGaQ&c=slrrB7dE8n7gBJbeO0g-IQ&r=6izWEoU5Au7hYN0VzT06cQ&m=kfk0LmXR_KN7j89dcn1Aq1wYb3ZohW8qxS8pxEbaDXs&s=pgSXsvdUDcyIdwWzzuG2nEnGqcHzA0ZFQL7R7qQOW5w&e=
>>> >
>>> > >
>>> >
>>> > --
>>> > Niels ten Oever, PhD
>>> > Postdoctoral Researcher - Media Studies Department - University of
>>> Amsterdam
>>> > Affiliated Faculty - Digital Democracy Institute - Simon Fraser
>>> University
>>> > Research Fellow - Centre for Internet and Human Rights - European
>>> University Viadrina
>>> > Associated Scholar - Centro de Tecnologia e Sociedade - Fundação
>>> Getúlio Vargas
>>> >
>>> > W:
>>> https://urldefense.proofpoint.com/v2/url?u=https-3A__nielstenoever.net&d=DwIGaQ&c=slrrB7dE8n7gBJbeO0g-IQ&r=6izWEoU5Au7hYN0VzT06cQ&m=kfk0LmXR_KN7j89dcn1Aq1wYb3ZohW8qxS8pxEbaDXs&s=nfRmXWnggXqHU8A2tmrYcBp45DZ5g0ASFe1T57NR4s4&e=
>>> <
>>> https://urldefense.proofpoint.com/v2/url?u=https-3A__nielstenoever.net&d=DwIGaQ&c=slrrB7dE8n7gBJbeO0g-IQ&r=6izWEoU5Au7hYN0VzT06cQ&m=kfk0LmXR_KN7j89dcn1Aq1wYb3ZohW8qxS8pxEbaDXs&s=nfRmXWnggXqHU8A2tmrYcBp45DZ5g0ASFe1T57NR4s4&e=
>>> >
>>> > E: mail at nielstenoever.net <mailto:mail at nielstenoever.net>
>>> > T: @nielstenoever
>>> > P/S/WA: +31629051853
>>> > PGP: 2458 0B70 5C4A FD8A 9488 643A 0ED8 3F3A 468A C8B3
>>> >
>>> > Read my latest article on Internet infrastructure governance in
>>> Globalizations here:
>>> https://urldefense.proofpoint.com/v2/url?u=https-3A__www.tandfonline.com_doi_full_10.1080_14747731.2021.1953221&d=DwIGaQ&c=slrrB7dE8n7gBJbeO0g-IQ&r=6izWEoU5Au7hYN0VzT06cQ&m=kfk0LmXR_KN7j89dcn1Aq1wYb3ZohW8qxS8pxEbaDXs&s=RZamNp83LA7uP9EJSscHVW-OXZ0zPM5VQ9p5jiK3smI&e=
>>> <
>>> https://urldefense.proofpoint.com/v2/url?u=https-3A__www.tandfonline.com_doi_full_10.1080_14747731.2021.1953221&d=DwIGaQ&c=slrrB7dE8n7gBJbeO0g-IQ&r=6izWEoU5Au7hYN0VzT06cQ&m=kfk0LmXR_KN7j89dcn1Aq1wYb3ZohW8qxS8pxEbaDXs&s=RZamNp83LA7uP9EJSscHVW-OXZ0zPM5VQ9p5jiK3smI&e=
>>> >
>>> >
>>> > _______________________________________________
>>> > Bigbang-dev mailing list
>>> > Bigbang-dev at data-activism.net <mailto:
>>> Bigbang-dev at data-activism.net>
>>> >
>>> https://urldefense.proofpoint.com/v2/url?u=https-3A__lists.ghserv.net_mailman_listinfo_bigbang-2Ddev&d=DwIGaQ&c=slrrB7dE8n7gBJbeO0g-IQ&r=6izWEoU5Au7hYN0VzT06cQ&m=kfk0LmXR_KN7j89dcn1Aq1wYb3ZohW8qxS8pxEbaDXs&s=pgSXsvdUDcyIdwWzzuG2nEnGqcHzA0ZFQL7R7qQOW5w&e=
>>> <
>>> https://urldefense.proofpoint.com/v2/url?u=https-3A__lists.ghserv.net_mailman_listinfo_bigbang-2Ddev&d=DwIGaQ&c=slrrB7dE8n7gBJbeO0g-IQ&r=6izWEoU5Au7hYN0VzT06cQ&m=kfk0LmXR_KN7j89dcn1Aq1wYb3ZohW8qxS8pxEbaDXs&s=pgSXsvdUDcyIdwWzzuG2nEnGqcHzA0ZFQL7R7qQOW5w&e=
>>> >
>>> >
>>>
>>> --
>>> Niels ten Oever, PhD
>>> Postdoctoral Researcher - Media Studies Department - University of
>>> Amsterdam
>>> Affiliated Faculty - Digital Democracy Institute - Simon Fraser
>>> University
>>> Research Fellow - Centre for Internet and Human Rights - European
>>> University Viadrina
>>> Associated Scholar - Centro de Tecnologia e Sociedade - Fundação Getúlio
>>> Vargas
>>>
>>> W: https://nielstenoever.net
>>> E: mail at nielstenoever.net
>>> T: @nielstenoever
>>> P/S/WA: +31629051853
>>> PGP: 2458 0B70 5C4A FD8A 9488 643A 0ED8 3F3A 468A C8B3
>>>
>>> Read my latest article on Internet infrastructure governance in
>>> Globalizations here:
>>> https://www.tandfonline.com/doi/full/10.1080/14747731.2021.1953221
>>>
>> _______________________________________________
>> Bigbang-dev mailing list
>> Bigbang-dev at data-activism.net
>> https://lists.ghserv.net/mailman/listinfo/bigbang-dev
>>
>
>
> --
>
> *<><><><><><><><><><><><><><><><>*
>
>
> *Christoph Becker (he/him/his)Assistant Researcher at the*
>
> *Institute for Biodiversity and Ecosystem Dynamics at*
>
> *University of Amsterdam*
> *Institute for Computational Cosmology and*
>
> *Institute for Data Science at*
> *Durham University*
> *United Kingdom*
> *website: christovis.github.io <http://christovis.github.io>*
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.ghserv.net/pipermail/bigbang-dev/attachments/20211001/19264126/attachment-0001.htm>
More information about the Bigbang-dev
mailing list