[Bigbang-dev] Clarifying theoretical commitments going into IETF 116

Mon Jan 30 15:25:41 CET 2023

Hi all,

Excellent suggestions - I am committed to working on accessibility of BigBang during the hackathon, mostly to:

- Get the dashboard working
- Improve the usability of the dashboard
- Select more specific analyses for the dashboard (find them in existing notebooks, or write new notebooks)
- Have a development pipeline and documentation of the dashboard (in a similar manner to BigBang) to foster its future development
- Talk to IETF leadership about relevant metrics
- Talk to Tools team about integration and exchange

It seems that for the coming time there will be more funding available for development, to use that efficiently it would be nice if we have everything neatly documented and streamlined.

I might also develop a some userstories for answering particular questions and using the dashboard.

Best,

Niels

On 30-01-2023 09:34, Priyanka Sinha wrote:
> Great 👍🏾 Hope you had a good time.
> 
> Let us continue the discussion here.
> 
> Look forward to our monthly meeting on 21st February ?
> 
> On Mon, 30 Jan, 2023, 13:59 Xue Li, <x.li3 at uva.nl <mailto:x.li3 at uva.nl>> wrote:
> 
>     Hi all,____
> 
>     __ __
> 
>     I was on holiday in China till today. I just saw there are interesting comments and insights for the NLP side of work.____
> 
>     I would be happy to join the meetings forward and discuss more about the details 😊 .____
> 
>     __ __
> 
>     Best,____
> 
>     Effy____
> 
>     __ __
> 
>     *From: *Bigbang-dev <bigbang-dev-bounces at data-activism.net <mailto:bigbang-dev-bounces at data-activism.net>> on behalf of Priyanka Sinha <priyanka.sinha.iitg at gmail.com <mailto:priyanka.sinha.iitg at gmail.com>>
>     *Date: *Monday, 30 January 2023 at 07:03
>     *To: *Sebastian Benthall <sbenthall at gmail.com <mailto:sbenthall at gmail.com>>
>     *Cc: *bigbang-dev at data-activism.net <mailto:bigbang-dev at data-activism.net> <bigbang-dev at data-activism.net <mailto:bigbang-dev at data-activism.net>>
>     *Subject: *Re: [Bigbang-dev] Clarifying theoretical commitments going into IETF 116____
> 
>     I agree with you .. please find my comments inline____
> 
>     __ __
> 
>     On Wed, 25 Jan 2023 at 18:18, Sebastian Benthall <sbenthall at gmail.com <mailto:sbenthall at gmail.com>> wrote:____
> 
>              From a computational perspective, in my opinion from what you are saying, doing CI would mean I just look at the flow of dialogues, i.e., turn by turn or order of the messages (posts and comments) that one and others have posted, but in a graph theory sense, I can ignore the temporal aspect and treat all the conversation together. Technically, this may avoid getting into issues of short text, noisy text that some statistical NLP methods become difficult due to short context. This may also be less complex computationally. ____
> 
>         __ __
> 
>         Aha. I see what you mean. This does seem computationally tractable.____
> 
>         It reminds me of some of the earliest work I did with BigBang.____
> 
>         __ __
> 
>         What comes to mind is that different working groups might be different 'contexts' and so have different patterns to how the discourse unfolds.____
> 
>         __ __
> 
>         To be honest, this is a bit of a stretch for CI as envisioned by Helen Nissenbaum. But when I originally approached Helen after working on BigBang, I also was thinking about mailings lists as contexts and messages sent as information flows. I suppose making this connection in a publication would be worthwhile :)____
> 
>         __ __
> 
>         To really make it work with CI, we would need to also track personal identifiers within email bodies. I.e not only replies to people, but also references to people. (Maybe this would potentially include legal persons, such as company names.) So entity recognition would be great for this, if it was working.____
> 
>     __ __
> 
>     So, identifying whether the email address used even when slightly different refers to the exact same person, is something my algorithm can do which I have presented at the AID workshop. ____
> 
>     __ __
> 
>     Within the email body, doing the entity recognition as well as perhaps coreference resolution (i.e., the name of the person or company is not present but is referred to with pronouns such as he/she/they) has varying accuracy. I was happy to know of Effy's work in this direction. Myself, I would try to use Effy's published work as well as try Lauren Berk's (now Lauren Wheelock) work https://github.com/lauren897 <https://eur04.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Flauren897&data=05%7C01%7Cx.li3%40uva.nl%7Cb02ed17db40a47e017dd08db0287b14d%7Ca0f1cacd618c4403b94576fb3d6874e5%7C0%7C0%7C638106554046165793%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=PmV9RNj%2BLeBD1qRNSEL967AvU8JJ8bcmBLPKwr2Wljo%3D&reserved=0> https://dspace.mit.edu/handle/1721.1/127291?show=full
>     <https://eur04.safelinks.protection.outlook.com/?url=https%3A%2F%2Fdspace.mit.edu%2Fhandle%2F1721.1%2F127291%3Fshow%3Dfull&data=05%7C01%7Cx.li3%40uva.nl%7Cb02ed17db40a47e017dd08db0287b14d%7Ca0f1cacd618c4403b94576fb3d6874e5%7C0%7C0%7C638106554046165793%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=dAJjPOkwLux0zkO0NJhtP9rL6UIKEwU6bbd2NBPMtvg%3D&reserved=0> which when I had attended worked well for cases with short context.____
> 
>     ____
> 
>         __ __
> 
>         What kind of graph metrics would you find worth tracking?____
> 
>     __ __
> 
>     This is an interesting question for me, since I haven't thought of the graph from the perspective of say measures like betweenness centrality, etc. I thought of it as a representation based on which we mine for insights, using new graph neural network algorithms.  For example, if we represent the discourses as a multi edged temporal graph, where the different types of edges represent different aspects of the communication that we take into account, then if we work on extracting say graphlets (which in my mind are homeomorphic subgraph patterns (say could have maybe 15 nodes which could be one set of folks that hold a particular view). Then these graphlets we could label as different viewpoints in how they view privacy?? I apologize if it doesn't make sense, I haven't yet figured this out . I mean we could take the direction where we are not doing this .. and we model the problem as a agent simulation where the goals are related to the CI .. and inside we represent the
>     agents and their interaction in the graph structure and we create a learning model whose weights we are trying to learn by trying to reach the goals based on the existing dialogue traces (aka mailing list conversations) we have.____
> 
>     __ __
> 
>         ____
> 
>             If the WN world view is so fine-grained that we need to look at timestamps and model in continuous time domain, then for me I think that is too challenging, albeit interesting. If WN is just major events and thus we can split our data into windows or chunks manually, then we avoid the problem. ____
> 
>         __ __
> 
>         I need to dig deeper to recall exactly how the computational sociology components of WN work.____
> 
>         But my sense is that the qualitative theory in WN is much richer than its technical operationalization.____
> 
>         That leaves a big gap that we can start trying to fill.____
> 
>         __ __
> 
>         I don't think continuous time analysis will be necessary; windows or chunks should be fine.____
> 
>     __ __
> 
>     AWesome !!!! ____
> 
>         __ __
> 
>         - S____
> 
>     __ __
> 
>     -priyanka ____
> 
> 
> _______________________________________________
> Bigbang-dev mailing list
> Bigbang-dev at data-activism.net
> https://lists.ghserv.net/mailman/listinfo/bigbang-dev

-- 
Niels ten Oever, PhD
Postdoctoral Researcher - Media Studies Department - University of Amsterdam
Affiliated Faculty - Digital Democracy Institute - Simon Fraser University
Non-Resident Fellow 2022-2023 - Center for Democracy & Technology
Associated Scholar - Centro de Tecnologia e Sociedade - Fundação Getúlio Vargas
Research Fellow - Centre for Internet and Human Rights - European University Viadrina

Vice chair - Global Internet Governance Academic Network (GigaNet)

W: https://nielstenoever.net
E: mail at nielstenoever.net
T: @nielstenoever
P/S/WA: +31629051853
PGP: 4254 ECD5 D4CF F6AF 8B91 0D9F EFAD 2E49 CC90 C10C

Read my latest article on network ideologies and how 5G reshapes the internet https://www.sciencedirect.com/science/article/pii/S0308596122001446
-------------- next part --------------
A non-text attachment was scrubbed...
Name: OpenPGP_signature
Type: application/pgp-signature
Size: 840 bytes
Desc: OpenPGP digital signature
URL: <http://lists.ghserv.net/pipermail/bigbang-dev/attachments/20230130/ac09bf61/attachment.sig>