[Bigbang-dev] Clarifying theoretical commitments going into IETF 116

Priyanka Sinha priyanka.sinha.iitg at gmail.com
Mon Jan 30 09:34:44 CET 2023


Great 👍🏾 Hope you had a good time.

Let us continue the discussion here.

Look forward to our monthly meeting on 21st February ?

On Mon, 30 Jan, 2023, 13:59 Xue Li, <x.li3 at uva.nl> wrote:

> Hi all,
>
>
>
> I was on holiday in China till today. I just saw there are interesting
> comments and insights for the NLP side of work.
>
> I would be happy to join the meetings forward and discuss more about the
> details 😊 .
>
>
>
> Best,
>
> Effy
>
>
>
> *From: *Bigbang-dev <bigbang-dev-bounces at data-activism.net> on behalf of
> Priyanka Sinha <priyanka.sinha.iitg at gmail.com>
> *Date: *Monday, 30 January 2023 at 07:03
> *To: *Sebastian Benthall <sbenthall at gmail.com>
> *Cc: *bigbang-dev at data-activism.net <bigbang-dev at data-activism.net>
> *Subject: *Re: [Bigbang-dev] Clarifying theoretical commitments going
> into IETF 116
>
> I agree with you .. please find my comments inline
>
>
>
> On Wed, 25 Jan 2023 at 18:18, Sebastian Benthall <sbenthall at gmail.com>
> wrote:
>
> From a computational perspective, in my opinion from what you are saying,
> doing CI would mean I just look at the flow of dialogues, i.e., turn by
> turn or order of the messages (posts and comments) that one and others have
> posted, but in a graph theory sense, I can ignore the temporal aspect and
> treat all the conversation together. Technically, this may avoid getting
> into issues of short text, noisy text that some statistical NLP methods
> become difficult due to short context. This may also be less complex
> computationally.
>
>
>
> Aha. I see what you mean. This does seem computationally tractable.
>
> It reminds me of some of the earliest work I did with BigBang.
>
>
>
> What comes to mind is that different working groups might be different
> 'contexts' and so have different patterns to how the discourse unfolds.
>
>
>
> To be honest, this is a bit of a stretch for CI as envisioned by Helen
> Nissenbaum. But when I originally approached Helen after working on
> BigBang, I also was thinking about mailings lists as contexts and messages
> sent as information flows. I suppose making this connection in a
> publication would be worthwhile :)
>
>
>
> To really make it work with CI, we would need to also track personal
> identifiers within email bodies. I.e not only replies to people, but also
> references to people. (Maybe this would potentially include legal persons,
> such as company names.) So entity recognition would be great for this, if
> it was working.
>
>
>
> So, identifying whether the email address used even when slightly
> different refers to the exact same person, is something my algorithm can do
> which I have presented at the AID workshop.
>
>
>
> Within the email body, doing the entity recognition as well as perhaps
> coreference resolution (i.e., the name of the person or company is
> not present but is referred to with pronouns such as he/she/they) has
> varying accuracy. I was happy to know of Effy's work in this direction.
> Myself, I would try to use Effy's published work as well as try Lauren
> Berk's (now Lauren Wheelock) work https://github.com/lauren897
> <https://eur04.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Flauren897&data=05%7C01%7Cx.li3%40uva.nl%7Cb02ed17db40a47e017dd08db0287b14d%7Ca0f1cacd618c4403b94576fb3d6874e5%7C0%7C0%7C638106554046165793%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=PmV9RNj%2BLeBD1qRNSEL967AvU8JJ8bcmBLPKwr2Wljo%3D&reserved=0>
>  https://dspace.mit.edu/handle/1721.1/127291?show=full
> <https://eur04.safelinks.protection.outlook.com/?url=https%3A%2F%2Fdspace.mit.edu%2Fhandle%2F1721.1%2F127291%3Fshow%3Dfull&data=05%7C01%7Cx.li3%40uva.nl%7Cb02ed17db40a47e017dd08db0287b14d%7Ca0f1cacd618c4403b94576fb3d6874e5%7C0%7C0%7C638106554046165793%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=dAJjPOkwLux0zkO0NJhtP9rL6UIKEwU6bbd2NBPMtvg%3D&reserved=0>
> which when I had attended worked well for cases with short context.
>
>
>
>
>
> What kind of graph metrics would you find worth tracking?
>
>
>
> This is an interesting question for me, since I haven't thought of the
> graph from the perspective of say measures like betweenness centrality,
> etc. I thought of it as a representation based on which we mine for
> insights, using new graph neural network algorithms.  For example, if we
> represent the discourses as a multi edged temporal graph, where the
> different types of edges represent different aspects of the communication
> that we take into account, then if we work on extracting say graphlets
> (which in my mind are homeomorphic subgraph patterns (say could have maybe
> 15 nodes which could be one set of folks that hold a particular view). Then
> these graphlets we could label as different viewpoints in how they view
> privacy?? I apologize if it doesn't make sense, I haven't yet figured this
> out . I mean we could take the direction where we are not doing this .. and
> we model the problem as a agent simulation where the goals are related to
> the CI .. and inside we represent the agents and their interaction in the
> graph structure and we create a learning model whose weights we are trying
> to learn by trying to reach the goals based on the existing dialogue traces
> (aka mailing list conversations) we have.
>
>
>
>
>
> If the WN world view is so fine-grained that we need to look at timestamps
> and model in continuous time domain, then for me I think that is too
> challenging, albeit interesting. If WN is just major events and thus we can
> split our data into windows or chunks manually, then we avoid the problem.
>
>
>
> I need to dig deeper to recall exactly how the computational sociology
> components of WN work.
>
> But my sense is that the qualitative theory in WN is much richer than its
> technical operationalization.
>
> That leaves a big gap that we can start trying to fill.
>
>
>
> I don't think continuous time analysis will be necessary; windows or
> chunks should be fine.
>
>
>
> AWesome !!!!
>
>
>
> - S
>
>
>
> -priyanka
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.ghserv.net/pipermail/bigbang-dev/attachments/20230130/9bc5134b/attachment.htm>


More information about the Bigbang-dev mailing list