[Bigbang-dev] Big Bang Notebook Questions Corinne

Niels ten Oever mail at nielstenoever.net
Fri Jul 27 21:12:35 CEST 2018


Hiya,

On 07/27/2018 05:58 PM, Corinne Cath wrote:
 <<snip>>
> I was wondering if the list-at-large has any ideas where the
> inconsistencies come from re: number of emails filtered per year or month?
> 
> Because it throws up some real questions regarding using the data
> presented by the tool in academic publications.

Well - that is why this notebook is still in the experimental folder :)

Am sure Nick or Sebastian have a simple explanation.

Cheers,

Niels

> 
> Best,
> 
> Corinne 
> 
> On Wed, Jul 25, 2018 at 2:03 PM, Beraldo, Davide <d.beraldo at uva.nl
> <mailto:d.beraldo at uva.nl>> wrote:
> 
>     Hey Corinne,
> 
>     So I have integrated the first two additional measures (relative
>     user activity and responsiveness -interesting measure btw)
> 
>     As for the third (you want to compute the yearly growth rate of
>     users activity, if I got it right), I have been struggling with it a
>     bit but haven't come to a working solution yet. Unfortunately (so to
>     say ;P) I'll be traveling till Monday, so I can start to work on it
>     again next week.
> 
>     Also, I have tested the inconsistency you mention and I do get them
>     on other mls as well. When measuring the number of emails as the
>     length of the (filtered) archive, it gives a slightly different
>     result then when filtering per month and doing the total. I tried to
>     dig into it but really cannot make sense of it at the moment
> 
>     I attach the notebook with the additions so you can test it locally
>     if useful. I have integrated the new features in the 4th and 6th cells.
> 
>     I'll get back to you next week!
> 
>     Cheers,
>     Davide
> 
> 
>     ------------------------------------------------------------------------
>     *From:* cattekwaad at gmail.com <mailto:cattekwaad at gmail.com>
>     [cattekwaad at gmail.com <mailto:cattekwaad at gmail.com>] on behalf of
>     Corinne Cath [corinnecath at gmail.com <mailto:corinnecath at gmail.com>]
>     *Sent:* Tuesday, July 24, 2018 5:51 PM
>     *To:* Beraldo, Davide
>     *Cc:* bigbang-dev at data-activism.net
>     <mailto:bigbang-dev at data-activism.net>
>     *Subject:* Re: [Bigbang-dev] Big Bang Notebook Questions Corinne
> 
>     Hi Davide,
> 
>     Many many thanks! So for the last one, I would like for each user
>     how many emails they sent in each year and how those relate to each
>     other. My phrasing is a bit clumsy, my apologies allow me to
>     demonstrate:
> 
>     So if Corinne sent 10 emails in year 1 and 20 in year two, that is a
>     50% increase of her emails. I guess I could also do that by
>     head/calculator. 
> 
>     I will also send a longer email to the list explaining some issues I
>     ran into today!
> 
>     Best,
> 
> 
> 
>     On Tue, Jul 24, 2018 at 5:24 PM, Beraldo, Davide <d.beraldo at uva.nl
>     <mailto:d.beraldo at uva.nl>> wrote:
> 
>         Hi Corinne,
> 
>         sorry I totally forgot to get back on this. Responding on
>         coding-related questions.
> 
> 
>         /    In [6]: top senders over a time period/
> 
>             Currently, it gives the absolute numbers, for example, niels
>         ten oever 77.0
> 
>             It would be great to know what percentage that represents of
>         the total number of emails sent in the     period specified. So,
>         those 77 emails are they 0.5, 5, 50% etc of the total over that
>         period?
> 
> 
>         Easy peasy
> 
> 
>         /    In [7]: number of emails in a time frame/
> 
>             I was wondering if it would also be possible to indicate the
>         number of threads versus single emails (with no responses) to
>         get a sense of how responsive a mailing list is in a certain
>         time period.
> 
>          
> 
>         Yes
> 
>             
> 
>             In [8]: I would be interested in, for instance, the average
>         number of emails per user, across multiple years.
> 
>             It seems that currently, the numbers presented are not the
>         average across the time specified but the absolute. That might
>         also be because for the test run, I set
> 
>            
> 
>          
> 
>             date_from = pd.datetime(2014,10,1,tzinfo=pytz.utc)
> 
>             date_to = pd.datetime(2015,11,30,tzinfo=pytz.utc)
> 
>             which is only a little over a year.
> 
> 
>         OK I think I misunderstood your question before. So you want to
>         do: for each user, count how many emails she sent in a
>         timeframe, divide by number of years? If so, be aware that if
>         you don't use a round number of years you would get inconsistent
>         results on the tails. Also, do you mean year as in 365 days from
>         date_from, or year as in 2015,2016,2017,...?
> 
> 
>         I'll try to incorporate these changes tomorrow, hope it's not
>         too late!
> 
> 
> 
>         Cheers,
> 
>         Davide
> 
> 
> 
> 
>     -- 
>     Corinne Cath
>     Ph.D. Candidate, Oxford Internet Institute & Alan Turing Institute
> 
>     Web: www.oii.ox.ac.uk/people/corinne-cath
>     <http://www.oii.ox.ac.uk/people/corinne-cath>
>     Email: ccath at turing.ac.uk <mailto:ccath at turing.ac.uk> &
>     corinnecath at gmail.com <mailto:corinnecath at gmail.com>
>     Twitter: @C_Cath
> 
> 
> 
> 
> -- 
> Corinne Cath
> Ph.D. Candidate, Oxford Internet Institute & Alan Turing Institute
> 
> Web: www.oii.ox.ac.uk/people/corinne-cath
> <http://www.oii.ox.ac.uk/people/corinne-cath>
> Email: ccath at turing.ac.uk <mailto:ccath at turing.ac.uk> &
> corinnecath at gmail.com <mailto:corinnecath at gmail.com>
> Twitter: @C_Cath
> 
> 
> _______________________________________________
> Bigbang-dev mailing list
> Bigbang-dev at data-activism.net
> https://lists.ghserv.net/mailman/listinfo/bigbang-dev
> 

-- 
Niels ten Oever
Researcher and PhD Candidate
Datactive Research Group
University of Amsterdam

PGP fingerprint	   2458 0B70 5C4A FD8A 9488
                   643A 0ED8 3F3A 468A C8B3



More information about the Bigbang-dev mailing list