[Bigbang-dev] Big Bang Notebook Questions Corinne
Niels ten Oever
mail at nielstenoever.net
Fri Jul 27 21:12:35 CEST 2018
Hiya,
On 07/27/2018 05:58 PM, Corinne Cath wrote:
<<snip>>
> I was wondering if the list-at-large has any ideas where the
> inconsistencies come from re: number of emails filtered per year or month?
>
> Because it throws up some real questions regarding using the data
> presented by the tool in academic publications.
Well - that is why this notebook is still in the experimental folder :)
Am sure Nick or Sebastian have a simple explanation.
Cheers,
Niels
>
> Best,
>
> Corinne
>
> On Wed, Jul 25, 2018 at 2:03 PM, Beraldo, Davide <d.beraldo at uva.nl
> <mailto:d.beraldo at uva.nl>> wrote:
>
> Hey Corinne,
>
> So I have integrated the first two additional measures (relative
> user activity and responsiveness -interesting measure btw)
>
> As for the third (you want to compute the yearly growth rate of
> users activity, if I got it right), I have been struggling with it a
> bit but haven't come to a working solution yet. Unfortunately (so to
> say ;P) I'll be traveling till Monday, so I can start to work on it
> again next week.
>
> Also, I have tested the inconsistency you mention and I do get them
> on other mls as well. When measuring the number of emails as the
> length of the (filtered) archive, it gives a slightly different
> result then when filtering per month and doing the total. I tried to
> dig into it but really cannot make sense of it at the moment
>
> I attach the notebook with the additions so you can test it locally
> if useful. I have integrated the new features in the 4th and 6th cells.
>
> I'll get back to you next week!
>
> Cheers,
> Davide
>
>
> ------------------------------------------------------------------------
> *From:* cattekwaad at gmail.com <mailto:cattekwaad at gmail.com>
> [cattekwaad at gmail.com <mailto:cattekwaad at gmail.com>] on behalf of
> Corinne Cath [corinnecath at gmail.com <mailto:corinnecath at gmail.com>]
> *Sent:* Tuesday, July 24, 2018 5:51 PM
> *To:* Beraldo, Davide
> *Cc:* bigbang-dev at data-activism.net
> <mailto:bigbang-dev at data-activism.net>
> *Subject:* Re: [Bigbang-dev] Big Bang Notebook Questions Corinne
>
> Hi Davide,
>
> Many many thanks! So for the last one, I would like for each user
> how many emails they sent in each year and how those relate to each
> other. My phrasing is a bit clumsy, my apologies allow me to
> demonstrate:
>
> So if Corinne sent 10 emails in year 1 and 20 in year two, that is a
> 50% increase of her emails. I guess I could also do that by
> head/calculator.
>
> I will also send a longer email to the list explaining some issues I
> ran into today!
>
> Best,
>
>
>
> On Tue, Jul 24, 2018 at 5:24 PM, Beraldo, Davide <d.beraldo at uva.nl
> <mailto:d.beraldo at uva.nl>> wrote:
>
> Hi Corinne,
>
> sorry I totally forgot to get back on this. Responding on
> coding-related questions.
>
>
> / In [6]: top senders over a time period/
>
> Currently, it gives the absolute numbers, for example, niels
> ten oever 77.0
>
> It would be great to know what percentage that represents of
> the total number of emails sent in the period specified. So,
> those 77 emails are they 0.5, 5, 50% etc of the total over that
> period?
>
>
> Easy peasy
>
>
> / In [7]: number of emails in a time frame/
>
> I was wondering if it would also be possible to indicate the
> number of threads versus single emails (with no responses) to
> get a sense of how responsive a mailing list is in a certain
> time period.
>
>
>
> Yes
>
>
>
> In [8]: I would be interested in, for instance, the average
> number of emails per user, across multiple years.
>
> It seems that currently, the numbers presented are not the
> average across the time specified but the absolute. That might
> also be because for the test run, I set
>
>
>
>
>
> date_from = pd.datetime(2014,10,1,tzinfo=pytz.utc)
>
> date_to = pd.datetime(2015,11,30,tzinfo=pytz.utc)
>
> which is only a little over a year.
>
>
> OK I think I misunderstood your question before. So you want to
> do: for each user, count how many emails she sent in a
> timeframe, divide by number of years? If so, be aware that if
> you don't use a round number of years you would get inconsistent
> results on the tails. Also, do you mean year as in 365 days from
> date_from, or year as in 2015,2016,2017,...?
>
>
> I'll try to incorporate these changes tomorrow, hope it's not
> too late!
>
>
>
> Cheers,
>
> Davide
>
>
>
>
> --
> Corinne Cath
> Ph.D. Candidate, Oxford Internet Institute & Alan Turing Institute
>
> Web: www.oii.ox.ac.uk/people/corinne-cath
> <http://www.oii.ox.ac.uk/people/corinne-cath>
> Email: ccath at turing.ac.uk <mailto:ccath at turing.ac.uk> &
> corinnecath at gmail.com <mailto:corinnecath at gmail.com>
> Twitter: @C_Cath
>
>
>
>
> --
> Corinne Cath
> Ph.D. Candidate, Oxford Internet Institute & Alan Turing Institute
>
> Web: www.oii.ox.ac.uk/people/corinne-cath
> <http://www.oii.ox.ac.uk/people/corinne-cath>
> Email: ccath at turing.ac.uk <mailto:ccath at turing.ac.uk> &
> corinnecath at gmail.com <mailto:corinnecath at gmail.com>
> Twitter: @C_Cath
>
>
> _______________________________________________
> Bigbang-dev mailing list
> Bigbang-dev at data-activism.net
> https://lists.ghserv.net/mailman/listinfo/bigbang-dev
>
--
Niels ten Oever
Researcher and PhD Candidate
Datactive Research Group
University of Amsterdam
PGP fingerprint 2458 0B70 5C4A FD8A 9488
643A 0ED8 3F3A 468A C8B3
More information about the Bigbang-dev
mailing list