[Bigbang-dev] Are gender diversity and draft productivity correlated? THE VERDICT
Colin Perkins
csp at csperkins.org
Fri Aug 28 10:09:18 CEST 2020
The cluster of AVT, AVTCORE, AVTEXT, PAYLOAD, and XRBLOCK WGs are very tightly related, and we have email history going back >25 years from the early research, through a massive boom in activity, to the current stage as the groups wind down. Those might be interesting to look at.
Or the IPv6 effort?
Colin
> On 26 Aug 2020, at 18:52, Niels ten Oever <mail at nielstenoever.net> wrote:
>
> Very interesting. I'd say the number if drafts and authors in hrpc is too low to make a statement about this though. Could we do this for the HTTP and/or DNS WGs ?
> On Aug 26, 2020, at 19:30, Sebastian Benthall <sbenthall at gmail.com <mailto:sbenthall at gmail.com>> wrote:
> Hello,
>
> I'm revisiting the question of whether mailing list gender diversity and draft productivity of working groups are correlated.
>
> Putting aside for now all the methodological complications, here is how I am operationalizing the question:
> I'm looking specifically at the HRPC working group, with this data:
>
> Gender is being detected based on first name birth records. "unknown" is used for cases that cannot with the current data set be determined as either men or women.
> I'm measuring "diversity" on any day as: (women's activity + unknown's activity) / (men's activity). Because, you know, this is probably close to what most people probably mean by diversity. (Recall that non-Western names are more likely to be categorized as "unknown".)
> I'm using a 100 day rolling average on the activity counts.
> This is the matrix of Pearson correlations between each of these values:
>
> women unknown men drafts diversity
> women 1.000000 0.910922 0.804869 0.008890 0.160833
> unknown 0.910922 1.000000 0.808168 0.027502 0.245059
> men 0.804869 0.808168 1.000000 0.015406 -0.141915
> drafts 0.008890 0.027502 0.015406 1.000000 0.061884
> diversity 0.160833 0.245059 -0.141915 0.061884 1.000000
>
> Things to note:
> The activity of each gender is correlated with the activity of other genders.
> Diversity is anticorrelated with the number of men. This is expected based on how it was defined, and a good sanity check.
> Draft output is MORE correlated with diversity than it is with any individual gender!
> This last point is quite nice. It resonates with the work of Scott Page on the value of diversity to collective intelligence, for example.
>
> These numbers are a bit hard to interpret. How much should we trust them? These are the p-values associated with each correlation:
> women unknown men drafts diversity
> women 0 0 0 0.6925 0
> unknown 0 0 0 0.221 0
> men 0 0 0 0.493 0
> drafts 0.6925 0.221 0.493 0 0.0059
> diversity 0 0 0 0.0059 0
>
> Generally, p-values below .01 are considered "statistically significant", i.e. publishable.
> This correlation between diversity and draft output makes the cut!!
>
> So the verdict is: for HRPC, YES, gender diversity is correlated with draft output.
>
> This result is robust to transformations of the activity scores into the log space, which is comforting.
> Further work is needed to see if this result is robust across other IETF working groups.
>
> Nick, what would you say to including a result like this in the paper about IETF and gender?
>
> Cheers,
> Seb
>
>
> Bigbang-dev mailing list
> Bigbang-dev at data-activism.net
> https://lists.ghserv.net/mailman/listinfo/bigbang-dev <https://lists.ghserv.net/mailman/listinfo/bigbang-dev>
> _______________________________________________
> Bigbang-dev mailing list
> Bigbang-dev at data-activism.net
> https://lists.ghserv.net/mailman/listinfo/bigbang-dev
--
Colin Perkins
https://csperkins.org/
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.ghserv.net/pipermail/bigbang-dev/attachments/20200828/96ec06d2/attachment.html>
More information about the Bigbang-dev
mailing list