[Bigbang-dev] Are gender diversity and draft productivity correlated? THE VERDICT

Colin Perkins csp at csperkins.org
Fri Aug 28 10:09:18 CEST 2020


The cluster of AVT, AVTCORE, AVTEXT, PAYLOAD, and XRBLOCK WGs are very tightly related, and we have email history going back >25 years from the early research, through a massive boom in activity, to the current stage as the groups wind down. Those might be interesting to look at.

Or the IPv6 effort?

Colin



> On 26 Aug 2020, at 18:52, Niels ten Oever <mail at nielstenoever.net> wrote:
> 
> Very interesting. I'd say the number if drafts and authors in hrpc is too low to make a statement about this though. Could we do this for the HTTP and/or DNS WGs ?
> On Aug 26, 2020, at 19:30, Sebastian Benthall <sbenthall at gmail.com <mailto:sbenthall at gmail.com>> wrote:
> Hello,
> 
> I'm revisiting the question of whether mailing list gender diversity and draft productivity of working groups are correlated.
> 
> Putting aside for now all the methodological complications, here is how I am operationalizing the question:
> I'm looking specifically at the HRPC working group, with this data:
> 
> Gender is being detected based on first name birth records. "unknown" is used for cases that cannot with the current data set be determined as either men or women.
> I'm measuring "diversity" on any day as: (women's activity + unknown's activity) / (men's activity). Because, you know, this is probably close to what most people probably mean by diversity. (Recall that non-Western names are more likely to be categorized as "unknown".)
> I'm using a 100 day rolling average on the activity counts.
> This is the matrix of Pearson correlations between each of these values:
> 
> women	unknown	men	drafts	diversity
> women	1.000000	0.910922	0.804869	0.008890	0.160833
> unknown	0.910922	1.000000	0.808168	0.027502	0.245059
> men	0.804869	0.808168	1.000000	0.015406	-0.141915
> drafts	0.008890	0.027502	0.015406	1.000000	0.061884
> diversity	0.160833	0.245059	-0.141915	0.061884	1.000000
> 
> Things to note:
> The activity of each gender is correlated with the activity of other genders.
> Diversity is anticorrelated with the number of men. This is expected based on how it was defined, and a good sanity check.
> Draft output is MORE correlated with diversity than it is with any individual gender!
> This last point is quite nice. It resonates with the work of Scott Page on the value of diversity to collective intelligence, for example.
> 
> These numbers are a bit hard to interpret. How much should we trust them? These are the p-values associated with each correlation:
> women	unknown	men	drafts	diversity
> women	0	0	0	0.6925	0
> unknown	0	0	0	0.221	0
> men	0	0	0	0.493	0
> drafts	0.6925	0.221	0.493	0	0.0059
> diversity	0	0	0	0.0059	0
> 
> Generally, p-values below .01 are considered "statistically significant", i.e. publishable.
> This correlation between diversity and draft output makes the cut!!
> 
> So the verdict is: for HRPC, YES, gender diversity is correlated with draft output.
> 
> This result is robust to transformations of the activity scores into the log space, which is comforting.
> Further work is needed to see if this result is robust across other IETF working groups.
> 
> Nick, what would you say to including a result like this in the paper about IETF and gender?
> 
> Cheers,
> Seb
> 
> 
> Bigbang-dev mailing list
> Bigbang-dev at data-activism.net
> https://lists.ghserv.net/mailman/listinfo/bigbang-dev <https://lists.ghserv.net/mailman/listinfo/bigbang-dev>
> _______________________________________________
> Bigbang-dev mailing list
> Bigbang-dev at data-activism.net
> https://lists.ghserv.net/mailman/listinfo/bigbang-dev



-- 
Colin Perkins
https://csperkins.org/




-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.ghserv.net/pipermail/bigbang-dev/attachments/20200828/96ec06d2/attachment.html>


More information about the Bigbang-dev mailing list