[Bigbang-dev] Are gender diversity and draft productivity correlated? THE VERDICT

Niels ten Oever mail at nielstenoever.net
Wed Aug 26 19:52:19 CEST 2020


Very interesting. I'd say the number if drafts and authors in hrpc is too low to make a statement about this though. Could we do this for the HTTP and/or DNS WGs ?

On Aug 26, 2020, 19:30, at 19:30, Sebastian Benthall <sbenthall at gmail.com> wrote:
>Hello,
>
>I'm revisiting the question of whether mailing list gender diversity
>and
>draft productivity of working groups are correlated.
>
>Putting aside for now all the methodological complications, here is how
>I
>am operationalizing the question:
>
>  - I'm looking specifically at the HRPC working group, with this data:
>   [image: image.png]
>- Gender is being detected based on first name birth records. "unknown"
>is used for cases that cannot with the current data set be determined
>as
>   either men or women.
>- I'm measuring "diversity" on any day as: (women's activity +
>unknown's
>activity) / (men's activity). Because, you know, this is probably close
>to
>what most people probably mean by diversity. (Recall that non-Western
>names
>   are more likely to be categorized as "unknown".)
>   - I'm using a 100 day rolling average on the activity counts.
>
>This is the matrix of Pearson correlations between each of these
>values:
>
>women unknown men drafts diversity
>women 1.000000 0.910922 0.804869 0.008890 0.160833
>unknown 0.910922 1.000000 0.808168 0.027502 0.245059
>men 0.804869 0.808168 1.000000 0.015406 -0.141915
>drafts 0.008890 0.027502 0.015406 1.000000 0.061884
>diversity 0.160833 0.245059 -0.141915 0.061884 1.000000
>
>Things to note:
>
> - The activity of each gender is correlated with the activity of other
>   genders.
> - Diversity is anticorrelated with the number of men. This is expected
>   based on how it was defined, and a good sanity check.
>   - Draft output is MORE correlated with diversity than it is with any
>   individual gender!
>
>This last point is quite nice. It resonates with the work of Scott Page
>on
>the value of diversity to collective intelligence, for example.
>
>These numbers are a bit hard to interpret. How much should we trust
>them?
>These are the *p*-values associated with each correlation:
>women unknown men drafts diversity
>women 0 0 0 0.6925 0
>unknown 0 0 0 0.221 0
>men 0 0 0 0.493 0
>drafts 0.6925 0.221 0.493 0 0.0059
>diversity 0 0 0 0.0059 0
>
>Generally, *p*-values below .01 are considered "statistically
>significant",
>i.e. publishable.
>This correlation between diversity and draft output makes the cut!!
>
>So the verdict is: for HRPC, YES, gender diversity is correlated with
>draft
>output.
>
>This result is robust to transformations of the activity scores into
>the
>log space, which is comforting.
>Further work is needed to see if this result is robust across other
>IETF
>working groups.
>
>Nick, what would you say to including a result like this in the paper
>about
>IETF and gender?
>
>Cheers,
>Seb
>
>
>------------------------------------------------------------------------
>
>_______________________________________________
>Bigbang-dev mailing list
>Bigbang-dev at data-activism.net
>https://lists.ghserv.net/mailman/listinfo/bigbang-dev
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.ghserv.net/pipermail/bigbang-dev/attachments/20200826/57fbaf76/attachment.html>


More information about the Bigbang-dev mailing list