[Bigbang-dev] Gender diversity and draft productivity

Sebastian Benthall sbenthall at gmail.com
Tue Jul 7 19:00:53 CEST 2020


I've managed to make some progress on the question of how gender diversity
affects working group productivity.

I'd like some feedback on this very preliminary visualization of HRPC, as I
think it illustrates some of the analytic, methodological, and ethical
challenges of this question.

[image: image.png]
This plot shows a rolling average of email activity to HRPC's mailing list,
and vertical lines for number of drafts published on each day.

In this plot, I've colored the draft lines based on a "gender tendency",
computed as follows:
 - Using BigBang's gender detector, it makes a guess based on each author's
first name as to whether they are "male" (1), "female" (0), or "unknown"
(.5).
 - It averages together the values for all the authors who have published
on that day.

There are some notable issues with this method.
 - The data is not always clean, and e.g. unicode errors in somebody's name
cause them to be identified as non-binary.
 - I get confused by this all the time, but I was under the impression that
"male" and "female" were sexes, not genders?
 - The name-based gender detector has a western bias and this leads to some
errors. I believe it is misgendering Gurshabad Grover as a woman.

I know these sorts of topics can be divisive. I wonder if anybody has ideas
for how to improve things that are actionable from an engineering
standpoint?

Is anybody on this list authoritative about the right kinds of gender
categories to use?

One idea is to use the IETF DataTracker's biography field and count
pronouns:
https://github.com/datactive/bigbang/issues/393

I'll step forward and say my view of this, which is: in no way, shape, or
form are we doing some sort of fundamental injustice or wrong by having an
imperfect solution to what is an inherently challenging engineering
problem. I'm not interested in moralizing on this topic. I would very much
like to improve the accuracy of the results such that this is possible and
over some minimum ethical hurdles.

- S
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.ghserv.net/pipermail/bigbang-dev/attachments/20200707/2491a639/attachment-0001.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: image.png
Type: image/png
Size: 42149 bytes
Desc: not available
URL: <http://lists.ghserv.net/pipermail/bigbang-dev/attachments/20200707/2491a639/attachment-0001.png>


More information about the Bigbang-dev mailing list