[Bigbang-dev] Gender diversity and draft productivity

Nick Doty npdoty at ischool.berkeley.edu
Sat Jul 11 00:37:30 CEST 2020


Sorry to be catching up on this thread late, but it’s a topic of great interest to me!

> On Jul 10, 2020, at 12:09 PM, Sebastian Benthall <sbenthall at gmail.com> wrote:
> 
> Thanks Gurshabad. This is very helpful.
> 
> I've done a deeper dive into the gender-detector package, and have a better sense now of what it's doing.
> 
> I've also realized that there was a bug in my code, and that this was part of misgendering Gurshabad. It is now saying "Gurshabad" is of "unknown" gender.

+1, that did sound like a bug to me, as the gender-detector package is both pretty conservative and dependent on imported birth records.

> Agree. Does it make sense to make this difference explicit, even if it's
> in the same category? eg. "Non-binary or could not be determined"
> 
> This is a good idea.
> Given our current methods, we have no way of determining if somebody considers themselves non-binary.
> So these people will always be of "unknown" gender, from the perspective of our research.
> I see that as good to flag.
> 
> An issue that has not yet been settled is how we are measuring "diversity", and how that measurement should reflect our uncertainty and the possibility of more than two represented gender categories.

So far I haven’t been trying to capture or record people with non-binary genders both because it’s not easily estimated by gender-detector and similar libraries and for ethical considerations that it could be outing or identifying people. In general, my research has been trying to estimate the gender breakdown of populations but not to record and publish individual people’s genders, to avoid individual misgendering and to avoid the privacy risks of disclosing someone’s gender.

> Non-authoritative as well, but fwiw, in agreement with Juliana that
> 'man' and 'woman' are probably better to use here. Maybe someone can
> also comment on whether 'masculine'/'feminine' also work for this? (The
> advantage I see with this descriptor is that the results then clearly
> remark on names and not people, but there may be other problems with
> this terminology that I'm not aware of. Apologies in advance if this
> suggestion seems misguided; happy to learn.)
> 
> This sounds very sensible to me.

Thanks for raising that. I think I was using “male” and “female” in some graphs, and I agree that “men” and “women” is more apt terminology for gender identity.

> One counterpoint though is that, digging more into the gender-detector module, it looks like it's not using data about whether or not a name is historically or linguistically masculine or feminine.
> 
> Rather it has count data for each country: the number of "male" and "female" (it's labels) that have that name in each country. (I'm not sure how this data was created. On of the people involved in that project, Nathan Mathias, is now a professor at Cornell and would probably weigh in if we asked him to.)
> 
> The gender guess is then based on whether or not the preponderance of uses of the name apply to "male" or "female" people. There's a confidence cutoff that's actually quite strict; anything below this confidence rate gets an "unknown" response.

Yes, these libraries import datasets that I believe come from local governments, which record names and recorded genders at birth. As you note, the cut-off is quite high confidence (both that there are enough instances recorded, and that the percentage of the recorded instances is extremely disproportionate to the identified gender).

> Not arguing against the theory that a Western bias may exist in the
> dataset, but just stepping in to say that my name is not a good case to
> determine this: like lots of names following a Sikh naming convention, I
> don't think mine is specific for men/women.
> 
> Cool. Good to know! The BigBang code now reflects this.
> 
> Now I think the only names that are currently giving the code trouble are:
>  "Stéphane Bortzmeyer". The dictionary is in ASCII and includes no accents. In the US dictionary, "Stephane" has a  655/1128 male/female count. In the UK dictionary, it has a 41/0 male/female count, and is considered "male". This actually accords with my intuition--without looking him up, I (from the US) had assumed Stéphane was a woman. Anyway, an interesting regional difference.
> "=?utf-8?q?St=C3=A9phane_Couture?=" who is "unknown"
Yes, I still see errors, and most often with names that in the US are strongly gendered but in other countries may not be gendered or may have a different gender balance. Those are cases where the US/Western focus also leads to incorrect data. But those instances have been rare when I’ve done manual checks with groups of people I know; more often the gender-detector library is recording genders as unknown.

> My conclusion is that while there's a fairly high error rate, the gender-detector module is good enough as is to proceed with. The errors should iron out as it's used at larger scale.
> 
> The next step is to get a sense of gendered mailing list participation change over time, which I believe has not been done yet.

I’d be interested in that. I have not looked at estimates of gender participation over time. I have compared different mailing lists/working groups, which seemed of interest. Some rough initial work in the graphs attached.


> <image.png>
> 

I think it would be better to use this method to look at the mailing list traffic by gender rather than the document authors: since there’s a small number of document editors, that’s something that could more easily be tagged by hand with higher precision.

I believe Jari was providing statistics on gender of RFC authors which used (at least in part) a manual list. He wouldn’t make that list public as a privacy matter, but it could be something he would be willing to share with researchers as long as we also kept it private.

> On the whole, this has been very helpful. Thanks to both Juliana and Gurshabad.
> 
> I hope this effort contributes towards some publishable research down the line. I anticipate that:
>  - The substance of this discussion is going to be critical to include in a Methods section of any research paper
>  - Depending on how deep we wind up going into it, an audit of the gender detection module and what we augment it with, the design process around it, etc., might be a publishable piece in its own right.

Yes, I found the methods and caveats about them to be the most detailed part of working/writing on this topic. In the draft I’d put together so far, I started with all the limitations of the method, and then tried to explain why it still might be useful to look at these estimates. I’m still cautious about publishing that because I don’t know how much we can look past those limitations and whether any harm can be done by publishing estimates, but I’d be interested to hear other perspectives. 

Maybe it would be best to work on a paper together that could include multiple reviews and perspectives.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.ghserv.net/pipermail/bigbang-dev/attachments/20200710/1b08663d/attachment-0001.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: gender-fraction-bars-20170718.png
Type: image/png
Size: 49686 bytes
Desc: not available
URL: <http://lists.ghserv.net/pipermail/bigbang-dev/attachments/20200710/1b08663d/attachment-0002.png>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: gender-pies-20170730.png
Type: image/png
Size: 2872434 bytes
Desc: not available
URL: <http://lists.ghserv.net/pipermail/bigbang-dev/attachments/20200710/1b08663d/attachment-0003.png>


More information about the Bigbang-dev mailing list