<div dir="ltr">Hmmm.<div><br></div><div>Web mail archives of the http list at <a href="https://ietf.org/mail-archive/text/http/">https://ietf.org/mail-archive/text/http/</a> only go up to 2012.</div><div>Does that make sense to you?</div><div><br></div><div>It looks like there are several DNS working groups. Any one in particular you think would be worth looking at?</div><div><br></div><div>Genericizing the code so that it can loop through many groups and compute results is the next step towards confirmation. Probably worth looking at a couple other concrete and well-understood examples before doing the big analysis though.</div><div><br></div><div>- S</div></div><br><div class="gmail_quote"><div dir="ltr" class="gmail_attr">On Wed, Aug 26, 2020 at 1:52 PM Niels ten Oever <<a href="mailto:mail@nielstenoever.net">mail@nielstenoever.net</a>> wrote:<br></div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex"><div><div dir="auto">Very interesting. I'd say the number if drafts and authors in hrpc is too low to make a statement about this though. Could we do this for the HTTP and/or DNS WGs ?</div>
<div class="gmail_quote">On Aug 26, 2020, at 19:30, Sebastian Benthall <<a href="mailto:sbenthall@gmail.com" target="_blank">sbenthall@gmail.com</a>> wrote:<blockquote class="gmail_quote" style="margin:0pt 0pt 0pt 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">
<div dir="ltr">Hello,<div><br></div><div>I'm revisiting the question of whether mailing list gender diversity and draft productivity of working groups are correlated.</div><div><br></div><div>Putting aside for now all the methodological complications, here is how I am operationalizing the question:</div><div><ul><li>I'm looking specifically at the HRPC working group, with this data:<br><div><img alt="image.png" width="418" height="221"><br></div></li><li><div>Gender is being detected based on first name birth records. "unknown" is used for cases that cannot with the current data set be determined as either men or women.</div></li><li>I'm measuring "diversity" on any day as: (women's activity + unknown's activity) / (men's activity). Because, you know, this is probably close to what most people probably mean by diversity. (Recall that non-Western names are more likely to be categorized as "unknown".)<br></li><li>I'm using a 100 day rolling average on the activity counts.</li></ul><div>This is the matrix of Pearson correlations between each of these values:</div></div><div><br></div><div><table border="1"><thead><tr style="text-align:right"><th></th>
<th>women</th>
<th>unknown</th>
<th>men</th>
<th>drafts</th>
<th>diversity</th>
</tr>
</thead>
<tbody>
<tr>
<th>women</th>
<td><font color="#0000ff">1.000000</font></td>
<td><font color="#0000ff">0.910922</font></td>
<td><font color="#0000ff">0.804869</font></td>
<td>0.008890</td>
<td>0.160833</td>
</tr>
<tr>
<th>unknown</th>
<td><font color="#0000ff">0.910922</font></td>
<td><font color="#0000ff">1.000000</font></td>
<td><font color="#0000ff">0.808168</font></td>
<td>0.027502</td>
<td>0.245059</td>
</tr>
<tr>
<th>men</th>
<td><font color="#0000ff">0.804869</font></td>
<td><font color="#0000ff">0.808168</font></td>
<td><font color="#0000ff">1.000000</font></td>
<td>0.015406</td>
<td>-0.141915</td>
</tr>
<tr>
<th>drafts</th>
<td><font color="#cc0000">0.008890</font></td>
<td><font color="#cc0000">0.027502</font></td>
<td><font color="#cc0000">0.015406</font></td>
<td>1.000000</td>
<td><font color="#cc0000">0.061884</font></td>
</tr>
<tr>
<th>diversity</th>
<td><font color="#674ea7">0.160833</font></td>
<td><font color="#674ea7">0.245059</font></td>
<td><font color="#674ea7">-0.141915</font></td>
<td>0.061884</td>
<td>1.000000<br></td></tr></tbody></table><br>Things to note:</div><div><ul><li><font color="#0000ff">The activity of each gender is correlated with the activity of other genders.</font></li><li><font color="#674ea7">Diversity is anticorrelated with the number of men. This is expected based on how it was defined, and a good sanity check.</font></li><li><font color="#cc0000">Draft output is MORE correlated with diversity than it is with any individual gender!</font></li></ul><div><font color="#000000">This last point is quite nice. It resonates with the work of Scott Page on the value of diversity to collective intelligence, for example.</font></div><div><font color="#000000"><br></font></div><div><font color="#000000">These numbers are a bit hard to interpret. How much should we trust them? These are the <i>p</i>-values associated with each correlation:</font></div><div><table border="1"><thead><tr style="text-align:right"><th></th>
<th>women</th>
<th>unknown</th>
<th>men</th>
<th>drafts</th>
<th>diversity</th>
</tr>
</thead>
<tbody>
<tr>
<th>women</th>
<td>0</td>
<td>0</td>
<td>0</td>
<td><font color="#cccccc">0.6925</font></td>
<td>0</td>
</tr>
<tr>
<th>unknown</th>
<td>0</td>
<td>0</td>
<td>0</td>
<td><font color="#cccccc">0.221</font></td>
<td>0</td>
</tr>
<tr>
<th>men</th>
<td>0</td>
<td>0</td>
<td>0</td>
<td><font color="#cccccc">0.493</font></td>
<td>0</td>
</tr>
<tr>
<th>drafts</th>
<td><font color="#cccccc">0.6925</font></td>
<td><font color="#cccccc">0.221</font></td>
<td><font color="#cccccc">0.493</font></td>
<td>0</td>
<td><font color="#ff0000">0.0059</font></td>
</tr>
<tr>
<th>diversity</th>
<td>0</td>
<td>0</td>
<td>0</td>
<td><font color="#ff0000">0.0059</font></td>
<td>0</td></tr></tbody></table></div><br></div><div>Generally, <i>p</i>-values below .01 are considered "statistically significant", i.e. publishable.</div><div>This correlation between diversity and draft output makes the cut!!</div><div><br></div><div><font color="#0000ff">So the verdict is: for HRPC, YES, gender diversity is correlated with draft output.</font></div><div><font color="#0000ff"><br></font></div><div><font color="#000000">This result is robust to transformations of the activity scores into the log space, which is comforting.</font></div><div><span style="color:rgb(0,0,0)">Further work is needed to see if this result is robust across other IETF working groups.</span></div><div><span style="color:rgb(0,0,0)"><br></span></div><div><font color="#000000">Nick, what would you say to including a result like this in the paper about IETF and gender?</font></div><div><font color="#000000"><br></font></div><div><font color="#000000">Cheers,<br>Seb</font></div><div><br></div></div>
<pre><hr><br>Bigbang-dev mailing list<br><a href="mailto:Bigbang-dev@data-activism.net" target="_blank">Bigbang-dev@data-activism.net</a><br><a href="https://lists.ghserv.net/mailman/listinfo/bigbang-dev" target="_blank">https://lists.ghserv.net/mailman/listinfo/bigbang-dev</a><br></pre></blockquote></div></div></blockquote></div>