<div dir="ltr">Ok. Please stand by....<div><br></div><div>It seems like the datatracking library, when used to crawl for a large amount of drafts, pulls an index and then does calls to the datatracker web API for calls the the draft metadata.</div><div><br></div><div>So I've had to write a new data collection script, similar to the script we use for scraping the mailing lists, to get the draft data. It's a slower process. But I should be able to compute these results once I have them downloaded locally.</div></div><br><div class="gmail_quote"><div dir="ltr" class="gmail_attr">On Wed, Aug 26, 2020 at 4:09 PM Joey S <<a href="mailto:joeysalazar@article19.org">joeysalazar@article19.org</a>> wrote:<br></div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">
<div>
+1 to dnsop, their drafts are also quite numerous and with a very
active mailing list.<br>
<pre cols="72">--
Joey</pre>
<div>On 26-Aug-20 1:25 PM, Niels ten Oever
wrote:<br>
</div>
<blockquote type="cite">
<div dir="auto">Httpbis is the one you're looking for :)<br>
<br>
</div>
<div dir="auto">DNSops is also a nice big one.<br>
<br>
</div>
<div dir="auto">Cheers,<br>
<br>
</div>
<div dir="auto">Niels</div>
<div class="gmail_quote">On Aug 26, 2020, at 21:17, Sebastian
Benthall <<a href="mailto:sbenthall@gmail.com" target="_blank">sbenthall@gmail.com</a>>
wrote:
<blockquote class="gmail_quote" style="margin:0pt 0pt 0pt 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">
<div dir="ltr"> Hmmm.
<div> <br>
</div>
<div> Web mail archives of the http list at <a href="https://ietf.org/mail-archive/text/http/" target="_blank">https://ietf.org/mail-archive/text/http/</a>
only go up to 2012. </div>
<div> Does that make sense to you? </div>
<div> <br>
</div>
<div> It looks like there are several DNS working groups.
Any one in particular you think would be worth looking at?
</div>
<div> <br>
</div>
<div> Genericizing the code so that it can loop through many
groups and compute results is the next step towards
confirmation. Probably worth looking at a couple other
concrete and well-understood examples before doing the big
analysis though. </div>
<div> <br>
</div>
<div> - S </div>
</div>
<br>
<div class="gmail_quote">
<div dir="ltr" class="gmail_attr"> On Wed, Aug 26, 2020 at
1:52 PM Niels ten Oever < <a href="mailto:mail@nielstenoever.net" target="_blank">mail@nielstenoever.net</a>>
wrote: <br>
</div>
<blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">
<div>
<div dir="auto"> Very interesting. I'd say the number if
drafts and authors in hrpc is too low to make a
statement about this though. Could we do this for the
HTTP and/or DNS WGs ? </div>
<div class="gmail_quote"> On Aug 26, 2020, at 19:30,
Sebastian Benthall < <a href="mailto:sbenthall@gmail.com" target="_blank">sbenthall@gmail.com</a>>
wrote:
<blockquote class="gmail_quote" style="margin:0pt 0pt 0pt 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">
<div dir="ltr"> Hello,
<div> <br>
</div>
<div> I'm revisiting the question of whether
mailing list gender diversity and draft
productivity of working groups are correlated. </div>
<div> <br>
</div>
<div> Putting aside for now all the methodological
complications, here is how I am operationalizing
the question: </div>
<div>
<ul>
<li>I'm looking specifically at the HRPC
working group, with this data:<br>
<div> <img alt="image.png" width="418" height="221"> <br>
</div>
</li>
<li>
<div> Gender is being detected based on
first name birth records. "unknown" is
used for cases that cannot with the
current data set be determined as either
men or women. </div>
</li>
<li>I'm measuring "diversity" on any day as:
(women's activity + unknown's activity) /
(men's activity). Because, you know, this is
probably close to what most people probably
mean by diversity. (Recall that non-Western
names are more likely to be categorized as
"unknown".)<br>
</li>
<li>I'm using a 100 day rolling average on the
activity counts.</li>
</ul>
<div> This is the matrix of Pearson correlations
between each of these values: </div>
</div>
<div> <br>
</div>
<div>
<table border="1">
<thead> <tr style="text-align:right">
<th><br>
</th>
<th>women</th>
<th>unknown</th>
<th>men</th>
<th>drafts</th>
<th>diversity</th>
</tr>
</thead> <tbody>
<tr>
<th>women</th>
<td><font color="#0000ff">1.000000</font></td>
<td><font color="#0000ff">0.910922</font></td>
<td><font color="#0000ff">0.804869</font></td>
<td>0.008890</td>
<td>0.160833</td>
</tr>
<tr>
<th>unknown</th>
<td><font color="#0000ff">0.910922</font></td>
<td><font color="#0000ff">1.000000</font></td>
<td><font color="#0000ff">0.808168</font></td>
<td>0.027502</td>
<td>0.245059</td>
</tr>
<tr>
<th>men</th>
<td><font color="#0000ff">0.804869</font></td>
<td><font color="#0000ff">0.808168</font></td>
<td><font color="#0000ff">1.000000</font></td>
<td>0.015406</td>
<td>-0.141915</td>
</tr>
<tr>
<th>drafts</th>
<td><font color="#cc0000">0.008890</font></td>
<td><font color="#cc0000">0.027502</font></td>
<td><font color="#cc0000">0.015406</font></td>
<td>1.000000</td>
<td><font color="#cc0000">0.061884</font></td>
</tr>
<tr>
<th>diversity</th>
<td><font color="#674ea7">0.160833</font></td>
<td><font color="#674ea7">0.245059</font></td>
<td><font color="#674ea7">-0.141915</font></td>
<td>0.061884</td>
<td>1.000000<br>
</td>
</tr>
</tbody>
</table>
<br>
Things to note: </div>
<div>
<ul>
<li><font color="#0000ff">The activity of each
gender is correlated with the activity of
other genders.</font></li>
<li><font color="#674ea7">Diversity is
anticorrelated with the number of men.
This is expected based on how it was
defined, and a good sanity check.</font></li>
<li><font color="#cc0000">Draft output is MORE
correlated with diversity than it is with
any individual gender!</font></li>
</ul>
<div> <font color="#000000">This last point is
quite nice. It resonates with the work of
Scott Page on the value of diversity to
collective intelligence, for example.</font>
</div>
<div> <font color="#000000"><br>
</font> </div>
<div> <font color="#000000">These numbers are a
bit hard to interpret. How much should we
trust them? These are the <i>p</i>-values
associated with each correlation:</font> </div>
<div>
<table border="1">
<thead> <tr style="text-align:right">
<th><br>
</th>
<th>women</th>
<th>unknown</th>
<th>men</th>
<th>drafts</th>
<th>diversity</th>
</tr>
</thead> <tbody>
<tr>
<th>women</th>
<td>0</td>
<td>0</td>
<td>0</td>
<td><font color="#cccccc">0.6925</font></td>
<td>0</td>
</tr>
<tr>
<th>unknown</th>
<td>0</td>
<td>0</td>
<td>0</td>
<td><font color="#cccccc">0.221</font></td>
<td>0</td>
</tr>
<tr>
<th>men</th>
<td>0</td>
<td>0</td>
<td>0</td>
<td><font color="#cccccc">0.493</font></td>
<td>0</td>
</tr>
<tr>
<th>drafts</th>
<td><font color="#cccccc">0.6925</font></td>
<td><font color="#cccccc">0.221</font></td>
<td><font color="#cccccc">0.493</font></td>
<td>0</td>
<td><font color="#ff0000">0.0059</font></td>
</tr>
<tr>
<th>diversity</th>
<td>0</td>
<td>0</td>
<td>0</td>
<td><font color="#ff0000">0.0059</font></td>
<td>0</td>
</tr>
</tbody>
</table>
</div>
<br>
</div>
<div> Generally, <i>p</i>-values below .01 are
considered "statistically significant", i.e.
publishable. </div>
<div> This correlation between diversity and draft
output makes the cut!! </div>
<div> <br>
</div>
<div> <font color="#0000ff">So the verdict is:
for HRPC, YES, gender diversity is correlated
with draft output.</font> </div>
<div> <font color="#0000ff"><br>
</font> </div>
<div> <font color="#000000">This result is robust
to transformations of the activity scores into
the log space, which is comforting.</font> </div>
<div> <span style="color:rgb(0,0,0)">Further work
is needed to see if this result is robust
across other IETF working groups.</span> </div>
<div> <span style="color:rgb(0,0,0)"><br>
</span> </div>
<div> <font color="#000000">Nick, what would you
say to including a result like this in the
paper about IETF and gender?</font> </div>
<div> <font color="#000000"><br>
</font> </div>
<div> <font color="#000000">Cheers,<br>
Seb</font> </div>
<div> <br>
</div>
</div>
<pre> <hr>
Bigbang-dev mailing list
<a href="mailto:Bigbang-dev@data-activism.net" target="_blank">Bigbang-dev@data-activism.net</a>
<a href="https://lists.ghserv.net/mailman/listinfo/bigbang-dev" target="_blank">https://lists.ghserv.net/mailman/listinfo/bigbang-dev</a>
</pre>
</blockquote>
</div>
</div>
</blockquote>
</div>
</blockquote>
</div>
<br>
<fieldset></fieldset>
<pre>_______________________________________________
Bigbang-dev mailing list
<a href="mailto:Bigbang-dev@data-activism.net" target="_blank">Bigbang-dev@data-activism.net</a>
<a href="https://lists.ghserv.net/mailman/listinfo/bigbang-dev" target="_blank">https://lists.ghserv.net/mailman/listinfo/bigbang-dev</a>
</pre>
</blockquote>
<br>
</div>
</blockquote></div>