<html>
<head>
<meta http-equiv="Content-Type" content="text/html; charset=UTF-8">
</head>
<body>
oh that's interesting and unexpected! Thank you for sharing that
with us, how easily done can this then be for non-tech/non-admin
people trying to use the tool for something similar?<br>
<pre class="moz-signature" cols="72">--
Joey Salazar
Digital Sr. Programme Officer
ARTICLE 19
6E9C 95E5 5BED 9413 5D08 55D5 0A40 4136 0DF0 1A91</pre>
<div class="moz-cite-prefix">On 27-Aug-20 5:16 PM, Sebastian
Benthall wrote:<br>
</div>
<blockquote type="cite"
cite="mid:CAEYE9Oeq746jsdgX6aRdxzPyv7Mb=_c2LDHVub9Xq=mBam=oJw@mail.gmail.com">
<meta http-equiv="content-type" content="text/html; charset=UTF-8">
<div dir="ltr">Ok. Please stand by....
<div><br>
</div>
<div>It seems like the datatracking library, when used to crawl
for a large amount of drafts, pulls an index and then does
calls to the datatracker web API for calls the the draft
metadata.</div>
<div><br>
</div>
<div>So I've had to write a new data collection script, similar
to the script we use for scraping the mailing lists, to get
the draft data. It's a slower process. But I should be able to
compute these results once I have them downloaded locally.</div>
</div>
<br>
<div class="gmail_quote">
<div dir="ltr" class="gmail_attr">On Wed, Aug 26, 2020 at 4:09
PM Joey S <<a href="mailto:joeysalazar@article19.org"
moz-do-not-send="true">joeysalazar@article19.org</a>>
wrote:<br>
</div>
<blockquote class="gmail_quote" style="margin:0px 0px 0px
0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">
<div> +1 to dnsop, their drafts are also quite numerous and
with a very active mailing list.<br>
<pre cols="72">--
Joey</pre>
<div>On 26-Aug-20 1:25 PM, Niels ten Oever wrote:<br>
</div>
<blockquote type="cite">
<div dir="auto">Httpbis is the one you're looking for :)<br>
<br>
</div>
<div dir="auto">DNSops is also a nice big one.<br>
<br>
</div>
<div dir="auto">Cheers,<br>
<br>
</div>
<div dir="auto">Niels</div>
<div class="gmail_quote">On Aug 26, 2020, at 21:17,
Sebastian Benthall <<a
href="mailto:sbenthall@gmail.com" target="_blank"
moz-do-not-send="true">sbenthall@gmail.com</a>>
wrote:
<blockquote class="gmail_quote" style="margin:0pt 0pt
0pt 0.8ex;border-left:1px solid
rgb(204,204,204);padding-left:1ex">
<div dir="ltr"> Hmmm.
<div> <br>
</div>
<div> Web mail archives of the http list at <a
href="https://ietf.org/mail-archive/text/http/"
target="_blank" moz-do-not-send="true">https://ietf.org/mail-archive/text/http/</a>
only go up to 2012. </div>
<div> Does that make sense to you? </div>
<div> <br>
</div>
<div> It looks like there are several DNS working
groups. Any one in particular you think would be
worth looking at? </div>
<div> <br>
</div>
<div> Genericizing the code so that it can loop
through many groups and compute results is the
next step towards confirmation. Probably worth
looking at a couple other concrete and
well-understood examples before doing the big
analysis though. </div>
<div> <br>
</div>
<div> - S </div>
</div>
<br>
<div class="gmail_quote">
<div dir="ltr" class="gmail_attr"> On Wed, Aug 26,
2020 at 1:52 PM Niels ten Oever < <a
href="mailto:mail@nielstenoever.net"
target="_blank" moz-do-not-send="true">mail@nielstenoever.net</a>>
wrote: <br>
</div>
<blockquote class="gmail_quote" style="margin:0px
0px 0px 0.8ex;border-left:1px solid
rgb(204,204,204);padding-left:1ex">
<div>
<div dir="auto"> Very interesting. I'd say the
number if drafts and authors in hrpc is too
low to make a statement about this though.
Could we do this for the HTTP and/or DNS WGs ?
</div>
<div class="gmail_quote"> On Aug 26, 2020, at
19:30, Sebastian Benthall < <a
href="mailto:sbenthall@gmail.com"
target="_blank" moz-do-not-send="true">sbenthall@gmail.com</a>>
wrote:
<blockquote class="gmail_quote"
style="margin:0pt 0pt 0pt
0.8ex;border-left:1px solid
rgb(204,204,204);padding-left:1ex">
<div dir="ltr"> Hello,
<div> <br>
</div>
<div> I'm revisiting the question of
whether mailing list gender diversity
and draft productivity of working groups
are correlated. </div>
<div> <br>
</div>
<div> Putting aside for now all the
methodological complications, here is
how I am operationalizing the question:
</div>
<div>
<ul>
<li>I'm looking specifically at the
HRPC working group, with this data:<br>
<div> <img alt="image.png"
moz-do-not-send="true"
width="418" height="221"> <br>
</div>
</li>
<li>
<div> Gender is being detected based
on first name birth records.
"unknown" is used for cases that
cannot with the current data set
be determined as either men or
women. </div>
</li>
<li>I'm measuring "diversity" on any
day as: (women's activity +
unknown's activity) / (men's
activity). Because, you know, this
is probably close to what most
people probably mean by diversity.
(Recall that non-Western names are
more likely to be categorized as
"unknown".)<br>
</li>
<li>I'm using a 100 day rolling
average on the activity counts.</li>
</ul>
<div> This is the matrix of Pearson
correlations between each of these
values: </div>
</div>
<div> <br>
</div>
<div>
<table border="1">
<thead> <tr style="text-align:right">
<th><br>
</th>
<th>women</th>
<th>unknown</th>
<th>men</th>
<th>drafts</th>
<th>diversity</th>
</tr>
</thead> <tbody>
<tr>
<th>women</th>
<td><font color="#0000ff">1.000000</font></td>
<td><font color="#0000ff">0.910922</font></td>
<td><font color="#0000ff">0.804869</font></td>
<td>0.008890</td>
<td>0.160833</td>
</tr>
<tr>
<th>unknown</th>
<td><font color="#0000ff">0.910922</font></td>
<td><font color="#0000ff">1.000000</font></td>
<td><font color="#0000ff">0.808168</font></td>
<td>0.027502</td>
<td>0.245059</td>
</tr>
<tr>
<th>men</th>
<td><font color="#0000ff">0.804869</font></td>
<td><font color="#0000ff">0.808168</font></td>
<td><font color="#0000ff">1.000000</font></td>
<td>0.015406</td>
<td>-0.141915</td>
</tr>
<tr>
<th>drafts</th>
<td><font color="#cc0000">0.008890</font></td>
<td><font color="#cc0000">0.027502</font></td>
<td><font color="#cc0000">0.015406</font></td>
<td>1.000000</td>
<td><font color="#cc0000">0.061884</font></td>
</tr>
<tr>
<th>diversity</th>
<td><font color="#674ea7">0.160833</font></td>
<td><font color="#674ea7">0.245059</font></td>
<td><font color="#674ea7">-0.141915</font></td>
<td>0.061884</td>
<td>1.000000<br>
</td>
</tr>
</tbody>
</table>
<br>
Things to note: </div>
<div>
<ul>
<li><font color="#0000ff">The activity
of each gender is correlated with
the activity of other genders.</font></li>
<li><font color="#674ea7">Diversity is
anticorrelated with the number of
men. This is expected based on how
it was defined, and a good sanity
check.</font></li>
<li><font color="#cc0000">Draft output
is MORE correlated with diversity
than it is with any individual
gender!</font></li>
</ul>
<div> <font color="#000000">This last
point is quite nice. It resonates
with the work of Scott Page on the
value of diversity to collective
intelligence, for example.</font> </div>
<div> <font color="#000000"><br>
</font> </div>
<div> <font color="#000000">These
numbers are a bit hard to interpret.
How much should we trust them? These
are the <i>p</i>-values associated
with each correlation:</font> </div>
<div>
<table border="1">
<thead> <tr
style="text-align:right">
<th><br>
</th>
<th>women</th>
<th>unknown</th>
<th>men</th>
<th>drafts</th>
<th>diversity</th>
</tr>
</thead> <tbody>
<tr>
<th>women</th>
<td>0</td>
<td>0</td>
<td>0</td>
<td><font color="#cccccc">0.6925</font></td>
<td>0</td>
</tr>
<tr>
<th>unknown</th>
<td>0</td>
<td>0</td>
<td>0</td>
<td><font color="#cccccc">0.221</font></td>
<td>0</td>
</tr>
<tr>
<th>men</th>
<td>0</td>
<td>0</td>
<td>0</td>
<td><font color="#cccccc">0.493</font></td>
<td>0</td>
</tr>
<tr>
<th>drafts</th>
<td><font color="#cccccc">0.6925</font></td>
<td><font color="#cccccc">0.221</font></td>
<td><font color="#cccccc">0.493</font></td>
<td>0</td>
<td><font color="#ff0000">0.0059</font></td>
</tr>
<tr>
<th>diversity</th>
<td>0</td>
<td>0</td>
<td>0</td>
<td><font color="#ff0000">0.0059</font></td>
<td>0</td>
</tr>
</tbody>
</table>
</div>
<br>
</div>
<div> Generally, <i>p</i>-values below
.01 are considered "statistically
significant", i.e. publishable. </div>
<div> This correlation between diversity
and draft output makes the cut!! </div>
<div> <br>
</div>
<div> <font color="#0000ff">So the
verdict is: for HRPC, YES, gender
diversity is correlated with draft
output.</font> </div>
<div> <font color="#0000ff"><br>
</font> </div>
<div> <font color="#000000">This result
is robust to transformations of the
activity scores into the log space,
which is comforting.</font> </div>
<div> <span style="color:rgb(0,0,0)">Further
work is needed to see if this result
is robust across other IETF working
groups.</span> </div>
<div> <span style="color:rgb(0,0,0)"><br>
</span> </div>
<div> <font color="#000000">Nick, what
would you say to including a result
like this in the paper about IETF and
gender?</font> </div>
<div> <font color="#000000"><br>
</font> </div>
<div> <font color="#000000">Cheers,<br>
Seb</font> </div>
<div> <br>
</div>
</div>
<pre> <hr>
Bigbang-dev mailing list
<a href="mailto:Bigbang-dev@data-activism.net" target="_blank" moz-do-not-send="true">Bigbang-dev@data-activism.net</a>
<a href="https://lists.ghserv.net/mailman/listinfo/bigbang-dev" target="_blank" moz-do-not-send="true">https://lists.ghserv.net/mailman/listinfo/bigbang-dev</a>
</pre>
</blockquote>
</div>
</div>
</blockquote>
</div>
</blockquote>
</div>
<br>
<fieldset></fieldset>
<pre>_______________________________________________
Bigbang-dev mailing list
<a href="mailto:Bigbang-dev@data-activism.net" target="_blank" moz-do-not-send="true">Bigbang-dev@data-activism.net</a>
<a href="https://lists.ghserv.net/mailman/listinfo/bigbang-dev" target="_blank" moz-do-not-send="true">https://lists.ghserv.net/mailman/listinfo/bigbang-dev</a>
</pre>
</blockquote>
<br>
</div>
</blockquote>
</div>
</blockquote>
<br>
</body>
</html>