[Bigbang-dev] IETF list crawling

Nick Doty npdoty at ischool.berkeley.edu
Thu Sep 28 07:11:29 CEST 2017

Hi Niels,

Per the conversation on Gitter, I'm reviewing my logs from the IETF crawls that I did at the end of July and not immediately seeing any Unicode issues preventing downloads. I've attached the log files (which are long! we should maybe try to make these more consistent/informative). The initial log file has some failures, but the run on July 31st seems to have been more successful. This didn't include my provenance code, so I can't easily tell you exactly which version of BigBang code this was running.

It's 13 gigabytes of email (!), and I don't think it's quite complete. I'm not sure my list of lists was comprehensive, I've attached that too.

Hope this helps,

-------------- next part --------------
An embedded and charset-unspecified text was scrubbed...
Name: ietf_lists_normalized.txt
URL: <http://lists.ghserv.net/pipermail/bigbang-dev/attachments/20170927/0beb8e1d/attachment-0001.txt>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: collect-20170730.log
Type: application/applefile
Size: 80 bytes
Desc: not available
URL: <http://lists.ghserv.net/pipermail/bigbang-dev/attachments/20170927/0beb8e1d/attachment-0001.bin>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: collect-20170731.log
Type: application/octet-stream
Size: 3100313 bytes
Desc: not available
URL: <http://lists.ghserv.net/pipermail/bigbang-dev/attachments/20170927/0beb8e1d/attachment-0001.obj>

More information about the Bigbang-dev mailing list