[Bigbang-user] R: Issue with listserv fetching (3GPP)
Christoph Becker
chrbecker01 at gmail.com
Fri Apr 23 00:23:45 CEST 2021
Hi Niels & Riccardo,
the argument 'instant_dump' for the ListservArchive class object does not
exist anymore in the up-to-date 'main' branch of the git repo.
@Niels: Do you mean that you did a 'git pull' and encountered the
TypeError caused by missing 'instant_dump' too?
But as I said in another message, we are not quite there yet for 3GPP and
IEEE to use the 'conventional' method on how BigBang scrapes archives such
as W3C.
I attached a small examples that shows how you can currently scrape the
3GPP archive and save it to mbox files in the CONFIG.mail_path folder.
Be aware that this could take very long and could use a lot of memory.
Best Wishes,
Christoph
Op do 22 apr. 2021 om 17:17 schreef Niels ten Oever <mail at nielstenoever.net
>:
> Hi Riccardo and Christoph,
>
> I see there might be an issue with the usage of special characters in the
> mailinglist URLs, to get it working I had to put a '\' in front on the '?',
> but this could also be fixed by using " " around the URL. However, after
> that fetching did not work either - so let's ask Christoph (cc).
>
> Cheers,
>
> Niels
>
>
>
>
>
>
> On 22-04-2021 17:43, Riccardo Nanni wrote:
> > Hi Niels,
> >
> > thanks for your answer!
> > I did, and I found the changes I can see in Github (e.g. the
> listserv.3GPP.txt file, etc.).
> > I did it again when I saw it didn't work and it says 'già aggiornato'
> (already updated).
> >
> > Riccardo
> >
> ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
> > *Da:* Bigbang-user <bigbang-user-bounces at data-activism.net> per conto
> di Niels ten Oever <mail at nielstenoever.net>
> > *Inviato:* giovedì 22 aprile 2021 17:38
> > *A:* bigbang-user at data-activism.net <bigbang-user at data-activism.net>
> > *Oggetto:* Re: [Bigbang-user] Issue with listserv
> >
> > Hi Riccardo,
> >
> > This is not a very informed response - but did you first do:
> >
> > git pull
> >
> > to ensure that you have the latest version with all the recent changes?
> >
> > Best,
> >
> > Niels
> >
> > On 22-04-2021 17:31, Riccardo Nanni wrote:
> >> Dear all,
> >>
> >> how are you?
> >> I tried to collect email from 3GPP by running these commands:
> >> python bin/collect_mail.py -u https://list.etsi.org/scripts/wa.exe? <
> https://list.etsi.org/scripts/wa.exe?> <
> https://list.etsi.org/scripts/wa.exe? <
> https://list.etsi.org/scripts/wa.exe?>>;
> >> python3 bin/collect_mail.py -u https://list.etsi.org/scripts/wa.exe? <
> https://list.etsi.org/scripts/wa.exe?> <
> https://list.etsi.org/scripts/wa.exe? <
> https://list.etsi.org/scripts/wa.exe?>>
> >> AND
> >> python3 bin/collect_mail.py -f
> examples/url_collections/listserv.3GPP.txt
> >>
> >> Also tried to scrape a specific group's list with the same commands:
> https://list.etsi.org/scripts/wa.exe?A0=3GPP_TSG_RAN <
> https://list.etsi.org/scripts/wa.exe?A0=3GPP_TSG_RAN> <
> https://list.etsi.org/scripts/wa.exe?A0=3GPP_TSG_RAN <
> https://list.etsi.org/scripts/wa.exe?A0=3GPP_TSG_RAN>>
> >>
> >> I get the following error:
> >> TypeError: from_url() got an unexpected keyword argument 'instant_dump'
> >>
> >> I don't understand what I'm missing. Can you help me, please?
> >> Thanks a lot in advance! The only similar argument I could find on
> Stackoverflow has no answers...
> >>
> >> Riccardo
> >>
> >>
> >>
> >>
> >> _______________________________________________
> >> Bigbang-user mailing list
> >> Bigbang-user at data-activism.net
> >> https://lists.ghserv.net/mailman/listinfo/bigbang-user <
> https://lists.ghserv.net/mailman/listinfo/bigbang-user>
> >>
> >
> > --
> > Niels ten Oever, PhD
> > Postdoctoral Researcher - Media Studies Department - University of
> Amsterdam
> > Research Fellow - Centre for Internet and Human Rights - European
> University Viadrina
> > Associated Scholar - Centro de Tecnologia e Sociedade - Fundação Getúlio
> Vargas
> >
> > https://nielstenoever.net <https://nielstenoever.net> -
> mail at nielstenoever.net - @nielstenoever - +31629051853
> > PGP: 2458 0B70 5C4A FD8A 9488 643A 0ED8 3F3A 468A C8B3
> >
> > Read my latest article on Internet infrastructure governance in New
> Media & Society here:
> https://journals.sagepub.com/doi/full/10.1177/1461444820929320 <
> https://journals.sagepub.com/doi/full/10.1177/1461444820929320>
> >
> > _______________________________________________
> > Bigbang-user mailing list
> > Bigbang-user at data-activism.net
> > https://lists.ghserv.net/mailman/listinfo/bigbang-user <
> https://lists.ghserv.net/mailman/listinfo/bigbang-user>
>
> --
> Niels ten Oever, PhD
> Postdoctoral Researcher - Media Studies Department - University of
> Amsterdam
> Research Fellow - Centre for Internet and Human Rights - European
> University Viadrina
> Associated Scholar - Centro de Tecnologia e Sociedade - Fundação Getúlio
> Vargas
>
> https://nielstenoever.net - mail at nielstenoever.net - @nielstenoever -
> +31629051853
> PGP: 2458 0B70 5C4A FD8A 9488 643A 0ED8 3F3A 468A C8B3
>
> Read my latest article on Internet infrastructure governance in New Media
> & Society here:
> https://journals.sagepub.com/doi/full/10.1177/1461444820929320
>
--
<><><><><><><><><><><><><><><><>
*Christoph Becker (he/him/his)*
*PhD at the*
*Institute for Data Science and*
*Institute for Computational Cosmology*
*Durham University*
*United Kingdom*
*christovis.github.io* <http://christovis.github.io>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.ghserv.net/pipermail/bigbang-user/attachments/20210422/698f204f/attachment-0001.htm>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: example.py
Type: text/x-python
Size: 403 bytes
Desc: not available
URL: <http://lists.ghserv.net/pipermail/bigbang-user/attachments/20210422/698f204f/attachment-0001.py>
More information about the Bigbang-user
mailing list