[R] Function gutenberg_download in the gutenbergr package
Jeff Newmiller
jdnewmil at dcn.davis.ca.us
Wed Jan 24 16:59:04 CET 2018
I have never used that package, but it seems obvious to me that you need to "reflect" on the meaning of the word "mirror". There is no reason to assume that a site hosting a mirror of the CRAN archive is also going to host a mirror of Project Gutenberg [1].
If, after you know you are giving reasonable inputs the package does not seem to work as designed, please remember that contributed packages have maintainers [2] and not all of them subscribe to r-help.
[1] https://www.gutenberg.org/MIRRORS.ALL
[2] ?maintainer
--
Sent from my phone. Please excuse my brevity.
On January 23, 2018 11:23:06 PM PST, Patrick Connolly <p_connolly at slingshot.co.nz> wrote:
>
>I've been working through https://www.tidytextmining.com/tidytext.html
>wherein everything worked until I got to this part in section 1.5
>
>> hgwells <- gutenberg_download(c(35, 36, 5230, 159))
>Determining mirror for Project Gutenberg from
>http://www.gutenberg.org/robot/harvest
>Error in open.connection(con, "rb") :
> Failed to connect to www.gutenberg.org port 80: Connection timed out
>
>Which indicates the problem is at the very start:
>
> if (is.null(mirror)) {
> mirror <- gutenberg_get_mirror(verbose = verbose)
> }
>
>The documentation for gutenberg_get_mirror indicates there's nothing
>different I could set.
>
>So I tried specifying my usual mirror:
>
>> hgwells <- gutenberg_download(c(1260, 768, 969, 9182, 767), mirror =
>"http://cran.stat.auckland.ac.nz")
>Error in read_zip_url(full_url) : could not find function
>"read_zip_url"
>>
>
>Which is, indeed, strange since according to
>
>> help.search("read_zip_url")
>Help files with alias or concept or title matching ‘read_zip_url’ using
>regular expression matching:
>
>
>gutenbergr::read_zip_url
> Read a file from a .zip URL
> Aliases: read_zip_url
>
>[...]
>
>And according to
>library(help = "gutenbergr")
>
>[...]
>Index:
>
>gutenberg_authors Metadata about Project Gutenberg authors
>gutenberg_download Download one or more works using a Project
> Gutenberg ID
>gutenberg_get_mirror Get the recommended mirror for Gutenberg files
>gutenberg_metadata Gutenberg metadata about each work
>gutenberg_strip Strip header and footer content from a Project
> Gutenberg book
>gutenberg_subjects Gutenberg metadata about the subject of each
> work
>gutenberg_works Get a filtered table of Gutenberg work metadata
>read_zip_url Read a file from a .zip URL
>
>[...]
>
>However, when I look at the list for that part of the search(), there
>is no read_zip_url but all the rest of that list are present. So it's
>not surprising that it isn't found. But it puzzles me that it is not
>there.
>
>Ideas as to where I should proceed gratefully appreciated.
>
>
>> sessionInfo()
>R version 3.4.2 (2017-09-28)
>Platform: x86_64-pc-linux-gnu (64-bit)
>Running under: Ubuntu 14.04.5 LTS
>
>Matrix products: default
>BLAS: /home/hrapgc/local/R-3.4.2/lib/libRblas.so
>LAPACK: /home/hrapgc/local/R-3.4.2/lib/libRlapack.so
>
>locale:
> [1] LC_CTYPE=en_NZ.UTF-8 LC_NUMERIC=C
> [3] LC_TIME=en_NZ.UTF-8 LC_COLLATE=en_NZ.UTF-8
> [5] LC_MONETARY=en_NZ.UTF-8 LC_MESSAGES=en_NZ.UTF-8
> [7] LC_PAPER=en_NZ.UTF-8 LC_NAME=C
> [9] LC_ADDRESS=C LC_TELEPHONE=C
>[11] LC_MEASUREMENT=en_NZ.UTF-8 LC_IDENTIFICATION=C
>
>attached base packages:
>[1] grDevices utils stats graphics methods base
>
>other attached packages:
>[1] sos_2.0-0 brew_1.0-6 gutenbergr_0.1.3
>ggplot2_2.2.1
>[5] stringr_1.2.0 bindrcpp_0.2 dplyr_0.7.4
>janeaustenr_0.1.5
>[9] tidytext_0.1.6 FactoMineR_1.38 readxl_1.0.0 tm_0.7-3
>
>[13] NLP_0.1-11 wordcloud_2.5 RColorBrewer_1.1-2
>lattice_0.20-35
>
>loaded via a namespace (and not attached):
> [1] Rcpp_0.12.13 cellranger_1.1.0 compiler_3.4.2
> [4] plyr_1.8.4 bindr_0.1 tokenizers_0.1.4
> [7] tools_3.4.2 gtable_0.2.0 tibble_1.3.4
>[10] nlme_3.1-131 pkgconfig_2.0.1 rlang_0.1.2
>[13] Matrix_1.2-11 psych_1.7.8 curl_3.0
>[16] parallel_3.4.2 xml2_1.1.1 cluster_2.0.6
>[19] hms_0.3 flashClust_1.01-2 grid_3.4.2
>[22] scatterplot3d_0.3-40 glue_1.1.1 ellipse_0.3-8
>[25] R6_2.2.2 foreign_0.8-69 readr_1.1.1
>[28] purrr_0.2.4 tidyr_0.7.2 reshape2_1.4.2
>[31] magrittr_1.5 scales_0.5.0 SnowballC_0.5.1
>[34] MASS_7.3-47 leaps_3.0 assertthat_0.2.0
>[37] mnormt_1.5-5 colorspace_1.3-2 labeling_0.3
>[40] stringi_1.1.5 lazyeval_0.2.1 munsell_0.4.3
>[43] slam_0.1-42 broom_0.4.2
>>
>
>--
>~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.
>
> ___ Patrick Connolly
> {~._.~} Great minds discuss ideas
> _( Y )_ Average minds discuss events
>(:_~*~_:) Small minds discuss people
> (_)-(_) ..... Eleanor Roosevelt
>
>~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.
>
>______________________________________________
>R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see
>https://stat.ethz.ch/mailman/listinfo/r-help
>PLEASE do read the posting guide
>http://www.R-project.org/posting-guide.html
>and provide commented, minimal, self-contained, reproducible code.
More information about the R-help
mailing list