[R] read.*: How to read from a URL?
Martin Morgan
mtmorgan at fhcrc.org
Thu Dec 11 00:45:11 CET 2008
Martin Morgan wrote:
> Prof Brian Ripley wrote:
>> On Wed, 10 Dec 2008, hadley wickham wrote:
>>
>>> Hi Michael,
>>>
>>> In general, I think you should be able to do:
>>>
>>> gimage <- read.jpeg(url(gimageloc))
>>
>> Note that would not be really correct: it would need to be
>>
>> gimage <- read.jpeg(con <- url(gimageloc))
>> close(con)
>>
>> since it otherwise leaks a connection (which would eventually be closed).
>>
>> However, from ?read.jpeg
>>
>> Arguments:
>>
>> filename: filename of JPEG image
>>
>> so it does not accept a connection (and the source code wll confirm
>> that). In fact virtually all functions that accept a 'file name or
>> connection' will work with URLs, as file() accepts URLs as well as
>> file names (see ?file).
>>
>> The issue is that writers of third-party readers should be encouraged
>> to support connections (which have been around for ca 7 years in R).
>> It is ammazing how people take such innovations for granted.
>
> Perhaps the discussion belongs on R-devel, but is there an example of a
> user-contributed package that uses R's connections, either for parsing a
> URL or, for instance, a compressed file?
To clarify, I meant using a connection from the C level.
> Martin
>
>>
>>> or alternatively use the EBImage from bioconductor which will read
>>> from a url automatically (it also opens a much wider range of file
>>> types)
>>>
>>> library(EBImage)
>>> img <- readImage(gimageloc, TrueColor)
>>>
>>> Hadley
>>>
>>>
>>> On Wed, Dec 10, 2008 at 1:17 PM, Michael Friendly <friendly at yorku.ca>
>>> wrote:
>>>> The question is how to use a URL in place of a file= argument for
>>>> read.*.functions that do
>>>> not support this internally.
>>>>
>>>> e.g., utils::read.table() and her family all support a file=
>>>> argument that
>>>> can take a URL
>>>> equally well as a local file. So, if I have a file on the web, I can
>>>> equally well do
>>>>
>>>>> langren <- read.csv("langrens.csv", header=TRUE)
>>>>> langren <-
>>>>> read.csv("http://euclid.psych.yorku.ca/SCS/Gallery/Private/langrens.csv",
>>>>>
>>>>> header=TRUE)
>>>>
>>>> where the latter is more convenient for posts to this list or
>>>> distributed
>>>> examples.
>>>> rimage::read.jpeg() doesn't support URLs, and the only way I've
>>>> found is to
>>>> download the
>>>> image file from a URL to a temp file, in several steps.
>>>> This is probably a more general problem than just read.jpeg,
>>>> so maybe there is a general idiom for this case, or better-- other
>>>> read.*
>>>> functions could
>>>> be encouraged to support URLs.
>>>>
>>>>> library(rimage)
>>>>> # local file: OK
>>>>> gimage <-
>>>>> read.jpeg("C:/Documents/milestone/images/vanLangren/google-toledo-rome3.jpg")
>>>>>
>>>>>
>>>>> gimageloc <-
>>>>> "http://euclid.psych.yorku.ca/SCS/Gallery/images/Private/Langren/google-toledo-rome3.jpg"
>>>>>
>>>>> dest <- paste(tempfile(),'.jpg', sep='')
>>>>> download.file(gimageloc, dest, mode="wb")
>>>> trying URL
>>>> 'http://euclid.psych.yorku.ca/SCS/Gallery/images/Private/Langren/google-toledo-rome3.jpg'
>>>>
>>>> Content type 'image/jpeg' length 35349 bytes (34 Kb)
>>>> opened URL
>>>> downloaded 34 Kb
>>>>
>>>>> dest
>>>> [1]
>>>> "C:\\DOCUME~1\\default\\LOCALS~1\\Temp\\Rtmp9nNTdV\\file5f906952.jpg"
>>>>> # Is there something simpler??
>>>>> gimage <- read.jpeg(dest)
>>>>
>>>>> # I thought file() might work, but evidently not.
>>>>> gimage <- read.jpeg(file(gimageloc))
>>>> Error in read.jpeg(file(gimageloc)) : Can't open file.
>>>>>
>>>>
>>>> --
>>>> Michael Friendly Email: friendly AT yorku DOT ca Professor,
>>>> Psychology
>>>> Dept.
>>>> York University Voice: 416 736-5115 x66249 Fax: 416 736-5814
>>>> 4700 Keele Street http://www.math.yorku.ca/SCS/friendly.html
>>>> Toronto, ONT M3J 1P3 CANADA
>>>>
>>>> ______________________________________________
>>>> R-help at r-project.org mailing list
>>>> https://stat.ethz.ch/mailman/listinfo/r-help
>>>> PLEASE do read the posting guide
>>>> http://www.R-project.org/posting-guide.html
>>>> and provide commented, minimal, self-contained, reproducible code.
>>>>
>>>
>>>
>>>
>>> --
>>> http://had.co.nz/
>>>
>>> ______________________________________________
>>> R-help at r-project.org mailing list
>>> https://stat.ethz.ch/mailman/listinfo/r-help
>>> PLEASE do read the posting guide
>>> http://www.R-project.org/posting-guide.html
>>> and provide commented, minimal, self-contained, reproducible code.
>>>
>>
>
>
--
Martin Morgan
Computational Biology / Fred Hutchinson Cancer Research Center
1100 Fairview Ave. N.
PO Box 19024 Seattle, WA 98109
Location: Arnold Building M2 B169
Phone: (206) 667-2793
More information about the R-help
mailing list