[R] Loading large tar.gz XenaHub Data into R

Spencer Brackett @pbr@ckett20 @end|ng |rom @@|ntjo@ephh@@com
Fri Aug 2 02:06:44 CEST 2019


Thank you both for your advice! The z <- readLines(gzcon(url("
https://tcga.xenahubs.net/download/TCGA.GBMLGG.sampleMap/HumanMethylation450.gz")),
) command worked out nicely

On Thu, Aug 1, 2019 at 6:47 PM William Dunlap <wdunlap using tibco.com> wrote:

> By the way, instead of saying only that there were warnings, it would be
> nice to show some of them.  E.g.,
> > z <- readLines("
> https://tcga.xenahubs.net/download/TCGA.GBMLGG.sampleMap/HumanMethylation450.gz
> ")
> [ Hit control-C or Esc to interrupt, or wait a long time ]
> There were 50 or more warnings (use warnings() to see the first 50)
> > warnings()
> Warning messages:
> 1: In readLines("
> https://tcga.xenahubs.net/download/TCGA.GBMLGG.sampleMap/HumanMethylation450.gz")
> :
>   line 1 appears to contain an embedded nul
> 2: In readLines("
> https://tcga.xenahubs.net/download/TCGA.GBMLGG.sampleMap/HumanMethylation450.gz")
> :
>   line 4 appears to contain an embedded nul
> 3: In readLines("
> https://tcga.xenahubs.net/download/TCGA.GBMLGG.sampleMap/HumanMethylation450.gz")
> :
>   line 7 appears to contain an embedded nul
>
> Burt's guess looks right, as the following gives 10 long lines of
> reasonable-looking data.  Remove the 'n=10' to get all of it.
>
> z <- readLines(gzcon(url("
> https://tcga.xenahubs.net/download/TCGA.GBMLGG.sampleMap/HumanMethylation450.gz")),
> n=10)
>
> Bill Dunlap
> TIBCO Software
> wdunlap tibco.com
>
>
> On Thu, Aug 1, 2019 at 3:37 PM Bert Gunter <bgunter.4567 using gmail.com> wrote:
>
>> These are gzipped files, I assume. So see ?gzfile and associated info
>> for how to open a gzip connection and read from it. You may also
>> prefer to search (e.g. at rseek.org) on "read a gzipped file" or
>> similar for possible alternatives.
>>
>> Of course, if they're not gzipped files, then ignore the above. If
>> they are, your current approach is hopeless.
>>
>>
>> Cheers,
>> Bert
>>
>> On Thu, Aug 1, 2019 at 3:13 PM Spencer Brackett
>> <spbrackett20 using saintjosephhs.com> wrote:
>> >
>> > Good evening,
>> >
>> > I am attempting to load the following Xena dataset
>> >
>> https://tcga.xenahubs.net/download/TCGA.GBMLGG.sampleMap/HumanMethylation450.gz
>> >
>> > I am trying to unpack the dataset and read it into R as a table, but
>> due to
>> > the size of the file, I am having some trouble. The following are the
>> > commands I have tried thus far.
>> >
>> > HumanMethylation450 <- fread("
>> >
>> https://tcga.xenahubs.net/download/TCGA.GBMLGG.sampleMap/HumanMethylation450.gz
>> > ")
>> >
>> > readLines("
>> >
>> https://tcga.xenahubs.net/download/TCGA.GBMLGG.sampleMap/HumanMethylation450.gz
>> > ")
>> >
>> >                  ###These two above attempts failed with warning
>> messages
>> > from R###
>> >
>> > Methyl <-read.delim("
>> >
>> https://tcga.xenahubs.net/download/TCGA.GBMLGG.sampleMap/HumanMethylation450.gz
>> > ")
>> >
>> >                ##This attempt is still processing, but has been doing so
>> > for quite some time##
>> >
>> > Any ideas as to what else I could try?
>> >
>> > Best,
>> >
>> > Spencer
>> >
>> >         [[alternative HTML version deleted]]
>> >
>> > ______________________________________________
>> > R-help using r-project.org mailing list -- To UNSUBSCRIBE and more, see
>> > https://stat.ethz.ch/mailman/listinfo/r-help
>> > PLEASE do read the posting guide
>> http://www.R-project.org/posting-guide.html
>> > and provide commented, minimal, self-contained, reproducible code.
>>
>> ______________________________________________
>> R-help using r-project.org mailing list -- To UNSUBSCRIBE and more, see
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide
>> http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>>
>

	[[alternative HTML version deleted]]



More information about the R-help mailing list