[R] scan and skip - without line breaks in the input file

Balzer Susanne susanne.balzer at imr.no
Sat Feb 27 19:02:48 CET 2010


Hi David,

That looks magic and works - if and only if you keep the file connection open. 
Cool, that was the hint I needed! 

> scan("myfile.txt", nmax=10) 

will always give you the first 10 items, obviously.

However, I did the workaround with tr under unix now and changed all the tabs into line breaks (thanks @Claudia Beleites). 

But good to know that scan also does the job.

Thanks again,

Susanne


-----Opprinnelig melding-----
Fra: David Winsemius [mailto:dwinsemius at comcast.net] 
Sendt: 27. februar 2010 18:46
Til: Balzer Susanne
Kopi: r-help at r-project.org help
Emne: Re: SV: [R] scan and skip - without line breaks in the input file


On Feb 27, 2010, at 11:47 AM, Balzer Susanne wrote:

> Hei David,
>
> Thanks for your quick response, but unfortunately n and nmax alone  
> don't do the job. If I want to read items no. 100001 to 200000, the  
> n=100000 option will work, but skip=100000 (to NOT read the first  
> 100000 items) won't.
>
> Or with your example,
>
> scan(textConnection('1 2 3 4 5 6 7'), skip=3) will never work, while

True.
>
> scan(textConnection('1 2 3 4 \n 5 \n 6 \n 7'), skip=3) will. But I  
> don't have line breaks in my file.

Right. That was what I was trying to help you deal with.

>
> Is there no way to specify the character for a line break in scan /  
> read.table / etc.?

Why are you fixating on linefeeds when you don't have any?????

 > closeAllConnections()
 > tc <- textConnection(paste(1:100, sep=" ", collapse=" "))
 > scan(tc, nmax=10)
Read 10 items
  [1]  1  2  3  4  5  6  7  8  9 10
 > scan(tc, nmax=10)
Read 10 items
  [1] 11 12 13 14 15 16 17 18 19 20
 > scan(tc, nmax=10)
Read 10 items
  [1] 21 22 23 24 25 26 27 28 29 30
 > scan(tc, nmax=10)
Read 10 items
  [1] 31 32 33 34 35 36 37 38 39 40
 > scan(tc, nmax=10)
Read 10 items
  [1] 41 42 43 44 45 46 47 48 49 50
 > scan(tc, nmax=10)
Read 10 items
  [1] 51 52 53 54 55 56 57 58 59 60
 > scan(tc, nmax=10)
Read 10 items
  [1] 61 62 63 64 65 66 67 68 69 70
 > scan(tc, nmax=10)
Read 10 items
  [1] 71 72 73 74 75 76 77 78 79 80
 > scan(tc, nmax=10)
Read 10 items
  [1] 81 82 83 84 85 86 87 88 89 90
 > scan(tc, nmax=10)
Read 10 items
  [1]  91  92  93  94  95  96  97  98  99 100
 > scan(tc, nmax=10)
Read 0 items
numeric(0)

-- 
David.




>
> Kind regards,
>
> Susanne
>
>
> -----Opprinnelig melding-----
> Fra: David Winsemius [mailto:dwinsemius at comcast.net]
> Sendt: 27. februar 2010 17:38
> Til: Balzer Susanne
> Kopi: 'r-help at r-project.org'
> Emne: Re: [R] scan and skip - without line breaks in the input file
>
>
> On Feb 27, 2010, at 11:24 AM, Balzer Susanne wrote:
>
>> Dear all,
>>
>> I am trying to read in big amounts of data with scan. It's only one
>> variable, numeric values, separated by tabs,.. and it's many of
>> them. So I was thinking that I could use the skip option and read in
>> 100000 values at a time - but skip doesn't work, probably because I
>> don't have line breaks in the txt file. So any value specified for
>> skip makes the scan function jump to the end of the file.
>
> ?scan
>
> Without a working example it is hard to be sure, but it appears from a
> rapid look at the help page that nmax is the argument you want.
>
>> scan(textConnection('1 2 3 4 5 6 7'), nmax=4)
> Read 4 items
> [1] 1 2 3 4
>
>
> (Ignores line-feeds)
>> scan(textConnection('1 2 \n 3 4 5 6 7'), nmax=4)
> Read 4 items
> [1] 1 2 3 4
>
>
> -- 
> David.
>>
>> Does anyone have a good idea? I would be extremely grateful.
>>
>> Kind regards,
>>
>> Susanne Balzer
>>
>>
>>
>> ****************************
>> Susanne Balzer
>> PhD Student
>> Institute of Marine Research
>> N-5073 Bergen, Norway
>> Phone: +47 55 23 69 45
>> susanne.balzer at imr.no
>> www.imr.no
>>
>> ______________________________________________
>> R-help at r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>
> David Winsemius, MD
> Heritage Laboratories
> West Hartford, CT
>

David Winsemius, MD
Heritage Laboratories
West Hartford, CT



More information about the R-help mailing list