[R] Value Lookup from File without Slurping

Gabor Grothendieck ggrothendieck at gmail.com
Fri Jan 16 10:41:06 CET 2009


The sqldf package can read a large file to a database without going
through R followed by extracting it.   The package makes it easier
to use RSQLite by setting up the database for you and after extracting
the portion you want removing the database automatically.  You can
specify all this in two lines: one to name the file and one to specify
the extraction using SQL. See the examples in example 6 on the
home page:

http://sqldf.googecode.com#Example_6._File_Input

On Fri, Jan 16, 2009 at 4:12 AM, Carlos J. Gil Bellosta
<cgb at datanalytics.com> wrote:
> On Fri, 2009-01-16 at 18:02 +0900, Gundala Viswanath wrote:
>> Dear all,
>>
>> I have a repository file (let's call it repo.txt)
>>  that contain two columns like this:
>>
>> # tag  value
>> AAA    0.2
>> AAT    0.3
>> AAC   0.02
>> AAG   0.02
>> ATA    0.3
>> ATT   0.7
>>
>> Given another query vector
>>
>> > qr <- c("AAC", "ATT")
>>
>> I would like to find the corresponding value for each query above,
>> yielding:
>>
>> 0.02
>> 0.7
>>
>> However, I want to avoid slurping whole repo.txt into an object (e.g. hash).
>> Is there any ways to do that?
>>
>> The reason I want to do that because repo.txt is very2 large size
>> (milions of lines,
>> with tag length > 30 bp),  and my PC memory is too small to keep it.
>>
>> - Gundala Viswanath
>> Jakarta - Indonesia
>
> Hello,
>
> You can always store your repo.txt into a database, say, SQLite, and
> select only the values you want via an SQL query.
>
> Thus, you will prevent loading the full file into memory.
>
> Best regards,
>
> Carlos J. Gil Bellosta
> http://www.datanalytics.com
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>




More information about the R-help mailing list