[BioC] R: Can an R script be run through a cron job ?
Steffen Moeller
steffen_moeller at gmx.de
Fri Nov 20 16:54:59 CET 2009
Hello, going back to the original question I just wanted to indicate
http://dirk.eddelbuettel.com/code/littler.html
which works just fine for me, also with the getops package.
Steffen
Cei Abreu-Goodger wrote:
> may I suggest the following:
>
> 1) First get all unique Ensembl transcript IDs
>
> 2) If there are too many, split into groups of ~1-5 thousand (I don't
> know what the optimum would be)
>
> 3) For each group of ids, use getSequence() to retrieve the 3'UTR.
>
> 4) rbind the results, check, save
>
> Cheers,
>
> Cei
>
>
> mauede at alice.it wrote:
>> I reattached my script. I had attached it to an earlier message that
>> maybe was overlooked.
>>
>> As you can see yourself, I scan a big data set, named hsTargets, that
>> contains plenty of target gene transcript IDs with a handle to the
>> relative miRNA.
>> I process such a data base one miRNA at a time. That is, I gather all
>> the transcript IDs for the current miRNA
>> and query biomaRT asking for the 3'utr for all such transcrpts whose
>> ENST are in a vector that I pass as input parameter to the query.
>> Therefore I do use the vectorized capabilities of R, don't I ?
>>
>> My mistake is to keep the connection to biomaRt opened while
>> processing as many miRNAs as I can.
>> Therefore I acknowledge I have to improve my script and catch the
>> exception so that I have to delete the file currently being written
>> (as in general it will be incomplete) and have the script die gently.
>> Then I have to get my script pause and disconnect from biomaRT
>> regularly to avoid hammering the provided
>> service. Eventually my process can even end itself instead of
>> sleeping, after saving its current status. However, I need to set up
>> the task scheduler to restart it some time later ...
>>
>> Regards,
>> Maura
>>
>>
>>
>>
>>
>>
>> -----Messaggio originale-----
>> Da: Kasper Daniel Hansen [mailto:khansen at stat.berkeley.edu]
>> Inviato: ven 20/11/2009 15.12
>> A: mauede at alice.it
>> Cc: Bioconductor List
>> Oggetto: Re: [BioC] Can an R script be run through a cron job ?
>>
>> Maura
>>
>> Unfortunately you never showed us your code, despite repeated
>> requests to do so. That makes it hard to help (and frankly, ignoring
>> requests for information from people trying to help you is extremely
>> counterproductive).
>>
>> Your comments in your last email in the last thread indicates that
>> you have code that essentially do this
>>
>> for(i in 1:100)
>> getBM(...)
>>
>> If this is true (which we would know if we can see the code), this is
>> why your script fail. There are two problems with this (1) you are
>> not using the vectorized capabilities of R, but more important is (2)
>> you are sending many requests to Biomart and typically such behaviour
>> might mean your IP address will be banned temporarily. They don't
>> like people hammering their services with repeated requests.
>>
>> Instead you should create a query that essentially asks for all your
>> return objects in one request. That should be easy to write, and
>> will be much faster. You might think that processing the output is
>> slightly harder, but that is the thing to do (and with more R
>> experience, processing a big output is actually easier).
>>
>> Regarding your actual question in this email, you seem to be very
>> confused regarding the meaning of a batch job. This word has many
>> different interpretations (not related to R), so it is hard to google
>> for. What you are specifically asking for has everything to do with
>> what operating system you are using (Windows, Linux, OS X) and
>> nothing to do with R.
>>
>> Kasper
>>
>>
>> On Nov 19, 2009, at 18:24 , <mauede at alice.it> <mauede at alice.it> wrote:
>>
>>> I am running a script that extracts many long strings from remote
>>> data bases.
>>> Every now and then the remote data base gets out of sync and closes
>>> the connection.
>>> I have been adviced to implement an R script that queries the data
>>> base in batch modality.
>>> I never ran an R script in batch modality. I think I have to use R
>>> CMD BATCH or something similar
>>> Given the amount of data I am extracting, I am concerned about
>>> having to parse a huge data file looking for the
>>> informattion I need.
>>> The less painful modification would consist in running the R script
>>> as is but through a cron job. So that the script
>>> should be set to sleep on an established frequency and when
>>> awakened it should resume from where it was interrupted.
>>> Is such a scheme doable in R ? If it is then what are the most
>>> important commands to make a script sleep and wake up
>>> on a regular basis ?
>>>
>>> Thank you in advance,
>>> Maura
>>>
>>>
>>>
>>>
>>> tutti i telefonini TIM!
>>>
>>>
>>> [[alternative HTML version deleted]]
>>>
>>> _______________________________________________
>>> Bioconductor mailing list
>>> Bioconductor at stat.math.ethz.ch
>>> https://stat.ethz.ch/mailman/listinfo/bioconductor
>>> Search the archives:
>>> http://news.gmane.org/gmane.science.biology.informatics.conductor
>>
>>
>>
>>
>>
>>
>> e tutti i telefonini TIM!
>> Vai su
>>
>> ------------------------------------------------------------------------
>>
>> _______________________________________________
>> Bioconductor mailing list
>> Bioconductor at stat.math.ethz.ch
>> https://stat.ethz.ch/mailman/listinfo/bioconductor
>> Search the archives:
>> http://news.gmane.org/gmane.science.biology.informatics.conductor
>
> _______________________________________________
> Bioconductor mailing list
> Bioconductor at stat.math.ethz.ch
> https://stat.ethz.ch/mailman/listinfo/bioconductor
> Search the archives:
> http://news.gmane.org/gmane.science.biology.informatics.conductor
More information about the Bioconductor
mailing list