[R] process id of an R script
Mikkel Grum
mi2kelgrum at yahoo.com
Wed Sep 7 17:04:01 CEST 2011
I have a script that runs as a cron job every minute (on Ubuntu 10.10 and R 2.11.1), querying a database for new data. Most of the time it takes a few seconds to run, but once in while it takes more than a minute and the next run starts (on the same data) before the previous one has finished. In extreme cases this will fill up memory with a large number of runs of the same script on the same data. My 'solution' has been to create a process id file with the currently running script, first checking whether there is another process id file and whether that process is still running. I use the following code:
pid <- max(system("pgrep -x R", intern = TRUE))
if (file.exists("/var/run/myscript.pid")) {
rm(pid)
pid <- read.table("/var/run/myscript.pid")[[1]]
if (length(system(paste("ps -p", pid), intern = TRUE)) != 2) {
stop("Myscript is already running in another process.")
} else {
pid <- max(system("pgrep -x R", intern = TRUE))
write(pid, "/var/run/myscript.pid")
}
} else {
write(pid, "/var/run/myscript.pid")
}
....my script .....
file.remove("/var/run/myscript.pid")
#The End
The trouble here is that I also have other R scripts running on the same system, so while max(system("pgrep -x R", intern = TRUE)) will almost always give me the right pid, it is not guaranteed to work. There are two situations where it could fail: when the process id numbers round 32000 and start over again, and if another process starts up at the same time, the process ids could get swapped.
Is there a way to query for the process id of the specific R script, rather than all R processes?
Mikkel
More information about the R-help
mailing list