[R] segfault debugging
Brian Ripley
ripley at stats.ox.ac.uk
Sat Dec 1 17:44:56 CET 2012
On 1 Dec 2012, at 16:09, William Dunlap <wdunlap at tibco.com> wrote:
>> valgrind is usually effective for this
>>
>> R -d valgrind -f myscript.R
>
> And adding the R command
> gctorture(TRUE)
> to the top of your script lets valgrind do a better job of
> find memory misuse.
That makes things even slower: it really only helps when PROTECT is used incorrectly (including not used): this error looks more like a memory over-run.
Note that valgrind is really only effective for under/over-run errors involving memory allocated by R if the build of R is instrumented (see 'Writing R Extensions').
>
> Bill Dunlap
> Spotfire, TIBCO Software
> wdunlap tibco.com
>
>
>> -----Original Message-----
>> From: r-help-bounces at r-project.org [mailto:r-help-bounces at r-project.org] On Behalf
>> Of Martin Morgan
>> Sent: Saturday, December 01, 2012 6:54 AM
>> To: Duncan Murdoch
>> Cc: r-help at r-project.org; Donatella Quagli
>> Subject: Re: [R] segfault debugging
>>
>> On 12/01/2012 04:51 AM, Duncan Murdoch wrote:
>>> On 12-12-01 6:56 AM, Donatella Quagli wrote:
>>>> Thank you so far. Here is an excerpt from the gdb session after a crash:
>>>> Program received signal SIGSEGV, Segmentation fault.
>>>>
>>>> 0xb7509a6b in Rf_allocVector () from /usr/lib/R/lib/libR.so
>>>> (gdb) backtrace
>>>> #0 0xb7509a6b in Rf_allocVector () from /usr/lib/R/lib/libR.so
>>>> #1 0xb744b64c in ?? () from /usr/lib/R/lib/libR.so
>>>> #2 0xb74c58bf in ?? () from /usr/lib/R/lib/libR.so
>>>> #3 0xb74c9c62 in Rf_eval () from /usr/lib/R/lib/libR.so
>>>> #4 0xb74ce60f in Rf_applyClosure () from /usr/lib/R/lib/libR.so
>>>> #5 0xb74c9f29 in Rf_eval () from /usr/lib/R/lib/libR.so
>>>> #6 0xb7503002 in Rf_ReplIteration () from /usr/lib/R/lib/libR.so
>>>> #7 0xb7503298 in ?? () from /usr/lib/R/lib/libR.so
>>>> #8 0xb7503812 in run_Rmainloop () from /usr/lib/R/lib/libR.so
>>>> #9 0xb7503839 in Rf_mainloop () from /usr/lib/R/lib/libR.so
>>>> #10 0x08048768 in main ()
>>>> #11 0xb728de46 in __libc_start_main (main=0x8048730 <main>, argc=8,
>>>> ubp_av=0xbfdb7824, init=0x80488a0 <__libc_csu_init>,
>>>> fini=0x8048890 <__libc_csu_fini>, rtld_fini=0xb7784590,
>>>> stack_end=0xbfdb781c) at libc-start.c:228
>>>> #12 0x08048791 in _start ()
>>>>
>>>> It seems to me that the error is in frame #0. Does it mean that there is a bug
>>>> in libR.so? What can I do next?
>>>
>>> It means that the error was detected when trying to do a memory allocation.
>>> That could be a bug in R, but more likely something else has damaged the memory
>>> management system structures, e.g. a function writing to memory that it doesn't
>>> own.
>>>
>>> Bugs like this are hard to track down, because the damage could have occurred a
>>> long time before it showed up, and small changes to your script could affect it.
>>>
>>> I would try to narrow it down to a single statement in your script. You might
>>> be able to deduce that from the last line printed before the crash. If you
>>> don't have any printing, you could try adding some, but as I mentioned above,
>>> that might make the bug behave differently.
>>>
>>> Another approach is to cut off statements at the end of your script. That
>>> probably won't affect the bug until you cut off the statement that actually
>>> triggered it (but it might, which is why this kind of bug is so frustrating to
>>> track down).
>>>
>>> If you find the bad statement, then look at calls to external code in it, or
>>> recently executed before it. See if any of them look like they contain errors.
>>> Common errors are to write to an array without allocating it, or to write beyond
>>> the bounds of an array, or (in .Call() code) to allocate something and then fail
>>> to protect it from garbage collection.
>>>
>>> You could also figure out what the problem is that caused the seg fault in frame
>>> 0. It might be because some particular variable contains a garbage value. Then
>>> in a new run, you can ask gdb to break when that memory location takes on the
>>> garbage value. This is usually effective if you really can identify the bad
>>> value, but doing that can be hard, especially when you aren't familiar with how
>>> things normally work.
>>
>> valgrind is usually effective for this
>>
>> R -d valgrind -f myscript.R
>>
>> but it requires an operating system where it is available (e.g., linux) and a
>> quick (say less than 10's of seconds) way of reproducing the bug (because
>> valgrind slows evaluation alot). So the first step is really to narrow down your
>> large script to something that is easier to re-run., e.g., saving the important
>> R objects to a file shortly before the problem section of your script, then
>> reproducing the problem by loading those and evaluating a few steps of the code.
>> The bug can still be intermittent; valgrind will likely spot the problem.
>>
>> Martin
>>
>>>
>>> Good luck!
>>>
>>> Duncan Murdoch
>>>
>>> ______________________________________________
>>> R-help at r-project.org mailing list
>>> https://stat.ethz.ch/mailman/listinfo/r-help
>>> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
>>> and provide commented, minimal, self-contained, reproducible code.
>>
>>
>> --
>> Computational Biology / Fred Hutchinson Cancer Research Center
>> 1100 Fairview Ave. N.
>> PO Box 19024 Seattle, WA 98109
>>
>> Location: Arnold Building M1 B861
>> Phone: (206) 667-2793
>>
>> ______________________________________________
>> R-help at r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
More information about the R-help
mailing list