When debugging new RavenDB’s 32-bit pager for Linux-based ARM environments, which has platform specific functionality implemented in C and P/Invoked from C# code, I ran into an issue: when starting, RavenDB was throwing a segmentation fault and crashing. Since the C# code didn’t change much, my immediate suspect was some sort of pointer issue in C code, such as trying to dereference a null pointer.
The GNU Debugger or GDB is very good at tracing such issues. Let’s see how we can find a segfault source in a small example.
Consider the following code:
Now, let’s compile this and use GDB to find where the segfault happens:
a.outis an executable compiled from segmentFaultThrower.c
Running GDB with such parameters will yield the following output
GNU gdb (Ubuntu 8.1-0ubuntu3.2) 126.96.36.19980409-git
We have now run our application with GDB attached and paused. Executing
run command actually starts the program.
bt command (backtrack) will show the stack trace where the segfault has happened.
This is nice, but GDB can do better! Compiling our application with
-g flag, will include debug information into executable.
So, if we re-compile with the flag, start GDB and issue
run command, we will see the following
bt will now print source code lines in the stack trace
The new 32-bit pager that I mentioned at the beginning, was using P/Invokes to C code that was used to access operating-system APIs, such as memory-mapping related functions.
I made sure to compile the C library with
-g flag and ran RavenDB with GDB attached. I saw the following output (notice the output of
Thread 1 "Raven.Server" received signal SIGSEGV, Segmentation fault.
Such output looked weird to me, especially the corrupt stack part, so I looked at the relevant code.
The P/Invoke call in C# part looked like this:
var rc = rvn_mmap_file(size,
In the C part,
rvn_mmap_file() signature looks like this:
EXPORT int32_t rvn_mmap_file(int64_t sz, int64_t flags, void *handle, int64_t offset, void **addr, int32_t *detailed_error_code)
In this case,
int64_tis a typedef for
int32_tis a typedef for
The first thing I noticed is that the
handle parameter value is 0 (which means
null pointer) and the
offset parameter has unreasonably large value.
In C# code, by the point the
rvn_mmap_file() is invoked,
_handle is guaranteed to have a value (otherwise the code would have failed earlier). Together with corrupt stack notification from GDB while executing the
bt command, I suspected that some offsets are wrong, since the segfault happens when invoking
After looking some more at the code, I noticed that the
flags parameter is
int64_t and the definition of the corresponding flags enum in C# looks like this:
Since in C# enums are of
System.Int32 type, this in fact was the issue. The fix was simply to change the
flags type so the signature became:
EXPORT int32_t rvn_mmap_file(int64_t sz, int32_t flags, void *handle, int64_t offset, void **addr, int32_t *detailed_error_code)
Usually, segmentation faults are associated with null point dereference or other types of pointer issue, but as we can see here, this doesn’t have to be so.