Most people have heard of debugging tools like
gdb. For the rest, it’s
usually just continuous testing and catching problems with your eye. But
sometimes it’s not that simple.
Take for example (and I’ve been using this example a lot),
libdpx. I was
trying to clean up the code for the past two days, but when I moved a simple
statement that should work, everything essentially went to hell. You can see
that commit here. Basically, the
alchanfree() method is commented because if I tried to
segfaults would be everywhere.
At first glance, this doesn’t look bad - we’re initialising a new channel, then sending the frame to the write frames method. Then we wait for something to come back. Because of cooperative threading, everything should work just dandily.
But when I uncommented the
alchanfree() method, stuff didn’t work. And I was
utterly confused. I was wondering why all of a sudden, I was getting segfaults
left and right.
Valgrind is this tool that not many people have heard of for some reason, yet
it is one of the most invaluable tools for memory checking that you could ever
have. It basically acts as a middle man between your program and libc,
free/malloc/calloc/realloc calls and recording addresses, then
making sure whatever calls you ‘
free’d later. As a result, it can
also detect reads and writes to
free’d pieces of memory, which would segfault
under normal conditions.
But most people just ignore what
valgrind is saying. That’s bad. Don’t do that.
Some sample output:
==802== Memcheck, a memory error detector ==802== Copyright (C) 2002-2013, and GNU GPL'd, by Julian Seward et al. ==802== Using Valgrind-3.9.0 and LibVEX; rerun with -h for copyright info ==802== Command: ./check_dpx ==802== ==802== Syscall param timer_create(evp) points to uninitialised byte(s) ==802== at 0x385AC03E72: timer_create@@GLIBC_2.3.3 (timer_create.c:82) ==802== by 0x526E47B: srunner_run (check_run.c:407) ==802== by 0x40361A: main (check.c:499) ==802== Location 0xffefff750 is 0 bytes inside local_evp._sigev_un, ==802== declared at timer_create.c:57, in frame #0 of thread 1 ==802== Uninitialised value was created by a stack allocation ==802== at 0x3859015185: _dl_runtime_resolve (dl-trampoline.S:46) ==802== ==830== HEAP SUMMARY: ==830== in use at exit: 572,679 bytes in 96 blocks ==830== total heap usage: 339 allocs, 243 frees, 1,510,826 bytes allocated ==830== ==830== 3 bytes in 1 blocks are definitely lost in loss record 5 of 83 ==830== at 0x4A0645D: malloc (in /usr/lib64/valgrind/vgpreload_memcheck-amd64-linux.so) ==830== by 0x4C1A428: _dpx_frame_msgpack_from (frame.c:171) ==830== by 0x4C149DB: _dpx_duplex_conn_read_frames (conn.c:60) ==830== by 0x4E5948B: taskstart (task.c:71) ==830== by 0x38594479FF: ??? (in /usr/lib64/libc-2.18.so) ==830== ==830== 72 bytes in 1 blocks are definitely lost in loss record 66 of 83 ==830== at 0x4A0645D: malloc (in /usr/lib64/valgrind/vgpreload_memcheck-amd64-linux.so) ==830== by 0x4C174BD: dpx_frame_new (frame.c:27) ==830== by 0x402700: test_dpx_call (check.c:224) ==830== by 0x4029E6: test_dpx_rpc_call (check.c:279) ==830== by 0x526E87D: srunner_run (check_run.c:396) ==830== by 0x40361A: main (check.c:499) ==830== ==830== 288 bytes in 1 blocks are possibly lost in loss record 72 of 83 ==830== at 0x4A081D4: calloc (in /usr/lib64/valgrind/vgpreload_memcheck-amd64-linux.so) ==830== by 0x3859011C44: _dl_allocate_tls (dl-tls.c:296) ==830== by 0x3859808862: pthread_create@@GLIBC_2.2.5 (allocatestack.c:580) ==830== by 0x4C1BCD5: dpx_init (dpx.c:172) ==830== by 0x4028F1: test_dpx_rpc_call (check.c:263) ==830== by 0x526E87D: srunner_run (check_run.c:396) ==830== by 0x40361A: main (check.c:499) ==830== ==830== LEAK SUMMARY: ==830== definitely lost: 75 bytes in 2 blocks ==830== indirectly lost: 0 bytes in 0 blocks ==830== possibly lost: 288 bytes in 1 blocks ==830== still reachable: 572,316 bytes in 93 blocks ==830== suppressed: 0 bytes in 0 blocks ==830== Reachable blocks (those to which a pointer was found) are not shown. ==830== To see them, rerun with: --leak-check=full --show-leak-kinds=all ==830== ==830== For counts of detected and suppressed errors, rerun with: -v ==830== ERROR SUMMARY: 7 errors from 4 contexts (suppressed: 2 from 2)
All that output looks, frankly, terrifying at first. But it’s actually not that hard to interpret. It tells you how many blocks of memory it thinks are lost, and where they were allocated. That’s it!
How about output for invalid reads?
I don’t have those, because I rage… obliterated them. Yes. That.
My problem was that something was trying to write to the channel after it was free’d. That has now been solved, thanks to valgrind outputting 8 pages of “invalid write of size 8, here (insert stacktrace), to a block that was free’d here (insert stacktrace)“.
valgrind treats you like you don’t know what you’re doing
In my case, that’s probably true. Manual memory management sucks.
But it’s good, because there’s definitely way more actual errors detected than false positives. Case in point, here was my recycling bin after I printed out my valgrind output and went through all of them one by one:
I feel old-fashioned because I do code reviews on paper, but hey, it’s easier for me. I can’t be the only one who agrees… right?
so what are you trying to tell me
Valgrind your program. It’s good for detecting memory leaks and the likes of it , can tell you when you’re trying to murder poor memory fields you don’t have access to, and also other cool things. Really.
Here, I’ll get you started! You can run valgrind with
valgrind --leak-check=yes --read-var-info=yes --track-origins=yes ./[program]
and receive your lovely output. By default, valgrind prints to stderr. Redirect
it to stdout if you want to pipe it to
ansi2html or something similar to
to print it out.
what was the point of this article
I don’t know, just that I had a problem, used Valgrind, solved said problem?