Most people have heard of debugging tools like gdb. For the rest, it’s usually just continuous testing and catching problems with your eye. But sometimes it’s not that simple.

Take for example (and I’ve been using this example a lot), libdpx. I was trying to clean up the code for the past two days, but when I moved a simple statement that should work, everything essentially went to hell. You can see that commit here. Basically, the alchanfree() method is commented because if I tried to chanfree(), segfaults would be everywhere.

At first glance, this doesn’t look bad - we’re initialising a new channel, then sending the frame to the write frames method. Then we wait for something to come back. Because of cooperative threading, everything should work just dandily.

But when I uncommented the alchanfree() method, stuff didn’t work. And I was utterly confused. I was wondering why all of a sudden, I was getting segfaults left and right.

enter valgrind

Valgrind is this tool that not many people have heard of for some reason, yet it is one of the most invaluable tools for memory checking that you could ever have. It basically acts as a middle man between your program and libc, catching free/malloc/calloc/realloc calls and recording addresses, then making sure whatever calls you ‘alloc are free’d later. As a result, it can also detect reads and writes to free’d pieces of memory, which would segfault under normal conditions.

But most people just ignore what valgrind is saying. That’s bad. Don’t do that.

Some sample output:

==802== Memcheck, a memory error detector
==802== Copyright (C) 2002-2013, and GNU GPL'd, by Julian Seward et al.
==802== Using Valgrind-3.9.0 and LibVEX; rerun with -h for copyright info
==802== Command: ./check_dpx
==802== Syscall param timer_create(evp) points to uninitialised byte(s)
==802==    at 0x385AC03E72: timer_create@@GLIBC_2.3.3 (timer_create.c:82)
==802==    by 0x526E47B: srunner_run (check_run.c:407)
==802==    by 0x40361A: main (check.c:499)
==802==  Location 0xffefff750 is 0 bytes inside local_evp._sigev_un,
==802==  declared at timer_create.c:57, in frame #0 of thread 1
==802==  Uninitialised value was created by a stack allocation
==802==    at 0x3859015185: _dl_runtime_resolve (dl-trampoline.S:46)
==830==     in use at exit: 572,679 bytes in 96 blocks
==830==   total heap usage: 339 allocs, 243 frees, 1,510,826 bytes allocated
==830== 3 bytes in 1 blocks are definitely lost in loss record 5 of 83
==830==    at 0x4A0645D: malloc (in /usr/lib64/valgrind/
==830==    by 0x4C1A428: _dpx_frame_msgpack_from (frame.c:171)
==830==    by 0x4C149DB: _dpx_duplex_conn_read_frames (conn.c:60)
==830==    by 0x4E5948B: taskstart (task.c:71)
==830==    by 0x38594479FF: ??? (in /usr/lib64/
==830== 72 bytes in 1 blocks are definitely lost in loss record 66 of 83
==830==    at 0x4A0645D: malloc (in /usr/lib64/valgrind/
==830==    by 0x4C174BD: dpx_frame_new (frame.c:27)
==830==    by 0x402700: test_dpx_call (check.c:224)
==830==    by 0x4029E6: test_dpx_rpc_call (check.c:279)
==830==    by 0x526E87D: srunner_run (check_run.c:396)
==830==    by 0x40361A: main (check.c:499)
==830== 288 bytes in 1 blocks are possibly lost in loss record 72 of 83
==830==    at 0x4A081D4: calloc (in /usr/lib64/valgrind/
==830==    by 0x3859011C44: _dl_allocate_tls (dl-tls.c:296)
==830==    by 0x3859808862: pthread_create@@GLIBC_2.2.5 (allocatestack.c:580)
==830==    by 0x4C1BCD5: dpx_init (dpx.c:172)
==830==    by 0x4028F1: test_dpx_rpc_call (check.c:263)
==830==    by 0x526E87D: srunner_run (check_run.c:396)
==830==    by 0x40361A: main (check.c:499)
==830==    definitely lost: 75 bytes in 2 blocks
==830==    indirectly lost: 0 bytes in 0 blocks
==830==      possibly lost: 288 bytes in 1 blocks
==830==    still reachable: 572,316 bytes in 93 blocks
==830==         suppressed: 0 bytes in 0 blocks
==830== Reachable blocks (those to which a pointer was found) are not shown.
==830== To see them, rerun with: --leak-check=full --show-leak-kinds=all
==830== For counts of detected and suppressed errors, rerun with: -v
==830== ERROR SUMMARY: 7 errors from 4 contexts (suppressed: 2 from 2)

All that output looks, frankly, terrifying at first. But it’s actually not that hard to interpret. It tells you how many blocks of memory it thinks are lost, and where they were allocated. That’s it!

How about output for invalid reads?

I don’t have those, because I rage… obliterated them. Yes. That.

My problem was that something was trying to write to the channel after it was free’d. That has now been solved, thanks to valgrind outputting 8 pages of “invalid write of size 8, here (insert stacktrace), to a block that was free’d here (insert stacktrace)“.

valgrind treats you like you don’t know what you’re doing

In my case, that’s probably true. Manual memory management sucks.

But it’s good, because there’s definitely way more actual errors detected than false positives. Case in point, here was my recycling bin after I printed out my valgrind output and went through all of them one by one:

so much paper...

I feel old-fashioned because I do code reviews on paper, but hey, it’s easier for me. I can’t be the only one who agrees… right?

so what are you trying to tell me

Valgrind your program. It’s good for detecting memory leaks and the likes of it , can tell you when you’re trying to murder poor memory fields you don’t have access to, and also other cool things. Really.

Here, I’ll get you started! You can run valgrind with valgrind --leak-check=yes --read-var-info=yes --track-origins=yes ./[program] and receive your lovely output. By default, valgrind prints to stderr. Redirect it to stdout if you want to pipe it to ansi2html or something similar to to print it out.

what was the point of this article

I don’t know, just that I had a problem, used Valgrind, solved said problem?