Tag Archives: GSoC2012

The second hardest bug I have ever encountered, an essay on C: The Portable Assembly

C is a special language. I’ve always understood that but I used to think it was in reference to C’s history. Reading tutorials as an intrepid learner I kept coming across references to C as the syntactical parent to X language’s syntax. What I did not know is that C is fundamentally different from all other modern languages.

C is a portable assembly, to borrow a common description. I’ve heard people refute the description, citing C’s expressive power and structured coding. I am forced to agree with their argument but not their conclusion. C does indeed look nothing like assembly  yet that difference is only on the surface, and the surface is easy to scratch through. C abstracts away the triviality of assembly but leaves the programmer exposed to the intricacies.

For generic programming C++ has templates, Java has Generics, yet C only has void pointers. Void pointers are a curtain beyond which the compiler cannot perform any type checking. For libjtapi I needed a generic dictionary data structure, which meant I needed void pointers. Thus over the weekend I found myself programming in C but debugging like I was in assembly.

The bug itself is less significant that what it took to debug. A story which might illustrate why C is like no other modern language.

Of course the bug was my fault. There is a saying that poor programmers doubt their tools, something I have seen before. Despite knowing this truth I was almost ready to test my code with a different compiler thinking GCC at fault.

If I ignored the bug it disappeared, that is not how bugs are supposed to work. A Heisenbug.

The bug was sneaky, it hid itself. If I were to comment out the error case that checked for it then of course libjtapi would never complain, that is an obvious result. The logical result of doing so would be unreported data loss. That is how bugs work. Yet if I did comment out the check everything worked as intended and no data was lost. Line for line the input was the same as the output, just as it should be.

A next logical step would be tracing print statements showing what was occurring. In modern languages you can even dump complete data structures, in C you have to settle for printing primitives like integers. Yet even a single print statement would throw the code into a state of data loss. Data that was supposed to be in a dictionary was being reported non-existent between two layers of abstraction.

I needed to find the bug’s on/off switch. Knowing what caused  the bug means I would know the bug’s type. It means I would know where to look.

My next step was to add intention time delay. A sleep(1), which pauses for one second, in the right spot would cause breakage in a different dictionary, progress. A sleep(0.001) call would bring us back to the land of functionality,  more progress. I should make a side note here, this was not the next logical step of debugging. Libjtapi is not multi-threaded and memory writes are synchronous, thus it was impossible that time might cause the bug.Yet my brain getting ever more tired was not willing to accept this.

In Calgary darkness is warded off by city lights

Programming at night is often a bad idea, double so for debugging. By now it was getting late and this bug was becoming mythical. The simple act of checking made it leap into existence and voodoo acts of timing would ward it off. A sense of dread took form, I knew this type of bug from assembly, my bug was somehow related to the intricacies of hardware. No amount of stackoverflow is doing to solve a fundamental misunderstanding of how my hardware works.

In the morning I took a different approach, maybe some variable were not getting initialized. Initializing one variable appeared to yield progress, everything worked. By know I was getting wise to the bug’s methods. I proceeded to add explicit initialization to other variables. The code responded in turn to switch between breaking in two places and not breaking anywhere.

This was real progress. My changes amounted to consuming room in lower memory space, expanding my binary. Which pushed all memory addresses, and thus pointers, up in memory. Thus the bug was somehow related to memory alignment. Memory alignment is a hardware restriction, that datatypes can only be read from an address which is divisible by N, where N is based on the datatype’s size. Some void pointer was pointing to an address illegitimate for the datatype.

Libjtapi keeps metadata in arrays prior to dynamic caching

The question thus became which pointer? Before I could answer that I needed to identify the structure into which said pointer pointed.

With the intention of narrowing the candidates I added global char variables to pad the binary. This let me cycle though the bug’s three states. The idea being to find some number with which to identify the structure’s size, and it worked. I got 3. But before I could put that into use I got the best hint possible, libjtapi started to segfault. The illegitimate pointer was getting nudged just outside the boundary of libjtapi’s mapped memory. With this giant arrow I squashed the bug with ease.

You can see the current fixed code at the top of this post. C is not at fault for the bug, I am. Yet it is not to C’s credit that I lost a day of productivity. Sure an experienced C programmer could have avoided the bug. They would have noticed the invalid void cast that caused it. Yet so would an experienced assembly programer.

libjtapi data can now do a round trip

Today libjtapi reached a major milestone. It can now consume and produce a basic job ticket. This provides a way to inspect what data is in a libjtapi object. Prior to this I had to walk through the debugger for even the most trivial debugging. I am glad I took the assembly course last semester, otherwise progress would have been even slower.

The past week has been a flurry of bug fixes across the entire codebase. Libjtapi is now proper c89 and all buffer overruns have been eliminated.

Programming is a lot more fum now that I can see the results without a debugger. My next task is to eliminate memory leaks and add jtapi’s object destructors. After that comes support for the missing jtapi interfaces and objects

Enforcing better software engineering on myself, an experiment

I like to pretend I am a software engineer despite being enrolled in my university’s computer science program.

One of the many was I need to improve as a software engineering is to keep code with at the same level of abstraction. An example of a failure to maintain coherent abstraction would be to manipulate a linked list in the same function you manipulate a different list using proper accessors methods. Instead you should be using methods for both lists.

In libjtapi I am experimenting with not having header files. This will force me to treat the libjtapi call-stack like a Directed Acyclic Graph. The important part being the acyclic attribute. This means that a low level function cannot call a higher level function, this is enforced by the compiler. I thus have to think hard about which layer of abstraction I am coding in at any point in time.

Or at least that is the hypothesis, I’ll see how the code matures.

libjtapi first code

After struggling with netbeans as a build platform I’ve switched to cmake.

The motivation for netbeans was to reduce the initial barriers to development. I expected that an eventual move to a full-fledged build system would be needed but I wanted to put that off until libjtapi had a substantial codebase. The good news is that cmake was much easier to setup than I expected. Thus I have now started on libjtapi proper.

My next milestone is to get the libjtapi hello world compiling.

Then fill out the dummy functions so that the hello world does what it is supposed to do.

Then refactor and code review.

 

Printed jtapi & headers

Before I write a single line of code I want to build a thorough mental map of the jtapi standard. The first thing I did was print off the headers and study those. That only took me so far and raised some further questions. Now I’m reading the jtapi spec itself.

The jtapi spec is are the top pages and the headers are on the bottom.

It just goes to show how not often I print that I was surprised how many pages 79 pages are once printed.