Tag Archives: C++

Nullptr

This last week, I spent a couple days converting all of the uses of NULL in Sauce to nullptr. As such, I wanted to take this opportunity to jot down some notes on what the differences are and why I believe that it is worth your time to convert your own code to use nullptr if you haven’t already.

NULL

Similar to many other C++ code bases, Sauce used the following code to define NULL:

#if !defined(NULL)
   #define NULL	0
#endif

While it is true that many commercial software packages and games have shipped with this definition of NULL, it is still problematic.

The reason why is that NULL simply an integer, not an actual null pointer. The danger stems from the fact that other constructs can be implicitly converted to and from integers. This means that using NULL can hide bugs.

In fact, I’ll be the first to admit that during the transition to nullptr, I found a few of these types of conversion errors in Sauce. Sure, the code still compiled and ran — but it was a bit disheartening to find them nonetheless.

Nullptr

Unlike NULL, nullptr is a keyword (available starting in C++11). It is defined to be implicitly converted to any pointer type, but cannot be implicitly converted to an integer. This allows the compiler to recognize the sort of type mismatches we hope that it would.

Let’s walk through an example to see the difference.

An Example

The problem we will explore in this example is the fact that booleans can be compared with NULL without a compiler warning. This is because the C++ standard declares that there is an implicit conversion from bool to int.

// signature:
Result* DoSomething();

// client code:
if (DoSomething() == NULL)
   printf("NULL\n");
else
   printf("NOT NULL\n");

Although this example is a bit contrived, the danger it exemplifies is real.

What happens if we change the return type of DoSomething() from Result* to bool? A change like this is certainly not unheard of — instead of using a full object, maybe we feel like we can reduce the result to a simple boolean.

// signature:
bool DoSomething();

// client code:
if (DoSomething() == NULL)
   printf("NULL\n");
else
   printf("NOT NULL\n");

The code still compiles — no additional warnings (even on Warning level 4!). That seems wrong … checking if a boolean is equal to null pointer is nonsensical and the compiler should bark when it comes across code like this, right?

Unfortunately, the compiler can’t detect the issue because we are using a #define as a stand-in for a null pointer. Remember, it’s not really a null pointer, it’s just the same value that a null pointer evaluates to: 0. Therefore, we shouldn’t be surprised when corner cases like this result in unexpected behavior.

So what happens if we replace NULL with the nullptr keyword?

// signature:
bool DoSomething();

// client code:
if (DoSomething() == nullptr)
   printf("NULL\n");
else
   printf("NOT NULL\n");

Now the compiler will generate an error stating that there is no conversion from 'nullptr' to 'int'. This is much better. We know that there is a type mismatch in the comparison, and we can repair the issue.

Final Thoughts

Simply put, the nullptr keyword is a true null pointer, while NULL is not.

When I was first deciding whether I was going to undertake the conversion task, I felt a bit overwhelmed at the number of changes that I would have to make (at the time there were over 5000 instances of NULL in Sauce). However, as I mentioned earlier, had I not made the transition to nullptr, those silent implicit conversion bugs would surely still be there. As such, I feel that Sauce is far better off with nullptr.

Scoped Enums

For the most part, I really like the C++ language. That said, I also have a small list of things that I wish had been done differently. For years, enumerations have been at the top of that list. Enums have a couple of distinct problems that make them troublesome, and while there are techniques to mitigate some of their issues, they still remain fundamentally flawed.

Thankfully, C++11 added scoped enums (or “strongly-typed” enums), which address these problems head-on. In my opinion, the best part about scoped enums is that the new syntax is intuitive and feels natural to the C++ language.

In an effort to build a case for why scoped enums are superior, we will first discuss the aforementioned deficiencies of their unscoped counterparts. Throughout this discussion we will also outline how we addressed some of these concerns in Sauce. Afterward, we will explore scoped enums and the task of transitioning Sauce to use them.

Terminology

Before we begin, let’s briefly establish some terminology. An unscoped enum has the following form:

enum IDENTIFIER
{
   ENUMERATOR,
   ENUMERATOR,
   ENUMERATOR,
};

The identifier is also referred to as the “type” of the enum. The list inside the enum is composed of enumerators. Each enumerator has an integral value.

Problem 1: Enumerators are treated as integers inside the parent scope.

Aliased Values

Consider the case where you have two enums inside the same parent scope. Unfortunately, there is no reinforcement by the compiler to say that a given enumerator is associated with one enum over the other. This can cause a couple issues. Here’s an example:

namespace Example1
{
   enum Shape
   {
      eSphere,
      eBox,
      eCone,
   };

   enum Material
   {
      eColor,
      eTexture,
   };
}

Now let’s see what happens when we try to use these enums in some client code:

const Example1::Shape shape = Example1::eSphere;
if (shape == Example1::eSphere)
   printf("SPHERE\n");
if (shape == Example1::eBox)
   printf("BOX\n");
if (shape == Example1::eCone)
   printf("CONE\n");

if (shape == Example1::eColor)
   printf("COLOR\n");
if (shape == Example1::eTexture)
   printf("TEXTURE\n");

The code above prints out both “SPHERE” and “COLOR”. This is because unscoped enum enumerators are implicitly converted to integers and the value of shape is 0, which matches both eSphere and eColor.

Sadly, the only workable solution is to manually assign a value to each of the enumerators that is unique within the parent scope. This is far from ideal due to the added maintenance cost.

Enumerator Name Clashes

Additionally, there is a second issue that arises from the fact that enums are swallowed into their parent scope: enumerator name clashes. For instance, consider modifying the previous case to add an “invalid” enumerator to each enum. While this makes sense conceptually, the following code will not compile:

namespace Example2A
{
   enum Shape
   {
      eInvalid,
      eSphere,
      eBox,
      eCone,
   };

   enum Material
   {
      eInvalid,
      eColor,
      eTexture,
   };
}

Although enumerator name clashes are not too common, it is generally bad practice to establish coding conventions that depend on the rarity of such situations.

Consequently, this usually forces you to mangle the enumerator names to include the enum type. Modifying the previous example might look something like this:

namespace Example2B
{
   enum Shape
   {
      eShape_Invalid,
      eShape_Sphere,
      eShape_Box,
      eShape_Cone,
   };

   enum Material
   {
      eMaterial_Invalid,
      eMaterial_Color,
      eMaterial_Texture,
   };
}

This version of the code will compile, but now the enumerator names look a little weird. Also, it is important to point out that we are now repeating ourselves: the enum identifier and each of the enumerators.

Another way to solve the name clash issue is to wrap the enum with an additional scoping object: namespace, class, or struct. Employing this method will allow us to keep our original enumerator names, which I like. However, it actually introduces a new problem: now we need two names… one for the scope and one for the enum itself.

Admittedly, there are a few different ways to handle this, but for the sake of the example let’s keep things simple:

namespace Example2C
{
   namespace Shape
   {
      enum Enum
      {
         eInvalid,
         eSphere,
         eBox,
         eCone,
      };
   };

   namespace Material
   {
      enum Enum
      {
         eInvalid,
         eColor,
         eTexture,
      };
   };
}

While the extra nesting does make the declaration a bit ugly, it solves the enumerator name clash problem. Furthermore, it also forces client code to prefix enumerators with their associated scoping object, which I personally consider a big win.

// in some Example2C function...

const Shape::Enum shape = GetShape();
if (shape == Shape::eInvalid)
   printf("Shape::Invalid\n");
if (shape == Shape::eSphere)
   printf("Shape::Sphere\n");
if (shape == Shape::eBox)
   printf("Shape::Box\n");
if (shape == Shape::eCone)
   printf("Shape::Cone\n");

const Material::Enum material = GetMaterial();
if (material == Material::eInvalid)
   printf("Material::Invalid\n");
if (material == Material::eColor)
   printf("Material::Color\n");
if (material == Material::eTexture)
   printf("Material::Texture\n");

In fact, before the transition to scoped enums, most of the enums in Sauce were scoped this way. Unfortunately, the availability of choices in situations like this breed inconsistency. Sauce was no exception: namespace, class, and struct were all being employed as scoping objects for enums in different parts of the code base (needless to say, I was pretty disappointed in this discovery).

Problem 2: Unscoped Enums cannot be forward declared.

This bothers me a lot. I’m very meticulous with my forward declarations and header includes, but unscoped enums have, at times, undermined my efforts. I also feel like it subverts the C++ mantra of not paying for what you don’t use.

For instance, if you want to use an enum as a function parameter, the full enum definition must be available, requiring a header include if you don’t already have it.

The following is a stripped-down example of the case in point:

Shape.h

namespace Shape
{
   enum Enum
   {
      eInvalid,
      eSphere,
      eBox,
      eCone,
   };
}

ShapeOps.h

#include "Shape.h"    // <-- BOO!

namespace ShapeOps
{
   const char* GetName(const Shape::Enum shape);
}

Unfortunately, there is no way around using a full include with unscoped enums. The situation is even more costly if the enum is inside a class header file that has its own set of includes.

Scoped Enums

Scoped enums were introduced in C++11. I am excited to report that not only do they solve all of the issues discussed above, but they also provide the client code with clean, intuitive syntax.

A scoped enum has the following form:

enum class IDENTIFIER
{
   ENUMERATOR,
   ENUMERATOR,
   ENUMERATOR,
};

That’s right — all you have to do is add the class keyword after enum and you have a scoped enum!

Converting the final example from the last section to use a scoped enum looks like the following:

Shape.h

enum class Shape
{
   eInvalid,
   eSphere,
   eBox,
   eCone,
};

ShapeOps.h

enum class Shape;   // forward declaration -- YAY

namespace ShapeOps
{
   const char* GetName(const Shape shape);
}

Here is an example of client code:

const Shape shape = GetShape();
if (shape == Shape::eInvalid)
   printf("Shape::Invalid\n");
if (shape == Shape::eSphere)
   printf("Shape::Sphere\n");
if (shape == Shape::eBox)
   printf("Shape::Box\n");
if (shape == Shape::eCone)
   printf("Shape::Cone\n");

This is exactly what we were looking for all along!

Another advantage of scoped enums is that they cannot be implicitly converted to integers. This solves the enumerator value aliasing we described earlier and is enforced by the compiler.

Transitioning to Scoped Enums

Sauce is a fairly large code base: ~200K lines of code at the time of this writing. It took me a few days to convert 100+ unscoped enums to scoped enums. Due to the fact that I was manually scoping all of the enums, this is not a simple “search and replace” task. Additionally, I spent the extra time replacing includes with forward declarations, when appropriate.

Overall, I strongly believe that the time investment is well worth the time spent. The scoped enum syntax is natural, and the fact that they can be forward declared opens an opportunity to drop your header include count in some places. If you are considering the task of transitioning your legacy code base to scoped enums, I highly recommend it!

UI 2.0

I just finished a complete overhaul of our User Interface system. This is a big deal because the Interface library has been an integral component in our engine, not to mention the fact that it was the product of many months of work.

Now that the redesign is complete, I wanted to take this opportunity to outline some of the decisions that were made and discuss why the redesign was needed in the first place. To keep things straight, throughout this article I will refer to the old system as “UI 1.0”, and the new system as “UI 2.0”.

A Brief Overview of UI 1.0

UI 1.0 was encapsulated in a single library called Interface. It was one of our largest libraries due to the number of controls it implemented:

  • Panel
  • Label
  • Button
  • CheckBox
  • ToggleButton
  • RadioButton
  • PictureBox
  • SceneView
  • Selector
  • TabControl
  • TextBox
  • NumericUpDown
  • ScrollBar (Horizontal and Vertical)
  • TrackBar (Horizontal and Vertical)
  • ScrollPanel
  • TableLayoutPanel

The Interface library was based on the set of controls I had created for our XNA codebase a couple years prior. For the most part, I was able to directly port the behavior logic into Sauce; however, for the visuals, our requirements were quite different. In particular, we wanted to support different variations of the controls: a basic one for our testers as well as one for each of the game projects.

To satisfy the requirement of visual variation, UI 1.0 was built around the concept of Styles: each Control (Panel, Button, CheckBox, etc.) had a corresponding Style (PanelStyle, ButtonStyle, CheckBoxStyle, respectively). Each control Style was an abstract interface which was implemented by our different variations.

An example for Button:

UI 1.0 Button Diagram

In Model-View-Controller terms, the Control classes encapsulated the Model and Controller components, while the Style was the View. This made sense because regardless of how it looked, the behavior of a Control remained unchanged. So the idea was that Controls could be written once, and custom Styles could be derived from a corresponding abstract base class.

In practice, a Style pointer would be assigned on each Control, which simply forwarded on the task of rendering to the Style (if it had been assigned). Also, Styles were designed to be shared across multiple Control instances. This way, we could adjust a Style and all the associated Controls would instantly be updated.

Design Flaws

While we had been able to use the Interface library with this feature set for the good part of the last two years, unfortunately, there were a couple of fundamental problems with its architecture that prevented us from doing some essential things.

First, it turned out that most (if not all) of the data members from the control were needed to render its visual representation. This spiraled into a mess where some Style routines required nearly ten parameters each.

Not only was this painful to work with, but it also gave rise to another problem. Some derived Styles required certain data while others did not; however, the only way to access the “extra” data was to add it to the parameter list(s) in the Style base class. This led to several verbose and sometimes unintuitive interfaces.

Second, as mentioned earlier, Styles were designed to be shared. While this meant that we needed less objects, it also meant we were unable store any state for the purposes of rendering. In other words, Controls were required to house all of the state data. This broke two things: 1) a Control now needed to keep visual data when it was intended to only be the model and controller; and 2) new Controls needed to be written to support different views — which was the exact problem the Styles architecture was attempting to solve in the first place.

For example, there was no way to create a Button with a glow that would pulse. Since all Buttons were tied to the same Style, glow state data would need to be stored in the Button — but not all Buttons need a glow state!

Eventually, I realized that both of the design flaws actually stem from the same issue: MVC declares that the components should be separate (read: independent), but that should not be confused with restricted access. To be effective, the visual component needs access to all of the data about the Control, as well as have its own state data.

Introducing UI 2.0

Considering the issues outlined above were architectural, I knew that the UI system would have to be redesigned. There was no doubt that this was going to be a huge undertaking, so I decided to make a prioritized list of goals:

  1. Remove Styles and migrate to system where Controls are responsible for display
  2. Design for Composite Controls
  3. Implement a real Scrollable Area Control
  4. Support for Nine-Patch based Controls
  5. Improve rending performance
  6. Animation support

I eventually tabled the last two since they require some groundwork to be completed in our Graphics system before they can implemented. Perhaps they will be at the top of the list for UI 3.0…

Redesigned Control Hierarchy

For 2.0, I decided to create three separate libraries:

  • Ui: contains the abstract base classes for standard controls.
  • BasicUi: contains an implementation of Ui controls, using simple borders and backgrounds.
  • FlexUi: contains an implementation of Ui controls, using Nine-Patch for visuals.

UI 2.0 Library Diagram

Each Control became an abstract base class, establishing the interface and handling the behavior logic. At the same time, Styles were removed and their functionality was extracted into respective derived classes.

Composite Controls

Trying to create composite Controls in UI 1.0 was really painful. Styles would have to be passed down through the Control interface. This cemented the sub-controls utilized by the composite, stripping away flexibility.

In UI 2.0, I wanted to be able to use sub-controls without them being baked into the interface of the Control; in other words, I wanted them to be implementation details, which is what that they actually are.

NumericUpDown Controls

My primary test case for Composite support was the NumericUpDown. In UI 1.0, the NumericUpDown had two buttons (+/-), but the value display was just static text, which could neither be edited nor copied. I really wanted to replace the static text with a TextBox control, but the Style framework was posing as an obstacle instead of a means to a solution.

By implementing the Controls as a hierarchy with an abstract base class, creating Composites fell into place naturally. This was a pleasant and most welcomed surprise, especially after working with the mess in 1.0

The only difficulty I found was in determining where to place the sub-controls. In the NumericUpDown, I used virtual functions to instantiate derived versions of the TextBox and Buttons, and then used their abstract interfaces in the update logic. While this works just fine, it feels a bit inside-out. As I mentioned above, I came to the conclusion that sub-controls should be implementation details. To stand true to this statement, the TextBox and Buttons should really be created and updated in the derived controls instead of in the abstract base class. However, structuring Composites in that way also means that there is bound to be a decent amount of duplication of the update logic code in each of the derived controls. So at this point, I’m ambivalent as to which design is superior.

Scrollable Panel

UI 1.0 included a proof of concept implementation of a scrollable panel: ScrollPanel. Unfortunately, I quickly realized during development that there was just no way to add child Controls to the scroll canvas.

Scrollable Panels are a must-have feature for our target project, so this had to be addressed.

ScrollPanel

In UI 1.0, a ControlManager class handled all the rendering and intersection traversal through recursion. This was possible because the Control base class had a list of child controls that the ControlManager could access and manage the flow. As such, Control implementations were very simple since they were only responsible for rendering themselves. However, this setup was far too rigid and did not allow for Controls to render children within their Render() function.

For UI 2.0, I decided that each Control would have to be responsible for intersecting and rendering their child controls. While this places a lot more of a burden on each Control, it enables us to implement a virtual canvas for the ScrollPanel’s child controls. In practice, I found this structure to be a bit more intuitive than the former, since there is less code “hiding” in the base Control class implementation. It also made the base Control class a lot more lightweight, which is always a good thing.

Building a scrollable panel is no simple task. There are a lot of details to consider when intersecting and rendering a virtual canvas. Aside from getting the architecture right, this was probably the most difficult part of implementing UI 2.0.

Nine-Patch

As was the case in UI 1.0, I wanted to have two distinct sets of controls: one for tester widgets and a dashboard, and another set for in-game UI.

For the most part, I have made use of the first set, which I call “Basic UI”. Basic UI Controls are visually simple: solid color backgrounds, borders, simple text.

BasicUi-Button

I dubbed the “in-game” control library: Flex UI. The controls are primarily based on using a Nine-Patch to draw their backgrounds and borders.

FlexUi-Button

A Nine-Patch is actually just a single texture sliced into 9 parts (as shown the in the figure below). The benefit to using a Nine-Patch is that you can keep crisp corners and edges, while stretching the texture in the directions you would naturally expect.

Nine-Patch Example

The only caveat is that you need extra data to know where to make the slices. For now, the Flex UI assumes that the corners are 16 x 16 pixels, but the intent is to make the system more robust to accept any slice sizes.

Stylesheets and Factories

At the very beginning of the redesign, I was hoping to implement some sort of Stylesheet system. After attempting a proof of concept, I realized that the same types of problems I had been trying to avoid were beginning to make their way back into the system. Consequently, I tabled the idea.

Later on in development, I resolved that the best way for Composite controls to create their sub-controls was for all controls to have access to a ControlFactory. So I created the ControlFactory as an abstract base class that is implemented by the Basic UI and Flex UI systems.

As it turns out, the ControlFactory is actually the perfect place to put the Stylesheet since the visual data can be applied to the corresponding control type. The only thing missing (without some substantial changes) is that the Controls cannot be updated dynamically if a Stylesheet is modified. I decided that while such a feature is cool to have, it would never be used in a final game project.

Final Thoughts

Although it took me a few months to complete, the UI 2.0 system is now in place and being used by the rest of the engine in the same capacity as the old Interface library. The effort was well worth it. I feel that the new architecture is far more flexible and extensible than the previous. Also, in addition to bringing the new features online, the overhaul gave me a chance to address a lot of the little things that had been bothering me, which is always nice.

JSON

Early last month, I set out to add support for JSON into our game engine. To my surprise, it turned out to be a fun and rewarding adventure.

JSON Logo

JSON is a very nice format that is fairly easy to parse. Its feature set is small and well defined, including:

  • explicit values for null, true, and false
  • numbers (integers and floating-point)
  • strings
  • arrays
  • hash tables

This feature set is perfect for configuration files, stylesheets, etc. In the past, I have used XML for these sort of things, but JSON is much more direct and compact.

Initially, I reached for an external library to wrap, just as we have done for many of the other file formats we support, namely: PNG, XML, FBX, and OGG. Of course, when it comes to external libraries, your mileage will vary. For example, we use TinyXML 2, as the basis for our XML library; it was a real pleasure to use — a very straightforward, well designed interface. The FBX SDK, on the other hand, is pretty atrocious.

Unfortunately, I wasn’t very satisfied when it came to JSON. Many of the C++ JSON libraries out there make use of STL and/or Boost, dependencies we have striven to avoid. Eventually I settled on RapidJSON due to its high praise on the web; however, about half way through my wrapper implementation, I concluded that its interface is not as clean and “wrappable” as I had originally thought it to be.

After some reflection, I decided that the best way forward was to roll my own. I found that rolling your own is an excellent decision for a few reasons:

First, the JSON format is relatively small, unambiguous, and well documented. This allows you to focus on the architecture and interface of your wrapper. I found the experience both valuable and refreshing.

Second, you are able to employ the use of your native data structures. Naturally, this is a great way to test your functionality and interface. In the case of Sauce, I was able to leverage the following Core structures: String, Vector, Array, and HashMap.

Last, but not least, I found it to be a whole lot of fun! It’s been a while since I’ve done anything like implementing a format encoder and decoder. Hopefully when you’re finished, you feel the same.

After I finished our JSON library, I converted our config files from XML to JSON with very little effort. The result is that our config files are more compact than they were with XML, and now we have the utilities required for future development. Overall, I feel it was well worth the time and effort.

Streams

In Sauce, we have a small, tight Streams library to handle the input and output of data in a standardized manner. After all, a game engine isn’t very exciting without the ability to read in configuration and asset data.

We use a stream as our main abstraction for data that flows in and out of the engine. In the case of input, the engine doesn’t need to know the source of those bytes; they could be coming from a file, memory, or over the network. The same holds true for output data. This is an extremely important feature that we can exploit for a number of uses, including testing.

Also, it should be noted that a stream is not responsible for interpreting the data. It is only responsible for reading bytes from a source or writing bytes to a destination.

As you might expect, we have two top level interfaces: InputStream and OutputStream. We’ve seen code bases where these are merged into a single Stream class that can read and write; however, we prefer to keep the operations separate and simple. Each of these interfaces has a number of implementations as described below.

Input Streams

InputStreams

The primary function for an InputStream is to read bytes.

Also, we store the endianness of the stream. This is an important property of the stream for the code that interprets the data. If the stream and the host platform have different endians, the bytes need to be appropriately swapped after being read from the InputStream.

Our Streams library features three types of input streams:

  • File Input Stream
  • Memory Input Stream
  • Volatile Input Stream

File Input Stream

This is probably the first implementation of InputStream that comes to mind. The FileInputStream is an adaptor from our file system routines to open and read from a file to the InputStream interface.

As an optimization, we buffer the input from the file as read requests are made. However, this is an implementation detail that is not exposed in the class interface; we could just as well read directly from the file — the callsite shouldn’t know or care.

Memory Input Stream

The MemoryInputStream implements the InputStream interface for a block of memory. In our implementation, this block can be sourced from an array of bytes or a string.

This implementation in particular is extremely useful for mocking up data for tests. For example, instead of creating separate file for each JSON test, we can put the contents into a string and wrap that in a MemoryInputStream for processing.

Volatile Input Stream

Simply put, the VolatileInputStream is an InputStream implementation for an external block of memory.

For safety, the MemoryInputStream makes a copy of the source buffer. This is because in many cases, the lifetime of an InputStream may be unknown or exceed the lifetime of the source buffer.

Of course, in the cases when we do know the lifetime of the source buffer will not exceed the use of the InputStream, we can make direct use of the source buffer. This is the core principle behind the VolatileInputStream.

Output Streams

OutputStreams

The primary function for an OutputStream is to write bytes.

Also, just like in the InputStream, we store the endianness of the stream. This is an important property of the stream for the code that writes the data. If the stream and the host platform have different endians, the bytes need to be appropriately swapped before being written to the OutputStream.

Our Streams library features two types of output streams:

  • File Output Stream
  • Memory Output Stream

File Output Stream

Similar to the input version, a FileOutputStream is a wrapper around our file system routines to open and write to a file.

However, unlike the FileInputStream, we do not buffer the output.

Memory Output Stream

The MemoryOutputStream implements the OutputStream interface for a block of memory. The internal byte buffer grows as bytes are written.

For convenience, we added a method to fetch the buffer contents as a string.

Again, this is extremely useful for testing code like file writers.

Readers and Writers

Admittedly, the stream interfaces are very primitive. They are so primitive, in fact, that they can be a bit painful to use by themselves in practice. Consequently, we wrote a few helper classes to operate on a higher level than just bytes.

We’ve found this to have been an excellent choice. It is not unusual for a single stream to be passed around to more than one consumer or producer. Separating the data (stream) from the operator (reader/writer) provides us the flexibility needed and the opportunity to expose a more refined client interface.

Readers

For InputStreams, we implemented a BinaryStreamReader and a TextStreamReader.

The BinaryStreamReader can read bytes and interpret them into primitive data types, as well as a couple of our Core data types: strings and guids. We use this extensively for reading data from our proprietary file formats.

The TextStreamReader can read the stream character by character, or whole strings at a time. This makes it ideal for performing text processing tasks like decoding JSON.

Writers

For OutputStreams, we implemented a parallel pair of writers: BinaryStreamWriter and TextStreamWriter. In both, we perform the appropriate byte swapping internally when writing multi-byte data types.

The BinaryStreamWriter can take the same set of data types supported by the Reader and write their bytes to the given OutputStream.

The TextStreamWriter can write characters or strings to the given OutputStream.

Summary

The Sauce Streams library has been a vital component to our development. We use it to read in models, textures, and configuration files; and we use it to write out saved games and screenshots.

We hope that this high-level discussion will help our readers with designing their own set of stream classes.

Assert

This article was originally published on November 15, 2008 at Pulsar Engine and my personal blog.

It has been updated to employ the conventions we use in Sauce instead of the ones I had previously used in Pulsar — aside from that, the content remains unchanged. This article still reflects my views and code.

Assert is one of the most important, indispensable, underused devices in a programmer’s toolkit.

One thing that I’ve found is that I can’t live without is the assert mechanism. In fact, it’s the first thing I wrote in the Sauce codebase. This is something that has “plagued” me ever since I was first introduced to it in Steve Maguire’s Writing Solid Code. Furthermore, throughout my reading adventures I’ve found a number of good passages about the assert mechanism. Unfortunately, this cardinal knowledge is spread across several books, so I’ve decided to compile a more compact, centralized synopsis to help give readers a detailed overview.

assert()

Whenever I research something, I always find it best to approach it from multiple angles in hopes to gain a more complete understanding. There are many different ways to “define” assert() — in fact, I have no less than 10 books in my own personal library that devote some portion of their text to this very device! [1]

So let’s start where I started: Writing Solid Code by Steve Maguire. Maguire devotes an entire chapter to the assert() mechanism, in which he outlines its potential to improve both the development and quality of a codebase. And if that wasn’t enough, he also exposes a few common pitfalls that have been known to gobble up the unsuspecting unseasoned programmer.

assert is a debug-only macro that aborts execution if its argument is false.

— Steve Maguire, Writing Solid Code

I love Maguire’s straightforward description — it’s simple and precise. However, while this is a nicely formulated answer for something like an interview question, I don’t recommend using this as your explanation for someone who’s never heard of assert() before. In fact, as we will see throughout this article there are many implications that need to be dug out of this pithy answer and brought to light.

Also, I want to emphasize the fact that assert() is a macro (or if you’re Stephen Dewhurst: a “pseudofunction”), not a function. This is an important point because we want to minimize the impact these checks make on our code as much as possible, so that the difference between the execution code for the debug and release builds is negligible. Calling a function proper is likely to cause a state change in our code which means both more overhead, the unloading / loading of registers, a change in the callstack, etc.

The C Programming Language (also often referred to as K&R) defines assert() as follows:

The assert macro is used to add diagnostics to programs:

void assert(int expression)

If expression is zero when assert(expression) is executed, the assert macro will print on stderr a message, such as

Assertion failed: expression, file filename, line nnn

It then calls abort to terminate execution. The source filename and line number come from the preprocessor macros __FILE__ and __LINE__. If NDEBUG is defined at the time <assert.h> is included, the assert macro is ignored.

To be clear, the K&R reference listing is specific to the assert() macro provided in the C Standard Library via the <assert.h> header (or the C++ equivalent <cassert>). It outlines not only the exact signature, but also provides details to its execution behavior. In comparison, Maguire’s definition is more about the concept of assert — the basics a programmer should bare in mind when they encounter or author an assert — allowing it to apply to any implementation: standard or custom.

Also, the fact that assert() is detailed in K&R states that it is an established device in C and C++ (as well as other languages). This means that you can use the assert() macro and be confident that it is both a portable and communal means of enforcing a necessary condition.

Assert statements are like active comments — they not only make assumptions clear and precise, but if these assumptions are violated, they actually do something about it.

— John Lakos, Large-Scale C++ Software Design

[Asserts] occupy a useful niche situated somewhere between comments and exceptions for documenting and detecting illegal behavior.

— Stephen Dewhurst, C++ Gotchas

Although these aren’t quite definitions, I chose to examine this pair of related excerpts because I think that together they conjure up an interesting idea that serves as an excellent supplement to the already said definitions. In particular, the notion that an assert is an “active comment” should make any programmer’s ears perk up.

How many times have you written the assumptions your function makes in a comment at the top of your code? How many times have you traced bugs down deep into the bowels of a codebase only to find that somebody (maybe you!) disregarded the assumptions outlined in the comment? If you follow Lakos’ advice and employ the use of assert statements to enforce the assumptions made in your code, you can rest assured that you’ll be notified if any of those assumptions are ever ignored again.

Finally, I want to finish this section with an important point made by Noel Llopis:

Asserts are messages by programmers for programmers.

— Noel Llopis, C++ for Game Programmers

This detail about the assert mechanism is probably the most forgotten over any other. Asserts are added to the code for a number of reasons (which we will soon get to), but none of them include the end-user.

Common Usage

After reading the above, you must be itching to see how to actually use an assert() in practice, so let’s take a look at a few common, simple, and admittedly contrived examples.

Example 1: Prohibiting the Impossible

Whenever you find yourself thinking “but of course that could never happen,” add code to check it.

— Andrew Hunt and David Thomas, The Pragmatic Programmer

Let’s say you have an enumeration of error codes which includes both fatal and non-fatal errors. In an effort to help make the distinction easier, the fatal error codes are negative while non-fatal error codes are positive. All errors are reported through the ErrorHandler, but create different objects depending upon whether they are fatal or non-fatal. Thus, you could add a “sanity check” to verify this assumption in the following manner:

enum ErrorCode
{
   ...  /* Fatal Errors < 0 */
   ...  /* Non-Fatal Errors > 0 */
}

class Error
{
public:
   virtual ~Error() = 0;

protected:
   ErrorCode mErrorCode;
}

class FatalError : public Error
{
public:
   explicit FatalError(const ErrorCode errorCode)
      : mErrorCode(errorCode)
   {
      assert( mErrorCode < 0 );
   }

   virtual ~FatalError() {}
}


class NonFatalError : public Error
{
public:
   explicit NonFatalError(const ErrorCode errorCode)
      : mErrorCode(errorCode)
   {
      assert( mErrorCode > 0 );
   }

   virtual ~NonFatalError() {}
}


void ErrorHandler::HandleError(const ErrorCode errorCode)
{
   Error* error = NULL;
   if(errorCode < 0)
      error = new FatalError(errorCode);
   else if(errorCode > 0)
      error = new NonFatalError(errorCode);
   ...
}

The assert() statements are being used here to guard against the impossible. And it’s important to note that if the “impossible” does happen (and believe me, it can), you should terminate the program immediately with extreme prejudice[2].

Example 2: Enforcing Preconditions

Preconditions are the properties that the client code of a routine or class promise will be true before it calls the routine or instantiates the object. Preconditions are the client code’s obligations to the code it calls.

— Steve McConnell, Code Complete

Now let’s say you wrote a function that computes the square root for a given value: SquareRoot(). Since you’re not dealing with complex numbers in your code, you don’t want this function operating on a negative value. In fact, you’d like to be notified if this function does happen to receive a negative value because that would mean that whatever calculations are using this routine aren’t handling all the possibilities before calling your SquareRoot(). This sounds like a perfect place to use an assert()!

float SquareRoot(const float value)
{
   assert( value >= 0 );
   ...
}

In this case, we are using an assert() to enforce the precondition that value parameter must be nonnegative. Again, this is something that will help a programmer during an application’s development cycle, not an end-user.

Example 3: Enforcing Postconditions

Postconditions are the properties that the routine or class promises will be true when it concludes executing. Postconditions are the routine’s or class’s obligations to the code that uses it.

— Steve McConnell, Code Complete

Using the same SquareRoot() function from Example 2, let’s add a check to ensure that our final output value sqrtValue is non-negative.

float SquareRoot(const float value)
{
   ...
   assert( sqrtValue >= 0 );
   return sqrtValue;
}

We could also add a check to ensure that our computation is correct:

float SquareRoot(const float value)
{
   ...
   assert( sqrtValue >= 0 );
   assert( FloatEquals(sqrtValue*sqrtValue, value, EPSILON) );
   return sqrtValue;
}

Example 4: Switch Cases

This is one of my favorite places to use the assert() because it helps guard against mistakes while allowing for code extension.

Let’s say you have a set of different platforms your code can output asset files for: Win32, Xenon, and PS3. However, it’s important to write your code in a way that allows for extension for additional platforms (say the Nintendo Wii). Here’s how we can enforce this in our AssetFileWriter:

bool AssetFileWriter::Write(const byte* data,
   const char* filename, const Platform platform)
{
   switch(platform)
   {
      case Platform::WIN32:
         /* write data to file for Win32 */
         break;
      case Platform::XENON:
         /* write data to file for Xenon */
         break;
      case Platform::PS3:
         /* write data to file for PS3 */
         break;
      default:
         assert(false);  /* invalid platform specified */
   }
   ...
}

Now when we get the green light to add the Wii to our list of supported platforms, we will be notified if we happen to forget to add the case in the AssetFileWriter.

Caveats

As Noel pointed out to us earlier, asserts are employed for the benefit of programmers, not end-users. They are active only for _DEBUG builds and are compiled out for all others. The consequences of forgetting this principle can be particularly dangerous. The details to this are two fold:

  1. The assert() expression argument should not contain any required code. This is because all instances of assert() in your code are omitted when compiling a non-debug build of your application.

    The canonical example of this is using a required function call inside assert() expression:

    bool WriteDataToFile(const char* filename, const byte* data);  /* prototype */
    
    assert( WriteDataToFile(filename, data) );  /* uh-oh... */
    

    Guess what happens when you compile a release build (or any build that defines NDEBUG for that matter)… no file for you! The solution to this should be clear, save the data you want to check to local variables and use those in the assert().

    const bool bFileWriteSuccessful = WriteDataToFile(filename, data);
    assert( bFileWriteSuccessful == true );  /* much better */
    

    To be perfectly clear, not all function calls must be avoided inside the assert(). For example, usage of strlen() or another const function guaranteed to have no side effects is perfectly fine to use inside the assert() expression. However, in C++ Gotchas, Dewhurst touts:

    Proper use of assert avoids even a potential side effect in its condition.

    — Stephen Dewhurst, C++ Gotchas

    alluding to functions that appear to be const from the outside, but actually perform some operations in order to return a value.

    Another good, yet more subtle example is the following:

    assert( ++itemCount < maxItemCount );   /* doh! */
    arrayOfItems[itemCount] = itemToAdd;    /* add item to list */
    

    In this case you’ll find that the value of your itemCount never changes in the release builds because your increment was compiled out (which means you’ll just keep overwriting the current item!) Here, we need to push the required increment operation out of the assert:

    assert( (itemCount+1) < maxItemCount );   /* safe check */
    arrayOfItems[++itemCount] = itemToAdd;    /* add item to list */
    
  2. assert() should not be treated as a means of or substitution for proper error-checking.

    The following code snippet resembles a case I have seen more often than I’d like to admit.

    char filename[MAX_PATH];
    OpenFileDialog(&filename, MAX_PATH);  /* fills in filename buffer*/
    assert( strlen(filename) > 0 );
    ...  /* do something with filename */
    

    Here the assert() is being used in a place where proper error-checking should be employed. This is dangerous because while the programmer is rightfully protecting the rest of their code from using an invalid filename, they are going about it in a way that assumes that a user will always proceed to choose a file in the OpenFile dialog. This is a bad assumption because the end-user most certainly has the right to (and will) change their mind about opening a file.

Asserts vs. Exceptions

An assert is different from an exception. A failed assertion represents an unrecoverable, yet programmer-controlled crash. A thrown exception represents an anomaly or deviation from the normal execution path — both of which have the possibly of being recoverable.

To be clear, you should use an assert() if the code after the assertion absolutely positively cannot execute properly without the expression in question evaluating to TRUE. One example of this is the pointer returned from malloc():

void* memBlock = malloc(sizeInBytes);
assert( memBlock != NULL );
...  /* use the allocated memory block */

If malloc() happens to return NULL (which can and does happen!), the rest of the code that follows which uses memBlock has no hope of working properly. As such, we have used an assert() so that we can control the termination of the application rather than allow the continued execution to do something worse.

Another good place for an assert statement is right before accessing an array at a given index. For example, say we have an ItemContainer class that holds an array of Items internally which has a const accessor routine GetItem(). Then we could use an assert() to check the requested Item index in the following manner:

const Item& ItemContainer::GetItem(const int itemIndex) const
{
   assert( (itemIndex >= 0) && (itemIndex < mItemCount) );
   return mItemList[itemIndex];
}

Because we unarguably don’t want to be accessing memory outside the bounds of the array, we have employed an assertion to notify us if the calling code tries to do just that. The importance of this example is two-fold: not only are we protecting ourselves from silently accessing “foreign” memory, but we are also enabling an additional internal layer of debugging for the calling code (which may have a bug in the way it computes the itemIndex parameter).

While I’m not going to go into all the details of exceptions and when and when not to use them, I will offer an example of usage for comparison.

Let’s say you have a routine that reads data from an external source (USB drive, network, etc.). What do you do if the data source is unplugged in the middle of a read? Do you terminate the program? NO WAY! The proper way to handle such a case is to throw an exception and let the exception propagate up to a safer point in the code where the exception can be handled by notifying the user with a nice little dialog box that says “I wasn’t done. Plug it back in!”, all without terminating the program. If it is the user’s intention to exit after they unplug the data source, force them to do so via the conventional means your application has made available to them — it’s just good manners.

Asserts vs. Unit Tests

There’s something else I want to be clear about: asserts are not a means of or substitution for unit testing. Just because we have included an assertion check to make sure the value of our SquareRoot() function is non-negative doesn’t mean we should forgo any appropriate unit testing.

Unit testing can only take you so far. If you write good tests, you can get good coverage for your code; but you don’t run unit tests for every value that ever enters your routine. That’s why if you forget a case in your unit tests, all your tests will pass but you’ll still have bugs. So both unit tests and asserts should be employed together. They are on the same team.

What if we want to test our assertions? Well, the best example I give you for this is outlined in a post about Assert from they guys at Power of Two Games: Stupid C++ Tricks: Adventures in Assert). Briefly, in order to test that an assertion is thrown you will need to add some additional hooks into the way you manage your assert() to ensure that your unit tests handle assertion failures gracefully.

Implementing a Custom Assert Macro

In this section I’m going to examine some custom implementations of assert(). For the most part they all implement the same functionality, but I found it very interesting how many different ways I had come across in the books on my shelf. We’ll also look at various implementation flaws (whether intentional or not) and finally wrap up with the implementation I use in Sauce.

The assert example from C++ Gotchas:

#ifndef NDEBUG
   #define assert(e) ((e) \
      ? ((void)0) \
      :__assert_failed(#e, __FILE__, __LINE__) )

#else
   #define assert(e) ((void)0)

#endif

I consider this the classic C skeleton of assert(). To customize, you would replace __assert_failed() with something that handles the assertion in a way tailored to your own needs.

Unfortunately, I have a problem with this method — unless __assert_failed() is a macro that includes the code to invoke a breakpoint (__asm int 3; and the like), it assumes you will break inside the function and not at the line that broke your code. A little annoying — but, seriously, even the little things are important. Following a recommendation to put the breakpoint code inside the assert() macro (not the handler function), you will be able to get the debugger to stop right at the failed assertion line rather than three callstacks up.

The custom assert example from Writing Solid Code:

#ifdef DEBUG
   void __Assert(char* filename, unsigned int lineNumber);   /* prototype */

   #define ASSERT(f) \
      if(f)          \
         {}          \
      else           \
         __Assert(__FILE__, __LINE__)

#else
   #define ASSERT(f)

#endif

This was the first custom assert() implementation I was introduced to. Looking back on it, one thing I don’t like about it is that Maguire doesn’t include a text version of the expression in the parameters provided to the assert handler.

Also notice that instead of ((void)0), absolutely nothing is evaluated for the Release build. This is a bit troublesome because, if you noticed, Maguire purposefully left off the semicolon after the __Assert(__FILE__, __LINE__) to give the compiler a reason to enforce the client to add one. But he’s left room for error in the Release build.

The custom assert example from Code Complete:

#define ASSERT(condition, message) {    \
   if( !(condition) ) {                 \
      LogError( "Assertion failed: ",   \
         #condition, message );         \
      exit( EXIT_FAILURE );             \
   }                                    \
}

At first you might say… Aha! What about a dangling else?! But if you look closely at McConnell’s code, he uses an additional pair of braces to enforce scoping on his version of assert(). The problem here is that you no longer are forced to append that semicolon.

Also, notice that McConnell doesn’t break either — he just exits. During development, I find it extremely handy to have a failed assert break into the debugger, not just print me an error message and bail.

Finally, where are the build configuration #ifdef‘s ? As is, this code’s going to run in all builds!

These might all sound like me knit-picking, but hey! this is assert()! This is something I use EVERYWHERE! so it better be solid, pristine, and easy to use.

The custom assert example from C++ for Game Programmers:

#ifdef _DEBUG
   #define DO_ASSERT
#endif

#ifdef DO_ASSERT
   #define Assert(exp, text) \
      if(!(exp) && MyAssert(exp, text, __FILE__, __LINE__)) \
         _asm int 3;

#else
   #define Assert(exp, text)

#endif

Noel’s got an interesting addition in his version of assert(): the DO_ASSERT symbol. I think this is actually a nice way to allow developers to turn assertions on and off for multiple builds, especially if you have more than just a Debug and Release configuration (i.e. a additional Debug build with optimizations turned on, or a “Final” build).

However, Noel’s suffers from a very important problem: a dangling else. Let’s say you happen to put this assert inside an if / else block. Guess what, that else is going to attach itself to the if inside the assert() when the macro expands, which is not the logic you had in mind. This can be combated with the scoping we’ve seen earlier, or my preference of a do / while(0) loop.

The authors of The Pragmatic Programmer argue that there’s no need to follow suit when it comes to printing an error message and terminating the application outright with abort() in your assertion. In fact, they claim that as long as your assertion handler is not dependent upon the data that triggered the assert you are free to perform any necessary clean up and exit as gracefully as possible. In other words, when an assertion fails it puts control of the “crash” into the hands of the programmer.

A good example of this is exhibited by Steven Goodwin in Cross-Platform Game Programming:

#define sgxAssert(expr)
   do { \
      if( !(expr) ) { \
         sgxTrace(SGX_ERR_ERROR, __FILE__, __LINE__, #expr); \
         sgxBreak(); \
      } \
   } while(0)

Again, this seems to be missing the #ifdef — maybe these authors are just trying to save trees?

Finally, taking all that I have learned — my implementation of assert() looks something like the following:

#ifdef SE_ASSERTS_ENABLED
   namespace Sauce { namespace AssertHandler {
      bool Report(const char* condition, const char* filename, int lineNumber);
   }}

   #define SE_ASSERT(cond) \
      do { \
         if( !(cond) && Sauce::AssertHandler::Report(#cond, __FILE__, __LINE__) ) \
            SE_DEBUG_BREAK(); \
      } while(0)

#else	// asserts not enabled
   #define SE_ASSERT(cond) \
      do { \
         SE_UNUSED(cond); \
      } while(0)

#endif  // end SE_ASSERTS_ENABLED

I’m utilizing the same trick Noel offers in his version to separate out the enabling of asserts via a single defined symbol SE_ASSERTS_ENABLED.

The SE_DEBUG_BREAK() macro calls __debugbreak() rather than straight __asm int 3; for the sake of x64 compatibility.

The SE_UNUSED() macro utilizes the sizeof() trick detailed in Charles’ article.

I’m also utilizing the do / while(0) loop convention to wrap my macros (including the SE_DEBUG_BREAK() and SE_UNUSED()) for scoping and to enforce the requirement of a semicolon.

My AssertHandler has been implemented in the same spirit as exhibited by the guys at Power of Two. This was mainly because I like the idea of being able to test with my asserts, and even test that the asserts are being thrown in the first place! If you haven’t taken a look at their article yet, I highly recommend you do.

Conclusion

This has been a fun (and two week long) ride to put this article together. I tried to take a comprehensive approach, and, as we can see, there’s a lot that can go into such a tiny little mechanism…

As I said in the beginning, assert() is one of the most important tools I have for code development and after reading this I hope it’s crystal clear why.

Footnotes

  1. I use a lot of them, but not all, as references throughout this article.
  2. In case you were wondering, why would we ever want to crash (as gracefully as possible) the program? Won’t users lose their unsaved work? Won’t they be mad? The answer is simple and resounding: YES. It is far better to abort the application than possibly allow for a user to continue using it in its invalid state while believing that everything is perfectly fine, or even worse corrupt their data. A user will most certainly be upset that they lost any unsaved work, but not anywhere near as angry as they would be if we corrupted their entire dataset!

References

Dewhurst, Stephen C. C++ Gotchas. 2003. pp 72-74.
Goodwin, Steven. Cross-Platform Game Programming. 2005. pp 262-265.
Hunt, Andrew and David Thomas. The Pragmatic Programmer. 2000. pp 113.
Lakos, John. Large-Scale C++ Software Design. 1996. pp 32-33.
Llopis, Noel. Asserting Oneself. Games From Within. 2005.
Llopis, Noel. C++ for Game Programmers. 2003. pp 386-402.
Kernighan, Brian and Dennis Ritchie. The C Programming Language. 2nd Ed. 1988. pp 253-254.
Maguire, Steve. Writing Solid Code. 1993. pp 13-44.
McConnell, Steve. Code Complete. 2nd Ed. 2004. pp 189-194.
Nicholson, Charles. Stupid C++ Tricks: Adventures in Assert. Originally posted on Power of Two Games. 2007.