C Coding Style

These C coding style recommendations, like many others, are hopefully designed to improve software robustness, maintainability, portability and performance of C programs.

1. C Language Dialects

The main constraint on the C language dialect is that it should compile with "enhanced" C89 compilers, that is, C compilers that understand long long types and map them to 64 bits.

There is no reason to ensure that the C language dialect is acceptable to a standard C++ compiler. Compiling for C++ comes at the cost of explicit casting of all the pointer conversions from/to void*. Adding type casts is bad programming style. In addition C++ has polluted the C namespace with extra reserved keywords. See Incompatibilities Between ISO C and ISO C++ for further information.

When compiling with GCC under -Wall -Wno-unused, there should be no warnings. The -Wno-unused option is needed because in debug mode there are checking macros that verify code invariants and these macros are defined to nothing when building high-performance code. When the checking macros are the only ones to refer to some variables, this yields irrelevant warnings about unused variables.

1.1. Use Standard Types

Use <stdbool.h>. This header contains a typedef for bool and macros for true and false.

Use types from <stdint.h>. This header defines: int8_t, uint8_t, int16_t, uint16_t, int32_t, uint32_t, int64_t, uint64_t, intptr_t, uintptr_t. In particular the two latter types should be used in place of diffptr_t and size_t. Don't assume void* fits in plain int.

Assume int may only be 16-bits? The impact is that we have to use int_least32_t for indexing loops or arrays. Conversely, if int is at least 32-bit, we can use plain int for indexing.

These header files are always available in a properly organized package even with C89 compilers thanks to the GNU autotools.

1.2. C Language Extensions

Don't use the GCC language extensions.

Assume that only the C89 standard C library is available.

Don't use alloca(), as it can silently fail and return a bogus pointer.

Assume that some of the code may be compiled with GCC -fshort-enums. To ensure inter-operability, no library should export types that depend on the storage size of an enum. In particular, avoid exporting pointer to enum types and structure definitions with enum members.

Don't step into the reserved C namespace. This includes identifiers starting with a double underscore, identifiers starting with a single underscore followed by capital letter, and identifiers ending with _t.

1.3. Use of C99 Features

Allow C++ line comments and comments after preprocessor directives. The // comment is part of C99 standard. And with old C compilers one can always use a modern cpp.

No other C99 features except restrict and static inline.

Assume that the compiler will apply C99 aliasing rules to optimize code. This yields better optimizations by removing dependences but is invalid in most cases case of pointer casting. Precisely, according to ANSI C:

7: An object shall have its stored value accessed only by an lvalue expression that has one of the following types: {footnote 73}

  • a type compatible with the effective type of the object,
  • a qualified version of a type compatible with the effective type of the object,
  • a type that is the signed or unsigned type corresponding to the effective type of the object,
  • a type that is the signed or unsigned type corresponding to a qualified version of the effective type of the object,
  • an aggregate or union type that includes one of the aforementioned types among its members (including, recursively, a member of a subaggregate or contained union), or
  • a character type. {footnote 73} The intent of this list is to specify those circumstances in which an object may or may not be aliased.

To disable optimizations based on alias-analysis of ANSI C, the GCC option -fno-strict-aliasing can be used as a work-around.

1.4. Machine Dependencies

May assume that the target machine is byte-addressable with 8-bit bytes. Byte-addressable means that incrementing a pointer increases its value by the sizeof the pointed to type.

May assume that the target machine uses 2's complement arithmetic.

Don't assume that large shift amounts are modular, eg (foo<<(-1)) equals (foo<<31) on a 32-bit machine.

2. Lexical Conventions

2.1. White Space

All lines should fit within a 100-character window without wrapping.

Indent with 2 or 4 spaces, and do not allow the editor to convert 8 consecutive spaces into <TAB> characters.

The <TAB> characters are only allowed for aligning comments. No source code line should beging with a <TAB>.

Here's a GNU Coding Standard advice that can be used:

Please use formfeed characters (control-L) to divide the program into pages at logical places (but not within a function). It does not matter just how long the pages are, since they do not have to fit on a printed page. The formfeeds should appear alone on lines by themselves.

2.2. Block Braces

Use the K&R recommended layout, that is, the opening brace of a compound-statement ends the preceding line and the closing brace is on its own line at the level of the containing statement:

if (cond) {
   ...
}
for (i = init; i < limit; i++) {
    ...
}
if (cond1) {
   ...
} else if (cond2) {
    ...
} else {
    ...
}

Always use braces unless the whole if, for or while and the controlled statement fits on one line:

if (cond)
    statement;              // Bad.
if (cond) statement;        // Better.
if (cond) {
    statement;              // Best.
}

2.3. Switch Statements

Each case statement is indented at the same level as the switch.

Every switch statement must include the default case.

switch (val) {
case 1:
    ...
    break;
case 2: {
    ...
    break;
}
default:
    ...
    break;
}

2.4. Logical Expressions

Ensure that the control expression in if, while, for tests a logical value:

if (pointer) {         // Implicit test against NULL: avoid
    ...
}
if (pointer != NULL) { // Explicit test of a logical value: better
    ...
}

Avoid comparing a unsigned value with a signed value. If it can not be avoided, use explicit conversions.

2.5. Function Calls

Function calls should not have whitespaces before opening '('.

Use ', ' to separate the arguments.

2.6. Function Definitions

According to the GNU Coding Standard:

It is also important for function definitions to start the name of the function in column zero. This helps people to search for function definitions, and may also help certain tools recognize them. Thus, the proper format is this:

static char *
concat (char *s1, char *s2)
{
  ...
}
In ANSI C, if the arguments don't fit nicely on one line, split it
like this:
int
lots_of_args (int an_integer, long a_long, short a_short,
              double a_double, float a_float)

2.7. Multi-Line Statements

When an expression does not fit on a single line, break it up according to these rules:

long_function(expr1, expr2,
              expr3, expr4, expr5);

So, according to these rules and by using the One True Brace Style (1TBS) as seen in K&R:

if (   cond1
    && (   cond 2
        || cond 3)) {
    ...
}
for (iter1 = val1,
     iter2 = val2;
     iter1 < limit1
     iter2 < limit2;
     ++iter1,
     iter2 = iter2 + expr2) {
   ...
}

2.8. Line and Block Comments

Use TODO and HACK tags in comments where applicable, so they can be globally searched.

Use line comments introduced with // ... in function bodies. The comment should apply to the next line(s).

Use block comments /* ... */ outside functions, mainly before the declaration or definitions.

Avoid fancy markup language inside comment, whether XML-like or Doxygen. The markup if any should be WLM (Wiki Like Markup).

3. Identifier Names

There are three main conventions for capitalizing identifiers:

Pascal case
The first letter in the identifier and the first letter of each subsequent concatenated word are capitalized. For example BackColor.
Camel case
The first letter of an identifier is lowercase and the first letter of each subsequent concatenated word is capitalized. For example backColor.
Upper case
All letters in the identifier are capitalized. For example BACKCOLOR.
Linux case
All letters are lowercase and the words are separated by '_'. For example back_color.

3.1. Type Names

All types defined by a module should have a name whose prefix is the name of the module. In addition, there should be no underscores in such type names to distinguish them from function names. For instance in some module Module.h:

typedef int (*ModuleCompare)(int, int);
typedef struct {
  int16_t REAL;
  int16_t IMAG;
} ModuleComplex;

3.2. Functions

All function names consist of upper and lower-case letters, digits, and underscores. If a name consists of multiple words, the individual words are capitalized; the other letters are lower case. Underscores can be used to separate multiple idioms. Functions associated with a type should be prefixed by the name of the type followed by an underscore. Functions with external linkage not associated with a type should be prefixed by the module name followed by an underscore.

static void someFunc(...);
void ModuleObj_someObjFunc(...);
void Module_someFunc(...);

3.3. Structures

Each data structure or type should have a set of functions associated to it (whose declaration immediately follows the struct definition). The name of these functions must start with the type name:

struct ModuleString_ {
  uint16_t LENGTH;
  char *NAME;
};
typedef struct ModuleString_ ModuleString_, *ModuleString;
ModuleString ModuleString_Ctor(ModuleString this, char *);
ModuleString ModuleString_Copy(ModuleString this, ModuleString that);
void ModuleString_Dtor(ModuleString this);
size_t ModuleString_Size(ModuleString this, char *);
uint16_t ModuleString_length(ModuleString this);
void ModuleString_setLength(ModuleString this, int16_t length);
const char *ModuleString_name(ModuleString this);
void ModuleString_setName(ModuleString this, const char *name);

3.4. Enumerations

Enumeration names should have the same capitalization rules as types. All enumeration member names should use the enumeration name as a prefix. All enumerations should have some method to allow clients to print the enumeration.

typedef enum {
  ModuleFlag_Opened,
  ModuleFlag_Closed,
  ModuleFlag_
} ModuleFlag;

3.5. Global/Local Variables

All global variables names should be prefixed by namespace followed by an underscore.

Many local variables are pointers to structures. Assuming the type(def) name of the structure pointer use the Pascal case, then the local variable should have the corresponding name in camel case:

ModuleString_ string_;
ModuleString string = ModuleString_Ctor(&string_, name);
ModuleString this_string, string_1, string_2, old_string;

3.6. Macro Names

Macro names should start with the namespace prefix and have the remainder in upper case.

Temporary variables used inside macros should also follow this convention to avoid unexpected name clashes.

4. Writing Better Code

4.1. Function Inlining

When the programmer judges inlining preferable for a given function (even if she judges affordable the impact of not doing it), then she should declare it static inline.

Using static inline might be a performance problem for compilers that do not honor inline. This Coding Conventions recommend using macros-like functions for accessing structure members between friends (shared between modules of a library), but keep real functions for anything public (exported for use outside a library).

4.2. Variable Initializations

All scalar local variables should be initialized in their declaration. If an useful initiatization value is not available, initialize it with zero.

All global variables should have one definition and zero or more declarations. Do not rely on the common feature of C for global variables.

4.3. Unlocking Instruction Scheduling

Avoid using expressions that involve several memory indirections. These should be broken into a series of simpler access steps.

Put rvalues in local variables as early as possible.

Assign to lvalues as late as possible.

4.4. Memory Aligment of Objects

Lay structure members in decreasing memory alignment constraint. Assume pointers are 32-bit or wider so their alignment constraint is same as int32_t or larger.

Avoid misaligned data even if the architecture supports it. Native types should be on addresses multiple of their size.

4.5. Checking Code Invariants

Design by contract. According to Bertrand Meyer, the key is "viewing the relationship between a class and its clients as a formal agreement, expressing each party's right and obligations". In procedural language, this implies that the caller must meet all the preconditions of the function being called and the callee or the function must meet its own postconditions. The failure of either party to live up to the terms of the contract is a bug in the software. The minimum support of design by contract is to provide REQUIRE and ENSURE macros that are used to respectively check preconditions and postconditions.

Distinguish code invariant by checking them with different macros, like: Except_REQUIRE, Except_ENSURE, Except_CHECK.

4.6. Integer Variable Overflow

Try to avoid unsigned integer types for variables involved in arithmetic expressions, especially with loop induction variables. This is because overflow of unsigned integers has a defined effect in C, as opposed to signed integer overflow. A C compiler may be more aggressive with signed integer arithmetic because it may assume overflow will not occur.

Beware of structure members that use small integers. While this is useful to reduce the structure footprint, any mutator function should check that the value stored equals the value assigned to the structure member.

4.7. Exception Handling

The C language does not have language support for exception handling. However, the setjmp and longjmp standard functions can be used to this support: Exception Handling in C without C++

We are writing system code for embedded applications, so the exceptions we must be able to recover from are the out of memory conditions. Whenever an out of memory condition occurs, the code should longjmp to a point where the failed processing can be bypassed.

5. Package Organization

A package is a set of libraries or executables that is configured and installed as a unit. This is the traditional notion of GNU source packages. The package should have a configure script generated by GNU autoconf and its top-level Makefile must obey the GNU Coding Standard. One option is to write a Makefile.am and use GNU automake.