This document is also available in plain text.
Identifier naming convention for object-oriented software development
Developed by Jeremy Neal Kelly
www.anthemion.org
Version 4.2
January 9, 2010
Iowan Notation is an identifier naming convention for object-oriented software development. It was designed originally for use with C++, but is largely applicable to other object-oriented languages. Its purpose is twofold:
| 1) | To make identifiers easier to generate and easier to read by promoting a 'separation of concerns' within them; |
| 2) | To avoid common mistakes by documenting language features with unusual side effects. |
Software development requires careful management of numerous details from varied domains: business concerns must be modeled, the platform must be usefully abstracted, and, at the lowest level, the programming language and its particular way of representing and transforming data must be carefully exploited.
To best aid the developer, identifiers must document all these concerns while remaining concise and legible. A program that fails to meet this standard fails necessarily to represent its problem clearly, and errors are certain to result. Identifiers should also be easy to generate, and ideally, predictable, such that different developers can hope to produce the same name for a given element.
Parts of this problem are too difficult to solve in a general way; business concerns and the data structures and algorithms that represent them vary too much to be categorized from afar. Other concerns are well-suited to categorization, however: language features, for instance, which are clearly defined and relatively few in number.
Iowan Notation honors this distinction by enforcing a 'separation of concerns' within identifiers. It labels important language features with a range of one-character prefixes, leaving the identifier root, which follows the prefixes, to document high-level concepts without concern for technical detail.
Consider a database application. The 'table' concept can be generally identified by the root 'Tbl'. Throughout the program, however, this concept may be used and referenced in many distinct ways. Applying some of the more common prefixes:
| Identifier | Referent |
| tTbl | Table type |
| oTbl | Local table instance |
| gpTbl | Global pointer to a table instance |
| esTbl | Class-private static table instance |
| arTbl | Function parameter passing a non-const reference to a table instance |
As shown, Iowan Notation allows a single root to produce a range of identifiers that are both concise and descriptive, distinct, yet obviously related. Details are documented without obscuring commonalities. The result is semantically dense but highly legible code.
Static analysis tools like 'lint' help developers write reliable code. Some mistakes defy such analysis, however, as when technically correct code generates unwanted side-effects. The detection of these mistakes cannot be automated without needlessly flagging correct code. It can be facilitated, however, by labeling unusual properties. In Iowan Notation:
| 1) | References, static variables, and union members are marked to show that changes to such variables affect other variables or scopes; |
| 2) | Virtual functions are marked to show that their behavior may change when invoked by subclasses; |
| 3) | Macros are marked to warn against side-effects from duplicated parameter expressions. |
By documenting scopes, the notation provides an additional safeguard. Because the same concepts are referenced repeatedly throughout a program, it is easy to hide or 'shadow' names in wider scopes when declaring entities in narrower ones. This effect is difficult to spot, and can cause significant confusion. By generating distinct identifiers in distinct scopes, the notation avoids most such conflicts automatically.
Unlike other notations, Iowan Notation does not document specific types. There are several reasons for this:
| 1) | Compilers flag most type-safety violations, so there is little need for help in this area; |
| 2) | A prefix list cannot enumerate non-fundamental types, which are innumerable, and of which it is necessarily ignorant; |
| 3) | Types often change in the course of development, and many such changes — replacing a fundamental type with a comparable larger type, replacing a class with another implementing the same interface — can be made to well-written code without detailed review. Requiring identifier changes in such cases would create unnecessary work for the developer. |
Though types are not explicitly documented, the identifier root strongly hints at a variable's type simply by describing its role.
In Iowan Notation, identifiers begin with zero or more lowercase prefix characters, one for each of the criteria below:
Non-static class-public variables and functions meet no criterion, and their identifiers are not prefixed. Other identifiers are prefixed at least once. Only qualities inherent to the identified entity are labeled; pointers to pointers, for instance, are prefixed with one 'p', not two. When multiple prefixes apply, they are affixed in the order specified above, with the exception of 'z', which may be placed anywhere within the prefix.
The identifier root follows the prefixes; it includes whatever text most succinctly describes the concept being represented. The first letter of every word within the root is capitalized, as in 'CamelCase':
string oNameLast;
It is acceptable and occasionally desirable to omit the root, producing an identifier of prefixes only, as in the loop index below:
for (int o = 0; o < eFlds.Ct(); ++o) cout << eFlds[o].Name() << endl;
|
z (no criterion) |
The 'z' prefix has no set meaning. It may be used to resolve name collisions with reserved words or third-party code, or for any other reason. |
|
t Type |
The 't' prefix applies to user-defined types and typedefs. It allows the same root to be shared by a type and an instance of that type: tTbl oTbl; It does not apply to template type parameters, for which the 'x' prefix is used instead. |
|
x Template parameter |
The 'x' prefix applies to template parameters, both type and non-type. By distinguishing template parameters, it clarifies template design:
template<class xNum>
struct gtPt {
xNum X, Y;
};
|
|
f Interface |
In languages that support them, the 'f' prefix applies to interfaces. It allows the same root to be shared by an interface and a type or instance implementing that interface. |
|
m Macro |
The 'm' prefix applies to macros. It warns against side-effects from duplicated parameter expressions and other preprocessor oddities: // Probably a bad idea: int oMin = mMin(++oX, ++oY); |
|
n Namespace |
The 'n' prefix applies to namespaces. It helps distinguish namespaces from types, as when nested types or class-static entities are invoked: // ctLog is a type: ctLog::tLine oLine; // nStr is a namespace: oText = nStr::gTrim(oLine.Text()); |
|
g Global element |
The 'g' prefix applies to global types, interfaces, variables, and functions. It also applies to the values of global enumeration types, which are essentially global constants:
enum gtTurn {
gTurnLeft,
gTurnRight
};
In C#, the prefix can be omitted from enumeration values, since these are qualified with the type name when used. |
|
c Protected class member or internal linkage |
The 'c' prefix applies to class-protected types, interfaces, variables, and functions, as well as those with file scope and internal linkage. Outside of C#, it also applies to enumeration values with such scopes, these being essentially class-protected or internal-linkage constants. |
|
e Private class member |
The 'e' prefix applies to class-private types, interfaces, variables, and functions. Outside of C#, it also applies to the values of private enumeration types, which are essentially class-private constants. Distinguishing protected from private entities is helpful when implementing parent classes. |
|
a Function parameter |
The 'a' prefix applies to function parameters. Though they are essentially local variables, it is useful to distinguish them because modifications to reference parameters change data outside the local scope. This prefix does not apply to macro parameters, as these are not necessarily variables, and macros do not define scopes as such. |
|
o Local element |
The 'o' prefix applies to local variables. In languages that support them, it also applies to local types and functions. |
|
s Static class member or local static variable |
The 's' prefix applies to static class members and local static variables. It warns that changes to such variables manifest outside the current invocation or instance. It also warns against static initialization and deinitialization order fiascos. The prefix does not apply to globals declared 'static' for internal linkage. |
|
v Virtual function |
The 'v' prefix applies to virtual class functions. It warns that a function's behavior may change in subclasses, and prevents virtual functions from being unknowingly called within constructors. The prefix obviously cannot be applied to virtual destructors. |
|
u Union member |
The 'u' prefix applies to union members. It warns that changes to such variables overwrite other parts of the union. |
|
r Non-const reference |
The 'r' prefix applies to non-const references. It shows that modifications to such variables affect other variables, possibly outside the current scope. Because they cannot be modified, const references do not receive this prefix. In C#, ref and out parameters include this prefix. Class or 'object' references — which are entirely distinct from C++ references — are prefixed with 'q' instead. |
|
q Object reference |
In languages that support them, the 'q' prefix applies to object references and object-referenced types. It shows that changes to a referenced instance persist outside the current scope. In C#, it also distinguishes classes, which use the prefix, from structures, which do not. This guards against unwanted copying and boxing. Note that object references are not part of C++; 'references' in this language differ fundamentally from the 'object references' found in C#, Java, and Delphi. C# does include C++-like references, however, in the form of ref and out parameters. |
|
p Data pointer |
The 'p' prefix applies to data pointers and data pointer types: typedef tFld* tpFld; tpFld opFld = 0; |
|
d Function pointer or delegate |
The 'd' prefix applies to function pointers and function pointer types, including those referencing class functions. In languages that support them, it also applies to delegates. Distinguishing function pointers from data pointers allows the same root to be used by variables of both types:
typedef tTbl* (* tdTbl)(const string&);
tdTbl odTbl = &eTblFromFile;
tTbl* opTbl = odTbl("Cust");
|
|
i Iterator |
The 'i' prefix applies to iterators and iterator types. Though not a language feature, iterators are a common means of address, and labeling them allows the same root to be used when a concept is referenced in distinct ways: tiRec oiRec(oTbl.First()); for (; !oiRec.EOT(); ++oiRec) cout << *oiRec << endl; |
|
h Handle |
The 'h' prefix applies to handles and handle types. Though not a language feature, handles are a common means of address, and labeling them allows the same root to be used when a concept is referenced in distinct ways:
typedef int thRec; tRec oRec(oTbl.First()); thRec ohRec = oRec.h(); |
Earlier notation versions did not apply scope prefixes — 'g', 'c', and 'e' — to types or interfaces. Such elements can produce the same name-hiding problems that afflict variables and functions, so it makes sense to label their scopes. On the other hand, types are defined less often than instances, somewhat limiting the chance of a type name collision. Longer prefix strings are also less legible.
The decision ultimately may be better left to the programmer. Those who define many nested types may prefer to document type scopes. Those who do not may prefer to prefix types with 't' alone.
It is possible to change the access level of a virtual function when overriding it, rendering the scope prefix assigned to that function invalid. There is no way for a naming convention to account for this, however, and the practice is questionable from a design perspective, so it seems best simply to avoid it.
It would occasionally be helpful to label macro parameters, but the 'a' prefix would be misleading here, and it seems wasteful to dedicate a new prefix to this concern. To prevent name collisions, macro parameters may be prefixed with 'z'.
In earlier versions of the notation, functions were labeled with 'r' or 'p' if they returned non-const references or pointers. This clarified the effect of modifying such return values, but it made function identifiers somewhat difficult to interpret. The current version is simpler, and even without help, it seems unlikely that return value modification could cause much confusion, as it is possible only when references or pointers are returned:
IDNext() = oID;
Conversely, storing a return value for later use requires the declaration of a variable, which necessarily bears the appropriate prefixes:
int* opID = IDNext(); *opID = oID;
Most resource management can and should be handled with RAII. This approach is sometimes impractical, however, and cannot be implemented in a meaningful sense within C# or Java. It might be useful, therefore, to label types and functions that incur cleanup obligations. This would go somewhat beyond documenting 'language features', however, and prefix strings in C# seem already too long. For now, at least, cleanup obligations remain undocumented.