Coding Standards » History » Revision 2
Revision 1 (Kishan Parekh (TechPartner), 01/03/2023 04:47 PM) → Revision 2/6 (Kishan Parekh (TechPartner), 01/03/2023 04:54 PM)
# Coding Standards ## Why coding standards? To be able to understand where to start from and where to head when designing coding standards, one must re-visit some basic ideas about programming itself. Novice programmers feel that as long as source is correct and bug-free, nothing else about it matters. With a little more experience, they learn about indenting. With a lot more experience, they learn about writing good comments. Most mediocre programmers write either too much comments or too little. This is because they are often unskilled in communicating, poor as writers and teachers. After insight into commenting comes insight into identifier design. One needs a good grasp over the English language to select sensible identifiers, which makes this a difficult job for unskilled communicators. Then, on a separate track, they learn about issues of software architecture, modularisation, and other aspects of good design. All this adds up to a large and difficult subject, and many books have been written about various corners of this territory. What makes the software development process more confusing is that the large and thriving software development industry has attempted to term software design an engineering activity. It is an engineering activity only to the extent that a finished, tested software program is in many respects like a machine, and we know that it requires engineering to design and build machines. Because of this relationship with engineering, many assume that the process of building software is akin to that of building physical machines in a factory, where lines of skilled workers labour over their tasks in a long assembly line. The software development industry talks of developing software using the "software factory" approach, where roles are differentiated, and design, assembly, testing, and measurement are all assigned to separate teams. This ``factory'' approach implicitly assumes that designing needs experienced designers, but code can be written by semi-skilled assembly-line workers. This extrapolation is rubbish. At best, this approach can product only mediocre software to perform simple tasks like maintaining account balances for financial transactions. We believe that good code cannot be written this way. A company engaged in executing one-shot software projects which address relatively simple business applications may sometimes succeed by using the "software factory" ``software factory'' approach. But a software product company, which must live for many years with the goods and evils of the code it puts into its own products, cannot afford to leave coding to semi-skilled assembly-line workers. Developing difficult and long-lived software is a process very different from manufacturing machines in a factory. It is more like writing literature. You must know what you want to write, and why. Then you must have the skills to translate your realisations and insights into beautiful, simple and elegant code. This is a skilled, individual, cerebral activity. It is not a manufacturing process. It is clear, by extrapolation, what we feel about industrial quality standards like the ISO~9000-2001 and SEI~CMM quality standards. We feel that these standards, when applied to software processes, do not necessarily help in the creation of beautiful, simple and elegant code. These standards are not without merit, though. They may bring standardisation and order into an otherwise unmanaged and unstructured software development process, specially when large developer teams are involved. The documentation that these standards impose helps to enforce a software *process*. But these standards cannot elevate the output of the team from structured, documented code to beautiful, simple and elegant code. If one keeps in mind the similarity between writing literature and writing code, the reasons for this failure are obvious. The goodness that we strive for in code, the goodness that we refer to with words like "beautiful, ``beautiful, simple and elegant," elegant,'' is not an abstract or personal aesthetic that one aims for purely for emotional satisfaction. We firmly believe that good code makes good business sense. Good code costs less to write and maintain. It translates to happier customers and software users, and earns the software developing company more money in the long run. This may not always be true of one-shot software project companies, but is *always* true about software product companies. As a software products company, we have a business reason to take our software development process and the *goodness* of our code very seriously. Once we understand this, we realise that coding standards are hard to get right. The belief that the output of a group of workers can be standardised is itself questionable in the context of software. This belief probably comes from the industrial-production mindset, and we have just said that good code is rarely born in that environment. However, any code which expects to live long needs to be maintained; it needs care and feeding. This means that successive generations of workers need to tend to a body of code as it grows and morphs. It is important that some degree of uniformity be maintained between original code and new additions and modifications, however superficial this uniformity may be, purely to make this maintenance easier. Coding standards help to achieve this uniformity, provided they do not stifle the freedom a good programmer needs to write good code. Self-documenting identifiers is a good example of standards which stifle good code. A sensitively designed coding standard should avoid such pitfalls. This document attempts to specify coding standards to be used at Merce, \merce, keeping in mind the challenges that a coding standard designer must face. The quality of code is not determined by how closely it conforms to a coding standard -- we must never let ourselves forget this. The yardsticks which determine excellence in code are far more difficult to pin down than a mere coding standard. We hope we will hold this in our hearts and minds as we approach the task of defining our coding standards and the larger task of developing software at Merce. ## Basic principles Code has only a five-decade history. It is doubtful whether there were large software projects many years before the IBM OS/360 project. Therefore, coding standards have no centuries-old tradition to fall back on, no stories which can claim "This is how code was written in Middle England from the days of King Richard the Lionheart." In the absence of its own tradition and philosophy, we have to borrow from other, older disciplines to create a set of basic principles on which we can then do the detailing. What are these basic principles? The following three come to mind: * The basis of any coding standard should be the rules of good typesetting which are applied to English. These rules are quite robust, having seen more than a millenium of use and having evolved well with time. The low-level rules which govern how to place commas, parentheses, \etc are all directly applicable for source code. Since code is not organised into printed pages, page composition rules do not help much here. But source is written in units of source files. Therefore, a source file can be loosely treated as a chapter. Only loosely. After all, good source is read in a manner very similar to reading good English. * Make the source file readable without the artefacts of typeface changes. All source files are laid out in monospaced font of uniform size. Therefore, normal typesetting artefacts like section headers and typeface changing are not available. The intelligent use of spaces and blank lines therefore becomes almost an art form of its own. Use too little and you crowd the lines. Use too much, and you break the flow and continuity. Just like blank lines control spacing in the vertical direction, the block structure of nested constructs are highlighted by indentation in the horizontal axis. Modern source is unreadable without strict indentation. * Make the source file readable to a good programmer who is unfamiliar with the code. When designing a coding standard, your choices can make things easier or harder for different types of readers -- no one size fits all. And the different types of readers we encounter can be described thusly: * the bureaucratic template-oriented code layout, meant more for conforming to a standard than to assist readability. Such code is, or at least appears to be, more suited to machine analysis than human reading.\footnote{See Section~\ref{selfdocumentingcodesection} on page~\pageref{selfdocumentingcodesection} for some examples and an explanation of the philosophy which drives such coding standards.} * the incompetent programmer, who, it is believed, needs copious amounts of comments and super-simple, widely-spaced tokens in his source. Usually such code is also written \textit{by} incompetent programmers. * the good programmer who is intimately familiar with his code -- no comments needed, obfuscated code is not obfuscated, and coding standards are at some level unnecessary * the good programmer who is unfamiliar with the code. For such programmers, some comments are a big help, and clear use of space to delineate blocks make for easy reading. You must choose the type of reader you want your coding standards to cater to. Many large software divisions or companies choose the bureaucratic approach. Such companies also find it harder to retain really talented programmers --- probably there is a connection. This document will assume that this coding standard is for the good programmer who is unfamiliar with the code. ## Language independent standard This document aims to be independent of programming language. It applies to C, C++, Perl, shellscripts, and Java. Wherever there are language-specific issues, they have been highlighted with explicit observations. Therefore, it is implicit that this coding standard will not get into the deeper issues of language-specific source formatting. We feel that such detailing can be left to the programmer. After all, we expect this document to be used by good programmers.