Source code testing rules, standards, references, etc. (incl. ParaSoft JTest).

In conversations with the customers, I frequently see confusion about how "standards" are enforced with automated source code testing tools. Below is my summary of how this works. I mostly use ParaSoft JTest but would apply to other tools as well by CAST Software, Fortify, Borland/Together, etc.

I'm using a certain vocabulary here just to try and make certain points clear. In my observation, it is less important that one enforce this particular vocabulary than that the project participants have clarity about these ideas.... and uses whatever vocabulary works for them.

Rules - Source code analysis tools like JTest come with 100s of discrete, fine-grained, "rules". For example, a JTest rule on JavaBeans [BEAN.JDBC-4] says "Do not use JDBC code in JavaBean classes". One of the optimization rules rules [OPT.PCS-3] says, "Use char() instead of startsWith() for one character comparisons". As you can see, these rules are very specific and technical.

These rules are intended to be turned on or off individually. So, for example, JTest comes with about 493 of these rules. To use these tools successfully, endeavors of some defined scope (a subsystem, project, program, organization, etc.) need to figure out which of the rules they will turn on and which they will turn off. Rules may be included or excluded for a variety of reasons: - The rules may apply to a technology (J2ME, EJBs, etc.) that isn't being used in the code to be examined. - The rules may reflect "best practices" from the past (and applicable to code of that era) but not reflect contemporary practices - The rules may reflect "early adopter" practices appropriate for a cutting-edge project but not appropriate for a more conservative team - A rule may, or may not, reflect fundamental architecture and design decisions about the project

There are rules available in the tools that contradict each other. This is because tools of this type provide a generalized capability to evaluate source code against criteria.... and what criteria should be applied depends upon the situation and the design decisions of the project. So, for example, there may be a rule that says "Always do X" and another rule that says "Never do X" and then it is up to the development team to turn on the rule appropriate to the situation.

More is not better and a common mistake is for teams to turn on all, or almost all, of the rules. This has two severely negative consequences. The first is that turning on rules that don't really apply to the project can generate very many false errors. The other is that in the case of one rule says "Do this" but another rule says "Never do this" then having both turned on will get the code/build/test teams stuck in a loop that uses up lots of time and effort but never makes any progress.

References - All of the tools of this type draw extensively from established references. ParaSoft JTest evaluates Java source code so it will cite Effective Java by Joshua Bloch, Elements of Java Style by Ambler et al, and other well know books or papers on Java programming. ParaSoft C++ Test (or other C++ tools) cite from comparable references on C++ such as Effective C++ by Meyers, Large Scale C++ Design by Lakos, and others.

There is a many-to-many relationship between the rules and the references. So, for example, a rule that says "Do not declare multiple variables in one statement" [JTest, COSTA.MVOX-3] will cite as references both Elements of Java Style by Ambler et al.....and will also cite SUN's Code Conventions for the Java Programming Language (1999) since this recommendation was in both of those books. Going the other way, something from a dominant reference on Java (e.g. Design or document for inheritance or else prohibit it, Item #15, Effective Java by Bloch) might be broken down into several discrete rules in a tool like JTest.

Source code tools generally provide only a partial implementation of the references they cite or the topics they address. These tools provide an automated capability to review software but most reference provide a topical (e.g. security) orientation for reviewing code.... both automated and manual. So, for example, a JTest rule or portability [PORT.ENV-1] says, "Do not use System.getenv()" and cites Flanagan's Java in a Nutshell book as a reference. A good rule to have available but the set of portability rules available are both an incomplete implementation of Flanagan's recommendations and incomplete relative to what is necessary to achieve a significant measure of portability.

Standards - It is easy for the confusion to take over about what the word "standard(s)" means. One of the basic sources of confusion is what is the scale meant by the word "standard". Some people will refer to a small, fine-grained, item (such as a JTest rule) as a "standard" and then accumulate these into the "standards" they want applied to their code. Others will refer to a larger published work saying something like, "I want to enforce this standard on our code".... while pointing to a book (or other reference work) they got from a colleague or book store. One of the reasons why I favor using "rule" and "reference" is that is seems to break this confusion and stakeholders more intuitively (and consistently) recognize the difference in scale here.