OO Software Metrics - Method Level Java Metrics - Halstead metrics, Cyclomatic Complexity, SEI MI (Maintainability Index)

Object-Oriented Software Metrics - Method Level Metrics

Product Home

Application

Command Line

Metrics Guide

Download Free Trial

FAQ, News, Bugs

Other Products

**** Just released - JHawk 6.1.7 - See here for details ****

Click here to see prices and purchase JHawk

Method level
We can start at the method level with very basic things like the numbers of lines of code.This is a much derided figure as it is a popular tool at the management level - some of whom view it as a measure of productivity. I would certainly doubt its usefulness at a higher (e.g. class, package) level but at the method level I would consider it one of the most useful metrics. A big method is harder to understand. Partly just because there is more code to read but there are also 'environmental' factors -you might have to scroll up and down to see all of a method so you can't grasp all of it's meaning in a single 'eyeful'. A big method is probably big because it's trying to do too much - one of the golden rules of programming is that each method should perform a single clear distinct function. There is also the question of what constitutes a line of code - JHawk takes the approach of measuring the number of Java statements rather than the number of lines of code. This means that the programmer is free to format their code as they feel without getting penalised for adding another line (e.g. by putting a bracket or a parameter on a separate line for the sake of clarity). You can find more about our views on lines of code, comments and statements in one of our 'Sidebars' papers which you can find here.

Also at the method level we can look at McCabes Cyclomatic Complexity. This is a measure of the number of possible alternative paths through a piece of code. Low Cyclomatic Complexity can be seen in methods where one line of code simply follows another e.g. where we set a number of attributes, perform a calculation and return a value. Higher complexity values will be found in methods that have a lot of if statements, for and while loops etc. This is a pretty intuitive metric - you could look at a method and deduce whether it had a high or low Cyclomatic Complexity. It is easy to get a computer to calculate this accurately so it's a reliable metric. Cyclomatic Complexities over 10 are generally viewed as being bad.

Another set of metrics at the method level are the Halstead metrics. There are quite a few of these. They are primarily based on the number of operators (method names, arithmetical operators) and the number of operands (variables, numeric and string constants). The Halstead metrics give a sense of how complex the individual lines of code (or statements) are. The most basic is the Halstead Length which simply totals the numbers of operators and operands - a small number of statements with a high Halstead Volume would suggest that the individual statements are quite complex. The Halstead Vocabulary gives a sense of the complexity among the statements - for example are you using a small number of variables repeatedly (less complex) or are you using a large number of different variables - which will inevitably be more complex. The Halstead Volume uses the length and the vocabulary to give a measure of the amount of code written. The Halstead Difficulty uses a formula to assess the complexity base on the numbers of unique operators and operands. It suggests how difficult the code is to write and maintain. The Halstead Effort attempts to estimate the amount of work that it would take to recode a particular method. The Halstead Bugs attempts to estimate the number of bugs that there are liable to be in a particular piece of code.

The Halstead metrics have been around for some time (since 1977 in fact) - they predate object-oriented languages but are still relevant today. All of the measures are relevant at method level and can reasonably be presented as averages per method at class, package and system level. Halstead Length and Halstead Effort can reasonably be expressed as totals at the higher levels.

If you are interested in a deeper discussion on the Halstead Metrics we have featured them in one of our 'Sidebars' papers which you can find here.

The Maintainability Index (MI) is a set of polynomial metrics developed at the University of Idaho, using Halstead's effort and McCabe's cyclomatic complexity, plus some other factors relating to the number of lines of code (or statements in the case of JHawk (see above)) and the percentage of comments. The MI is used primarily to determine if code has a high, medium or low degree of difficulty to maintain. It is language independent and was validated in the field by Hewlett-Packard (HP). HP concluded that modules with a MI less than 65 are difficult to maintain, modules between 65 and 85 have reasonable maintainability and those with MI above 85 have excellent maintainability.(I’m sure you will be pleased to learn that all of the code modules in the JHawk distribution have values for MI greater than 65!)

JHawk gives separate figures for MI which takes account of comments and for MI which does not take account of comments. We have two main reasons for this -

Comments are subjective. Far better to measure the MI without considering comments and then to assess (by eye) whether the comments mitigate any modules marked as having a high maintenance requirement.
The formula used to calculate the comment part is controversial and is viewed as being language dependant. The use of sine in the calculation means that comments may add to or detract from the maintainability without any clear correlation.

MI can be calculated at method, class, package and system level.

There are a number of other metrics that can give some guidance to the quality of your code at method level -

No. of Arguments - If a method has a large number of arguments it may be a sign that the method is trying to do too much. It also means that this method signature has a greater propensity to change. You should check that all the arguments are used. If the arguments are primarily the arguments of another object then pass in the object. If the reason that you haven’t used the object is that you don’t want the class that the method is in to know about it then create an interface for the object and get it to implement it. Another approach is to create a new class whose sole purpose is to hold the arguments - this could be a valid approach if this particular set of arguments is passed to numerous methods.
No. of comments - Comments are a matter of taste. For example - JHawk only tells you the number of comments - there is no tool that can tell you anything about the quality of comments. If the method is complex and has no comments this is probably bad but an accessor method without comments is probably fine.
Variables Declared - if a lot of variables are being declared then your method may be doing too much - if that is the case you should split the method using the strategies outlined in the section below.
Variables Referenced - if a lot of variables are being referenced then your method may be doing too much - if that is the case you should split the method using the strategies outlined in the section below.
No. of Expressions - This is an alternative measure to the number of statements. If the average number of expressions per statement is high this may indicate a complex method.
Max nesting - too great a depth of nesting indicates complexity. Exceptions to this general rule would include iterating across multi-dimensional arrays. The total depth of nesting is also shown as a large number of nested statements may make the method difficult to read unless care has been taken with the formatting. You can reduce the depth of nesting by splitting the functionality of the method using the ‘Nested chunks’ strategy outlined in the section on splitting methods below.
No. of loops - loops are an important place to look for performance issues. Code that is only run infrequently is unlikely to have a big impact on performance.
Number of external methods called - The more external methods that a class calls the more tightly bound that class is to other classes. If you feel that a class is too tightly bound then this column will help you to identify the methods that are responsible for external calls. You can see the actual methods called and the number of times that they are called by double clicking on the selected method record to bring up the method details dialog
Number of classes referenced - This is similar to the situation with the external methods - you can use this to identify methods that reference external classes and identify them in the method details dialog.

How to split methods
There are a number of good reasons to split methods which are seen as being too large -

It encourages best practice by ensuring that individual methods perform a single function
It makes the methods easier to read and therefore to understand their function
It is harder for bugs to ‘hide’ in the code
By separating functionality out into smaller methods there is more likely that the code will be re-used

The actual process of splitting depends on what you are trying to do and the particular algorithm that you are trying to implement.There are three main types -

Sequential chunks - in this case you separate each chunk out into a method and your original method just calls each of the methods in succession, handling the interim variables that need to be passed from one method to another

Nested chunks - this is the case where you have a number of nested loops each of which (apart from the deepest) does some preparatory work (optionally), calls the inner loop then (optionally) does some work after calling the loop. In this case each of the inner loops becomes a separate method and your original method contains only the outer loop and the call to the first nested loop

Conditional chunks - this is where you have a number of alternative paths in your algorithm depending on condition(s) passed in or accessed by the method. Each conditional chunk should be implemented as a separate method and the original method should simply handle the outer conditions and decide which of the methods to call.

Of course many of your methods will actually be combinations of these different types but you can still apply the strategies as appropriate. In any case a good rule of thumb is that the entire text of a method should be viewable as a single screenful in whatever editor you are using.

Refactoring (by Fowler) is an excellent source of patterns to use when refactoring code. Remember that if you are refactoring code a good set of unit tests will ensure that you maintain the functional integrity of your code while improving its quality. If you are not yet familiar with the JUnit unit testing framework then you should be - you can find more information here.

--> Click here to continue on to look at some of the Metrics available at Class Level
<-- Click here to go back to the overview
If you are interested in Java Metrics you might be interested in topics that we are publishing in our Sidebars papers. Just click on this line to have a look.