| Code coverage metrics and Functional Test Coverage |
|
|
|
| Wednesday, 04 May 2011 09:52 |
|
There have been some articles and tweets about code coverage recently, and it seems that many developers are still laboring under a few misconceptions in this area.
Code coverage can be a very useful metric. However you need to know how, and when, to use it. The link between code coverage and test quality is tenuous at best - in short, high code coverage is, in itself, no guarantee of well tested code. And increasing code coverage for the sake of code coverage will not necessarily improve either the quality of your tests or the quality of your application. It is easy (and obviously a largely futile exercise) to achieve high code coverage metrics without actually testing anything at all. Now don't go thinking I'm not a fan of test coverage. For the record, I am a huge fan of high test coverage, though I don't write tests explicitly with this aim (as I will discuss further on). As a metric, code coverage has its limitations, and should not be used for purposes for which it is poorly suited. Test coverage is excellent at indicated what code has not been exercised by your unit tests. Indeed, if high code coverage does not prove, in itself, that your code is well tested, low code coverage provides fairly conclusive evidence that your code is untested. An experienced developer will know how to use this information to complete her tests to cover important edge cases and boundary conditions. But what of the broader picture? How do code coverage metrics help you deliver a useful, high quality product to your users? Well tested applications tend to be more reliable, easier to understand, easier to maintain and in the end faster to develop. This seems a no-brainer, but it is also the practical experience of countless TDD practitioners, and the results of quite a few academic studies. And, in my experience, the single most effective way to achieve high test quality comes from using a combination of ATDD and TDD/BDD. Techniques such as Acceptance-Test Driven Development (ATDD) and Example-based specifications are an excellent way to to drive and track the development process. This process drills down and fans out into Test-Driven Development (TDD), often with a behavioural flavour to it (BDD) at a lower level. This holistic approach has the major advantage of giving you confidence in your code both on a functional (does it does what the client wants) and a technical (does it work) level. So what of test coverage? For a product owner, or for someone from QA, the notion of 90% test coverage is abstract at best. It may be able to indicate that all of the classes in the
What I would call Functional test coverage is a little different. Functional Test Coverage should give an indication of what features are done, in that they satisfy the acceptance criteria, and what features are still in progress. This sort of information is much more accessible to product owners than the number of lines of code exercised. This is in the lines of Acceptance-Test Driven Development, and can be a very powerful communication tool. In ATDD, product owners express their requirements as stories (or features, or whatever). The form and content of automated functional tests should be ideally driven by the customer, though in practice QA or BA folks may play this role as well. It's a communication exercise. Each story has a set of acceptance criteria, typically expressed as examples of how the feature would work in different scenarios. Developers or testers automate these acceptance criteria (for web applications, this could involve using Selenium or WebDriver tests, for example). These tests are then run automatically, for example whenever the code changes (ideally), or on a nightly basis (if the tests take a very long time to run). The reports generated by these test runs give the product owners a very clear idea of which features have been implemented, which work, and how many are still pending implementation. BDD tools such as easyb or cucumber are a great help implementing this sort of tests. So where does that leave us with code coverage? In short, you really need both functional and technical test coverage metrics. However high code coverage should be the natural outcome of good testing practices, not a goal to be aimed for. For this reason, I am not a big fan of aiming for a certain percentage of code coverage. But, if I am working on a project using proper ATDD, BDD and TDD practices and the code coverage drops below say 90-95%, I will investigate, as it may be an indicator of an underlying problem or an area where good testing practices have not been followed. If you would like to learn more about TDD, BDD and ATDD in practice, I will be running the next TDD,BDD and Testing Best Practices for Java Developers in Sydney on June 20-22. And for those in Europe and the UK, I will be running two online courses in the week of May 31: Fundamentals of Test-Driven Development in Java and Automated Web Testing with Selenium 2/Web Driver. Trackback(0)
Comments (2)
![]() written by John Ferguson Smart, May 05, 2011
Thanks for your comments, Tim. I'm not really talking about refactoring here - the point I wanted to make was on not focusing on the metrics without considering where the data is coming from. Of course, any half-decent refactoring will improve the quality of your code. But if you are refactoring, then hopefully you are using the code coverage metrics to spot untested and therefore possibly unreliable code, which is a noble goal. If you are writing tests which simply exercise the code with no assertions (or only very superficial ones), simply because your manager has fixed a 100% code coverage goal, then I would argue that the end result may not be particularly valuable, and may even be misleading (high code coverage but crappy tests). As with most TDD practitioners, for example, the code I write tends to have very high code coverage metrics, but this is simply a result (or, more accurately put, a by-product) of the way I write my tests.
Write comment
|
OK the java compiler ensures that you have no typos in uncovered code but that is it. Uncovered code might do anything.
You make a couple of assertions which need explanation or backing up with examples:
"increasing code coverage for the sake of code coverage will not necessarily improve either the quality of your tests or the quality of your application. It is easy (and obviously a largely futile exercise) to achieve high code coverage metrics without actually testing anything at all."
I have not come across many ways in which the exercise of refactoring code and tests to hit the magic 100% does not improve the quality of both.
One might be to exercise all branches of two conditionals but on two separate tests so that not all combinations are exercised. This requires discipline.
Another is to run the application but make no assertions about the outcome: not really a test.
What other 'easy' ways are there for code coverage not to help?