Qafoo GmbH - passion for software quality

Help you and your team benefit from new perspectives on cutting-edge quality engineering techniques and tools through the Qafoo team weblog.

By Kore Nordmann, first published at Mon, 20 Jan 2014 11:19:12 +0100

Code Reuse By Inheritance

Inheritance

To me, inheritance has two properties:

  • Defining an is-a relationship

  • Making it possible to share code between classes by extending from a common base class

The is-a relationship between classes is one thing I like to use inheritance for. So, to me, a concrete Cache - for example a MemcacheCache implementation - extending from an abstract Cache base class is a very sane thing to do. Implementing an interface means, in my opinion, adding an additional usage aspect / feature to a class, while the base class defines the type of a class. Mind: The base class, most likely, only defines abstract methods and does not provide any implementation in this case. I discussed this in more detail in a dedicated blog post, which is why I skip a detailed explanation now.

Get a Qafoo expert on-site to train your team in object oriented design and support you in developing more efficient and maintainable web software.

The other thing you can use inheritance for is to share code between multiple classes. A prime example for this is Active Record, where a concrete record class (for example a BlogPost) extends from a base class which adds the database access logic.

I personally think that the latter is one of the worst things you can do in object oriented design. Bear with me for a moment and let me try to explain why I think that.

Active Record

With Active Record, the problem is fairly obvious: The base class usually implements the storage logic, while the extending class is supposed to implement the business logic and maybe additional table query logic.

The business logic (or domain logic) is the most important part of your application – the stuff you earn your money with. You probably want to test this logic since you might be broke if it fails too hard.

Unit-testing a class which uses logic from a base class, for example accessing the database, is a lot of work. The best way is usually to mock all methods (from the parent class) which access the database and then run the tests on the mocked subject while correctly simulating the return values from your database. This is damn tedious.

Not wanting to go into too much detail here, but testing an Active Record class as-is is often even worse since tests which hit your database are usually damn slow, harder to set up and harder to keep atomic. But most importantly, unit-tests should fail for only one reason, and also testing a database access layer will likely make it a lot harder to locate the exact reason for unit test failures.

A Smaller Example

One of my preferred examples when describing this to others is a mistake in a component design I made not too many years ago. I wanted to write a little diff component, which makes it possible to evaluate the line-wise diffs in source code, but also paragraph and word-wise diffs for wiki articles and blog posts.

When writing the component, I started implementing the most common diff algorithm on top of "tokens", which should allow me to handle all the cases mentioned before. Once the algorithm worked, I derived from the class, implementing the different diff flavours:

That worked fine for some time, but once the diff component was integrated into PHPUnit, Sebastian Bergmann noticed problems with large texts and we realized that there are better diff algorithms for particular large texts. The LineDiff and TextDiff classes, implementing their respective tokenizing rules, are still fine, but we wanted to replace the algorithm in the base class depending on the diff use case. This is, of course, not possible. A better class design would obviously be:

We can now replace the used diff algorithm, test it directly based on simple token streams, and every concrete Diff implementation (LineDiff, TextDiff, …) will still work with all the diff algorithm implementations you can come up with. A downside, of course, is that creating the Diff object gets slightly more complex, but this is usually handled by a Dependency Injection Container, anyway.

The Helper Method

The points where Code-Reuse-By-Inheritance still keeps creeping into my code from time to time are small helper methods. For example for complex data structures it is so easy to just define a method in the data structure which calculates some kind of hash, to apply simple transformations or to validate some values. The next thing you notice is that you will move those methods up in the extension hierarchy or copy it around (maybe slightly modified).

Why not make it a (single) public method in some Hasher, Validator or Visitor class? You might be surprised how much easier stuff gets to test and that you might even be able to re-use that code in even more places. I guess the fear of (many) classes applies here again, which I consider void. But this is another blog post.

Of course it is valid to have such helper methods during prototyping, but if you start using such code in multiple places, start testing it or start to put it into production you should refactor it to follow the mentioned concerns.

In most cases, those helper methods are also a different concern which you start mixing into your class. Even the concern seems closely related, it probably does not hurt to move it somewhere else and make it explicit.

It may happen, though, that people start to over-engineer given these constraints. But every developer walks on a very fine line between missing abstraction and over-engineering all the time, anyway. A good base for a decision could be: Will it ever make sense to somehow re-use this piece of code or may it be possible that someone wants to replace this implementation in a similar use case or during testing?

Testing Private Methods

The urge to test private or protected methods is, in my opinion, a code smell which should directly lead to refactoring.

Once you separate your concerns and move code you share by inheritance into its own classes, every non-trivial code will be contained in public methods which are easy to test. And more importantly: easier to understand.

Depth Of Inheritance Tree (DIT)

Then there is the "Depth of Inheritance Tree" (DIT) metric with a common boundary value of 5, while the counting even stops at component borders. To me, the maximum value for this metric should be considered 2. Except for some struct classes / value objects, there is, in my opinion, no reason for more then one level of extension of a class. If you use inheritance just for defining the _type_ of classes, you will never extend more than once. If you are tempted to do that, use aggregation instead and you are probably fine in 99% of all cases.

If you have valid use cases for an inheritance hierarchy greater than two, please share those with me.

Summary

To me, by now, Code-Reuse-By-Inheritance is a clear code smell. Every time I am tempted to do this or find this in existing code, I will refactor out the code into a separate class as soon as possible.

Comments

  • Stephan on Mon, 20 Jan 2014 19:41:20 +0100

    I just found myself doing Code-Reuse-By-Inheritance for my own phpunit testcases. How would you solve that ? I can't imagine to put every test into the dic.

  • Christian on Mon, 20 Jan 2014 20:11:12 +0100

    You actually made me think about the question regarding hierarchies greater than two a lot, and I found a (from my current point) valid example.

    We have some kind of a monitoring solution, which has an abstract class "Monitoring\Check", which has several abstract child classes (e.g. "Monitoring\Check\Client"), which in turn have concrete sub-classes (e.g. "Monitoring\Check\Client\Windows").

    What do you think about this example?

  • Kore on Mon, 20 Jan 2014 20:36:29 +0100

    @Stefan: Sadly it is not really possible to inject dependencies in PHPUnit-Test-Cases. This is one place where I commonly use code-reuse-by-inheritance myself. Sometimes we create test-helper-objects, which then can be implemented in a clean way.

    @Christian: Depends on what those objects are doing … sounds more like value-objects, or do they share implementation? When you got complex object graphs representing complex data structures, one sometimes tends to implement logic in there, which would make more sense to be implemented in a visitor or factory (just very common examples). But we should probably discuss this in more detail while drinking some beer (/ rum / gin).

  • Stephan on Mon, 20 Jan 2014 20:47:49 +0100

    @Kore: My idea was to use traits. I then could "use" the traits as needed in my tests. Would be a useful way to use traits in my opinion. Unfortunately our codebase is still php 5.3.

  • Larry Garfield on Tue, 21 Jan 2014 05:40:58 +0100

    Recent and relevant: I just posted my own issue on the subjects of traits in PHP 5.4 last week.

    http://garfieldtech.com/blog/beyond-abstract

    Digest version: As of PHP 5.4, abstract classes in general are a code smell. Use traits instead and separate the different kinds of reuse. If you're not on PHP 5.4 yet, here's your excuse to demand to be allowed to move to it. :-)