02.15
Werd homez.
Here is a Perl script that will extract dependency information from all the C++ source and header files it finds within a specified directory structure, and spit out a bunch of interesting stuffs it can find out about them. It would be fairly trivial to convert to parsing Java or something, essentially just change “.cpp” to “.java” and “#include” to “import” or whatever it is.
EDIT: Version 2.0 released. Changes since 1.1:
- Fixed a couple of fatal bugs;
- A few more command line options;
- It warns if the dependency graph is not levelizable;
- Experimental support for Java.
EDIT: Version 1.1 released. Changes since 1.0:
- Better documentation;
- More command line options;
- Option to generate a PNG imageĀ of the dependency graph (uses GraphViz; example below).
![[Component dependency graph for "Colony" 2010-03-06]](http://www.oktalist.com/images/colony_component_graph.png)
It gives each component (.cpp/.h pair) a level:
Level 1 means it depends only on external libraries (including system libraries); it can be tested independently of any other component in the system.
Level 2 means it depends on external libraries and one or more members of level 1; it can be tested independently of other components in levels 2 and higher.
Level n means it depends on external libraries and one or more members of levels <n; it can be tested independently of components in levels >n.
Circularly dependent components share a level and are marked with an asterisk.
It then calculates the normalised cumulative component dependency (NCCD):
num_of_dependencies / (num_of_components * log(num_of_components))
An NCCD of 1.0 is the theoretical baseline of a system in which the dependency graph resembles a balanced binary tree. <1 is loosely coupled, >1 is becoming more tightly coupled. In practice, anything <1.5 is good, and anything >2.0 is not so good. 0.0 would mean that none of the components depends on any other.
And when you achieve low coupling you automatically acheive high cohesion; the two are intertwined. Components become more cohesive when you strive to keep only closely tied functionality within each single component. Refactor mercilessly whenever a component is becoming too eclectic, and keep in mind the dependency graph when factoring out functionality into a new component. This encourages appropriate reuse, as it makes interfaces more general and allows clients to gain access to some functionality that they want without pulling in a load of useless garbage at the same time.
Sign off!
Acknowledgements/bibliography:
Large Scale C++ Software Design, John Lakos, 1996, Addison-Wesley Publishing
Caveats:
Although direct circular dependencies are allowed, indirect circular dependencies are not so far. That is to say, cycles in the dependency graph passing through three or more components. Behaviour in this case is undefined.
bump ;)
new version w/ pretty graph
Pretty pictures!
I don’t do any C++, but you should try running it on OpenOffice.org and see if it explodes…
Submitted to reddit.
Is level 1070 bad?
OMG people are actually using it :O
reddit kicks ass – thanks bLaXe
Level 1070? Yeah, you’re pretty screwed (or the script is ;)
Your welcome :)
Yup, reddit is awesome. It’s a good feeling when your code helps someone else.
FYI – this page has had around 5000 pageviews since the start of the week (bfish averages around 10K pageviews a month). Not bad going I’d say…
My welcome what? ;)
Uploaded a new version again, this time with experimental support for Java (add “–lang java” on the command line) just for you Chris.
It can’t be foolproof though because Java lets you use any available package without an import statement just by using the fully qualified package name (had to refer to Java 1.1 in a Nutshell to remind myself how it works).
Also: Go Oktal, it’s your birthday, go Oktal, it’s your birthday \o/
There’s a Makefile::GraphViz module that does a similar thing, although you will need to have a Makefile in order to generate the graph. And it only interprets your Makefile, not your .cpp files. I bet you can find some gems in its module dependency list that might be able to help you implement this script.
Makefile::GraphViz doesn’t measure the amount of interdependency of components, or arrange them into levels (although its diagrams give the appearance of levelisation), which was the primary reason for my script. I added the diagram generation because it was easy to add given that the script already compiles an adjacency matrix of components.
Good point, though. The following command would generate a makefile suitable for Makefile::GraphViz describing just the dependencies between source files:
find . -name ‘*.cpp’ -exec cpp -MM -MT ‘{}’ ‘{}’ ‘;’
But for my purposes having the script search through the source files itself seemed just as easy as parsing such a makefile, and more flexible.