See my VUW
information page for contact details.
My CV is here.
Status:The thesis was submitted in December, 2011, but has not yet been defended.
Abstract: Much of the cost of software development is maintenance. Well structured software tends to be cheaper to maintain than poorly structured software, because it is easier to analyze and modify. The research described in this thesis concentrates on determining how to improve the structure of object-oriented classes, the fundamental unit of organization for object-oriented programs. Some refactoring tools can mechanically restructure object-oriented classes, given the appropriate inputs regarding what attributes and methods belong in the revised classes. We address the research question of determining what belongs in those classes, i.e., determining which methods and attributes most belong together and how those methods and attributes can be organized into classes. Clustering techniques can be useful for grouping entities that belong together; however, doing so requires matching an appropriate algorithm to the domain task and choosing appropriate inputs.
This thesis identifies clustering techniques suitable for determining the redistribution of existing attributes and methods among object-oriented classes, and discusses the strengths and weaknesses of these techniques. It then describes experiments using these techniques as the basis for refactoring open source Java classes and the changes in the class quality metrics that resulted. Based on these results and on others reported in the literature, it recommends particular clustering techniques for particular refactoring problems. These clustering techniques have been incorporated into an open source refactoring tool that provides low-cost assistance to programmers maintaining object-oriented classes. Such maintenance can reduce the total cost of software development. This document summarizes my research career and discusses some areas I intend to look at in the future.![]() |
![]() |
| Ungrouped members of a class | Grouped members of a class |
I have this idea that social network analysis (SNA) techniques can be useful for the code analysis community. However, this is not my main research thrust, and I don't want to go off on too much of a tangent. I'm thinking that somebody might find these ideas worth pursuing in greater depth than I can. I'd be happy to contribute.
![]() |
| ExtC screen shot - graph view |
![]() |
| Metrics2 screen shot |
A list of refactoring references.
A collection of object-oriented cohesion references with associated abstracts.
SNA and SW Engineering references - Some previous work with using SNA techniques on software.
Here are BibTeX references to papers about code similarity, graphs, SNA, knowledge representation, maintenance, metrics, patterns, query languages, refactoring, software clustering, and visualization that I've found interesting.
A big ol' list of object-oriented cohesion metrics and their acronyms.