Does coding style matter?
Code has to work correctly, but it also should be easy to maintain. Many factors have been proposed that may aid in creating maintainable code, e.g. software metrics, code smells, object-oriented principles, etc. In this project we’re interested in the impact of programming style on software maintainability.
Some programmers are very conscious of programming style; other aren’t. Does it really matter? By ‘matter’ we mean does it have an impact on something truly important, such as bug rate?
For example, in a language like Java, it’s generally accepted that variables should be declared in the narrowest scope possible. However using a field where a local variable would suffice will work fine, even though purists (like me, and probably you) will argue that it places an unnecessary cognitive burden on the person reading the code. This question in this project is, can we measure this? Does non-adherence to accepted programming style correlate with known problems such as bug rate or code churn? And is the inversion (adherence correlates with reduced bugs and churn) true?
This project seeks to investigate these issues in an empirical way. A (large) number of Github repositories will be mined and the sequence of commits analysed. For each commit, adherence to programming style will be measured along with other factors: bug rate (if available), churn, plus others. This data will be analysed statistically to determine what correlations exist and which idioms really matter.
The project would suit a student who wants to do real research and hopefully publish a paper based on their project work.