Skip to main content
Article
A large scale study of multiple programming languages and code quality
2016 IEEE 23rd International Conference on Software Analysis, Evolution, and Reengineering (SANER): March 16-18, 2016, Osaka: Proceedings
  • Pavneet Singh KOCHHAR, Singapore Management University
  • Withthige Dinusha Ruchira WIJEDASA, Singapore Management University
  • David LO, Singapore Management University
Publication Type
Conference Proceeding Article
Version
publishedVersion
Publication Date
3-2016
Abstract

Nowadays, most software use multiple programming languages to implement certain functionalities based on the strengths and weaknesses of different languages. Researchers in the past have studied the impact of independent programming languages on software quality, however, there has been little or no research on the impact of multiple languages on the quality of software. Does the use of multiple languages cause more bugs? Are certain languages when used with other languages make software more bug prone? What are the relationships between multi-language usage and various bug categories? In this study, we perform a large scale empirical investigation to shed light on the answers to these questions. We gather a large dataset consisting of popular projects from GitHub (628 projects, 85 million SLOC, 134 thousand authors, 3 million commits, in 17 languages) to understand the impact of using multiple languages on software quality. We build multiple regression models to study the effects of using different languages on the number of bug fixing commits while controlling for factors such as project size, team size, project age and the number of commits. Our results show that in general implementing a project with more languages has a significant effect on project quality, as it increases defect proneness. Moreover, we find specific languages that are statistically significantly more defect prone when they are used in a multi-language setting. These include popular languages like C++, Objective-C, and Java. Furthermore, we note that the use of more languages significantly increases bug proneness across all bug categories. The effect is strongest for memory, concurrency, and algorithm bugs.

Keywords
  • Computer bugs,
  • Java,
  • Programming,
  • Software quality,
  • Google
ISBN
9781509018550
Identifier
10.1109/SANER.2016.112
Publisher
IEEE
City or Country
Piscataway, NJ
Creative Commons License
Creative Commons Attribution-NonCommercial-No Derivative Works 4.0 International
Additional URL
http://doi.org./10.1109/SANER.2016.112
Citation Information
Pavneet Singh KOCHHAR, Withthige Dinusha Ruchira WIJEDASA and David LO. "A large scale study of multiple programming languages and code quality" 2016 IEEE 23rd International Conference on Software Analysis, Evolution, and Reengineering (SANER): March 16-18, 2016, Osaka: Proceedings (2016)
Available at: http://works.bepress.com/david_lo/165/