AbstractsComputer Science

Open Source Software Evolution and Its Dynamics

by Jingwei Wu




Institution: University of Waterloo
Department:
Year: 2006
Keywords: Computer Science; Open Source; Software Evolution; Evolution Dynamics; Punctuated Equilibrium; Self-Organized Criticality
Record ID: 1779188
Full text PDF: http://hdl.handle.net/10012/1095


Abstract

This thesis undertakes an empirical study of software evolution by analyzing open source software (OSS) systems. The main purpose is to aid in understanding OSS evolution. The work centers on collecting large quantities of structural data cost-effectively and analyzing such data to understand software evolution dynamics (the mechanisms and causes of change or growth). We propose a multipurpose systematic approach to extracting program facts ( e. g. , function calls). This approach is supported by a suite of C and C++ program extractors, which cover different steps in the program build process and handle both source and binary code. We present several heuristics to link facts extracted from individual files into a combined system model of reasonable accuracy. We extract historical sequences of system models to aid software evolution analysis. We propose that software evolution can be viewed as Punctuated Equilibrium ( i. e. , long periods of small changes interrupted occasionally by large avalanche changes). We develop two approaches to study such dynamical behavior. One approach uses the evolution spectrograph to visualize file level changes to the implemented system structure. The other approach relies on automated software clustering techniques to recover system design changes. We discuss lessons learned from using these approaches. We present a new perspective on software evolution dynamics. From this perspective, an evolving software system responds to external events ( e. g. , new functional requirements) according to Self-Organized Criticality (SOC). The SOC dynamics is characterized by the following: (1) the probability distribution of change sizes is a power law; and (2) the time series of change exhibits long range correlations with power law behavior. We present empirical evidence that SOC occurs in open source software systems.