Monday, November 19, 2012

An empirical study of speed and communication in globally distributed software development

James D. Herbsleb and Audris Mockus. An empirical study of speed and communication in globally distributed software development. IEEE Trans. Softw. Eng., 29(6):481–494, June 2003.


View the paper here.
View the presentation here.


Abstract


Global Software Development is becoming increasingly more popular model of software development throughout the world. It has been quantitatively found out that distributed nature of GSD may increase the time required for implementation of unit of development work, e.g. modification request. In this paper, the authors use data from both source code change management system and survey to quantify the delay that is introduced in GSD and also to explore the reasons and mechanisms which cause this delay. One of the major findings from their research is that it takes approximately two and a half times longer in case of GSD to complete a unit of work compared to single-site development. In addition, their research suggests that for implementation of GSD, more number of people are needed compared to that of single-site development, and more number of people involved increases the development time significantly, as number of people involved and calendar time are strongly connected. The analysis was performed in different organization, with different products and different sites and the results obtained were the same. The authors also report the difference between GSD and same-site development based on the gather data by testing several hypotheses on characteristics distributed social network that may contribute to the introduced delay. In the end, the authors suggests the implications of their findings in present characteristics of GSD and certain changes or modifications that may drastically improve the speed of development in GSD.


Discussions


Initially in the paper the authors discuss about the effect of distance on communication. In any development work communication process is highly significant for the success of the project. But with physical separation, the frequency of communication reduces significantly and issues related to communication, such is who to communicate, increases. There are many researches done on these issues, but questions remain about cumulative effect. For example, how distance affects the speed with which software engineering tasks are accomplished, and how distance is related to other important variables that influence speed, such as the size of a task, or the number of people involved. 

To solve the research questions, the authors use quantitative methods. The answer of first question, that is if distributed delay introduces delay compared to same-site development, is surprisingly found out to be negative from their initial analysis. To find out the root cause behind this result, they created a graphical model, which showed that distributed development requires more number of people, and more number of people introduces more delay.

The answer to the second question, which is what factors influence the time interval required to make a software change, is found to be size, diffusion and number of people involved. In addition, compared to same-site social network, in distributed development, the social network has very restricted information flow. People find it difficult to get in touch with the appropriate person in other sites and get very less useful information from them. Distance colleagues feel less teamness. They also found that reduction of inter-dependencies among sites reduces delay. 

The authors suggest that work should be split optimally across sites use novel strategies and reduce inter-dependencies  The organizations should increase communication by introducing communication tools, like instant messaging. They also suggest something called Experience Browser, where the people working in the organization can be browsed through based on their experience. Increased awareness through instant messaging and shared calendar also helps in reducing delay in GSD.


This paper creates a structured mathematical model of analyzing the root-cause behind introduction of delay in globally distributed development. I find this paper highly innovative as the authors used the graphical model based on collected data to go in depth of the relation between various factors that affects the time required for distributed development. The graphical model not only shows to correlation between different parameters, it also shows how strongly they are connected to each other. I believe that this method can be used in any organization to quantify the relations between various development factors and strategic decisions can be taken based the analysis to improve and refine future project work.

I, somehow, cannot agree with the result that distribution of sites do not introduce delay directly, but more number of people introduced as a result of distribution is the main reason for this delay. From my personal experience, I have seen that even a very small team, if distributed, will introduce delay to the project significantly. As the analysis was performed only in two organizations, maybe the result was somewhat biased. The same analysis on different organizations can shade light on this issue. 

The paper suggests distributing the work optimally, but the definition of optimal distribution is not given. It is highly complicated to optimally distribute work across sites and till today organizations are suffering from bad decision choices regarding this. 

However, I think that the best contribution by this paper is the introduction of the mathematical and graphical method which is highly creative, innovative and useful. 

No comments:

Post a Comment