Tuesday, November 20, 2012

Does Distributed Development Affect Software Quality? An Empirical Case Study of Windows Vista

Christian Bird, Nachiappan Nagappan, Premkumar Devanbu, Harald Gall, and Brendan Murphy.
Does distributed development affect software quality?: an empirical case study of Windows Vista.
Commun. ACM, 52(8):85–93, August 2009.


View the paper here.
View the presentation here.

Abstract:-

    This paper tries to analyze the effects of distributed development over collocated development. The case-study taken is of the product of Microsoft Vista. Some of the authors of this paper are the developers and researchers from Microsoft and they have good idea what happened during the development of Microsoft Vista. The authors are trying to assess where more number of failures occurred, was it in development of files being developed in distributed manner or in collocated manner. So this paper is about analyzing post-release failure components. The authors have found almost negligible differences in case of the failure rate between distributed development and collocated development. The authors have also contrasted why the differences was so little when it was likely that failure rate of distributed development should have been quite high. The authors have also spoken a little about Microsoft’s approach to achieve proper global software development.


Discussions:- 

    The authors are making a pioneering approach in a study which is first of its kind that distribution development has been divided into multiple levels of separation like building, campus, continent, etc., the software to be studied is composed of thousands of executables, libraries and thousands developers, individual files among these thousands of files have been characterized numerically with different quality attributes and the developers involved in the project are quite skilled and have years of experience in their domain.

    The authors have divided the distribution into various types. The first type of distribution is of building. A file classified at the building level may have been worked on by developers on different floors of the same building. Developers who work in the same building will enjoy more face to face and informal contact. The second type of distribution is of cafeteria. In Microsoft, a cafeteria can be seen between one and five nearby buildings. Developers who belong in this set of buildings are said to have developed the file in cafeteria distribution. Developers may share meals together and meet by chance during meal times. The third type of classification is of campus. A campus represents a group of buildings in one location. The fourth type is locality. This refers to different campuses present in a city or group of adjacent cities. The fifth type is continent. Whichever campus falls in the geographical landmass of same continent, fall under this category. The last is of world. This consists of collection of continents. A file is assigned the lowest level in the hierarchy from which at least 75% of the commits were made. Commits are made after significant amount of change has been done in the code or some function point has been added in the code.

    The Microsoft architects had distributed the work in a manner such that majority of the work being done was at hierarchy of building level. The total number of files being developed came up to 68 percent. Now, with analysis done by Mann Whitney test it was found that on an average, only 8 percent more failure rate is found in files being developed in distributed manner. Further analysis revealed that if number of developers are non-uniformly distributed with different distribution of work happening in different teams, this error rate can be brought down to 4.6 percent. When analysis was done for different levels of distribution, it was found that at continent level the error rate is lesser than the error rate of collocated development.

     To investigate this peculiar behavior, the authors studied the various characteristics of the files present in Microsoft Vista. They studied the characteristics of size, complexity, code churn, test coverage, dependencies and people involved in the project. It was found that the files developed in distributed manner were less complex, had less code churn and had fewer dependencies than collocated files.

     Authors also spoke about the approach taken by the Microsoft to improve global software development. Microsoft treats all its employees equally and hence there is usually no negative competition spirit among different team members. So there is a good relationship among different sites. Microsoft tries to reduce the cultural barriers among the different teams by promoting tours to different sites across different parts of the world, so that employees understand the cultures, ethics and behavior of their compatriots in other parts of world. Microsoft encourages its employees to stay in touch with different employees present in different parts of world and hence takes care that synchronous communication happens among different sites. Microsoft employs only one type of configuration management tool so that similar process is performed in all the sites. Every file has a defined owner in Microsoft who is responsible to see that file falls under correct hands in different phases of development lifecycle of the product. Microsoft keeps common schedules of deadlines across all its sites. Microsoft keeps organizational integration by keeping managers at the same site so that face to face communication takes place between their managers.

    It is desired that when files are going to be developed at distributed level, it is necessary that file is having less complexity, less code churn and has fewer dependencies. If good software engineering practices are employed, distributed development is as good as single-site development. An organizationally compact but geographically distributed project would be better than a geographically local, organizationally distributed project.

No comments:

Post a Comment