Showing posts with label Prasad. Show all posts
Showing posts with label Prasad. Show all posts

Tuesday, November 20, 2012

Does Distributed Development Affect Software Quality? An Empirical Case Study of Windows Vista

Christian Bird, Nachiappan Nagappan, Premkumar Devanbu, Harald Gall, and Brendan Murphy.
Does distributed development affect software quality?: an empirical case study of Windows Vista.
Commun. ACM, 52(8):85–93, August 2009.


View the paper here.
View the presentation here.

Abstract:-

    This paper tries to analyze the effects of distributed development over collocated development. The case-study taken is of the product of Microsoft Vista. Some of the authors of this paper are the developers and researchers from Microsoft and they have good idea what happened during the development of Microsoft Vista. The authors are trying to assess where more number of failures occurred, was it in development of files being developed in distributed manner or in collocated manner. So this paper is about analyzing post-release failure components. The authors have found almost negligible differences in case of the failure rate between distributed development and collocated development. The authors have also contrasted why the differences was so little when it was likely that failure rate of distributed development should have been quite high. The authors have also spoken a little about Microsoft’s approach to achieve proper global software development.


Discussions:- 

    The authors are making a pioneering approach in a study which is first of its kind that distribution development has been divided into multiple levels of separation like building, campus, continent, etc., the software to be studied is composed of thousands of executables, libraries and thousands developers, individual files among these thousands of files have been characterized numerically with different quality attributes and the developers involved in the project are quite skilled and have years of experience in their domain.

    The authors have divided the distribution into various types. The first type of distribution is of building. A file classified at the building level may have been worked on by developers on different floors of the same building. Developers who work in the same building will enjoy more face to face and informal contact. The second type of distribution is of cafeteria. In Microsoft, a cafeteria can be seen between one and five nearby buildings. Developers who belong in this set of buildings are said to have developed the file in cafeteria distribution. Developers may share meals together and meet by chance during meal times. The third type of classification is of campus. A campus represents a group of buildings in one location. The fourth type is locality. This refers to different campuses present in a city or group of adjacent cities. The fifth type is continent. Whichever campus falls in the geographical landmass of same continent, fall under this category. The last is of world. This consists of collection of continents. A file is assigned the lowest level in the hierarchy from which at least 75% of the commits were made. Commits are made after significant amount of change has been done in the code or some function point has been added in the code.

    The Microsoft architects had distributed the work in a manner such that majority of the work being done was at hierarchy of building level. The total number of files being developed came up to 68 percent. Now, with analysis done by Mann Whitney test it was found that on an average, only 8 percent more failure rate is found in files being developed in distributed manner. Further analysis revealed that if number of developers are non-uniformly distributed with different distribution of work happening in different teams, this error rate can be brought down to 4.6 percent. When analysis was done for different levels of distribution, it was found that at continent level the error rate is lesser than the error rate of collocated development.

     To investigate this peculiar behavior, the authors studied the various characteristics of the files present in Microsoft Vista. They studied the characteristics of size, complexity, code churn, test coverage, dependencies and people involved in the project. It was found that the files developed in distributed manner were less complex, had less code churn and had fewer dependencies than collocated files.

     Authors also spoke about the approach taken by the Microsoft to improve global software development. Microsoft treats all its employees equally and hence there is usually no negative competition spirit among different team members. So there is a good relationship among different sites. Microsoft tries to reduce the cultural barriers among the different teams by promoting tours to different sites across different parts of the world, so that employees understand the cultures, ethics and behavior of their compatriots in other parts of world. Microsoft encourages its employees to stay in touch with different employees present in different parts of world and hence takes care that synchronous communication happens among different sites. Microsoft employs only one type of configuration management tool so that similar process is performed in all the sites. Every file has a defined owner in Microsoft who is responsible to see that file falls under correct hands in different phases of development lifecycle of the product. Microsoft keeps common schedules of deadlines across all its sites. Microsoft keeps organizational integration by keeping managers at the same site so that face to face communication takes place between their managers.

    It is desired that when files are going to be developed at distributed level, it is necessary that file is having less complexity, less code churn and has fewer dependencies. If good software engineering practices are employed, distributed development is as good as single-site development. An organizationally compact but geographically distributed project would be better than a geographically local, organizationally distributed project.

Globally Distributed Software Development Project Performance: An Empirical Analysis

Narayan Ramasubbu and Rajesh Krishna Balan. Globally distributed software development
project performance: An empirical analysis. InESEC-FSE’07, Cavtat, Croatia, September 2007.
ACM.


View the paper here.
View the presentation here.

Abstract:- 

    This paper is about the effect on productivity of employees and quality of code when development takes place in a distributed manner. The paper is based on results of an extensive two year done by the authors. They first develop a model of distributed software development to study the impacts. The model is highly a mathematical based model with the use of several concepts in it. The authors make use of a company which has obtained the rating of level 5 in capability maturity model (CMM). The unnamed company worked on a total of 42 projects for the two years and all the quality attributes obtained in these projects were given as input to the mathematical model developed by the authors. The results obtained by using the parameters of quality attributes were tested for its validity using different types of validation tests. These validation tests gave good results to the model so that model could be used at any place where the scenarios are similar to that of studied company. The results obtained by this mathematical model were that the dispersion effect had the impact on productivity of employees and quality of code.

    To counter these negative effects, several steps have to be adopted. All these steps were based on following good software engineering practices throughout the lifecycle of the product development. The authors have classified these practices into three approaches oriented on prevention basis, appraisal basis and failure basis. Each of these approaches has its own significance in its way in some part of the development phase of the project. The authors observed that when the practices in a company are oriented towards the above mentioned approaches, the company does well to mitigate the effects of dispersion on productivity of employee and quality of code and on the contrary, even improves the performance better than collocated development.


Discussions:-
    
     We learnt about the creation of a model from scratch and what elements is author considering in construction of it. At first, author considered all the factors affecting software development. First factor is work dispersion for which author had made use of Herfindahl-Hirschman index to obtain a numeric value of work dispersion. The other factor is a collection of approaches which the author describes are necessary to be followed by the companies working in dispersed manner. The first approach is of prevention-based which speaks about the various types of training an employee has to undergo before working on a project and various types of configuration management, task planning and scheduling processes. The second approach is of appraisal-based which speaks about practices to be followed while project is ongoing like review, inspection etc. The last approach is of failure-based which talks about various practices to be followed once the project is in the completion stages like testing, error tracking and correction.

    The model consists of two indicators that is development productivity and conformance quality. Then there are five types of control variables which are team size, code size, reuse, upfront investment and design rework. Some of these control variables are dependent on development productivity or conformance quality or both. The model tries to establish the relation between these factors affecting software development, indicators and control variables. The relation is about establishment of two equations of indicators in terms of functions as of above entities.

    The obtained model’s coefficients are obtained by making use of empirical data obtained from an unnamed CMM level 5 company’s quality attributes and other parameters. Durbin-Wu-Hausman endogenity method of two-stage least squares test is used to obtain these coefficients. The significance of the obtained model with coefficient value is tested by making use of two-tailed hypothesis test. The validity of the whole model on the whole was tested by making use of F-Test. Both the validity tests gave positive results signifying the model obtained is mathematically correct.

    Then using the values from the model, it was established that productivity of employees decreases with increase in dispersion and productivity of employees has an inverse relation with quality of the code. The values obtained in model also denoted that increase in appraisal-based and failure-based approaches will help to reduce productivity loss caused by dispersion. The model also indicated that more failure-based approaches will help to increase the quality of the product.

     The main concluding remark will be stated as dispersion significantly reduces development productivity and has effects on conformance quality. But these negative effects of dispersion can be significantly mitigated through deployment of structured software engineering processes in terms of various types of approaches described by the authors. Companies when going for distributed development have to account for the inevitable loss in productivity and quality when deciding to move software production to a second or third location to reduce labour costs, etc. The model results suggested that companies that institute high quality software processes are far more likely to overcome the effects of dispersion than companies that don’t.

Managing Complexity in Collaborative Software Development: On the Limits of Modularity

Marcelo Cataldo, Matthew Bass, James D. Herbsleb, and Len Bass. Managing complexity in
collaborative software development: On the limits of modularity. InCSCW’06, Banff, Alberta,
Canada., November 2006. ACM.


View the paper here.
View the presentation here.

Abstract:-

    This paper is about the common mistakes which occur in a distributed development. The paper has listed about the poor practices which are followed by industries following distributed development and what serious consequences they pay up because of these types of negligence. The paper aims at making us understand that identification and management of dynamic dependencies between components of software systems is a constant challenge for software development organizations through 4 case studies. We have to appreciate the above statement by means of case studies which author speaks about. The problems occurring in a distributed development can range from simple syntactic differences to complex semantic dependencies. To overcome all these problems, the author suggests the necessity of communication and coordination.


Discussions:-

    The paper starts with the Conway’s law which states that organizations which design systems in distributed manner often produce designs which are copies of the communication structures of these organizations. Conway’s law which was proposed in 1968 doesn’t hold true today because at that time, communication tools was not as prevalent like as they are today. So, if proper communication and coordination is ensured between different teams spread across the world, a single product can be built with less cost, more intellect, less time and better quality. Baldwin’s and Clark’s work suggests that for this improvement to achieve, there has to very good modularization of work by architects so that teams can work independently and parallelly with as little communication and coordination possible between different teams.

    The author explains the first case study in which some design specification was changed by one team. They changed the syntactic formation of the code given by the central team. But the team did not document that changes were made, nor did they mention about the changes done by them in forums or video conferences etc. The team did not even mention about the change being done to the other teams which were likely going to use their piece of code. Finally, when another team used it, there was a serious problem in understanding the resolving needed. This shows the importance of documentation and communication in global software development

    In the second case study, a team changed the semantics of the interface of code which was going to be used heavily later on by many teams. The team made the changes in code without the proper acknowledgement from central team. As a result, there was a big spiraling effect on all the development centers to make changes in their code to suit according to the required interface. This shows again the importance of teams following the proper orders by central teams and not creating their own rules.
    
    In the third case study, the modularization of a project work was not done properly. The code which was having inter dependencies on each other was split across different teams. As a result, continuous communication and coordination was needed between the two teams which did not take place for long. This shows the importance of architect roles in properly modularizing the work and then dividing it into different teams.
    
    In the final case study, it shows the importance of contact person of the group. In a country where international language is not spoken and if the contact person who is the lone link between the two sides of the world, falls sick, it can have catastrophic effects on the development. That is what happened in the final case study.

    Communication and coordination is very important in teams which are performing distributed development. But yet, it is necessary that modularization of the work is done in such a way that communication and coordination happening among the teams is as little as possible so that work can take independent of each other. Also, maintaining proper documentation is extremely important in case of distributed development.

Developing a knowledge-based perspective on coordination: The case of global software projects

Julia Kotlarsky, Paul C. van Fenema, and Leslie P. Willcocks. Developing a knowledge-based perspective on coordination: The case of global software projects. Inf. Manage., 45(2):96–108, March 2008.

View the paper here.
View the presentation here.


Abstract:- 

    This paper is about the explanation of various types of activities which are needed to increase the coordination between different sites in a global software development scenario. The authors have created a model based on these activities. The model is a pure theoretical model with no mathematics involved in it. The model consists of different types of mechanisms necessary for achieving coordination. The authors have proposed four types of mechanisms. They are organization design, work-based, technology-based, social (inter-personal) mechanisms. The developed by authors is applied on two projects belonging to two companies. The two companies are SAP and Baan. They observed whether the proposed model by them gives the correct result or not. The model gave the result that the project of SAP would be successful and of Baan would be unsuccessful. In reality as well, same results happened proving that the model proposed by the authors was correct.


Discussions:-

    The authors have proposed a model for achieving coordination. The model takes into consideration different types of mechanisms involved. Each of these mechanisms in turn has several activities to be followed. The first mechanism is of organization design. This mechanism states that organization structure has to be formed properly. The designation of employees should be in a hierarchical format. Coordination has to be there among people at the same level and for this to be achieved, team size should be small. Whenever contact has to be established among different teams, the contact person of the teams should take up responsibility of establishing the communication between different teams.

    The next mechanism is of work-based mechanism. In this mechanism, it is desired that work is divided correctly among the employees with no possibility of over work or under work. In this way, employees will have equal responsibilities to handle with no sense of envy among the peers. In this way, each employee of the project will be having some defined amount of knowledge of the project and hence when anyone wants to know something about the code, the person is known who is responsible. The organization also has to step in and provide standard tools and should encourage its employees to follow standard methodologies and specifications.

    The next mechanism is of technology-based mechanism. In this type of mechanism, it is desired that organizations provide maximum support of technology which will aid in coordination and communication. It is desired that shared databases are provided among different teams. Next, the use of video conferences, forums, blogs and chats should be encouraged among the members of team belonging to different teams to ensure proper coordination is achieved. The authors also suggest the use of technologies which support global software development.

    The last mechanism is social (inter-personal) mechanism. This mechanism is about achieving inter-team spirit among different teams. For this, it is encouraged that before the start of project, the teams from different parts of the world come together and spend some time understanding the different team’s cultures, ethics, behaviors etc.

    The authors suggest that the mechanism of organization design should facilitate knowledge flow or establish a communication channel among different teams. The mechanism of work-based should help in making knowledge explicit which means there has to be only one responsible person for a piece of code. The technology-based mechanism should help in amplifying knowledge which means the gap between different teams should be bridged by means of technology. The social mechanism should help in building social relations among the team-members of different team.
  
    This proposed model was applied on the individual project of two companies. The two companies were SAP and Baan. In case of SAP, all the activities proposed by authors were followed and hence the authors predicted that the project should be a success. The other company was of Baan. Many activities proposed by authors were not followed. The only activities which were majorly followed belonged to technology-based mechanism. The reason why Baan didn’t follow all the activities was that the company was not financially stable and had to cut down lot of expenses. The model predicted that the project to be developed by Baan would be a failure. After the projects were completed, it was found that the project of SAP was successful and of Baan was unsuccessful.

     To achieve proper coordination between different teams in a global software development project, it is necessary that organization follow all the different types of mechanisms which the authors have proposed. The mechanisms are organization-based, work-based, technology-based and social (inter-personal) mechanisms. Coordination is one of the most important factors to be considered in when organizations are going for global software development.