Operational Excellence in Software Development: A Brief Introduction

I. Introduction

Experience tells us that delivering large software systems is fraught with complexity. The traditional method of dealing with such was a mixture of best practices without theoretical backing and a variety of scientific (sometimes pseudo-scientific) frameworks poorly applied to problems in the field.

In this article, we describe Operational Excellence and how it can help overcome these challenges. But first, let us review the specifications of a successful software company, examine its main challenges, and imagine new solutions.

II. Software Development and the Challenge of Complexity

A. Successful Software Business

Customer needs drive the design and production of products while environmental changes push production, cost, quality and sustainability to their limits.

Customer needs drive product design and production, while environmental changes push production, cost, quality, and sustainability to their limits.

Successful software businesses have the following properties:

  • Quality software products, including customer support and maintenance
  • Efficient project execution with on-time and within-budget deliveries
  • The ability to effectively respond to market changes and demands
  • Sustainability, or the ability to stay in the game for a long time

If we examine these properties closely, we will notice a striking resemblance to manufacturing and almost any other product-developing organisation. The requirements for a successful service-oriented organisation are also not so dissimilar; aside from the first requirement on quality, which might be rephrased to suit the service context, the remaining three remain the same.

Companies also have their challenges; let’s examine them next.

B. Common Challenges of Delivering Software

Uncertainty has many sources, creating a complex setting in which software development and delivery can easily struggle.

Here are some examples of why being a successful software company is more challenging than it looks.

  • It’s not only about technology or processes but also about people
  • Talent is the source of innovation, creativity, and contribution, and it can occur only in the right context and within a healthy organizational culture supported by a suitable rewards system.
  • Processes and talent evolve together, influencing one another to shape the organisation’s performance.
  • Knowledge drain
  • While patents, software code, and other forms of intellectual property are vital organisational assets, tacit knowledge accumulated from years of expertise in processes and people’s heads is just as necessary.
  • Tacit knowledge is, by definition, difficult to codify, and losing talent also means losing knowledge. Organisations make investments in training, education, coaching, and upskilling in the hope that they will receive due returns in the near future. Knowledge management is another critical challenge in software organisations.
  • Too much volatility
  • In software, areas of high volatility include business requirements (especially in novel products), novel technology, emerging market trends, customer preferences, global supply chains, and natural catastrophes.
  • What about stable products and markets (like digital payments), which, although constantly changing, evolve at a much slower pace? In these situations, uncertainty and volatility will arise for different reasons, including the presence of large legacy systems and the complexity of delivering large-scale software integration projects.
  • Project uniqueness
  • Every software project has a unique element, which means there is never a shortage of novel situations. For many companies, levelling demand from software projects is very challenging.
  • With alternating periods of famine or feast, organisations must shuffle people around and rely on outside contractors and consultants.
  • In such situations, learning can be challenging to socialize and incorporate into existing processes, forcing every project to encounter the same problems repeatedly.
  • Too much uncertainty
  • Changes in organisational priorities, strategies, management, and hierarchies are constant sources of uncertainty and anxiety. This sometimes leads people to place their interests above those of the group, business, or organisation.
  • By seeking certainty, employees try to avoid failure (and, therefore, the opportunity to learn from mistakes) by sticking to what is considered safe rather than optimal.
  • Imbalance between theory and practice
  • When software professionals lack an in-depth understanding of the first principles (or theory) to back up their best practices, their convictions in these practices will result from faith rather than scientific and rational reasoning.
  • The consequence is dogmatic adherence to these best practices even when they stop working, and they often will as circumstances change.
  • Scaling Agile
  • Agile works perfectly in small software teams, allowing them to tackle complex problems involving high degrees of uncertainty. This is done iteratively, where each new attempt is one step closer to the ideal solution.
  • The problem with Agile, however, is that it scales poorly. The higher up in the organisation one goes, the more certainty is required regarding delivery. In short, organisation planning cannot be done in two-week sprints.
  • Agile also works even less well when projects scale horizontally, i.e. in programs where multiple delivery streams must be coordinated. In this scenario, deliveries have to be coordinated and agreed upon in advance, effectively pushing the teams to fall back on Waterfall.

In Operational Excellence and the Structure of Software Development and Delivery, we examined the complexity of diagnosing and troubleshooting organisational issues. We concluded that superficial solutions wouldn’t do, and successful ones would have to rise to the required level of sophistication and potency.

Luckily, many original thinkers have articulated the assumptions, frameworks, and paradigms upon which those sophisticated solutions should be built. These assumptions, frameworks, and paradigms will form the basis for our approach, which we now describe.

III. A New Framework for Thinking About Software Development

A. People, Processes, and Technology

The solution we propose, which will form the philosophical foundations of Operational Excellence in Software Development as we describe it, has three pillars: A) people, B) Processes, and C) Technology.

The interactions between these three entities in both directions allow complex behaviour to emerge.

More important are the interactions between these three pillars, which we summarize as follows:

  • The interplay between the right people and the right processes will produce a healthy organizational culture.
  • Using the right technology, people will enhance their understanding of its capabilities and potentialities and help drive and shape its evolution.
  • A synergy between technology and processes occurs when the latter relies on the former (such as in the use of automation or artificial intelligence).

B. Theoretical Foundations of the Operational Excellence Framework

These ideas are rooted in the Toyota Production System (TPS), Organizational Theory and Culture, and Complexity and the Social and Natural Sciences. Let’s review some of those principles.

Natural Sciences

  • Mathematics, physics, chemistry, and engineering, taught at school and university, discipline the mind and provide us with a powerful toolset of logical and rational mechanisms for solving physical, technological, and mechanical problems.
  • At the heart of these disciplines is the Scientific Method, which consists of three steps: A) observation and data collection, B) hypothesis formulation and idealized model creation, and C) hypothesis testing through experimentation.
  • But, applying the scientific method to social groups is orders of magnitude more challenging. First, you will never have enough data. Second, the experiments are arduous to replicate, and finally, the insights are context-specific. But, such is the nature of reality, and the scientific method is the best we have.
  • People come before processes and technology
  • One of the most influential factors that determine the success or failure of an organisation is its culture.
  • Unlike machines, cultures cannot be engineered and once they set in, they are remarkably difficult to change.
  • Therefore, organisational culture and psychology must take a centre position in any discussion on productivity, performance, and operational excellence.

Organisations as Complex Systems

In an effort to better understand organisations and their behaviour, organisational theorists created several models, inspired by the leading scientific paradigms of the day.

First, there was the “organisation like a machine” or mechanistic model, followed by Systems Thinkings, Cybernetics, and finally Complex Systems. In between we had Open Systems Theory, the Learning Organisation and a few more.

An organisation comes into being, evolves, develops an identity, a structure, and a purpose, and subsequently disappears. It displays emergent behaviour, is hard to predict and control and is composed of human beings. Thus, it has all the properties of a complex adaptive system.

The Software Value Chain–Infinite Variations on a Single Theme

Optimizing the software delivery process includes identifying where and how business value is generated (the value chain) and understanding the process that allows this to happen (the Software Development Lifecycle or SDLC).

  • Fortunately, both are easy to uncover through field observation. Agile is an excellent example of this and will be the centrepiece of Principle 4. The idea here is that Waterfall, DevOps, Agile, and their many variations are surprisingly similar under the hood.
  • This common denominator will help us find optimization techniques that apply to all project delivery methodologies, for example.

These three pillars—organisations as complex systems, the importance of the human element, and the commonality of the underlying drivers in the value chain—will form the basis of the seven Principles of Operational Excellence in Software Development.

IV. The Route to Operational Excellence

NATO SOFTWARE ENGINEERING CONFERENCE 1968

Understanding what a successful software business looks like and how it should operate requires effort, patience, and, most of all, curiosity. Even when you think you got the facts right, synthesizing those facts into a coherent and logical narrative is fraught with complexity, and deriving the correct insights is rare. Only time can tell how well a job such exercises do.

My personal story started with some questions in two specific areas: software development processes and the structure of software businesses.

1

Software Development Processes

The first area had to do with software development processes, which differed wildly between teams and organisations. This puzzled me quite a bit as I thought surely it could not all be arbitrary, and there must be some fundamental underlying principles that any software practitioner would accept and apply in the field.

2

Structure of the Software Business

The second area covered the many relationships between the various units (business, management, operations and support, technolgy) of a software organisation. I was constantly searching for a hierarchy that never existed. Instead, it turned out ot be a dense web of relationships.

Areas 1 and 2 above produced two pressing questions, which I could state as follows:

  • Question 1: On the structure of the software business
  • Is there a coherent and logical framework, firmly anchored in scientific theories and objective reality, that allows developers and technicians to comprehend software delivery holistically, from project inception to deployment and from business to technology?
  • Question 2: On software development processes
  • Is there a set of universal guidelines and best practices that could be deployed anywhere and at any time (i.e. context-free), allowing developers to excel in delivering top-of-the-line software products within budget and on time?

Or, combined…

  • How can software organisations achieve operational excellence in their development and delivery projects?

Arriving at the destination…

The search for answers to these two questions led me explore the evolution of software development as a practice and of software as an industry. The framework that emerged was Operational Excellence in Software Development.

V. Developing a Framework for Operational Excellence

A. On the Shoulders of Giants

The answers to the two questions above came gradually and from varying and unrelated sources. Some ideas came from engineering courses, while others were distilled from manufacturing, software engineering, business management, and natural and social sciences.

Many of the concepts came from the seminal works of Robert Martin, Martin Fowler, Kent Beck, and a few other experts in the tech industry. However, these pioneers focused mainly on the software development process; the human side of the story, the organizational aspects, and group dynamics were conspicuously missing. This is where equally insightful experts like Peter Drucker, Henry Mintzberg, Ralph Stacey, Edgar Schein, and Dave Snowden rushed to fill the gaps. But that’s not all. The first place is reserved for Jeffrey Liker’s book “The Toyota Way”.

The ideas behind Principles of Operational Excellence in Software Development are a large assemblage of smaller, more varying concepts. Some parts fit together effortlessly, while others had to be refined and reshaped. Below is a summary of each of those sources that contributed the most to the Operational Excellence narrative that we are telling.

B. The Story of Toyota

Operational Excellence has its roots in the automotive industry. It started in the 1950s when post-war Japanese car manufacturer Toyota struggled to catch up with the potency of mass production – a method in which their giant American competitors had a significant edge. Henry Ford deployed the assembly line for the first time in 1913, capable of mass-producing an entire vehicle.

With massive resources and a vast market at Ford’s disposal, there was very little the Japanese automakers could do to compete. The situation, however, changed drastically in the late 1980s, and the Japanese carmakers, especially Toyota, were doing significantly better than their American counterparts. For the next few decades, the Japanese produced exceptionally better cars at highly competitive costs. Their market share kept increasing year on year.

To make that happen, Toyota and other Japanese manufacturers deployed a strategic weapon that would later be known as the Toyota Production System (or TPS). This system was a set of 14 business management principles (articulated fluently in a now classic work by Dr Jefferey Liker) that Toyota lived by. The principles were also supplemented by a philosophy of business management that gave meaning to the organisation’s existence and the work Toyota’s employees were doing.

The Toyota Way is a brilliant book by Dr Jeffrey K. Liker which is a must-read for anyone interested in the greatest ideas of manufacturing and how concepts like Kaizen and Kanban were invented.

As Toyota grew from a Japanese car manufacturer to a global player, it had to promote and apply the Toyota Production System in its overseas factories. To explain TPS in a language that transcended culture, The Toyota Way was introduced. Essentially, The Toyota Way is centred around respecting people and applying continuous improvement.

The Toyota Way and the Toyota Production System were so efficient that professionals from different industries started looking at incorporating its ideals into their businesses. Think of KaizenKanban, and 5S. All three originated at Toyota and carried its trademark. Although, on the surface, car manufacturing bears minimal resemblance to software product development (or services in general), there is enough overlap to warrant a serious discussion, as we have thoroughly discussed in Operational Excellence and the Structure of Software Development and Delivery.

We believe software development can benefit from various concepts successfully applied in the car business. This article will look at how software development and delivery can profit from Toyota’s philosophy and principles, not by repeating the tenets most elegantly narrated in Liker’s book but through practical guidelines that developers can immediately relate to.

Today, Toyota is a Japanese multinational automotive manufacturer with headquarters in Aichi, Japan. It is the largest automobile manufacturer in the world, producing about 10 million vehicles per year.

Two books, “The Toyota Way” and “The Machine That Changed the World,” popularized the story of Toyota and, more precisely, the Toyota Production System (TPS). Both books inspired many tech experts.

C. The Development of Large Software Systems

In 1981, Winston Royce published a seminal paper on the delivery of large software systems. In this paper, he describes the basic steps and challenges of what later came to be known as the Waterfall project management model. If you read this paper, you will immediately notice that some of the fundamental problems associated with software development have already been uncovered, and experts have already started looking for answers.

In 2001, a group of software professionals who had independently invented various ways of dealing with software delivery issues came together and launched the Agile movement. The Agile Manifesto enunciated 12 principles that are now the basis of every software development methodology in this school.

A few years later, when the build and test automation tools improved, we got DevOps. Today, top tech companies use Continuous Integration and Continuous Deployment (CI/CD) practices to deploy production changes daily and hourly. The next revolution will probably be around AI and Large Language Models (LLM).

It is hard to think about anything in software without having all these models, their evolution, and first principles in mind.

D. Individual and Group Behaviour and Psychology

Until a decade ago, my understanding of software development and the management of a software business consisted of parallel narratives occasionally presenting irreconcilable paradoxes. The technical aspect of software development was straightforward. But whenever an individual or a group becomes part of the equation, we immediately move into a more complex domain.

organisational drivers of behavior

For instance, the following paradoxes were most intriguing:

  • Employee satisfaction vs productivity maximization
  • Individual interests (expectations, influence and control, and identity) vs group interests
  • Building the right software, the best software, or the one that has minimal time-to-market
  • Influencing and being influenced by environmental (or external) forces affecting the group to which I belonged

The key to reconciling these paradoxes lies in the fact that each person has multiple identities (Dave Snowden’s concept of anthro-complexity). These identities flip and switch depending on the current situation. For example, a software developer is also a team member, family member, community member, citizen of a country, an alumni of a particular school, a human with a unique experience and possibly a lot more. The (mostly unconscious) forces emanating from these attractors influence our behaviour in an infinite variety of ways.

VI. Organisation’s Lifecycle and Operational Excellence

Operational Excellence requires a minimum level of organisational stability, where leaders and employees have the time to analyse and rethink their actions. They must also be able to run experiments in continuous improvement and transformations without jeopardizing production or sustaining damaging failures.

A. The Organisation’s Lifecycle

Various management scholars and practitioners have widely discussed and promoted the concept of an organization’s lifecycle and its different stages. Notable thinkers and management consultants were behind these ideas:

  • A management professor, Larry Greiner, proposed a model of organizational growth and evolution in his 1972 article “Evolution and Revolution as Organizations Grow.” Greiner’s model suggests that organizations go through phases of growth punctuated by periods of crisis, necessitating radical changes in management practices and structures.
  • Charles Handy, Ichak Adizes, and James O’Brien, management consultants, authors, and distinguished professors, developed similar Organizational Lifecycle models. Inspired by biological lifecycles, their models emphasize the predictable patterns of organizational development, including birth, growth, maturity, and decline.

B. Critiques of the Organisation’s Lifecycle Model

While the model of organizational lifecycles has been widely discussed and utilized, it is not without its critiques. Here are some common criticisms of this model:

  • Simplification and Generalization:
  • Critics argue that the lifecycle model oversimplifies the complex reality of organizational dynamics. It assumes a linear progression through distinct stages. In contrast, organizations often experience nonlinear paths (with critical junctures; see Order out of Chaos by Ilya Pergogine for a fascinating description of complex systems). Organisations may also exhibit qualitatively different characteristics of different stages simultaneously.
  • Neglect of Organizational Resilience:
  • The model often overlooks organizations’ capacity to adapt, innovate, and overcome challenges. It assumes that decline is inevitable, whereas organizations can successfully navigate crises, reinvent themselves, and achieve sustained growth despite adversity. Of course, a great example would be Toyota.
  • Homogeneity of Organizational Experiences:
  • Critics argue that the model assumes a one-size-fits-all approach and fails to acknowledge the diversity of organizational experiences across industries, sectors, and contexts. Organisations have individual identities and unique contexts, making their evolution patterns challenging to replicate.
  • Limited Focus on External Factors:
  • The model focuses primarily on internal factors and organizational characteristics, often neglecting the impact of external factors such as market dynamics, technological advancements, regulatory changes, and competitive forces. These external factors can significantly shape an organization’s lifecycle and outcomes.

C. Organisation’s Lifecycle and Operational Excellence

This stability is, by definition, not present in startups or organisations in crisis. In startups, power hierarchies, typically dominated by the founders, dictate what can and can’t be done and how the organisation should operate. The main objective of startups is to establish themselves as viable players in the market. Sustainability, through Operational Excellence, is for a later stage.

Companies in crisis have high anxiety levels and have not had the luxury of slow, rational, and deliberate pondering on improvement through Operational Excellence. Their main objective is surviving the crisis.

Discussions on Operational Excellence can be futile at these two extremes: startups and in times of crises.

Therefore, the following prerequisites must be available before Operational Excellence can be conceived as a long-term goal.

  • First, the organisation must be well established and have mature processes and products. Its immediate objective is not how to survive the quarter.
  • Second, the organisation is not dominated by a tiny minority of key individuals (such as its founding fathers). This requires organisations to be of a specific size where diversity occurs naturally.

VIII. Summary

Let’s recap some of the key ideas of this article.

  • Software delivery requires much more than technical expertise, such as software design and programming; in fact, success revolves primarily around managing the software enterprise’s human resources and client interactions.
  • The creative functions responsible for producing the best software must necessarily include the people involved, the implemented processes, and the available technology. The interaction of these three elements is non-trivial and can produce the most complex behavioural patterns.
  • The philosophy, structure, function, and operation of a software development team must be rooted in natural or social science, i.e. it must use the scientific method to determine what works and what doesn’t.
  • Many drivers influence decision-making in individuals and groups. The only valid driver of an enterprise is its long-term survival and economic development.

Now that everything is in place let’s examine the seven principles of Operational Excellence in Software Development.

IX. References

Leave a Reply

Your email address will not be published. Required fields are marked *