Principles of Operational Excellence in Software Development

1. Introduction

1.1 Operational Excellence and How Software Development Works

A long-standing curiosity about certain aspects of software development has accompanied me for 18 years of professional experience. After enough sweat and tears, I could paint a coherent picture of what this curiosity consisted of and even some insights into its answers. It could be expressed as follows:

  • First, on the structure of software development. My question was: Is there a coherent framework, firmly anchored in scientific theories and objective reality, that allows developers and technicians to comprehend software delivery holistically, from inception to deployment and from business to technology?
  • Second, on software delivery and execution: Is there a set of universal guidelines and best practices that could be deployed anywhere and at any time (i.e. context-free), allowing developers to excel in delivering top-of-the-line software products within budget and on time?

Both questions could be condensed further and fused into one: how can software organisations achieve operational excellence in their development and delivery projects?

The answer to the first doubt on the presence of a deep philosophy and coherent framework for understanding software is as follows. Until a few years ago, my understanding of software development and business management consisted of parallel narratives occasionally presenting irreconcilable paradoxes. For instance, the following (apparent) paradoxes were most intriguing:

  • Employee satisfaction vs productivity maximization
  • Individual interests (expectations, influence and control, and identity) vs group interests
  • Building the right software, the best software, or the one that has minimal time-to-market

Depending on whom you talk to (management vs. employee, individual vs. group), different justifications would be laid on the table, each equally compelling and truthful, at least in the eyes of their beholders.

As agents in a complex system, we are influenced by multiple factors, identities, and drivers. Our preferences and attitudes vary with context and constantly evolve through interactions with other individuals. Our behaviour as a group or organisation is unpredictable and almost uncontrollable.
As agents in a complex system, we are influenced by multiple factors, identities, and drivers. Our preferences and attitudes vary with context and constantly evolve through interactions with other individuals. Our behaviour as a group or organisation is unpredictable and almost uncontrollable.

After much research, the answer to the second question concerning the universality of methodologies, methods, and practices ran like this. It turned out that methodologies, methods, and practices are not context-free, as Dave Snowden tries to explain using his Cynefin framework. The question of context is key, and methodologies, processes, tools, and practices rarely apply efficiently in all situations.

What about cults and rituals and the dogmatic wars between the different schools of project management (Agile vs Waterfall vs DevOps, Scrum vs Kanban, or Safe vs Waterfall)? How can we separate the core principles (delivery on time and budget of quality software) from commoditised half-truths? That was a follow-up question to the second, which also deserved close inspection.

1.2 A Philosophy and Framework of Software Rooted in Science

In the first part of this series, we focused on the structure of software development, relying on theories from cognitive science, group psychology, natural science, and organisational theory to define solid assumptions from which we could derive (with sufficient rigour) the seven disciples that will be the central topics of our discussions here.

Any theory of organisational management must be rooted in natural sciences, psychology, organisational theory, and cognitive sciences. Otherwise, its utility will depend heavily on the specific context in which it was initially developed, and it will never scale.
Any theory of organisational management must be rooted in natural sciences, psychology, organisational theory, and cognitive sciences. Otherwise, its utility will depend heavily on the specific context in which it was initially developed, and it will never scale.

These seven disciplines of operational excellence are not universal and will apply in a specific context, which we will define in detail in the coming paragraphs. Although this boundary on the domain of applicability of these ideas is a limiting factor, the domain itself is broad enough to encompass many projects forming the bulk of the work in most software organisations.

More specifically, the principles of operational excellence will apply to large software development or integration projects with varying team sizes. Some of the seven principles will apply less effectively in small teams (or startups) working on novel ideas, technologies, or industries requiring a less structured, more dynamic approach.

Any discussion that involves human nature or group and organisational psychology is never bound to specific jobs or industries, as these two drivers of human behaviour are omnipresent in every organisation. Other, more technical aspects are industry-specific and would poorly transfer across domains.

In the remainder of this article, we will define Operational Excellence, specifically focusing on software development. We will discuss The Toyota Way and the Toyota Production System. We then describe the seven principles of Operational Excellence in Software Development and to effectively deploy them in an organisation.

2. Operational Excellence in Software Development

2.1 A Brief History of Toyota and the Toyota Production System

Operational Excellence has its roots in the automotive industry. It started in the 1950s when post-war Japanese car manufacturers struggled to catch up with the potency of mass production – a method in which their giant American competitors had a significant edge. Henry Ford deployed the assembly line for the first time in 1913, capable of mass-producing an entire vehicle.

With massive resources and a vast market at Ford’s disposal, there was very little the Japanese automakers could do to compete. The situation, however, changed drastically in the late 1980s, and the Japanese carmakers, especially Toyota, were doing significantly better than their American counterparts. For the next few decades, the Japanese produced exceptionally better cars at highly-competitive costs. Their market share kept increasing year on year.

To make that happen, Toyota and other Japanese manufacturers deployed a strategic weapon that would later be known as the Toyota Production System (or TPS), a set of 14 business management principles (articulated fluently in a now classic work by Dr Jefferey Liker) that Toyota lived by. The principles were also supplemented by a philosophy of business management that gave meaning to the organisation’s existence and the work Toyota’s employees are doing.

As Toyota grew from a Japanese car manufacturer to a global player, it had to promote and apply the Toyota Production System in its overseas factories. To explain TPS in a language that transcended culture, The Toyota Way was introduced. Essentially, The Toyota Way is centred around respecting people and applying continuous improvement.

The Toyota Way and the Toyota Production System were so efficient that professionals from different industries started looking at incorporating its ideals into their businesses. Think of KaizenKanban, and 5S. All three originated at Toyota and carried its trademark. Although, on the surface, car manufacturing bears minimal resemblance to software product development (or services in general), there is enough overlap to warrant a serious discussion, as we have thoroughly discussed in Part 1.

We believe software development can benefit from various concepts successfully applied in the car business. This article will look at how software development and delivery can profit from the philosophy and principles of Toyota, not by repeating principles most elegantly narrated in Liker’s book but through practical guidelines that developers can immediately relate to.

2.2 Chaotic Organisations and Operational Excellence

In our preliminary (mostly theoretical) discussion of Operational Excellence and the Structure of Software Development and Delivery, we tried to convey a central idea on strategic management that applied equally well to specific instances like software development. The idea is that organisations are complex systems, and their evolution is chaotic and uncontrollable. This brings us to the question of how Toyota, for example, managed to negotiate the high waves associated with this complexity and consistently grow and do well over many decades.

Toyota’s success has been described briefly in an interview Dr Liker gave (the full story is, of course, in his books) and goes as follows. Management’s vision for the next years or decades is articulated into clear targets and objectives, which are then attained by deploying Operational Excellence. In Liker’s words, Operational Excellence is about execution (people, processes, tools) or how to achieve those objectives.

Visions and long-term objectives can be reached if the organisational structure, processes, culture, and people support them. The ability to continually adjust and tune those elements, without comprising the core values or the targets, so that execution is smooth and effective is called Operational Excellence.

Once again, the ideas in the last few paragraphs seem paradoxical; on the one hand, we have a complex system that is practically unpredictable or unmanageable in the long term. On the other hand, we have a proven framework (Toyota Production System) for working under what we believe would be similar conditions.

When we look at the details, however, we find that the lower-level concepts, frameworks, and processes (partially) overlap between the Toyota Production System and those from Complexity Theory (promoted by key thinkers like Dave Snowden and Ralph Stacey). These similarities and differences are illustrated in the following table:

ProblemTPS SolutionComplexity Theory Solution
Theoretical frameworkLong-term systems thinkingComplexity theory
Meaning and purposeBenefiting society through meaningful contributions[No equivalent]
VisionSet by top managementThe future is almost uncontrollable; management should focus on the present.
Sense-making
Genchi Genbutsu (personal involvement or “go see for yourself”)
Collecting intelligence via a human sensor network in real-time. Indexing narratives and stories told around watercoolers.
Hierarchies and decision-makingTop-down, making decisions slowly, coaching instead of forcing solutionsSelf-organisation with boundaries, distributed ideation with centralised decision-making (or any other combination)
PlanningCascading vision and plans from top management all the way to employees on the shop floorManaging in the present, Agile, Scrum, sprints, adjacent possible
Root-cause analysis5 Whys, the Fishbone diagramCausality is difficult to establish, strange attractors
Solution findingExploring many alternatives, avoiding solutions from aboveConducting safe-to-fail experiments, distributed ideation
Continuous improvementChallenging objectives, Kaizen, Plan-Do-Check-Act (PDCA)Enabling constraints, micro-interventions, fractal engagement
Establishing orderStandardized processes, a conservative approach, and a healthy respect for authorityBoundaries and governing constraints
Innovation[No equivalent]Exaptation, Serendipity
LeadershipInternally grow leaders who live the philosophy and understand the cultureLeaders must manage, and managers must lead. Leaders must manage the present and abandon hopes of controlling the long-term evolution of the organisation.
Quality ControlBuilt-in quality, visual controls, and one-piece flow to prevent issues from being hidden[No equivalent]
Warehouse management5S[No equivalent]
DesignGetting feedback from many sources, possibly from experts outside the fieldAgile design, designing for serendipity
FlowRespect for people, meaningful jobsChanging the nature of people’s interactions rather than their mindsets
TechnologyTechnology serving the customers and the processesThe interplay between technology and changing client preferences.
A comparison of the Toyota Production System with concepts of business management grounded in complexity theory is shown in this table.

The columns in the table above are not isomorphic (share the same meaning) but are loosely coupled. To clarify that isomorphism, we observe the following:

  • The complexity theory of management, especially as it appears in Stacey’s book, is more about understanding organisational behaviour, psychology, and evolution. Even Snowden’s ideas remain at a certain level of abstraction, and the practices proposed are quite sophisticated, with a steep learning curve.
  • On the other hand, the Toyota Production System is a handbook with practical guidelines on making decisions, finding solutions, eliminating waste, and preparing leaders. Liker’s book, The Toyota Way, contains many examples and anecdotal evidence to support the 14 principles presented.
  • The Toyota Production System has generic guidelines applicable to all industries in addition to those from manufacturing. For example, just-in-time (JIT), 5S, and one-piece flow do not readily apply outside manufacturing.
  • The Toyota Production System covers more areas of business management than the complexity management theory. In a sense, it appears more mature and powerful. That isn’t surprising, given how well-established (and simple and practical) its ideas are. On the other hand, the Toyota Production System is more conservative regarding innovation, for example.
  • While complexity theory has a solid scientific foundation, the Toyota Production System is the accumulation of vast industrial engineering knowledge acquired over many generations.
  • The various dimensions of organisational management touched upon by both schools vary from philosophical, psychological and organisational to technical, making Operational Excellence challenging as it involves taming those aspects and effectively deploying the tools, methodologies, and practices within real teams and workshops.

In Toyota’s world, Operational Excellence consists of putting the ideas presented in the table above into practice in mega-scale, multi-cultural, global organisations. In services, the exact meaning can be carried over, and Operational Excellence would encapsulate an organisation’s ability to deploy its resources efficiently and effectively to achieve its strategic goals.

2.3 Operational Excellence in Software Development

So where does that leave us regarding Operational Excellence in software development? First, I believe we need to understand the sources of complexity in software, without which this conversation may not be worthwhile. Experience shows us that, in software development, complexity arises as follows:

  • First, the human element is heavily involved in every stage of the software development lifecycle. There are hardly any repetitive processes (hence why Six Sigma can never apply, but that’s only one reason), and people’s behaviour is virtually unpredictable (the same goes for groups and organisations). Human beings, individually and collectively, are complex systems, and any system that involves them seems to inherit this property.
  • Second, the high level of uncertainty under which decisions must be made. The fundamental idea here is that you will never have enough information to make the best decision, and you must find your way iteratively through many small experiments while things around you (technology, trends, customer preferences, competition) are changing all the time and sometimes at breathtaking speeds. This is due to the nature of software development in the 21st century, and very little (if at all) can be done to eliminate it.

Complex, uncertain, and uncontrollable as it may, we are not saying that managing large software projects or businesses is hopeless. On the contrary, excellent theories and practices have been published and are ready to be deployed, even though many have not yet become mainstream. For example, we have tangible evidence of the benefits of Agile, DevOps, and test automation. However, these wins fall short of our objectives (and needs) for achieving Operational Excellence, hence the seven principles we will discuss shortly. Even with those principles, Operational Excellence can be achieved in certain contexts while others remain out of scope. We will discuss the domain of applicability of those seven principles shortly.

3. Challenges of Delivering Software and a Novel Approach to Software Development

3.1 Successful Software Business

The efficient organisation and its integration in a competitive environment.
The efficient organisation and its integration in a competitive environment.

Successful software businesses have the following properties, which surprisingly resemble manufacturing to some degree, and services a lot.

  • Quality software products, including customer support and maintenance
  • Efficient project execution with on-time and within-budget deliveries
  • The ability to effectively respond to market changes and demands
  • Sustainability, or the ability to stay in the game for a long time

3.2 Common Challenges of Delivering Software

Uncertainty has many sources, creating a complex setting in which software development and delivery can easily struggle.
Uncertainty has many sources, creating a complex setting in which software development and delivery can easily struggle.

Now that we know our targets for a successful software business, here are some examples of why that is more challenging than it looks.

  • It’s not about technology or processes but also people. Organisations or managers that ignore that fact, perhaps because it doesn’t make sense to them or maybe it’s too hard, are doomed to fail. People are the source of innovation, creativity, and contribution, and this happens only in the right context and within a friendly and healthy organisational culture.
  • Knowledge drain. While patents, software code, and other forms of intellectual property are vital organisational assets, so is tacit knowledge accumulated from years of expertise in processes and people’s heads. Tacit knowledge is, by definition, difficult to codify, and by losing talent, you also lose knowledge. Training, education, coaching, and up-skilling are all investments that organisations make in the hope that, in the near future, they will receive due returns. Knowledge build-up and retention are key challenges in software.
  • Too much volatility. The areas where high volatility is naturally found include business requirements (especially in novel products), novel technology, emerging market trends, customer preferences, global supply chains, natural catastrophes, and best practices. What about stable products and markets (like digital payments), which, although constantly changing, evolve at a much slower pace? In these situations, uncertainty and volatility will arise for different reasons, including, for example, the presence of large legacy systems and the complexity of delivering large-scale software integration projects.
  • Every project is unique. In software, every project has a unique element, which means there is never a shortage of novel situations. For many companies, levelling demand from software projects is very challenging, with periods of famine or feast alternating, driving organisations to shuffle people around and rely on outside contractors and consultants. In such situations, learnings can be difficult to socialize and incorporate into existing processes, forcing every project to encounter the same problems repeatedly.
  • Too much uncertainty. Changes in organisational priorities, strategies, management, and hierarchies are constant sources of uncertainty and anxiety. This leads people to sometimes place their interests above those of the group, business, or organisation. By seeking certainty, employees try to avoid failure (and, therefore, the opportunity to learn from mistakes) by sticking to what is considered safe rather than optimal.
  • Imbalance of theory and practice. When software professionals lack an in-depth understanding of the first principles (or theory) to back up their best practices, their convictions in these practices will result from faith rather than scientific and rational reasoning. The consequence is the dogmatic adherence to these best practices even when they stop working, and these often will, as circumstances change.

These are just the high-level issues forming a common denominator between most software organisations. Differentiation and local problems will impose themselves when we get into the details. For the time being, I believe the list above will cover the majority of issues of interest.

In Operational Excellence and the Structure of Software Development and Delivery, we examined the complexity of diagnosing and troubleshooting organisational issues. We concluded that cheap solutions wouldn’t do, and successful ones would have to rise to the required level of sophistication and potency.

Luckily, many original thinkers have articulated what assumptions, frameworks, and paradigms those sophisticated solutions should be built upon. These assumptions, frameworks, and paradigms will form the basis for our approach, which we now describe.

3.2 A New Approach and Solution

Operational Excellence in Software Development revolves around gearing processes, technology, and people towards a common vision and strategy. The interactions between these three entities in both directions allow complex behaviour to emerge.
Operational Excellence in Software Development revolves around gearing processes, technology, and people towards a common vision and strategy. The interactions between these three entities in both directions allow complex behaviour to emerge.

As you might have guessed already, the fundamental ideas behind the seven Principles of Operational Excellence in Software Development will be based on those from the Toyota Production System, Organisational Theory, and Complexity Science. These ideas are heavily influenced by this author’s personal experience and the specificities of the software industry.

  • Organisations are complex systems. The discipline, legacy, and practices (such as lifetime employment in one company) afforded by Japanese culture are not necessarily present outside Japan or Toyota. This means that the complex systems approach to understanding organisational behaviour will better mirror the real world than the Toyota Production System. Nevertheless, the reader going through the seven principles below will recognize patterns from many powerful practices promoted in the Toyota Production System.
  • People come before processes and technology in discussions about productivity, as we have endeavoured to show in Part 1, Operational Excellence and the Structure of Software Development and Delivery. This is where any serious discussion would need to start. Both The Toyota Way and Complexity Theory have strong views in that dimension. Still, I believe that individuality and the human element, as described in Complexity Theory, align more with what we can observe in reality. Hence, we will lean more toward one side when discussing Organisational Culture, Group behaviour, and Group Psychology.
  • Agile and the problem of scale. This brings us to the fundamental conflict between complexity science and its views on business management on the one hand and the Toyota Production System on the other. One is highly complex, emergent, and unpredictable, while the other appears to be the exact opposite. Complexity, emergence, and chaotic evolution are opposite to what senior management wants. Senior management wants long-term predictability and control without losing adaptability and flexibility, so two-week sprints in organisational planning and strategy-making are unrealistic. Agile is perfect in specific contexts (high customer interaction, high uncertainty, and small scale) but not in others.
  • The software value chain–infinite variations on a single theme. Optimizing the software delivery process includes identifying where and how business value is generated (the value chain) and understanding the process that allows this to happen (the Software Development Lifecycle or SDLC). Fortunately, both are easy to uncover once we dive under the surface. Agile is a great example of this and will be the centrepiece of Principle 4. The idea here is that Waterfall, DevOps, Agile, and their many variations are surprisingly similar under the hood. This common denominator will help us find optimization techniques that apply to all project delivery methodologies, for example.

These three pillars, organisations as complex systems, the importance of the human element, and the commonality of the underlying drivers in the value chain, will form the basis of the seven Principles of Operational Excellence in Software Development. But before we tackle these seven principles, we must first discuss the limitations of this approach.

4. Context and Domain of Applicability

4.1 The Organisation’s Lifecycle

Various management scholars and practitioners have widely discussed and promoted the concept of an organization’s lifecycle and its different stages. Notable thinkers and management consultants were behind these ideas:

  • A management professor, Larry Greiner, proposed a model of organizational growth and evolution in his 1972 article “Evolution and Revolution as Organizations Grow.” Greiner’s model suggests that organizations go through phases of growth punctuated by periods of crisis, necessitating radical changes in management practices and structures.
  • Charles Handy, Ichak Adizes, and James O’Brien, management consultants, authors, and distinguished professors, developed similar Organizational Lifecycle models. Inspired by biological lifecycles, their models emphasize the predictable patterns of organizational development, including birth, growth, maturity, and decline.

4.2 Critiques of the Organisation’s Lifecycle Model

While the model of organizational lifecycles has been widely discussed and utilized, it is not without its critiques. Here are some common criticisms of this model:

  • Simplification and Generalization: Critics argue that the lifecycle model oversimplifies the complex reality of organizational dynamics. It assumes a linear progression through distinct stages. In contrast, organizations often experience nonlinear paths (with critical junctures, see Order out of Chaos by Ilya Pergogine for a fascinating description of complex systems). Organisations may also exhibit qualitatively different characteristics of different stages simultaneously.
  • Neglect of Organizational Resilience: The model often overlooks the capacity of organizations to adapt, innovate, and overcome challenges. It assumes that decline is inevitable, whereas organizations can successfully navigate crises, reinvent themselves, and achieve sustained growth despite adversity. Of course, a great example would be Toyota.
  • Homogeneity of Organizational Experiences: Critics argue that the model assumes a one-size-fits-all approach and fails to acknowledge the diversity of organizational experiences across industries, sectors, and contexts. Organisations have individual identities and unique contexts, making their evolution patterns difficult to replicate.
  • Limited Focus on External Factors: The model focuses primarily on internal factors and organizational characteristics, often neglecting the impact of external factors such as market dynamics, technological advancements, regulatory changes, and competitive forces. These external factors can significantly shape an organization’s lifecycle and outcomes.

4.3 Organisation’s Lifecycle and Operational Excellence

Operational Excellence requires a minimum level of organisational stability where leaders and employees have the time to analyse and rethink how they do things. They also must be able to run experiments in continuous improvement and transformations without jeopardizing production or sustaining damaging failures.

This stability is, by definition, not present in startups or organisations in crisis. In startups, power hierarchies, typically dominated by the founders, dictate what can and can’t be done and how the organisation should operate. The main objective of startups is to establish themselves as viable players in the market. Sustainability, through Operational Excellence, is for a later stage.

Companies in crisis typically have high anxiety levels and have not had the luxury of slow, rational, and deliberate pondering on improvement through Operational Excellence. Their main objective is surviving the crisis.

At these two extremes, startups and times of crises, discussions on Operational Excellence can be futile. Therefore, the following prerequisites must be available before Operational Excellence can be conceived as a long-term goal.

  • The organisation must be well established, with mature processes and products. Its immediate objective is not how to survive the quarter.
  • The organisation is not dominated by a tiny minority of key individuals (such as its founding fathers). This requires organisations to be of a certain size where diversity occurs naturally.

Now that we have everything in place, let’s look at the seven principles of Operational Excellence in Software Development.

Principle 1: Eliminate Cultural Blockers

5.1 What are Cultural Blockers?

The 15th State of Agile Report and the 2021 State of DevOps Report cite cultural blockers as primary challenges in adopting Agile or DevOps. In this sense, perhaps it makes sense that the first principle of Operational Excellence would focus on managing cultural blockers that impede growth and success. We say managing and not eliminating as the latter would be too difficult, if not impossible, as we shall see in a moment. So what are those cultural challenges?

Cultural blockers are prime detractors of Operational Excellence and can be defined as the answer to the following question:

  • Why are productivity, performance, and motivation low despite having the right mix of talents, skills, and processes?

The answer to this question can be provided by acknowledging that organisations are not machines, a point we have discussed extensively in Part 1. If they were, a combination of superbly engineered processes and great talent and skills would be enough to drive performance and productivity.

What is missing is the human element called organisational culture; this can be summarized as shared beliefs on how the system should be designed, driven, and led and what its ultimate objectives must be. Also fundamental to this concept is the employee’s perception of the nature of their contribution and how much meaning these contributions give to their efforts.

5.2 Organisational Culture

Edgar Schein, an influential American organizational psychologist and consultant, describes organisational culture as consisting of three layers: artifacts, values, and assumptions:

  • Artifacts are physical objects such as symbols, language, rituals, and dress codes. Artifacts are important in understanding a group’s culture but, studied in isolation, reveal little information on the subject.
  • Watching the group’s interactions can reveal much information about its espoused values, a second level of abstract concepts that inform observers on acceptable norms and standards of behaviour. Espoused values, declared publicly, may starkly contrast people’s behaviour.
  • Finally, shared assumptions are the deepest levels of abstract concepts. These assumptions are built over time, typically from traumatic experiences. Shared assumptions form the frame, or backdrop, against which the group’s behaviour can be deciphered.

In his seminal work, Gödel, Escher, Bach: an Eternal Golden Braid, professor Douglas Hofstadter explores the many layers of stability that underpin our individual worldview. To explain this concept, Hofstadter uses mathematical (and programming) notions of constants, parameters, and variables.

In the situation […] represented by the symbols, c establishes a global condition; p establishes some less global condition which can vary; and finally, v can run around while c and p are held fixed. It makes little sense to hold v fixed while c and p vary, for c and p establish the context in which v has meaning.

— Douglas Hofstadter, Gödel, Escher, Bach: an Eternal Golden Braid

The last sentence is pure insights. Shared assumptions and values (constants and parameters in Hofstadter’s metaphor) provide context without which our fleeting daily experiences have no meaning. This powerful idea explains why change (which consists of modifying what is supposed to be constant) is indigestible, traumatic, and anxiety-ridden.

To illustrate the impact of culture on organisational growth and evolution, consider the time management issue. For example, what are the mainstream views in your organisation on the following questions?

  • Is sacrificing quality (medium and long term) acceptable for faster time-to-market?
  • When people gather around the water cooler, do their discussions mostly revolve around a lost but glorious past, an expedient present, or a futuristic vision?
  • What is the average tenure of an employee?

The perception of time in a group of individuals is integral to an organisation’s culture. How you answer the above questions determines the disposition of an organisation to move forward, backward, or maintain the status quo.

The perception of space is equally important in organisational culture. This can be observed mostly in how offices are designed and how space is allocated according to rank and hierarchy.

5.3 Change Resistance

Cultures (tribal, familial, national, organisational) are cognitive constructs deployed as defensive mechanisms that help groups cope with internal integration challenges and challenges their environment poses.

  • Internal challenges include cohesion, integrating new members or ideas, and replacing old ones.
  • External pressures can come from novel technologies, new customer requirements, or competitors. Any environmental change can be a threat (or opportunity) to the business.

If the group were to survive, it would need to adapt and respond to those threats in new ways. Its leaders and members must unlearn old methods and learn new ones. This process is traumatic and does not necessarily succeed all the time.

Returning to our metaphor on constants, parameters, and values, and the different levels of stability that each offers, we can make the following straightforward observation. The higher the hierarchy, the stiffer the resistance will be.

5.4 How Can Cultural Blockers Be Managed

Best practices for removing cultural blockers.

It might sound a bit defeatist to admit that organisational and cultural transformations are extremely difficult, if not impossible, to execute. Nevertheless, eminent thinkers and practitioners have presented several ways of managing change and softening cultural blockers. Leaders play a central role in all these practices. Let’s see what these practices offer.

  • Leading by example. Leaders (especially founding fathers) have a long-lasting influence on culture creation. During culture creation and maturity, the group closely monitors a leader’s actions and reactions and will modify its assumptions, values, and beliefs based on those observations. A leader wishing to influence a group’s behaviour can do so with genuine commitment to their ideals and by deploying other means, such as the right incentives and rewards system or an effective job enrichment plan, creating boundaries, and preparing the next generation of leaders.
  • Preserving the group’s identity and heritage. Group identity often revolves around shared cultural practices, traditions, language, rituals, and customs. It can be shaped and preserved through a shared history and collective memory, such as common historical events, achievements, or traumas. Regular social interaction among group members, such as gatherings, celebrations, or shared activities, fosters a sense of belonging and strengthens group identity. Maintaining strong social ties and solidarity within the group is crucial for preserving identity. A shared identity allows individuals to collaborate more effectively, possess the psychological safety to make risky changes, accumulate past experiences, and preserve gained knowledge. On the other hand, change that threatens a group’s identity will place them on the defensive, forming an effective cultural blocker.
  • Matching the desired culture with realistic expectations. Knowing what to expect from a group helps manage expectations and plan successful long-term strategies. Certain standards and value systems are more at home in some cultures but not others. In his talks on the Toyota Way, Dr Liker emphasizes that this framework (The Toyota Way) was created for plants outside Japan so that these would easily integrate with their Japanese counterparts. The Toyota Way has two pillars: respecting people and continuous improvement. In Japanese plants, The Toyota Way was implicit and taken for granted; it was their natural and default modus operandi. Matching the right culture with the right processes and expectations is key. Outside Toyota, organisations are best modelled as complex adaptive systems, and a new set of methods would apply.
  • Setting boundaries (or governing constraints). In Open Systems theory, a boundary separates the interior from the exterior of a system, through which energy, data, or some materials might flow. In this context, boundaries are meant to delineate acceptable behaviour, more like constraints than protective walls. A system with agents interacting without any constraint will only display random behaviour. As soon as constraints are introduced, order emerges. In human systems, constraints will arise when interaction starts, regardless of whether it’s designed or left alone. You want to introduce enabling and governing constraints so that self-organisation occurs along healthy routes. Without boundaries, toxic cultures can arise, which are difficult to change when beneficiaries of the status quo become too powerful.
  • Moving to an adjacent possible. Cybernetics is a theory that assumes human systems are equilibrium-seeking and continually adjust to match an external target. The difference between the external objective and the system’s output (the error) is fed back into the system via negative feedback loops to correct its behaviour. Cybernetics is easy to understand and, therefore, seems more realistic and practical than it is. Experience, however, has shown that managing complex systems by setting targets to be matched is incredibly challenging, especially if these targets are long-term. Cybernetics has outlived its usability; in that respect, complex systems show a predisposition to be nudged into certain directions but cannot be directed as a machine is. Understanding the current situation is key, allowing us to determine the next best thing to do. Transformation failures are typically explained by referring to cultural blockers as the main culprits, whereas they were never viable in the first place.
  • Employee motivation. Perhaps one of the most insightful studies published on satisfaction was by the US psychologist Mihaly Csikszentmihalyi in 1961, in which he coined the term flow when discussing motivation. The flow is experienced during periods of intense focus while working On an activity that they choose that is neither too far above nor below our skill levels, has clear objectives, and immediate rewards. Frederick Herzberg‘s article in HBR One More Time: How Do You Motivate Employees? is a classic on employee motivation where he painstakingly and humorously takes the reader on a journey of examination of the different methods employees could be motivated through, starting with the proverbial KITA. He concludes (unsurprisingly) that the KITA does not work, and the same goes for other methods like distributing benefits. To achieve “flow”, the employee’s job has to change along the lines described by Csikszentmihalyi. People will never improve on a dull job.
  • Lowering anxiety. Systemic high anxiety is a significant cultural blocker. When group anxiety is high, nothing can be achieved, and a leader must first bring anxiety back to normal levels. Anxiety increases during changing or troubled times when internal or external pressures threaten the group’s survival (see Schein’s model of organisational culture). There are alternative perspectives to this view, though. For example, John Kotter described his theory of change in an article published in the Harvard Business Review, where he recommends that leaders might resort to inducing fake crises to get people to accept change and move along. The two situations are not similar, and a head-to-head comparison would fail. Anxiety in a group is induced by uncertainty and puts the team on the defensive, whereas a crisis compels them to think differently and be more open to taking risks.

It is critical to understand that cultures are, as Snowden most eloquently puts it, emergent properties of complex systems and, therefore, cannot be engineered. You can’t design a social group, but you can manage an agent’s interactions to direct it. Most crucially, intervention in a complex system (such as changing interactions) will always have unintended consequences, making predictability ever more challenging.

Principle 2: Process Engineering, Management, and Governance

6.1 What Is a Process?

A process is a set of actions or tasks designed to produce a desired outcome by consuming and transforming resources like energy, data, and material. Processes are central to manufacturing, software development, science, and business management.

Processes can vary widely in nature (manual vs automated) and complexity (linear and sequential vs iterative, parallel, interlocking, with feedback loops).

6.2 The Relevance of Structure and Processes

Processes provide a structured method for delivering business value. However, too much structure stifles innovation, creativity, and the system’s ability to adapt to new challenges; it becomes fragile. On the other hand, too much dynamism induces chaos, an expensive state of affairs that cannot be maintained long, and which eventually forces the system to settle into an equilibrium state where processes emerge from self-organisation along paths not necessarily leading to desired outcomes (like maximum productivity or top quality).

The questions that arise can be stated as follows:

  • How rigid or flexible should processes be?
  • How are processes articulated, documented, and socialized?
  • How to determine when processes must be reexamined or modified?
  • How can managers know when to move to a different delivery methodology, i.e., when process redesign or reengineering is insufficient?
  • Who are the process owners, and what are their duties and responsibilities?

We will answer these questions briefly in the following sections. Still, the interested reader is invited to explore other articles on Process Engineering, Six Sigma, and Process Improvement on this website.

6.3 Agile Processes

Agile is currently in a state where every person you ask has a different opinion, interpretation, perspective, or story about it, to such an extent that there is hardly any common thread between them. This tragi-comical situation has left all Agile-related questions open-ended, and one, in particular, has to do with specifying software development processes and team structure. Let’s focus on processes and leave team structure for another article.

There is a mainstream view that being Agile involves abandoning processes, documentation, and management in favour of self-organisation. This is hardly what Agile’s founders had in mind, at least from what can be inferred by reading the Agile manifesto or listening to their public lectures.

Agile answered concerns about the software development and delivery methodology employed at the time, which was heavily centred on Waterfall. The resulting practices (XP, Scrum, Kanban…) still involved well-defined processes. These practices have become so ritualised, varied, and specialised that they lost much of their original potency. At no point did Agile stipulate that the best way forward in software development is to throw away everything that has worked in the past and replace it with something novel.

Process engineering and flexibility are unrelated to Agile, Waterfall, or any other flavour of preferred software development methodology and can be discussed separately. This is exactly what Principle 2: Process Management and Governance aims to achieve.

6.4 Process Engineering

The underlying metaphor when approaching the subject of Process Engineering is that of manufacturing, where employees in the workshop perform repetitive tasks to produce thousands of identical pieces. Although at the heart of the Toyota Production System and Quality Management, this metaphor is out of sync with software development, and if we must discuss process engineering in software, an alternative metaphor must be found. Therefore, we propose expanding the scope of Process Engineering in software development to mean the following:

  • Organisational processes (business, production, decision-making, or conflict resolution) must be firmly rooted in engineering and natural sciences. Without theory, we have prescription followers. With theory-informed practice, professionals generate solutions appropriate to their context with whatever ingredients they possess.
  • Creating lean processes exposes its inefficiencies and allows people to address them; otherwise, the process might fail and shut down. In contrast, processes that are not lean allow inefficiencies to go unnoticed for long periods of time. For example, imagine you are developing a piece of software that can only be tested in three months. If the module contains a defect, you might build additional features on top of it during those three months, making it harder to fix once it is uncovered.
  • Processes are designed to maximize the value delivery capabilities of the organisation. This requires a thorough understanding of the value chain or where and how value is created. In software, the value chain starts with a customer looking at what technology can do to help them solve a business problem. It ends with a software product that coevolves with their changing preferences.
  • Process efficiency and resilience. Processes must have built-in mechanisms to achieve maximum efficiency and resilience. Efficient processes are optimized for maximal productivity by removing all forms of redundancy. This extra leanness makes them less resilient and less diverse. Redundancies mean that subsystems can take over other subsystems’ work if the latter fails or shuts down, while diversity (of opinion, outcomes, paths) enables innovation.
  • Processes are knowledge storage mechanisms. It is harder to document why a certain approach might fail and get people to become aware of it than to simply add a validation step in the production process that ensures it never gets taken. In this respect, sophisticated processes act as storage devices for learned lessons. This is another reason why being Agile should not mean discarding a legacy of field expertise embedded and know-how.
  • Any production process must follow the Software Development Lifecycle (or SDLC). The SDLC is common knowledge among software development communities. In large projects and corporations, there is a need to share ideas based on a common understanding of how software products are created. Common language, familiar thought processes, and a shared understanding of complex ideas facilitate collaboration, while a proprietary software development methodology requires a learning curve for new recruits.

6.5 Process Improvement, Reengineering, and Redesign

Managers dedicate much energy to discussing continuous improvement, a term that has become so cliche that it lost all meaning. In Toyota, continuous improvement has a well-defined meaning: How can the organisation rebuild its products, people, and processes to match a future vision?

In Principle 2: Process Engineering, Management, and Governance, we will focus more on the continuous improvement of software development processes and what guidelines software organisations should follow to keep their processes on par with the present challenges.

The below list includes our definitions of the relevant terms:

  • What is continuous improvement? Continuous improvement is an ongoing organizational process aiming to refine production processes for higher efficiency (speedy, low-cost, and flawless execution) and effectiveness (a measure of how close we come to the desired outcome). Improvement can be incremental or transformational.
  • Incremental improvement refers to isolated and localised efforts to reduce certain inefficiencies in how things are done. Incremental improvements typically stay within a unit or division and do not affect how this unit interacts with the outside world. It might involve using tools differently, shifting roles and duties around, and investing more in specific stages, skills, or areas.
  • Transformational improvement (or process reengineering or redesign) is a fundamental change in production processes. For example, in manufacturing, Toyota implements one-piece flow, a method radically different from Henry Ford’s mass production lines. In software, deploying Agile or DevOps to replace Waterfall project management methodologies is another example of transformational improvement.
  • The ultimate objectives of process improvement firstly include getting rid of non-value-adding tasks. This is called waste elimination in manufacturing, which is especially important for the people at Toyota. There is a parallel in Winston Royce’s (fascinating) paper on developing large software systems. Dr Royce states that only analysis and coding are truly value-creating stages, while the rest (requirement gathering, design, testing, operations, and support) is to be minimised.
  • The second objective of process improvement is to adapt to changing environmental conditions. While inefficient processes might result from outdated technologies, problem-solving methods, poor skills, or insufficient or antiquated knowledge, ineffective processes produce products that do not meet evolving business needs, like fast and on-demand delivery. In this case, a radical transformation is required to deal with novel challenges.

In the next section, we will cover a few ideas on how best (or mainly what prerequisites are needed) to implement process changes.

6.6 Process Engineering Guidelines for Software Development

The reader will find that the ideas we will present shortly on engineering and improving processes are in the same spirit as those in Part 1: Operational Excellence and the Structure of Software Development and Delivery and the remainder of this article. This consistency is reassuring as it shows that the models and frameworks we use (complexity management and The Toyota Way) are coherent, covering most dimensions of the interesting aspects of software development.

Process engineering guidelines for software engineers revolve around three key axes: process definition, individual and group psychology, and software value.
Process engineering guidelines for software engineers revolve around three key axes: process definition, individual and group psychology, and software value.

So how can we design and improve software development processes using concepts from complexity management and the Toyota Production System? Here are some guidelines.

  • Understanding the value chain. Software engineering university programs focus heavily on science, technology, engineering, and mathematics (STEM education). Very little effort is dedicated to disentangling the history of software, its evolution, its pervasiveness in our lives, and how business value is created from software applications. Engineers start their careers programming features, but not until quite late do they fully appreciate software delivery’s richness, depth, and complexity. The ultimate objective of software applications is to solve business needs, and as Stephen Covey says in his book The 7 Habits of Highly Effective People, we must “start with the end in mind”. In software, we always begin with a specific business need. The end is a software solution, and the value chain is the bridge that connects them.
  • The Software Development Lifecycle (SDLC). If the value chain articulates the philosophical nature of software delivery, the sequence of SDLC stages (analysis, design, development, testing, deployment) is the language that expresses it. A high-level examination of the SDLC shows a linear, sequential chain of tasks, each feeding into the next. A closer examination, however, reveals a more intricate structure with iterations, closed cycles, and feedback loops. Each stage of the SDLC requires specialized skill sets. A planning unit to coordinate the different stages becomes necessary. Putting the SDLC into practice involves deciding which stages are high-value-added and which are not, how to optimally allocate and utilise resources, how to structure teams, and which tasks to automate. None of these decisions is obvious, and all require a thorough understanding of these issues and the context in which they are meant to operate.
  • Human nature and individual and group psychology. Have you ever worked in an environment where designated team members could break the rules under special conditions? While processes are meant to apply uniformly to everybody and all circumstances, the reality is quite the opposite, especially in complex environments. Under such conditions, prescriptive processes are hardly present, while consultants, developers, and leaders are expected to use their common sense and expertise in the field to make sound judgments. Truth is writing down everything people know is an impossible task, and processes, by definition, are always at an abstract higher level that must constantly be interpreted in the light of the present circumstances. Processes must not be rigid and tedious so that diverse opinions and flexibility may arise in dealing with changing circumstances.
  • Use a data-driven scientific approach based on theory and field expertise. Nobody would argue the soundness of this approach to designing processes, but many would dismiss it on pragmatic grounds, such as urgency, the lack of reliable data, etc. In such cases, one of two approaches prevails. The first is where practitioners dogmatically follow a specific school of thought for no reason other than familiarity. The second is a chaotic, unstructured approach with ad-hoc rules constantly created and promptly discarded. Both will catastrophically fail when the challenges become too much. Ideally, we want to look up industry best practices, build on other people’s experience, and leverage our current strengths. A strong theoretical background will help us make sense of changes and identify the limitations of each method. With modern and sophisticated collaboration tools like JIRA and Confluence, data is less of a problem than it used to be.
  • Key Performance Indicators (KPIs) for measuring performance. Once the KPIs are published, they instantly replace the objectives they are meant to track. This is a very unfortunate fact repeatedly demonstrated. KPIs should reflect genuine performance changes in the processes they monitor and must be set up in a way that makes gaming them very difficult.
  • Process standardisation means that teams can benefit from lessons learned in other teams. It is also essential so that data collected from various teams can be analyzed and interpreted similarly. Process standardisation does not necessarily mean tedious and rigid processes. For example, a common coding style (or testing method) facilitates code legibility (or test reliability) in software programming without stifling innovation.
  • Process governance. Process owners should be assigned to monitor and track performance (efficiency and effectiveness). Ownership should reside at the lowest level, as close as possible to the people the processes impact. Incremental process improvement can be made via safe-to-fail experiments. These experiments are local and isolated, and should they fail, they can be reverted with little cost. If a small change proves its merits, it is articulated at a higher abstraction level to facilitate distribution and socialisation; at this point, it becomes standard.

These guidelines are enough to conclude a section on Principle 2: Process Engineering, Management, and Governance but remain at a high (and somehow abstract) level. Still, we have discussed them in sufficient detail in separate articles, focusing more on their implementations’ technical and practical aspects. The interested reader is invited to browse these articles for additional insights.

Principle 3: Draw a Line Below Which Quality Does Not Go

7.1 Problem Description

When you have aggressive deadlines or your project is running out of time or budget, there are only four things you can do:

  1. Reduce the scope
  2. Extend the deadline
  3. Commit additional resources
  4. Sacrifice quality

While the first two options are your best bets, and the third option is not always feasible, the fourth one (sacrificing product or delivery quality) is what most people would consider. There is a good explanation for why this can be appealing; sacrificing quality today will only surface in the future, and furthermore, it will be someone else’s responsibility.

Sacrificing quality creates a drag on your delivery capabilities. If left uncontrolled, it will transform your product from a lucrative asset to a liability. In the long run, sacrificing quality will have detrimental and irreversible effects on your business.

Peter M. Senge documented this organisational behavioural pattern in a 1990 bookThe Fifth Discipline. Eroding Goals was the name Senge gave to an organisational dynamic describing a positive feedback loop where managers increasingly accept lowering standards to retain the same delivery capabilities.

In software development, Eroding Goals can be seen with rising technical debt. Technical debt makes the next delivery harder, requiring managers to accept more technical debt to keep deliveries on track. At some point, these become so slow that you risk being completely outrun by competition.

Keeping quality above a certain mark prevents organisations from slipping towards a path of Eroding Goals.

7.2 Levels of Software Quality

Software quality can be broken down into Known limitations, bugs, and technical and architectural debt.
Software quality can be broken down into Known limitations, bugs, and technical and architectural debt.

Technical debt can be subdivided into four categories, not all equally problematic. This means managers can be selective when deciding which to carry and which to avoid. Let’s see what these look like:

  • Known limitations are usually attached to specific features and limit the range of business functionality available within that feature. The core functionality is ideally delivered, but the nice-to-haves are deferred or postponed. This is not an issue but a pragmatic approach for minimizing time-to-market.
  • Bugs are more serious problems as they can severely impact the application’s service levels. In extreme cases, they may, for example, take down the application, in which case they become critical and must be fixed immediately. Bugs are not smart to carry over between releases or into production; fixing an aeroplane while it’s still on the ground is much easier.
  • Technical debt. When developers compromise code quality for faster delivery, technical debt increases. Examples of technical debt are duplicate code, hardcoded functionality, poor software architecture or design, and insufficient testing. Technical debt is transparent to application users and is, therefore, the easiest to rationalize. However, it can slowly accumulate until it significantly erodes a team’s long-term delivery ability.
  • Architectural debt is the most dangerous on the list as it’s both hidden and strategic. Examples of architectural debt include compromising a product’s architectural integrity by allowing it to support functionality it’s not designed for. Fixing architectural debt requires anywhere between refactoring a few classes and redesigning core modules. Ideally, you want to avoid this at all costs.

7.3 Draw a Line Below Which Quality Does Not Go

The best way to remedy such situations is to avoid being exposed in the first place. This can be achieved through an open and transparent discussion with business stakeholders every time the backlog is saturated with high-priority items and not waiting for it to overflow. A positive outcome of these discussions means a reshuffling of priorities, deadlines, or resources.

However, there will be times when moving things around (deadlines, scope, or resources) would not be feasible and when the only option left is delivering on time, regardless of the (technical) risk or cost (such as when the organisation’s reputation is at risk). What should managers do in this case?

The short answer to this question is drawing a line below which quality doesn’t go. The long answer can be found in the following paragraphs, where we list a few guidelines and best practices for keeping software quality up to scratch.

7.4 Guidelines for Maintaining Product and Delivery Quality

Though not mutually exclusive, product and delivery quality might collide head-on with scarce resources or aggressive deadlines. Product quality means building the right thing right. Quality delivery means fast and affordable delivery. Here is a list of actions managers can take when struggling with product vs delivery priorities. The list is ordered from top recommended to least.

  • Smooth deliveries through Operational Excellence. A recent survey concluded that roughly 70% of software projects fail to deliver on time and budget. The good news is that 30% did. In his book The Toyota Way, Dr Liker describes how Toyota achieved operational excellence by implementing one-piece flows, pull systems, load levelling, Kaizen, and many other ingenious practices. Operational excellence is more than an ideal state organisations hover over but never attain. The fact that companies like Toyota, IBM, and others have managed to survive for decades in their business is proof that it can be done.
  • Zigzagging your way to excellence. Suppose we could imagine an XY plane where the x-axis measured time and the y-axis quality. In that case, operational excellence can be represented by a straight line at 45 degrees where quality would continuously improve as a function of time. However, more realistic scenarios would show a ragged line with alternating horizontal and vertical segments; the horizontal segments show a dash to market with new features, while the vertical dashes show an increase in overall quality (through refactoring, redesign, or introduction of automation or new tools). The key idea here is to occasionally accept technical debt while simultaneously committing to periodic bouts of maintenance. Tactical decisions favour customer satisfaction through fast delivery, while strategic decisions lean towards continuously increasing product quality.
  • Relaxing delivery constraints. This can only work for products and brands dominating their market niche. The strategy here is to leverage the fact that customers are willing to wait as long as it takes to get a great product. For example, the average waiting time for a new Ferrari could be many months. Still, the customer is happy to wait. There are similar examples in software where, for example, the customer and the supplier are different divisions within the same company and have little choice but to work together.

A fourth option, which must never be applied, is sacrificing long-term quality (carrying architectural debt and ignoring technical debt) for short-term wins. Despite its undeniable irrationality, managers may focus solely on the present if their jobs, bonuses, or careers depend on it. This can be observed in a system where the individual and the group’s interests are not aligned or contradictory. Therefore, the rewards system of incentives, benefits, and promotions must favour strategic goals.

Principle 4: Implement Agile Processes Where Applicable

8.1 What Is Agile and How Will It Help

Interestingly, two decades after the Agile Manifesto was announced, there is still no widespread agreement on what Agile looks like. While everybody agrees that Agile practices must include user stories, sprints, backlogs, and Kanban boards, the basic underlying principles of Agile (and what problems its original creators’ intended it to resolve) seems elusive.

While Agile training, certifications, and coaches abound, Agile seems to have lost its teeth. In Snowden’s words, Agile has been “domesticated”. The domestication metaphor and the subsequent call by Snowden to wilden Agile again deserve some explanation. The curious reader is invited to view any of Snowden’s lectures on Agile or head to our article on Agile’s history and first principles.

So what is Agile, and why is it vital for operational excellence in software development? Agile was born from Extreme Programming (XP), Scrum, and what was known back then as lightweight methodologies. These were presented as an alternative to Waterfall and were destined to solve some of the problems that plagued large software development projects, of which the following two are most relevant to our discussion:

  • Working with unarticulated needs. How can software engineers accurately specify the design of a software application if users keep changing their minds about what they want? Because users can’t predict what technology can do for them, it is only when they start interacting with it (a prototype, for example), do they refine their requirements. User preferences and the prototype they are working with “coevolve”. Because Waterfall required the design to be fixed upfront, it could not handle requirement volatility, and more “agile” methods were needed. Scrum, for example, is ideal for turning complex problems into complicated ones (in Cynefin terms). Complex problems admit various alternative solutions, whereas complicated problems have engineering solutions that already exist and, once identified, can be easily implemented.
  • Smaller, more frequent deliveries. Waterfall stages are sequential by design, and for large projects, the feedback loop between requirement gathering stages and user acceptance testing was measured in months, way longer than it should be. Even when the project could be broken up into small features that could be delivered independently, it was more economical to lump them into one large initiative. This, however, meant that customers waited longer than they should for new features. More importantly, project failure rates were extraordinarily high and, due to their size, extremely costly as well. What was missing was a delivery methodology that allowed big deliveries to be substituted for small faster ones, and this is where Agile shines the brightest. The extra overhead of managing many smaller deliveries was compensated through sophisticated tools like source code versioning, orchestration, automation, monitoring, knowledge sharing, collaboration, and test simulation.

Does this mean that Agile is the universal solution to software development projects? Is Waterfall obsolete? Not quite. Let’s look at why this is not the case.

8.2 It’s Not Agile’s Fault That It Can’t Scale

Agile does a fantastic job in small teams. However, because of its high interaction levels and dynamic nature, its cost at scale becomes prohibitive. Consider the following scenarios:

  • Managing large software integration projects. You are trying to deliver a software solution consisting of many platforms with intricate interdependencies between them. The platforms are mature, and specifications could be reasonably determined upfront with enough precision. Furthermore, a new feature might require changes in multiple platforms and, therefore, has to be designed and tested end-to-end. Aside from development and unit testing, which can be done in parallel per application, all other stages of the SDLC (requirement gathering and analysis, architecture and design, system integration testing, and user acceptance testing) must be done as a linear and sequential exercise. In this case, the sequential stages will necessarily follow a Waterfall model, while parallel development on the application level could be conducted with Agile methods.
  • Agile management and organisational strategy. The job of senior management is to make educated assumptions about the company’s future potential and evolution in a fast-changing industry and competitive market. A strategy would then be built on those assumptions, and an action plan would be created from that strategy. The vision, strategy, and action plan would be disseminated gradually among the various divisions, departments, and their teams, outlining the expected contribution of each unit. Units will then set up structures, request funding, purchase tools, hire resources, and expend significant energy to make that happen. With every decision made, the set of possible alternatives diminishes. Imagine an organisation with a few thousand people strategizing and planning in two-week sprints!

It might seem absurd to discuss organisational strategy after everything we said above and in the first part on the challenges involved in predicting an organisation’s future behaviour, let alone controlling it. A closer look, however, would dispel those doubts. Short sprints (a couple of weeks) in small teams seem appropriate. Larger teams can go for 3-6 month sprints, while it’s entirely reasonable for organisations to have 2-5 year plans.

Just like a chisel and a jackhammer are conceptually similar, their domains of applicability are entirely different. Any tool or method works great in a specific domain, but when the context shifts, its effectiveness diminishes. This applies equally well to Agile, Waterfall, and anything in between.

Agile can scale, but not necessarily through SAFe, Kanban, or Scrum. A bit more on that in the next section.

8.3 Agile Across the Organisation

The following are some guidelines on how to make the best of Agile in your organisation.

  • Agile is not a mindset or culture and cannot be acquired from crash courses, two-day workshops, or hiring external Agile coaches. Agile has distinctive features like fast and frequent delivery and iterative methods. If these features don’t apply, you are not agile, and that’s OK. Waterfall project management is fine in some contexts, such as major infrastructure projects.
  • Use XP, Scrum, Kanban (or any variation) if your software team is small. This might sound unrealistically easy, and although it’s not, it’s worth the effort. Teams used to working in Waterfall will find the Agile learning curve quite steep. They might also lack the tools, especially in automation, to make it feasible. On the positive side, Agile methods can be broken down into their lowest coherent levels and recombined in different ways to suit new needs. More about this in a second.
  • Leverage knowledge management and collaboration tools. Being agile can be summed up in two ideas: a) a high degree of interaction with different stakeholders, and b) short and frequent deliveries. To efficiently manage agile teams, a high degree of coordination and planning is required, and this is facilitated immensely by collaboration and knowledge management tools like JIRA and Confluence. Delivering complex software with spreadsheets will be daunting and take away the precious effort and time that could be invested more wisely somewhere else.
  • Leverage development, testing, automation, and deployment tools. The extra overhead due to agile teams contributing daily to the master branch and the release team frequently deploying into production can be compensated by automating as much of the manual effort as possible. Development teams should have budgets allocated to continuous integration and continuous deployment tools, infrastructure, and talent. This initial investment is the biggest obstacle that teams must overcome to transition from Waterfall to Agile.
  • Scaling Agile. Although the original creators of the Agile Manifesto came from different backgrounds, its core principles are surprisingly coherent. I believe that the source of this coherence comes from the high level of abstraction of the twelve principles and, therefore, no specific implementation of Agile (Scrum, XP, Kanban…) or method (Test-Driven Development, sprint, user story, backlog, retrospective, sprint planning…) embodies all twelve principles and is universal or superior to the rest. All these methods and practices have bounded domains of applicability. Agile can only scale by decomposing a specific implementation into its fundamental methods and practices and recombining them in novel ways depending on the context. Think of the Hybrid model, for example, which combines elements of Waterfall and Agile. Theoretically, nothing is wrong with this model as it combines powerful Agile practices within a traditional and easy-to-explain Waterfall implementation.

Principle 5: Focus on Business Value

9.1 Customer and Business Value

So far, we have used customer value and business value interchangeably. It’s now time to get the right definitions in place.

  • Customer Value refers to the perceived benefits, advantages, or utility customers and users receive from a product, service, or overall experience a business provides. It focuses on meeting or exceeding customer expectations and addressing their needs, desires, and preferences.
  • Business value refers to the benefits or advantages a business or organization derives from its activities, strategies, or investments. It encompasses the positive impact on the company’s financial performance, operational efficiency, competitive advantage, brand reputation, and overall business goals.

Customer value can be measured in various ways, such as through:

  • Customer satisfaction
  • Loyalty
  • Repeat purchases
  • Referrals

Businesses must understand and deliver customer value as it contributes to customer retention, market competitiveness, and long-term success.

Business value can be measured through various financial metrics, such as:

  • Revenue growth
  • Profitability
  • Return on investment (ROI)
  • Market share
  • Shareholder value

Business value focuses on the outcomes and results achieved by the business as a whole.

While customer and business value are related and interconnected, they represent different perspectives. Customer value primarily focuses on meeting customer needs and creating a positive experience, while business value concentrates on the benefits and outcomes for the company. However, successful businesses recognize the importance of aligning customer and business value, as delivering superior customer value often leads to enhanced business performance and sustainable growth.

Customer and business values are the north stars on the map of operational excellence in software development. They are the centre of mass around which everything else gravitates.

While such a fact seems fairly obvious and does not need further elaboration, the reality is a bit more complex.

  • For starters, not all departments in an organisation are in touch with the customers and may not immediately see the product of their labour. In this case, they might take pride in the sophistication of their work, its quality, or its aesthetics but not in its utility to the business or the end user.
  • Secondly, compartmentalising organisational effort, a necessary mode of operations in large and complex projects, focuses the commercial and financial pressures on specific departments (sales, project management, finance) while shielding others (engineering, production, customer support). In such cases, we again observe other (localized) goals that might, in some cases, be antithematic to the overall business effort.

These complications typically manifest themselves in constant arguments between departments on whether to build the right product, the best product, or that which is easiest to launch in the market. While these discussions might be healthy in some cases, there will be instances where different units pull the cart in opposite directions. Rather than zigzagging towards the ultimate goal, they get stuck in vicious loops.

To remedy this problem, the role of a product manager was put in place. However, looking after a software product from a commercial and customer value perspective is insufficient; technical considerations must also be considered to sustain the product. This leads us to conclude that every resource involved in a software product must understand why this software is built (customer value) and why it is important for the organisation (business value).

9.2 Product Management

Product management encompasses the end-to-end process of conceptualizing, developing, launching, and managing products throughout their lifecycle. It involves a multidisciplinary approach, combining:

  • Marketing
  • Technology
  • Design
  • Business strategy.

Product managers act as the orchestrators, aligning customer insights, market trends, and organizational goals to deliver value-added solutions. Their roles typically include the following activities:

  • Roadmap Development: A robust product roadmap is essential for guiding product development efforts. Product managers create and communicate roadmaps that outline the product’s future direction, including feature prioritization, release timelines, and resource allocation. Roadmaps are one of the top things organisations look for when assessing a software product for their next purchase. A poor roadmap means the product is neglected and might risk being sunsetted before their investments come true.
  • Cross-Functional Collaboration: Product managers bridge the gap between various teams, including engineering, design, marketing, and sales. They facilitate effective communication, ensuring that all stakeholders are aligned, and work collaboratively to deliver high-quality products.
  • Product Development and Launch: Product managers oversee the product development lifecycle, collaborating with engineering teams to transform concepts into tangible products. They coordinate testing, iteration, and quality assurance processes, culminating in successful product launches that meet customer expectations.

Many things can go wrong when managing large software products in complex ecosystems. However, the following are the top pitfalls encountered:

  • Feature Overload: Trying to incorporate too many features or functionalities into a product can result in complexity, confusion, and diminished user experience. Product managers should adopt a user-centric approach, delivering core value and prioritizing features based on customer needs and market demands.
  • Lack of Iteration and Adaptability: Failing to iterate and adapt and lacking Agile design approaches can result in outdated products or failing to address evolving customer requirements. Successful product managers embrace feedback loops, encourage experimentation, and continuously monitor market dynamics and changing user preferences to refine and improve their products.

9.3 Minimum Viable Products

The Minimum Viable Product (MVP) is a strategic concept that focuses on delivering:

  • A functional version of a product
  • With the minimum set of features required to satisfy early adopters and gather valuable feedback.

A Minimum Viable Product (MVP) is like a usable prototype put into active service and monitored for insights. An MVP provides the following advantages:

  • Rapid Validation of Ideas: By releasing an MVP, software teams can efficiently test assumptions and validate product ideas early in the development cycle. As Snowden puts it, users don’t know what they want until they understand what the technology can do for them, and the latter may only happen when they start interacting with the product. Developing multiple MVPs or multiple versions of the MVP aligns with the idea of conducting safe-to-fail experiments to probe a complex system.
  • Accelerated Time-to-Market: Developing and launching an MVP reduces the time required to bring a product to market. By focusing on the core functionalities that address mission-critical customer needs, teams can swiftly deploy an initial version of the product, gaining an early foothold and competitive advantage while continuing to enhance and expand its capabilities.
  • Cost Optimization: The MVP approach helps optimize development costs by focusing resources on essential features and reducing unnecessary expenditures. Once the product generates revenue, elaborate functionalities based on customer feedback can be implemented. This is again aligned with Agile design concepts.

There is, however, one common pitfall in working with MVPs regarding interpreting user feedback. Misinterpreting user feedback or relying solely on anecdotal evidence can lead to misguided decisions. Teams should leverage robust feedback mechanisms like user analytics and surveys to extract actionable insights and make data-driven improvements.

9.4 Modernizing Legacy Software Products

Legacy software modernization involves upgrading and revitalizing existing software applications to:

  • Leverage modern technologies
  • Improve performance
  • Enhance user experiences
  • Align with current business needs
  • Address technical or design limitations
  • Overcome technical debt
  • Future-proof products while preserving valuable functionality and data.

Modernizing legacy software requires a significant initial investment and immense leadership support for the following reasons:

  • Firstly, the organisation has optimized its processes, structure, and commercials over many years around a successful product which has also raised many of its leading figures into prominence. A significant pushback will be expected when a novel idea or product suddenly threatens to dethrone existing products and reshuffle the power hierarchy.
  • Secondly, customers would rather buy old (and perhaps obsolete) products over acting as guinea pigs for a new platform despite any obvious advantages the latter might possess. In such cases, purchasing managers might prioritize their job safety over what’s best for the organisation. This dynamic raises the market entry barrier for new products.
  • Thirdly, there is no obvious and risk-free method to migrate existing customers without impacting the bottom line or risking customer turnover. It is wise not to cannibalize a successful and mature product for a young and unproven one.

Finally, organisations must recognize the difference between upgrading legacy software because of its inherent limitations or the presence of imminent threat from competition on one hand and the need for technicians in the organisation to switch to the latest and greatest technology stack. In the former case, we are correctly focusing on business value and its future. In the latter, we are succumbing to hype.

9.5 Guidelines for Prioritizing Business and Customer Value

Following are powerful guidelines that organisations can apply to ensure they always focus on customer and business value.

  • Have an experienced product manager on board who understands the business, the product, the customers, and the commercial aspects of running a business organisation
  • Data-driven decision-making on features prioritisation. This facilitates the maintenance of a product roadmap that guarantees success.
  • Subordinating technology to business value. Major investment decisions should have a clear business objective easily explainable at any management level.
  • Take the time to bridge departmental gaps and silos through informal networks that allow intergroup communication and awareness. DevOps is a great attempt, for example, to bridge the gap between those who create the software and those who run it. People could then rely on their informal networks in times of need, thus giving everybody the feeling they are genuinely in the same boat.
  • Ensure that local objectives on a departmental level do not seek to maximize the unit’s performance regardless of the cost to the business. This would be difficult in a culture that thrives on departmental feuds or management styles that overestimate the utility of key performance indicators, performance data, and statistical analysis methods.
  • When novel ideas can be turned into viable products, set up a new organisation to develop and promote the new product. This helps remove the friction with old products and existing structures, processes, and powerful stakeholders.

Principle 6: Use Mature and Proven Technology to Serve Your Customers

10.1 Business, Software, and Technology

The complex interplay between technology and business, especially in software, can be abstracted to the following ideas:

  • Technology enables faster delivery and better software quality. One of the obvious ways in which technology contributes to the software business is how well it helps developers do a better job. Through the automation of effort-intensive exercises like building and testing of software, technology allowed Agile and DevOps to become practical. The basic underlying ideas of Agile and DevOps existed long before they became mainstream. Developers, however, had to wait for the tools to catch up before they could implement them.
  • Technology coevolving with customer preferences. Technology plays such an influential role in businesses today, not only because it facilitates production but also because it radically alters our preferences as consumers. Here consumers are not limited to the end users of technological products in the conventional sense (smartphone users, car drivers, internet surfers, cardholders…) but also to software developers and technicians using technological tools. Both types (end users and technicians) have an insatiable desire to use the latest and greatest equipment or products that technology can offer (what in philosophy is called objects of material desire).
  • Managing a technology’s lifecycle. A developer spending sufficient time in an organisation may witness the rise and fall of a specific technology, design, or tool used in software development. Here there are two discernable dynamics (or two strange attractors in complex systems terminology) in action. On the one hand, the tech staff is pushing for more expenditure and investment in tools, infrastructure, and refactoring. In contrast, on the other, we have management trying to maximize their return on investment in the existing technology, infrastructure, and tools. Every once in a while (hopefully at the right moment), the organisation will move from one attractor (new investment) to the other (maximizing ROI).
Two dynamics occur simultaneously in tech organisations: one dynamic involves technicians and customers pushing for technological advancements and upgrades, while business and management push to maximise ROI.
Two dynamics occur simultaneously in tech organisations: one dynamic involves technicians and customers pushing for technological advancements and upgrades, while business and management push to maximise ROI.

Software, in this sense, can be quite different from manufacturing. For example, Toyota cars benefit from 50 years of incremental improvements in precision instruments, electronics, engine and transmission designs, and aerodynamics. The first car to be mass-produced was the Ford Model T, launched in 1908, more than 110 years ago. Software is comparatively young, by contrast.

Software is also more dynamic. In Part 1 (Operational Excellence and the Structure of Software Development and Delivery), we listed the most notable revolutions in the software industry since its inception, and there are quite a few, especially considering the short lifespan of the industry. By comparison, the automotive industry has progressed through fewer revolutions.

In software, the fast-paced evolution of technology is a constant challenge for all the reasons mentioned above and, therefore, needs special attention. The response to this challenge is in the sixth principle of Operational Excellence in Software Development: Use Mature and Proven Technology to Serve Your Customers.

Let’s examine this statement in more detail by exploring the pros and cons of the (sometimes premature) adoption of novel tech.

10.2 Mature vs Novel Tech, Which Is Best?

Unless you work in high-tech mega-corporations like Google or Tesla or in Research and Development projects, relying on mature and proven technology is generally advisable for a safer and more efficient approach.

The advantages of adopting such an approach are readily apparent:

  • First and foremost, utilizing mature technologies ensures easy access to a vast pool of resources with extensive knowledge and expertise. This becomes particularly crucial when seeking specific skill sets that may not be readily available within a niche market. Additionally, abundant online help, support, and documentation can be found for ubiquitous and mature technologies, further facilitating learning and troubleshooting.
  • Secondly, mature technologies often come equipped with industry-standard interfaces, allowing seamless integration with third-party systems and protocols. This interoperability greatly simplifies the development and implementation of complex solutions, enabling businesses to leverage existing infrastructure and tools.
  • Furthermore, relying on mature technologies contributes to the stability and reliability of your products. Over time, most bugs and issues have been identified and addressed, resulting in a more robust and dependable system. This, in turn, enhances customer satisfaction, reduces downtime, and minimizes the need for costly bug fixes or continuous maintenance.
  • Lastly, employing mature technologies promotes effective communication with customers and peers. By utilizing a standard dictionary of terms and definitions prevalent in the industry, misunderstandings and confusion can be minimized. This streamlined communication fosters collaboration and ensures everyone involved is on the same page, accelerating decision-making processes and driving overall efficiency.

While some may perceive this approach as lacking excitement or novelty, it is undeniably a safe, dependable, and profitable strategy. Let’s explore some concrete examples to illustrate these advantages further:

  • In the banking sector, major financial institutions rely on established technologies like mainframe systems to handle critical transactions. These systems have been refined and hardened over decades, making them incredibly reliable and resistant to failures. The stability of mature technology allows banks to conduct millions of transactions securely and efficiently, instilling confidence in customers and regulators.
  • Consider open-source content management systems (CMS) adoption like WordPress in the web development industry. With a large and active community of developers and extensive documentation, WordPress provides a robust foundation for building websites quickly and effectively. The availability of numerous plugins and themes further expands its functionality and customization options. By choosing a mature technology like WordPress, developers can tap into a wealth of knowledge and resources, ensuring efficient project delivery and seamless collaboration.
  • Manufacturers rely on standardized protocols and interfaces like CAN (Controller Area Network) for electronic communication between various vehicle components in the automotive industry. This industry-wide adoption of a mature technology allows different subsystems, such as the engine, transmission, and infotainment system, to work together harmoniously. It facilitates interoperability, enabling the integration of third-party components and reducing development time and costs.

These examples highlight how mature technologies bring tangible benefits across diverse sectors. By leveraging established solutions, businesses can capitalize on their wealth of resources, stability, and compatibility, resulting in safer, more efficient operations and ultimately driving profitability.

10.3 Guidelines for Working with Technolgy

Following are some best practices that can be used to great effect when dealing with technology:

  • Educate staff on the interplay between technology and business. This prevents staff from pushing for unreasonable investments in technical assets while allowing managers to dispel misconceptions about what new technologies (like cloud computing or micro-services) can and can’t do.
  • Allocate budget for maintaining existing technology assets. This budget will cover tools, licenses, infrastructure, and, most importantly, refactoring. While poorly maintained infrastructure can show immediately through severe service degradation, the problems emanating from mounting technical debt take a bit of time to express themselves, and by then, it might be too late.
  • Selecting a tech stack. Select a tech stack that does the job, is widely supported by the community, and has a long history (due to all the advantages that maturity brings). A good heuristic for estimating the life expectancy of a specific technology (paraphrased from Fooled by Randomness) is by looking at its age. For example, if a programming language is 10 years old, expecting it to live for another 10 years is not unreasonable.

11. Have We Missed Anything?

Yes — communication, leadership, group psychology, norms, culture, organisational behaviour and theory, interpersonal relationships, conflict resolution, resource allocation and management, the list can stretch quite a bit more. So where does Operational Excellence stand on these points, and how do these issues, in turn, affect an organisation’s ability to achieve high execution standards? We will try to answer these questions through a chess metaphor and some basic game theory concepts, specifically on infinite games.

11.1 Operational Excellence and Middle Game Mastery

In chess, the game can be divided into three main phases: the opening, the middle game, and the end game. Each phase has its own goals and strategies. We can think of organisational development and evolution along similar lines.

The opening game is similar to the founding stages of the organisation or group where culture, norms, and processes are created (initial conditions in complex systems parlance). These initial conditions have a long-standing impact on the group’s future development. The goal of the “opening game” in organisational development is to create a great culture, identify your value proposition, hire the right talent, and work on a strategy for growth and development.

During the middle game, the goal is to maximize productivity and exposure to serendipity, ward off external threats, maintain the group’s integrity, and leverage new opportunities. These can be achieved through the six principles of Operational Excellence in Software Development. However, the opening game must have prepared the ground for these six principles to be successfully implemented.

Operational Excellence is particularly out of context in the following situations:

  • Formative group stages. This is where team members have immediate concerns about their feelings and roles within the group and where they are not yet ready to progress towards a common goal.
  • Times of instabilities. In times of high instability (mergers, acquisitions, natural disasters, the emergence of disruptive technologies, and fierce competition), the present concern is about short-term survival only.

What about end games? The objectives and strategies can vary in chess endgames depending on the position and material balance. They typically include: checkmating the opponent’s king, promoting pawns, gaining a material advantage, and keeping one’s king safe. Organisations have no end games; therefore, the rules of play differ.

11.2 Short-Term Wins, Operational Excellence, and Infinite Games

In game theory, infinite games extend indefinitely over time, whereas closed games are finite with a specific endpoint. These two types of games differ in duration, strategies, and potential outcomes.

  • Closed games, or finite games, have a defined and fixed endpoint. The players in a closed game have limited moves or rounds before the game concludes. Examples of closed games include chess, tic-tac-toe, or a single hand of poker. In closed games, the strategies employed by players typically focus on achieving victory within the finite time frame.
  • In contrast, infinite games have no fixed endpoint and can continue indefinitely. Ongoing and evolving interactions between the players characterize them. Examples of infinite games include negotiations, business competition, or the interaction between countries in international relations. In infinite games, players often adopt strategies prioritising long-term sustainability, adaptation, and maintaining a competitive advantage over time.

Running a business, especially if you aspire for Operational Excellence, must consider the following topics:

  • Business is an infinite game. Although businesses have a lifecycle with a definite beginning and end, the lifetime of an organisation is, on average, long enough relative to that of its employees to be considered infinite. The objective is, therefore, to remain in the game (sustainability) for as long as possible and not to secure short-term wins that do not contribute to the overall objective or, in some cases, adversely impact the organisation’s ability to sustain its existence.
  • Operational Excellence is not played at a tactical level. To illustrate this point, consider Toyota’s attitude towards employment and leadership. Toyota and other Japanese firms typically offer lifetime employment to their employees, meaning employer and employee are closely bound and can either grow or stagnate together. Toyota also prefers to nurture leadership talents internally rather than rely on outsiders in times of need. Growing leaders internally is also a long-term strategy with long-term objectives.

12. Conclusion

Writing down six principles and innumerable ideas, concepts, and theories on how something ought to be done is very different from putting those thoughts into practice. One cannot but feel powerless when contemplating the job’s immensity or the intimidating height of its entry barrier.

This feeling of overwhelmedness puts into question the fruitfulness of such an endeavour and whether people are not better off focusing on low-hanging, isolated, and localised objectives in the hope that a global, cumulative, and incremental improvement will become noticeable shortly. This reasoning is behind the concept of “managing in the present”, a fundamental approach to working with complex adaptive systems, as we have seen earlier.

Maybe the Toyota Production System and its version of operational excellence would not have been possible outside of Japanese culture. Maybe Schein’s organisational culture and processes model does not apply effectively outside Western cultures. This line of inquiry again pushes the issues of context and cross-industry learning to the forefront.

While the answers remain open, we have little choice but to hang on to what we believe is a consistent framework (and structure) on how software development should be carried out. Just like democracy became the norm only a few centuries ago, despite its fundamental concepts elucidated more than two thousand years back, we hope the young software industry will adopt a similar framework. Whether it settles down on the six concepts above or new ones altogether does not matter.

As far as this author goes, having a working compass and knowledge firmly rooted in theory and practice is crucial. This knowledge provides meaning to our actions and allows us to make sense of our experiences while the compass guides our future steps. We hope the reader has found similar insights into those ideas.

13. Reference

Leave a Reply

Your email address will not be published. Required fields are marked *