Principles of Operational Excellence in Software Development

Georges Lteif

Georges Lteif

Software Engineer

Last Updated on January 19, 2023.
Subscribe now to stay posted!
About Us
19 min read


1. Overview

Operational Excellence has its roots in the automotive industry. It started in the 1950s when post-war Japanese car manufacturers struggled to catch up with the potency of mass production – a method in which their giant American competitors had a significant edge.

Henry Ford deployed the assembly line for the first time in 1913. It was capable of mass-producing an entire vehicle.

With massive resources and a vast market at Ford’s disposal, there was very little the Japanese automakers could do to compete.

The situation, however, changed drastically in the late 1980s, and the Japanese carmakers, especially Toyota, were doing significantly better than their American counterparts.

For the next few decades, the Japanese produced exceptionally better cars at highly-competitive costs. Their market share kept increasing year on year.

To make that happen, Toyota and other Japanese manufacturers deployed a strategic weapon: something that would later be known as Operational Excellence.

Toyota then moved on to perfect its production methods and management model.

While the former was subsequently known as the Toyota Production System (or TPS), the latter became The Toyota Way. Both heavily inspired many industries.

The Toyota Way and the Toyota Production System were so efficient that professionals from different industries started looking at incorporating its ideals into their businesses.

Think of KaizenKanban, and 5S. All three originated at Toyota and carried its trademark.

Although on the surface, car manufacturing bears minimal resemblance to the development of software products, there is enough overlap to warrant a serious discussion.

We believe software development is well-poised to benefit from various concepts successfully applied in the car business. This article will look at how software development and delivery can profit from the philosophy and principles of Toyota.


2. Challenges of Delivering Software

Let’s examine why new solutions are required for issues in a 70-year-old industry.

2.1 Types of Challenges

The challenges organisations face can be divided into shared and industry-specific.

Challenges of Software Delivery
Challenges of Software Delivery

Shared challenges are heavily centred around the human element and can include any of the following:

  1. Organisational culture
  2. Group behaviour and integration
  3. Effective leadership
  4. Stakeholder management
  5. Efficient collaboration
  6. Team motivation and morale
  7. Group and individual performance
  8. Dealing with threats and opportunities, both internal and external
  9. Risk management
  10. Recognizing and dealing with complexity

Industry-specific challenges are mainly technical. An example can be:

  1. Technical decisions on tools and infrastructure
  2. Creating efficient production processes
  3. Choosing a suitable project delivery methodology
  4. Dealing with novel technologies
  5. Encouraging innovation and creativity
  6. Implementing a performing software development lifecycle (SDLC)

Choices you make will create risks and opportunities, and balancing these aspects sustainably turns software development into an interesting adventure.

The flourishing of tools and methods, such as the multitude of Agile and DevOps variants, Six Sigma, Lean, etc. and their commodification and ritualization, placed enormous pressure on software professionals to conform to the most fashionable (rather than the most effective) ideas of the time.

You can’t be expected to perform with your tools and code working against you.

2.2 Why Software Projects Fail

A significant portion (around 80%) of IT projects either finish late, over budget, or with less business value than initially promised. Some projects never even make it into production.

The top reasons for IT project failures from sources like Forbes magazine, a paper published in Association for Information Systems, and an article that appeared in PMI were given as follows:

#Reason Why Projects Fail
1Unclear business requirements
2The mismatch between the solution design and implementation on the one hand and the business requirements on the other
3Substandard planning and execution and lack of project management skills
4Lack of standardised production processes
5Working in silos
6Lack of agile practices
7Friction caused by undefined roles
8Lack of discipline
9Inadequate testing and quality assurance
10Focusing on technical detail instead of business value
Top 10 Reasons Why IT Projects Fail

Agile was supposed to take over Waterfall (at least where applicable) as the superior project delivery methodology, but that did not happen, even two decades after its inception.

Participants in the 2021 State of Agile Report survey provided the following answers regarding the slow adoption of Agile in their organisations:

#ReasonPer cent of Participants
1Inconsistencies in production processes and practices46%
2Cultural clashes43%
3General organisational resistance to change42%
4Lack of skills and experience42%
5Absence of leadership participation41%
6Inadequate management support and sponsorship40%
Reasons behind the Slow Adoption of Agile

3. Addressing Software Delivery Challenges

3.1 A Problem-Solving Framework

While there is no guaranteed recipe for solving the vast array of issues presented above, there are heuristics that we can build into a framework. This framework will help us think about the problem and identify proper solutions.

We propose the following three steps to approaching any task:

  1. Step1: Proper diagnosis and root cause analysis
  2. Step 2: Leverage Other People’s Experiences (OPE)
  3. Step 3: Create mechanisms that can deal with novel or atypical future challenges

We discuss these steps in the next sections.

Step 1: Root Cause Analysis

The omnipresence of the human element in (almost) every challenge listed above brings about enough complexity to rule out any obvious and straightforward solutions. This fact is, undoubtedly, the first to recognize if you are serious about change and improvement in your organization or group.

You must read the literature on group behaviour complexity and the technical challenges specific to software development to make sense of your experiences.

Binary views (good vs evil, skilled vs non-skilled, suitable vs not suitable) must be cast aside in favour of a graduated spectrum where two conflicting attributes are desirable in a specific ratio. For example, you want enough flexibility to allow for innovation, but you also want some order to avoid chaos.

These ideas heavily apply to the below topics:

On each occasion, we went back to the source to understand the circumstances that led to the creation of a specific idea or concept. We also looked at some retrospectives from influential opinion leaders who got their hands dirty trying out their ideas.

The research effort allows for an informed, evidence-based opinion on what works and does not. In most cases, firsthand experience was added to the mix.

Step 2: Other People’s Experience

Two practices have been applied historically with magnificent results:

  • Cross-industry knowledge sharing
  • Synthesizing new solutions from old ones

Cross-industry knowledge sharing is not easy; it requires a lot of poking around, an open mind, and no small amount of luck. The latter helps you stumble upon precious sources of knowledge where valuable techniques can be transferred from a different field or industry to your own.

Earlier, I divided software delivery challenges into two categories: shared problems mainly centred around human groups and industry-specific problems of a more technical nature.

We will lean on the Toyota management principles for inspiration on issues of a slightly technical nature, such as process design, process improvement, and quality management.

Moving solutions across borders is called creative problem-solving, and we have used this technique extensively when looking at Toyota and Six Sigma practices.

As we shall see later, Toyota has developed and deployed articulate and highly-effective strategies for addressing quality and performance issues with excellent results. Six Sigma has also taken proven techniques like DMAIC to a new level.

Synthesizing new solutions from old ones is a great strategy when you start observing cracks and crevices in your traditional methods.

The best example I can think of is the Hybrid project management model; it is neither Waterfall nor Agile, but many people appreciate its power and suitability in specific contexts.

The Hybrid model is Waterfall at heart, supplemented heavily with Agile techniques such as User Stories and Kanban boards. Exactly how Agile or Waterfall is the Hybrid model implementation depends on the implementors and their specific conditions.

This flexibility for combining elements from various spaces allows the emergence of tailor-made solutions.

Step 3: Dealing with Novelty

Many giants of the tech world (Xerox and Kodak, to name a few) went under because they failed to recognize and deal with novelty.

People who successfully deal with new threats can quickly appreciate the significance of conflicting new data. They also have enough courage to acknowledge that something new is unfolding.

In some situations, incremental improvements are enough, while cultural transformations may be necessary for more radical ones.

Incremental improvements result from tweaking existing processes, sometimes called process redesign or reengineering.

Cultural transformations are traumatic experiences where leaders must unlearn old models and replace them with new ones.


3. Operational Excellence in Software Development

3.1 Approach

The Toyota Way involves 14 management principles implemented in the Toyota Production System (TPS).

Some of these principles, like the Just-in-Time (JIT) system, are very specific to car manufacturing, and although they would make for a great read, their utility does not readily come across in the software space. Other principles like Genshi Genbutsu will be incredibly beneficial.

Six Sigma is also another place to look for solutions. Despite the limited utility of strategies that heavily depend on data and statistics in small to medium-sized businesses, the backbone of Six Sigma (the DMAIC problem-solving technique) is a valuable addition to our toolbox.

3.2 Objective

So why Operational Excellence? What exactly is it about, and what does it aim to achieve?

In summary, Operational Excellence describes the ability of an organisation to conduct its operations flawlessly. In the context of this website, the “operations” we are primarily interested in are the delivery of software solutions.

Origins of Operational Excellence

We have collected seven insightful quotes from leading figures in the business management and quality fields that we hope will give you a strong sense of what drives Operational Excellence.

3.3 Implementation Plan

Achieving Operational Excellence in Software Development requires us to follow the same three-step plan we outlined earlier (root cause analysis, finding creative solutions, and setting up governing processes).

Operational Excellence
Major Actors in a Software Business

As part of the root cause analysis, we need to figure out where the challenges originate, not just the symptoms we listed earlier.

Is it the people, the production processes, the technology stack, or some combination of those elements that make up the origin of delivery challenges?

The answer is a combination of all three. This statement is relatively accurate in most cases. Let’s have a closer look at each of these categories in turn.

3.3.1 People and the Organization’s Culture

At the core of every organization is a team of professionals, and this team is typically divided into management, technical, and administrative.

The team can be newly formed or might have a long-standing history. In the former’s case, the first formative months are decisive as this is when the group’s culture is created. In the latter’s case, the culture is already established, and its transformation in response to new threats can be particularly interesting to study.


We accept challenges with a creative spirit, and the courage to realize our own dreams without losing drive or energy. We approach our work vigorously, with optimism and a sincere belief in the value of our contributions.

In both cases, Operational Excellence cannot be attained without a relentless drive for continuous improvement by everybody, and maintaining constant momentum requires keeping the employees engaged and motivated.

Employees are usually motivated when:

  1. Their ultimate goal as a group gives meaning to their efforts (such as benefiting society and not just improving shareholder profits).
  2. They feel emotionally secure (as when their opinions are valued and their contributions appreciated).
  3. They have the opportunity to learn and grow.

Perhaps one of the most insightful studies published on satisfaction was by the US psychologist Mihaly Csikszentmihalyi in 1961, in which he coined the term flow when discussing motivation.

The flow is experienced during:

  1. Intense focus
  2. On an activity that we choose
  3. That is neither too far above nor below our skill levels
  4. With clear objectives
  5. And immediate rewards

When approaching the human element in organizations, it is always helpful to remember two things.

  • First, hierarchies are complex systems and cannot be viewed as a simple mechanistic model. This characteristic of human groups means that making decisions, planning policies, and implementing strategies is not as easy as we would like it to be. Stong internal forces in complex systems render the predictions of the group’s behaviour, among other things, very challenging.

Frameworks like Six Sigma systematically ignore the human element when addressing organisational performance issues and focus more on the technical aspects.

The Toyota Production System is quite the opposite, with 14 principles spanning the whole spectrum of issues to be tackled.

3.3.2 Process Management

A production process is a set of instructions applied to raw input to produce an output with increased business value over the inputs. These processes must be managed for efficiency (maximized value over cost) and effectiveness (maximal impact).

Production processes are usually set up in a specific context, and they, therefore, can become obsolete when the circumstances change. A process owner must monitor these processes and look for degeneration and lowered performance signs. She will then implement methods to improve or replace them.

Process Management Improvement, and Redesign
Process Management, Improvement, and Redesign

There are three stages to managing production processes:

  1. Design — During the process design phase, the process owner looks at the specific tools, infrastructure, and skills present in the team and designs processes fitted for this context.
  1. Incremental Improvement — This is an ongoing activity that provides remedies to issues arising from a process failure. Issues of such nature occur when the process is not correctly implemented, followed, or is inherently flawed. For example, a bug in a system interface that makes it into production may result from process failure. In this situation, improved measures for quality assurance can be implemented.
  1. Redesign — Process redesign is what you do when the processes become obsolete or require significant modification to continue doing their job. For example, a systematic failure to deliver quality products signifies outdated frameworks or methodologies rather than simple processes. Moving to Agile or DevOps is one example.

The people at Toyota believe that following the right processes will always lead to the correct results. For them, it is preferable that you get incorrect results with the proper procedures over good results through flawed processes; you can only get lucky so many times.

3.3.3 Choosing the Right Technology

Going back to the Venn diagram at the top of this section, we see a tight connection between technology on the one hand and people and processes on the other.

Let’s explore these relationships through the following ideas:

  • Technology as a Strategic Choice — The selection of the technology stack, tools, and infrastructure is a strategic decision for two reasons. First, changing once you acquire a few clients is extremely hard. Second, it directly influences many other strategic drivers, such as the ability to integrate your product with other technologies and keep up with innovation. Mainframe legacy applications are still running core banking applications at a considerable cost and limitation to the client.
  • Technology and Processes — DevOps is a perfect example of the amalgamation of processes, technology, and tools. Choosing to offer Continuous Delivery through a DevOps model influences stack selection, solution architecture, software design, solution design, testing procedures (automation, quality assurance), technical debt management, deployment, and monitoring. A suitable choice of technology and compatible tools can make these more or less challenging.
  • Technology and People — The technology you use, its maturity, ubiquity, and popularity will dictate the size of the available pool of resources from which you can choose. Needless to say that a larger pool is always preferable as it drives down the costs of resources and allows for hiring skilled and experienced talent ready to start delivering. Using legacy technology, despite its ubiquity, may not attract talent as people look for new challenges to broaden their horizons.

4. Core Principles of Operational Excellence in Software Development

We chose six principles to represent the core ideas of Operational Excellence in Software Development.

Core Principles of Operational Excellence in Software Development
Core Principles of Operational Excellence in Software Development

4.1 Principle 1: Eliminating Cultural Blockers

The 15th State of Agile Report and the 2021 State of DevOps Report cite cultural blockers as primary challenges in adopting Agile or DevOps.

In this sense, perhaps it makes sense that the first principle of Operational Excellence would focus on managing cultural blockers that impede growth and success.

We say managing and not eliminating as the latter would be too difficult, if not impossible, as we shall see in a moment.

So what are those cultural challenges? Let’s have a look at some examples.

4.1.1 Resistance to Change

Cultures are cognitive constructs deployed as defensive mechanisms that help groups cope with internal integration challenges and challenges posed by their environment.

Examples of internal challenges are the group’s cohesion, integration of new members or ideas, and replacing old ones.

External pressures can come from novel technologies, new customer requirements, or competitors. Any environmental change can be a threat (or opportunity) to the business. If the group were to survive, it would need to adapt and respond to those threats in new ways.

Its leaders and members must unlearn old methods and learn new ones. This process is traumatic and does not necessarily succeed all the time.

A culture that recognises and acknowledges change as a constant of nature and has the psychological safety to carry out cultural transformations is much better adapted for survival.

4.1.2 Stakeholder Management

Stakeholders are influential people in the organisation who can influence its progress or regression. Any business strategy is bound to fail if it does not adequately analyse and manage its stakeholders.

It is also essential to recognise that organisational hierarchies are complex systems, making them difficult to control and predict.

Power distribution among the stakeholders and their desire and ability to support/undermine your plans must be determined before the project kicks off and closely monitored during execution.

Implementing long-term strategic improvement plans (such as Agile or DevOps) will require patience as the results will not be immediately evident. Therefore, continuous support from the leadership over the plan’s lifecycle is vital.

4.1.3 Organisational Culture

Organisational culture determines how groups perceive time and space, dictates the reward and punishment system, and imposes group inclusion or exclusion rules.

For example, a culture that encourages hierarchical boundaries, punishes failures and constantly seeks stability and equilibrium does not necessarily facilitate innovation or the adoption of new ideas.

The perception of time is another example of how organisational culture governs the evolution of a group. A culture that focuses on the present may discourage spending time on experimentation and improvement if the results are not immediate.

Time management (beyond the to-do list) manifests in the dominant culture.

4.1.4 Continuous Learning and Self-Improvement

Engineers generally have a particular way of looking at things from their formal training. The world, in their views, can be modelled as a machine with inputs, outputs, and rules that govern its dynamics.

The natural world’s complexity, especially in organisational hierarchies, fades away due to the reductionist methods applied in the mechanistic model.

One way of combatting such tendencies is through studying philosophy, cultural anthropology, and organisational theory. This knowledge will inevitably lead to a radically different perception of the world, for example, using an informational paradigm instead of a mechanistic one.

A never-ending learning journey increases one’s awareness of their cognitive limitations, skewed perception of the world, and biased judgements.

This awareness is humbling and empowering, allowing them to make better decisions.

4.2 Principle 2: Process Management and Governance

The traditional discussions on productions processes typically revolve around the following questions:

Here are some suggestions to tackle these questions:

  1. Industry best practices — Follow industry best practices when setting up your processes. Despite every project and organisation being unique in some aspect of their operations, some ground rules are robust and universal enough to remain efficient despite the variations. Standardise processes across all your teams, customers, and projects. It is easier to gauge the impact of any improvements you make and validate the results with standardisation. It’s also easier to switch your staff between teams and projects and onboard new members.
  1. Accountability — Appoint a process owner who oversees process performance and improvement. Their job would be to investigate failures and decide whether they are systemic (because the process is inadequate or flawed) or user error (because someone failed to follow the guidelines).
  1. Improvement — Great tools for process design, improvement, and redesign are available today (DMAIC, Six Sigma, Kaizen, Hansei). Attempt to solve issues with technical tools and evidence-based arguments instead of opinions and thoughts. This approach helps keep the discussion productive and non-personal.
  1. Cultural blockers and resistance to change — Expect to face at least two types of blockers. Cultural blockers result from applying incompatible processes with the group’s culture. On the other hand, resistance to change is like a complex system, and hierarchical human groups are an example.
  1. Coverage — The situations where an employee would be equipped with a written and well-defined process for handling them will always be limited. This limitation is due to the novelty and complexity of today’s software solutions and projects. Under these circumstances, employees must use common sense, expertise, and personal initiative to determine the best course of action.

4.3 Principle 3: Draw a Line Below Which Quality Does Not Go

When you have aggressive deadlines or your project is running out of time, there are only four things you can do:

  1. Reduce the scope
  2. Extend the deadline
  3. Commit additional resources
  4. Sacrifice quality
Software Project Constraints

While the first two options are your best bets, and the third option is not always feasible, the fourth option (sacrificing software and testing quality) is what most people consider an easy way out.

There is a good explanation for why this can be appealing, and that’s because the results of sacrificing quality today will only be evident in the future. They will end up being someone else’s responsibility.

Sacrificing quality increases technical debt, creating a drag on your delivery capabilities. If left uncontrolled, it will transform your product from a lucrative asset to a liability.

Organisational Behavioural Pattern of Eroding Goals
Organisational Behavioral Pattern of Eroding Goals

In the long term, sacrificing quality will have detrimental and irreversible effects on your business. This organisational behavioural pattern of Eroding Goals was documented in Peter M. Senge’s book (1990) The Fifth Discipline.

But what if postponing the delivery, reducing the scope, or adding resources is impossible, and the only open option is to lower your quality standards?

The answer is: draw a line below which you don’t go. This threshold is where the risks will start to outweigh the benefits.

Here are some examples that help illustrate the point.

  1. Postpone or decline non-strategic and non-profitable projects. Focusing on lucrative or strategic customers is vital, especially during a fast-growing period.
  1. Reduce or eliminate non-essential activitiessuch as internal documentation, refactoring of old code, or daily meetings for smaller projects. Create different project delivery guidelines depending on project size.
  1. Carry over non-critical bugs into future releases. Manage your technical debt efficiently.
  1. Make software drops frequently so the customer can start testing and uncovering issues as early as possible.

Ideally, it is best to avoid these situations altogether. You can do that with better planning, efficient processes and resource allocation, and being open and honest with your clients.

4.4 Principle 4: Implement Agile Processes Where Applicable

Once you separate the hype accumulated around Agile and DevOps, you will find that their adoption offers immediate and substantial gains.

Granted, Agile is not for every project. Still, we can safely assume that of the 12 principles it advocates and the various practices that have flourished around it, there is a universal and generic set that can be applied everywhere.

Let’s look at the most powerful of these principles and practices.

  1. User Stories — User stories have brought the customer back to the forefront of the stage. Technical staff can easily get distracted from what is essential and valuable for the customer to what is purely a technical matter to the customer is oblivious.
  1. Face-to-face interactions over documentation — In this day and age, where teams are operating across different continents and time zones, it is easy to get sucked into excessive dependency on online collaboration tools such as emails and instant messaging. In these virtual worlds, misunderstandings proliferate, and progress is unnecessarily slow.
  1. Frequent iterations — In the pre-Agile world, customers would have to wait months before setting their eyes on the new software. It was often the case that what they got was substantially different from what they had. Agile tries to resolve this problem with frequent software drops that allow valuable information to find its way back to the developers to adjust their requirements and design. The gains achieved are not small when we consider that unclear requirement is one of the top reasons for IT project failures.
  1. Automation — Automation tools were not mature enough at Agile’s conception, but that has changed. Automation is a necessary tool for Agile to succeed, and it is impossible to release frequently and reliably without automating repetitive work, especially around building and testing.
  1. Agile Architecture — Architecture is “what is important and hard to change”. So how does that work with Agile, where change is always welcome? Agile architecture is about shifting the question from What is the best design I can come up with? How can I design my architecture to make future changes feasible and easy to introduce?

4.5 Principle 5: Focus on Business Value

It can be deceptively easy to lose sight of what the customer genuinely needs and act upon what your product managers or technical staff think they need.

This diversion can happen when, for example, you measure your success based on how many new features you release instead of how many new customers you acquire.

It can also happen when you hand over the reins to your technical staff. They might, for example, switch to new technology (at an immense cost and risk) without giving two hoots of whether the customer will notice any benefits.

Numerous other examples follow the same trend. Overengineering a feature hoping future customers would be easier to integrate is another common mistake.

It’s impossible to predict new customer trends and preferences, and you invest time extensively parametrizing a feature that will not work differently than today.

Here are some things you can do to stay focused on what customers need:

  1. Create prototypes based on Minimum Viable Product (or MVPs) models when testing new ideas
  2. Take the customer along the journey from project inception to end
  3. Create a minimalistic feature for your first customer. Parametrize it for the next one.
  4. Always subordinate the technology to the business
  5. Separate and limit the research and Development (R&D) you would like to do.

4.6 Principle 6: Use Mature and Proven Technology to Serve Your Customers

Unless you work in high-tech mega-corporations like Google or Tesla or in Research and Development projects, you are much safer using mature and proven technology.

The benefits of such an approach are easy to spot:

  • First, you can easily find resources with knowledge and expertise when the skillset you want is not residing in a small niche. The same applies to online help, support and documentation, which will be abundant for ubiquitous and mature technologies.
  • Second, you can integrate easily with third-party systems and protocols when implementing industry-standard interfaces.
  • Third, your products will be more stable and reliable as most bugs have already been addressed.
  • Finally, communication with customers and peers is smoother and less prone to misunderstandings when using a common dictionary of terms and definitions.

While the above might seem boring and perhaps not so cool, it is safe, dependable, and profitable. These properties are especially critical when your products are running critical business operations.


5. References


Leave a Reply