Part 1: Solution Design — Introduction and First Principles

1. What Is System (or Solution) Design?

The kind of intellectual activity which creates a useful whole from its diverse parts may be called the design of a system.

— How Do Committees Invent, by Melvin Conway

I like to think about system (solution or software) design using the following metaphor. Imagine you are required to play a game governed by the following rules:

Building blocks: You are given infinitely many building blocks that can be assembled into different shapes and forms.
Block specialisation: Not all building blocks are equal; various block types exist within the mix. For example, some blocks are expensive, while others are cheap. Some have specialized functions, while others are generic.
Blocks, components, and interfaces: Individual blocks or assemblages of individual blocks can be connected via interfaces.

The game’s objective is to build a system according to the rules below:

Rule 1: Technical Consistency — You must follow the manufacturer’s instructions and industry best practices when assembling the blocks; otherwise, the structure will not hold.
Rule 2: Generation of Business Value — Acceptable systems must generate business value. In other words, it must serve a business need and solve a business problem. It must be fit for purpose.
Rule 3: Service Quality — The assembled system must provide services with the best possible quality, the latter being two-dimensional with business and technical aspects. For example, technical aspects include execution speed, security, reliability, and scalability. On the other hand, business quality ensures a great user experience, low cost, and low maintenance.

The building blocks of an IT solution are technological structures that obey the laws of computer science and engineering. These blocks can be assembled to provide services that solve a business need.

We add the following constraints to make the problem more interesting (and closer to the real world).

Constraint 1: Limitation on the Schedule and Resources — You can use a limited number of blocks for this initiative. You also have a limited timeframe within which all project-related tasks must be completed.
Constraint 2: Integration — The final system must be fully integrated within a larger ecosystem of similar systems, connecting via interfaces. Once the system integration phase is complete, the ecosystem collectively produces the desired business value.

2. What Does the Solution Design Series Cover?

The solution design series will cover the topics below:

Solution Design: First Principles

What is design, and why is it vital for software solutions?

Design of Modern IT Systems

Design process of modern IT systems and large software integration projects.

Agile Design

Agile Design: Solution design under conditions of uncertainty.

Solution Design Documents

Generating detailed solution design documents (SDD).

High-Level Solution Design Documents (HLD)

How to create a high-level system design document or HLD.

Component-Based Software Engineering (CBSE)

A discussion of CBSE and its development process.

3. Solution Design Process

The initial stages of a design effort are concerned more with structuring the design activity than with the system itself. The full-blown design activity cannot proceed until certain preliminary milestones are passed. These include: […] Understanding of the boundaries, both on the design activity and the system to be designed, placed by the sponsor and the world’s realities.

— How Do Committees Invent, by Melvin Conway

Drawing parallels between assembling building blocks under well-defined constraints and software project implementations is relatively easy. Let’s articulate this further.

3.1 Building Blocks of IT Solutions

We can imagine building blocks as abstractions of lower-level, physical entities such as lines of code, modules, or applications that can be combined according to computer science and technology rules.

The set of possible combinations is practically infinite, and the system architect must select the optimal one from a vast design space.

While on the surface, this may seem like building a LEGO structure, there are significant distinctions.

The main distinction lies in the following:

The maturity (stable, proven, time-tested) of the individual building blocks (implementations, applications, or modules)
The issue of requirement volatility, which we discuss further below
The integration within an environment that predates the new system and is much larger

These variables will determine the best design (top-down, bottom-up) and project delivery (Waterfall or Agile) approaches. We will say a bit more about that later.

Business Requirements: An Essential Guide to Definition and Application in IT Projects

3.2 Software Products and Design Optimization

Design is an optimization process aiming to produce a viable solution within the limited time and material at the organization’s disposal.

A viable product or solution can be sold to many customers within a market niche or dedicated for internal use. In both cases, the investment is justified by anticipating a positive return on investment (ROI).

We generally refer to the well-known engineering problem of minimising a cost function when discussing optimisation.

In this exercise, independent parameters are varied, and the cost function is calculated for every permutation. The permutation producing the lowest cost would then be adopted.

In IT, design optimisation, however, works differently. Architects and senior engineers combine technological assets in multiple ways (to produce various candidates) before selecting a suitable candidate.

The optimization process relies on heuristics (like ranking decision criteria). The final design must be technically feasible and satisfy various functional and non-functional requirements.

Architects combine technological assets to produce IT solutions. Heuristics and field expertise are used to identify (or build) the optimal solution. The solution might be locally optimal as finding a global optimum, especially in the face of requirement volatility, is impossible.

The technological assets can be the following:

Technology stack
Persistence (structured, unstructured), database engine
Deployment method (cloud, onsite, client/server, distributed)
Redundancy, backup and recovery mechanisms
Hardware capacity
DevOps or cloud-friendly software architecture

Optimisation can be particularly challenging because there are no rigid rules for constructing the cost function; the weight of the different parameters is context-specific and can vary between organisations.

In this case, optimisation is conducted using heuristics and good practices that allow variations to be considered according to the organisation’s business needs.

The obvious decisions are made first, sometimes based on first principles and axioms of technology (like selecting C++ if you value speed, Python if you value productivity), and others on the decision-makers subjective preferences and biases (familiarity, experience).

Further decisions are then built on top according to reasonable technology practices and the constraints of previous choices. The design can be tweaked further if the benefit-to-cost of change ratio is high enough.

Generally, it is considered good practice to avoid over-committing to critical decisions early on. Experienced architects prefer delaying consequential and binding technical choices as long as possible until more information becomes available.

3.3 Balancing Constraints

Producing an ideal product is possible only in trivial cases. In contrast, most circumstances require difficult decisions and tactical (but not strategic!) compromises.

The conflict will most certainly arise from the desire to go to market as soon as possible and the opposite (but equally valid) desire to produce a sustainable and technologically superior product.

Progress is measured through quality and time-to-market. It is acceptable to make tactical sacrifices (accumulating technical and architectural debt) if, in the long run, it serves the business and can be rectified.

The business usually champions swift time-to-market, while the developers support long-term sustainability and technical superiority.

Tactical compromises are essential to benefit from any market window of opportunity. Still, the general direction, the strategic orientation, must be towards achieving both in the long run.

You can accumulate technical debt to beat the competition and enhance your value proposition on condition that you invest the necessary resources to keep it under control.

In summary, designing the perfect product is an antipattern to be avoided since it requires (theoretically) an infinite (and attainable) amount of information to consistently predict the client’s ever-changing preferences. A more pragmatic approach is preferred.

4. Solution Design Under Uncertainty

4.1 Requirement Volatility

If you think selecting an optimal setup from the infinite possible permutations is hard, consider how harder it can get if the requirements evolve with time. Imagine the cost function you are trying to optimize has weights that evolve with time and parameters that drop in and out spontaneously.

Seasoned engineers know that requirement volatility is integral to software delivery and is unlikely to be eliminated by any method at our disposal. We can either ignore this fact until it hits us or transform our processes to handle it effectively.

Requirement volatility can be explained by legacy systems, lack of skills, insufficient knowledge, and novel technology, ultimately leading to project risk that must be effectively managed if the project were to succeed. — Requirement volatility can be explained by legacy systems, lack of skills, insufficient knowledge, and novel technology, ultimately leading to project risk that must be effectively managed if the project succeeds.

This section examines the nature of uncertainty in business requirements, while the next will discuss design under such conditions.

There are two radically different sources of requirement volatility: incomplete knowledge and unarticulated needs. We examine these topics next.

Technical Risk Management and Decision Analysis — Introduction and Fundamental Principles

4.2 Incomplete Knowledge

Incomplete knowledge may arise in the following scenarios:

The technology, idea, or business model is novel and complex, and best practices are yet to be established.
The system impacted involves poorly documented legacy software, and people carrying the knowledge are no longer part of the organisation.
Due to time and cost constraints, business analysts have not adequately articulated the business requirements, and critical decisions have been deferred.
Under internal or external environmental pressure, stakeholders change their priorities during implementation.
Poor planning, change management, effort estimation, and project management skills.

Notice that adequate knowledge can be obtained in all the above examples if some effort is liberally invested, or in the last two scenarios, the requirements are locked in contractually, and talent is acquired.

Incomplete knowledge does not mean unknowable knowledge, unlike unarticulated needs, which is our next topic.

Uncertainty, Randomness, and Risk: A Very Short Walkthrough

4.3 Unarticulated Needs

Dave Snowden, the Welsh management consultant, researcher in complexity science, and developer of the Cynefin framework, thoroughly discussed unarticulated needs in his lectures.

The changing nature of user interactions with the system facilitates unarticulated needs. This type of requirement volatility needs exploratory (Agile) methods to design a usable product.

In his view, customers cannot articulate what they want before interacting with the product and understanding what the technology can do for them. Only then would the requirements become evident and existential.

Unarticulated needs are manifestations of a complex system where the rules of interactions between the agent and the system evolve, leading to further evolution of the system and the agent. It’s like a computer algorithm modifying itself based on new information.

So, how do we design a system before we know what the clients want? We will discuss some ideas in the following sections.

Complexity in Natural and Human Systems — Why and When We Should Care

5. Product vs Solution

5.1 Term Definitions

So far, we have used the terms product and solution without a formal definition. Intuitively, we know they are not the same, but it’s not apparent how. The following paragraphs will hopefully polish our understanding of these words a bit.

The term product carries the connotation of a consumer product, something you pick off a supermarket shelf. It typically comes with a user manual and can be configured in a few different modes depending on how you want to use it.

On the other hand, a solution can be understood as an offering that addresses distinct business needs. A solution can involve several products or platforms customized according to a user’s preferences and context.

We define product and solution as follows: a solution is designed to address a particular problem, whereas a product is created to perform a specific function.

5.2 Comparing Products and Solutions

Product

A Software Product usually refers to an application(s) running on a computer system. It offers users one or more functionalities or services in a predetermined industry.

Solution

An Enterprise Solution may include integrated software products, hardware components, and enterprise databases. A software solution can be deployed in a data centre or cloud.

A software solution allows enterprises to offer digital end-to-end services to their customers.

Office 365 or a payment switch from FIS are examples of software products.

The software, servers, licenses, and user configuration for Office 365 collectively form an enterprise software solution.

A Software Product can be viewed as a self-contained entity with interfaces to the outside world, allowing integration with other products.

An Enterprise Solution is an umbrella term encompassing various products integrated to offer an end-to-end business solution.

6. Solution Design Methodologies

There are two classes of system designers. The first, if given five problems, will solve them one at a time. The second will come back and announce that these aren’t the real problems and will eventually propose a solution to the single problem which underlies the original five. This is the ‘system type’ who is great during the initial stages of a design project. However, you had better get rid of him after the first six months if you want to get a working system.

— NATO SOFTWARE ENGINEERING CONFERENCE 1968

6.1 Contextualisation of IT Projects

Experience has shown that universal solutions for complex problems are rare, and methods that work in one context may fail in another. Agile and Waterfall are perfect examples of context-sensitive solutions.

Similarly, solution design methodologies will vary depending on the subdomain where they are applied. We will use the following two-dimensional map to survey a proposed model of the IT project landscape and its subdomains.

Projects must be placed in their proper contexts if they are to succeed. Correct contextualisation will determine the necessary tools (Agile, DevOps, Waterfall) and practices that apply to the specific situation. — Projects must be placed in their proper contexts if they are to succeed. Correct contextualisation will determine the necessary tools (Agile, DevOps, Waterfall) and practices that apply to the situation.

The above model defines four subdomains with varying degrees of scale and uncertainty.

Scale amplifies existing cracks and introduces unique challenges to the project delivery model. In large projects, interactions will rise dramatically, feedback loops will become longer, decision-making will be slower, and cost overruns will be more significant.

Uncertainty due to unarticulated needs or incomplete knowledge radically modifies the dynamics of the delivery model by pushing teams to move towards Agile, DevOps, automation, and deploying sophisticated risk management practices.

Example of the four types of project contexts:

Type I: Trivial
Routine maintenance tasks or off-the-shelf projects with little or no integration
Type II: Novel Ideas Or Technologies
New solutions to old or novel needs.
These might emerge as humble initiatives in startups with the potential of growing into mega-corporations.
Examples abound, from social media to search engines.
Type III: Large System Integration Projects
Replacing or upgrading legacy technologies, acquiring new platforms and solutions, and integrating them into an enterprise ecosystem.
Technologies and platforms are generally mature, with standard interfaces and designs.
Type IV: Mega-Projects
Megaprojects include nuclear plants, hydroelectric dams, and enterprise IT solutions.
They magnify errors and inconsistencies in processes, technologies, and designs.

6.2 Four Design Approaches

The diagram above places IT projects into four categories. Below is a summary of each type and the preferred design approach.

Quadrant 1 — Small-scale, low-uncertainty projects are generally trivial with mature and established production processes. Such projects include routine maintenance and upgrades, are scheduled well in advance, and require little project planning or design as the risk is low.
Quadrant 2 — Typical of startups, small-scale projects with high novelty, complexity, or requirement volatility use an evolving and malleable design refined over many iterations and releases. The preferred project management model is Agile.
Quadrant 3 — Large system integration projects seeking to assemble mature products using reliable technology and standardized interfaces fall into this category. Project risk and cost of change usually are very high in such projects. As such, careful and detailed planning is necessary to lower risks comfortably.
Quadrant 4 — Megaprojects are rare and characterized by massive scales and medium to high novelty. Initiatives on such a scale invariably overrun their costs and schedules and deliver comparatively low benefits. Modularized design that enables cumulative learning is critical in ensuring their success.

7. Final Words

Business value is generated only in two phases of the Software Development Lifecycle: design and development. All other stages (planning, testing, deployment, operations, and maintenance) are burdens and extra costs to delivery that customers are not happy to pay for.

In addition to skills, tools, and expertise (which all tasks of the SDLC require), solution and software design heavily rely on creativity, innovation, and the ability to anticipate business needs. On this website, we focus heavily on Operational Excellence, whose objective is producing top-quality products. Operational Excellence in the software business (as we define it) is a framework for understanding software delivery and executing projects flawlessly.

We hope this article has brought the core concept of solution design closer to the reader and has provided them with the right intellectual tools to design solutions with excellence.

Principles of Operational Excellence in Software Development

Operational Excellence

Organisational Culture

Organisational Processes

Project Delivery

Soft Skills for Engineers

Computer Science and Engineering

Quick Links