Software Architecture and Design — Order, Complexity, and Chaos
Software architecture is one of those words we constantly throw around in our conversations without having a precise definition.
In its broadest sense, architecture refers to a coherent form or structure, typically in buildings, but what about software? Is there any coherence, tidiness, or order in a piece of code that we can point to and say: “this constitutes our architecture”?
It turns out that the answer is Yes. A structure (set of associations, a recognisable shape, or a degree of organization) may exist in a coherent and consistent, well-designed body of software.
Structure emerges when the nodes of a cluster become interconnected in a solid and rigid network, where a disruption at one end of the network can be felt on the other.
In the following sections, we will:
- Identify nodes in software code and their relationships and interactions
- Discuss when an emerging structure is “good” and how we can measure that “goodness”
- Define terms widely used in connection with architecture like context, model, domain, interface, and Agile architecture
- Distinguish between software architecture and deployment styles
- Introduce domain-driven design
- Discuss coupling and cohesion
- Elaborate on complexity in software architecture
The ideas we present here, as with every other one on this website, are not intended as recipes; they are not rules for building sound architecture.
As Dave Snowden most elegantly put it, a recipe user can never work out what must be done if some ingredients are unavailable. On the other hand, a chef can prepare a great dish with any set of ingredients.
These articles aim to provide the reader with a practical and theoretical framework (not a recipe) that can be deployed to solve any software architecture problem.
2. Modelling and Software Architecture
2.1 Modelling Real-World Problems
The number and diversity of business problems that software solves today are immense, but we can roughly classify them into three large clusters:
- Data management and service delivery — This category represents anything from accounting to retail, payments, booking, online shopping, and stock management.
- Content management and delivery include social media, blogging websites, and knowledge management systems like Confluence.
- Embedded Applications — Every piece of hardware nowadays has a small computer, and on that computer is an embedded software program that operates it. This category includes watches, air conditioners, mobile phones, space rockets, or anything with a microchip.
One cannot but wonder how a complicated business problem in the real world gets translated into a software program that can deliver a sophisticated solution. It all starts with a model that software architects create.
The only valid model of a human system is the system itself.
— Murray Gell-mann
A model can be described by the following ideas:
- It is an abstract representation of the physical problem at hand — A model is usually an approximation of the system, valid only in a specified domain. For example, we can model the surface of the Earth as a plane as long as the distances are small.
- We need a distinct language to describe it — The natural language is too huge, ambiguous, and culture-specific to be of any practical use (more on this later).
- The model we generate is subjective and, therefore, not unique — It is not always designed as a faithful representation of the world but to answer a particular subset of questions. For example, the air in a container is modelled as a volume with temperature and pressure. A more faithful representation may include the millions of molecules jiggling in all directions, but that problem is intractable and of little or no practical use.
Any organization that designs a system (defined broadly) will produce a design whose structure is a copy of the organization’s communication structure.
— Melvin E. Conway
2.2 Influential Factors
The following factors will broadly influence the software architecture we generate:
- Language — As many scientists put it, the language we speak primarily determines how we see the world and not the other way around.
- Team organization — Allen Holub asserts that the software architecture we produce will mirror our internal structure as an organization. If two departments work on separate product areas, it is natural that two modules will be produced, possibly linked by an interface. On the other hand, with cross-functional self-organized teams, a different architecture may emerge.
- The Business Model — If a business offers services A and B to different client segments separately, we can expect the software architecture to carry that separation. Today’s business requirements will shape a product’s architecture for a long time.
2.3 Complexity and Software Architecture
Complex systems have distinct characteristics such as unpredictability, continuous novelty, and heavily connected internal elements, making them difficult to understand. Is software complex by that definition?
Software is undoubtedly complicated but predictable and static, and the relationships between its internal elements do not change. Software architecture cannot evolve to generate new behaviour; an external agent, the developer, must introduce new features.
However accurate the above statement may be, it does not explain why software is not dull or why a developer’s time is always overbooked. To fully address this apparent conflict, we must zoom out a bit. Have a look at the following diagram.
The software product is now embedded in a highly sophisticated and ever-changing ecosystem that includes clients, a global market full of competition, fast-paced scientific advancement, emerging technologies, and natural disasters with tremendous impacts on the business (like a global pandemic).
The ecosystem’s complexity places enormous pressure on the software business, which must adapt its value proposition or become extinct. Can we update our software architecture quickly to meet environmental challenges? This is where Agile architecture can prove invaluable.
2.4 Agile Software Architecture
Software architecture can be as orderly or chaotic as we want; it all depends on how much effort we invest in keeping it in shape.
More constraints on the design lead to tidier architecture, longer deliveries, higher cost of change, and, therefore, lesser agility. This self-imposed constraint is not ideal; we want to be fully Agile to keep up with the mounting pressure to adapt.
We can remain Agile while maintaining excellent software architecture by taking a pragmatic attitude towards orderliness.
In addition to the other Agile requirements, such as cross-functional, self-organising teams, adequate communication, and emphasis on automation, we may also decide to spend as little design time as possible while employing the industry’s best software architecture practices.
We accept that:
- Our current knowledge is not perfect — hence an ideal future-proof design is not realizable, and all we can do is keep going.
- Long to medium-term business requirements will inevitably change, ushering in a new paradigm of the business model we serve and, therefore, essential structural changes to the software.
- Postponing consequential decisions and relying on mature technology as much as possible is a winning strategy.
We can also create a complex design halfway between order and chaos. The advantage of introducing complexity at the architectural level is that it allows the system to grow in ways we could not have predicted.
You understand a complex system by interacting with it, not by analysing or modelling it. You can only understand it by interacting with real-time feedback loops over multiple agents, so you don’t get cognitive bias.
— Complex Adaptive Systems – Dave Snowden – DDD Europe 2018
Complex architecture cannot be built by following a top-down approach; it must be built from the bottom up on solid but simple rules.
You can go about this plan as follows. You create many small safe-to-fail experiments (smaller and more frequent releases in Agile terminology) and observe the results. You then eliminate the unintended consequences and build on the positive ones.
3. Software Architecture or Deployment Style?
Allen Holub distinguishes between software architecture and deployment style.
From his perspective, the (very) familiar three-tier model (User Interface/Business Logic/Database) pertains to the former. This classification applies to other deployment patterns like Client/Server, MVC, Service-Oriented Architecture (SOA), and micro-services.
We present the following argument to understand why the layered or n-tier model and its cousins are not architecture. We gave three factors influencing architecture earlier: language, business model, and team organization.
Upon closer examination, we notice that the n-tier model is oblivious to these factors. Any business line, model, or problem can be solved with a software product that could be decomposed into these three layers.
This independent relationship between deployment and architecture will lead us to conclude that any discussion on the n-tier model in the context of architecture is probably moot.
Aside from marketing or convenience, there is very little gain in one deployment style over another. Deploying your database as a single or distributed system, on-premise or in the cloud, may affect the system’s low-level design but not its business or architecture.
4. The Language of Software Architecture
When nodes in a cluster connect to form a network of relationships according to specific rules, we obtain a structure. A structure can be orderly or chaotic, depending on the effectiveness of the constraints governing its creation and modification.
Software architecture is a discernable structure in a body of software code.
To describe architecture (or any model) effectively, we need a formal language, and the latter requires the following:
- Ontology — An ontology lists all that can exist in a language (objects, states, and links in OPM, for example). An object can be a class, message, application, or system. A link can be hierarchical or indicates a dependency.
- Syntax — “Sentences” can be formed by grouping “words” where a word is an element of an ontology. Sentences are correct if they follow the language’s grammar rules or syntax.
- Semantics — Meaning is conveyed via semantics and covers all objects in an ontology. A sentence can be grammatically correct but carries no sense. Two different sentences can be equivalent if their meaning is identical.
A comprehensive language like OPM can describe structure and behaviour. When we talk about software architecture, we mostly think of structure.
Anybody who has done online shopping will recognize the diagram below. Here is what it says:
- A cardholder triggers an online shopping process.
- An eCommerce website enables online shopping.
- An eCommerce website comprises a payment gateway, an inventory, and an account management subsystem.
- Online shopping produces an order in one of two states: Placed and Paid.
- A payment process moves the order between its states.
The rest is self-explanatory. Three things worth noting:
- OPM, as a meta-language, allowed us to express the language of our model precisely
- The business model language we just created has the following ontology:
- cardholder (agent)
- eCommerce website (object)
- payment gateway (object)
- order (object)
- inventory (object)
- payment (process)
- online shopping (process)
- Item, order, and deliverable are identical in the physical world. In our architectural language, however, item, order, and deliverable are separate objects in three domains. The significance of this domain separation will become apparent when we discuss Domain-Driven Design.
5. Domain-Driven Design (DDD)
5.1 What is Domain-Driven Design?
Eric Evans coined Domain-Driven Design (DDD) in a book he published in 2003. It is formulated around the following objectives:
- There is a core business domain on which the project should be focused.
- Complex software architecture should be based on a model of the domain.
- Continuous collaboration between technical and domain experts iteratively refines a conceptual model.
- The conceptual model defined addresses a subset of domain problems.
Domain-driven design requires developers to implement a significant amount of isolation and encapsulation to maintain the design’s integrity and purity. Maintaining high design standards can be costly, especially if the domain is obvious.
Models in Domain-Driven Design have the below properties:
- They are not a realistic representation of the world; models are not meant to express physical realities with arbitrary precision. They are approximations good enough to be used in practice to solve a cluster of similar problems.
- Models are specialized; they are not intended to solve every business problem, present and future.
- They are described as a special language as the natural language is too broad and ambiguous to be of any practical use.
5.3 Bounded Contexts
Some terminology definitions:
- A context is a setting that allows us to infer the meaning of a specific statement, event, or idea.
- A context with a well-defined boundary in which the meaning of every object can be found within the context is called a bounded context.
Bounded contexts allow developers and analysts to focus on a narrow cross-section of the business model, thus avoiding being overwhelmed with information, some of which might not be relevant to the problem.
Also, bounded contexts allow the precise definition of terms. This definition can be obtained relatively quickly inside a bounded context since the number of stakeholders involved can be drastically reduced.
In our previous eCommerce example, item, order, and deliverable represent the same physical reality. Still, the inventory people and their peers in payments or delivery can define an item separately, each within their team, thus avoiding collisions or overuse of the same terms.
Business and technical staff jointly create a ubiquitous language specific to a domain, allowing them to communicate ideas efficiently.
What constitutes a boundary?
Anything the developer recognizes as separating two entities is a valid boundary. Two databases, deployment pieces, namespaces, and packages, are examples of such separations.
A modelling exercise generates alternative models, which are then isolated into bounded contexts, each designed to address a cluster of similar business problems.
The models and bounded contexts are owned by separate teams who would agree on terms, architecture, and processes. Because a bounded context does not necessarily focus on one technology or layer in the tech stack, it must be implemented by a cross-functional self-organized team.
5.4 Organization and Architecture
In one of his lectures, Allen Holub makes an interesting connection between team organization and the software architecture produced.
Holub argues that organising technical teams in silos of experts in specific domains (UI, database, backend) will produce an architecture that mirrors this organization.
The communication and decision-making challenges and high cost of the change introduced by this particular team organization will carry over to the software’s architecture.
On the other hand, a domain-driven design with cross-functional teams will create a cross-section of the product that covers the entire tech stack and will be organized around a specific business domain.
Eric Evans, who originally coined the term Domain-Driven Design, emphasizes that DDD need not be applied in every area of the software code but only where it makes sense. DDD is at its best when the business subdomain is complex, and a narrow focus will reduce its complexity.
5.5 Relationships Between Domains
Evans distinguishes between three different domains in his approach to software design:
- New domains — where everything is tidy, the domain is isolated, and the model implemented narrowly focuses on specific business issues.
- Legacy domains — Legacy software products are ubiquitous, and including them in our design is not optional. Legacy domains are typically larger than new ones because they have existed longer.
- External domains — Solutions we build may need to integrate several applications, some provided by prominent third-party vendors; we have no control over the integration process in such applications.
Relationships between new internal domains can be on an equal footing, with interfaces translating messages to and from both areas.
On the other hand, Evans recommends that relations between new internal and large external domains be conformal due to the massive disproportionate size of the two parts. For example, suppose you have an in-house application that needs to integrate with Microsoft Office. In that case, your application will likely conform to the latter’s specs rather than the other way around.
Finally, integration between your new software and your legacy system is facilitated by what the creator of DDD refers to as an Anti-Corruption Layer (or ACL). The latter’s (crucial) duty is to keep order and tidiness on one side and chaos on the other.
6. Software Design Practices
Two entities A and B (they could be anything; classes, services, applications, servers), are coupled to a change if a change in A leads to a change in B.
The worst forms of coupling are when an entity modifies another’s internal data or when two or more entities share a global data structure.
Coupling is not always visible, as Adam Tornhill suggests in one of his talks on technical debt management. While analyzing a code base, he found that two micro-services were highly coupled, not through code, but by belonging to the same product feature.
Moreover, the two services happened to be managed by different teams, resulting in (potentially slow and cumbersome) cross-team cooperation on every change.
It is essential to note that there are limitations to decoupling software components. Entirely decoupling members can lead to a system that is overly complex and difficult to understand. Decoupling also requires significant effort and resources and may not always be necessary or practical. Sometimes, a certain level of coupling between components is acceptable or desirable, and it is a trade-off between flexibility and simplicity.
Furthermore, it is crucial to consider the system’s scalability and performance when designing software. While decoupling components may make a system more flexible, it can also decrease performance. Therefore, a balance must be struck between flexibility, simplicity and performance.
In summary, decoupling software components is vital for creating a well-designed and flexible system, but it is not a one-size-fits-all solution. It is essential to consider the limitations and trade-offs involved in the process and to strike a balance between flexibility, simplicity, and performance.
Cohesion is when entities supporting a joint functionality are grouped. High cohesion in codebases is desirable, allowing code changes to be local and restricted to the smallest possible context.
This locality reduces the amount of exploration a developer needs while analysing the code or the amount of regression testing required for good coverage.
The best form of cohesion occurs on a functional rather than a logical or procedural basis.
Grouping credit with debit cards in a payment software program might seem logical enough, but the business lines they belong to can be very different.
The (procedural) grouping of order creation, payment, and delivery in an eCommerce website might seem like a good idea, except these business processes belong to separate domains. They must be implemented in individual modules to allow reusability.
6.3 Conceptual Integrity of Software Architecture
The conceptual integrity of architecture ensures that a system is not modified to support functionality outside the one it was initially designed to provide. This characteristic is desirable for two reasons.
- Firstly, a design crafted by fewer architects and developers will be more consistent and coherent. Think of a painting completed by four or five artists!
- Secondly, conceptual integrity ensures that the product is reusable from a business perspective since it can be integrated into a different solution in a far more manageable manner than a bespoke, general-purpose product.
6.4 Deep vs Shallow Classes
John Ousterhout is a professor at Stanford University. He created a unique course on software design in which he introduced the subject of shallow vs deep classes under the topic of modular design.
In his view, classes and their interfaces and implementations can be designed to reduce the overall code complexity. His main ideas revolve around the following principles:
- The functionality provided by a class is proportional to its area (length times width).
- An interface provides enough knowledge for other system modules to use the class functionality effectively. The best interfaces are simple, generic, and “leak” the least amount of information and design decisions used in the class.
- Shallow classes have large interfaces but tiny implementations. The cost of calling the interface in shallow classes is no less than writing the implementation itself. Shallow classes add complexity rather than abstracting it away.
- Deep classes, on the other hand, present significant advantages. First, they abstract away significant complexity by hiding design decisions. Second, changes in the implementation do not affect other modules (unless the interface has to change).
- The number of lines of code in a deep class becomes secondary (if not irrelevant) as long as the class implementation is deep.
7. A Measure of Good Design
What was appealing in Evans’ introduction to Domain-Driven Design is that it provided a different approach to creating software architecture and a measure of “goodness” of design and quality code.
A great design can be measured by how much context a developer requires to understand a specific piece of code; the smaller the necessary context, the better the design.
Also, this definition nicely works with the low coupling, high cohesion principle.