A Technical Debt Fairy Tale

Once upon a time, there was a lead developer called Annabel. She worked for vakation.com, a travel booking site on the Internet. She was the tech lead of the devops team that maintained the web-app front-end. Springtime was upon the land, for ‘twas April, a busy time, when people were starting to book their summer holidays. But sales were disappointing. Conversion of visits into actual bookings were getting worse and worse.

One morning Ramon, the team’s business owner, requested an urgent gathering: they had found one of the root causes of the problem that was plaguing them. “Our site does not show visitors whether the accommodations they are booking have facilities to cast streaming media from their phones to the TV in the room,” he said, with a frown on his face. “It turns out that many of our competitors now do show that information, and clients end up booking their travel on other sites. So we need to add the media casting info to the vakation.com website with great speed and utmost alacrity!”

Annabel turned to Edwin, who headed up the hotel reservation back-end system, and asked Edwin whether the media casting info was available in the back-end system. Edwin smiled, for he had good news: the latest hotel booking communication standard included the media casting information! It was already present in their accommodations database. Vakation.com used an Enterprise Service Bus (ESB) to connect the various systems in their landscape, so Annabel asked Edwin to expose the desired info on the ESB for the web-app front-end to display. Seeing Annabel’s look of joyful expectation on her face, Edwin was happy to oblige. But things were not as merry as they seemed – for at that time, Edwin’s team had quite a full backlog: it looked like they wouldn’t have time for the change until June.

Now Annabel started to get worried: fearing Ramon’s wrath, she didn’t want to go back to him and tell him that she wouldn’t be able to fix the website until June. But she soon cheered up, for due to a stroke of fortune, the web-app fornt-end was on the same physical database as the hotel reservation back-end system, and Annabel’s team of diligent developers knew the table structure. So Annabel decided to temporarily ignore the company’s ESB policy and obtain the media casting information directly from the accommodations database, taking on some technical debt. Her mind was firmly set on refactoring the temporary fix as soon as the info was exposed on the ESB, hopefully in June.

Two months went by, after which Edwin perused his backlog, and lo and behold: he spotted the story card with Annabel’s request. With some trepidation in his heart, he visited Annabel in her lair, and asked her: “Fair Annabel, is this story still needed? Some other, more urgent stories have popped up. Would it be terrible if we pushed back the media casting story a few more sprints? After all, things are working now and nobody is complaining”. Failing to reach agreement, Edwin and Annabel decided to ask Ramon, the product’s business owner. But it turned out Ramon had trouble remembering that the fix was temporary, and was not at all interested in prioritizing the refactoring.

Days, weeks and months passed by. Eventually, it took until November before the info was finally accessible through the ESB. In the meantime, it turned out that some new members of Annabel’s team had been copying the method of accessing data in the hotel reservation system directly. Having run into Annabel’s temporary fix in the code, they felt comfortable using the same method – this time without even asking Edwin’s team. The cursed technical debt had multiplied itself! This had already led to the website breaking down after Edwin’s team had made some changes to their table structure, unaware that it was accessed directly by other teams. These outages had caused Ramon to wax angry and scold both teams: “Why did you allow things to become so bad? You should have managed that better!”

As we close the magical  book on this horror story, there are some questions we may ask ourselves:

  • Was Annabel’s temporary fix to access Edwin’s team’s database directly a wise decision? After all, the company would have lost a lot of revenue if she had waited two months and made the fix the “right way”.
  • What about Edwin’s decision to postpone the exposure of the media casting data on the ESB? Doesn’t it make sense to prioritize by business value?
  • Was Ramon right to be upset with the teams? After all, he was completely uninterested in fixing the shortcut when it didn’t cause any problems yet.
  • Code scanning tools like SonarCube claim to detect tecnical debt, but would such tools have flagged this temporary fix as technical debt?
  • What could Annabel, Edwin and Ramon have done to prevent the problems?

 I look forward to reading your answers! Join the discussion on https://www.linkedin.com/pulse/technical-debt-fairy-tale-eltjo-poort/

Architectural design with autonomous teams

According to the agile manifesto, the best architectures emerge from self-organizing teams. The word emerge here has received some criticism, since it seems to imply that this happens automagically (it doesn’t), and that it doesn’t require significant effort (it does). I prefer the verb design here (in stead of emerge), as it indicates an intentional activity. The key message of that (in)famous emergence principle is not to discourage intentional design, but to clarify that agile teams work best if they are autonomous. In other words, no outsider should dictate the architecture – the teams design it.

As a consequence of this, agile teams need architectural design compentencies. Last year Transavia, one of our airline clients, acted on this need. They set out to hone their agile teams’ skills in this area, and asked us to develop an architectural design training curriculum for non-architects. Each of their 20 teams was invited to send a representative to the course, to become a guide to the architectural emergence process in their self-organizing team. Some teams sent product owners, others tech-leads, we saw the occasional information analyst. We got to spend 10 full days with them spread across three months in a kind of Architecture Boot Camp, and learnt a lot.

Autonomous architectural thinking

The learning for us started right away with the selection of the course topics. Apart from the necessary technical content, such as introductions to modern software technologies, application integration and security tactics, we had to get the concept of architectural thinking across. What is architectural thinking in an agile team?

Based on our previous experiences doing architectural design with teams, we decided to focus on two key aspects:

  • Seeing architecture in terms of stakeholder value
  • Making trade-offs in design decisions

Architecture in terms of stakeholder value

One of the problems agile teams have with architecture is that it is sometimes perceived as something that inhibits, rather than enables, business value. There can be several reasons for this. For example, under-the-hood improvements such as refactoring or laying architecture runway are seen as leaching capacity from the teams – capacity that could otherwise be used to create new features. Another example are architectural compliance rules, which can cause delays and make the teams feel constrained in their design choices.

In our curriculum we turn this perception around to make the teams and their business counterparts see architecture as a positive thing. A key skill is translating the impact of this architecture work (the enablers) into business terms, facilitating the explanation of their value to stakeholders. This also gives teams confidence to approach business stakeholders to participate in the design of ‘their’ enablers. Course attendees complete a number of exercises focused on speaking the language of the business, such as a business case for technical debt reduction. They also practice techniques such as architecture roadmapping to prioritize enablers based on current and future business needs.

Trade-offs in design decisions

Focus on stakeholder value also informs better design decisions. Knowledge of good design principles by itself is not enough at the architectural level: architecture is context, so for high-impact decisions the teams should be able to trace the decision criteria all the way to stakeholder value. And that often requires a much closer involvement of business stakeholders than teams are used to. But how do we get the attention of ‘the business’ for topics that, at first sight, seem purely technical?

Interesting business stakeholders in design decisions may sometimes look like a hefty challenge, but this is where the ‘architecture in terms of stakeholder value’ skill pays off. Their involvement in the decision process is actually quite essential, because without it the team cannot ask the all-important why questions that reveal the real stories and drivers behind stakeholder requirements. Such background stories often lead to new, previously unknown decision criteria, which in turn can completely turn around the team’s design – leading to solutions that are a significantly better fit to the business needs.

One technique that proved very popular (and easy to apply in practice) is decision stories. A decision story captures the essence of a design decision in one sentence, just like a user story does for a business need: “In the context of <context>, facing <concerns>, we decided for <choice>, and not <rejected alternatives>, to achieve <criteria>, accepting <drawbacks>.” The power of the decision story is that it is concise enough to quickly get your head around the essence of a decision, but still encourages a reasonable level of justification. Including rejected alternatives and drawbacks has proved to be especially valuable, since it reflects the work that has gone into the decision, and shows the team’s awareness of alternatives and drawbacks. The decision story assures the reader that the team has not just followed their first hunch. Teams using this technique find it easier to convince stakeholders that their design is well justified. We use decision stories as a valuable addition to more comprehensive techniques like trade-off tables and ADRs (Architectural Decision Records).

Experiences

That’s it, the skills that we found to be essential to architectural thinking. “Don’t you teach them about modeling and documentation”, you may ask? Sure, we spend some time on that too – but always in the light of stakeholder value, as seen in our value-driven architecture documentation approach.

It is now three months after the first batch of team members attended Architeture Boot Camp, and a first evaluation shows effects of the training coming to fruition. Teams are starting to document design decisions, and to involve their stakeholders in the process. During the evaluation it also became clear that the biggest challenge for some teams is indeed to get business stakeholders’ attention. Teams that were not well aligned with their business stakeholders beforehand found it much harder to apply what they learned. There is also a role here for the organization’s architects outside of the teams – they can coach the trained team members and help deal with concerns that cross team boundaries, riding the Architect Elevator (also part of the boot camp training).

Our discusssions about architecture across the board show that it is not just Transavia that feels the need for architectural design skills in teams. In order to facilitate the development of these skills in other organizations, we recently made the Architecture Boot Camp available as an open registration curriculum. If you’re interested, the first open architecture boot camp will be run in The Netherlands from March to May 2023.

Architecture: the outside view

Last month, I was asked to give a second opinion on some key architectural decisions and the way they were working out in a client’s IT landscape. Such architecture assessments are some of my favorite engagements, especially when there is sufficient time to get to the bottom of things with a small team of experts. A thorough analysis, underpinned by firm evidence and fleshed out in concrete recommendations, can add tremendous value: value in terms of risk mitigation, prevention of rework (or failure) and freshly spotted opportunities.

The value of architecture assessments

Where does the added value come from? Of course, an outside reviewer can never have more depth of knowledge and experience about a system than the team that has been working on that system for months or years, no matter how experienced or erudite the reviewer is. There is, however, intrinsic value in what Daniel Kahneman calls the “outside view”.

When the same group of people make decisions together for a prolonged period of time, the quality of the decisions may start to suffer from tunnel vision. Tunnel vision is caused by the need to make progress and the desire to have harmony in the group. Time and peer pressure lead to participants not voicing, or prematurely rejecting, alternatives, and so see only information that confirms what they already thought, ignoring danger signals (confirmation bias). Asking someone from outside the team to provide a second opinion or perform an independent architecture assessment is a great way to obtain such an outside view, and combat tunnel vision as well as other forms of cognitive bias (such as “groupthink”, anchoring bias and cargo cult – see Philippe Kruchten’s presentation on cognitive biases in software architecture).

Can we quantify the value of the outside view? Almost 20 years ago, research suggested that a well-timed and well-executed architecture assessment saves 10% of a project’s costs on average. The same number probably applies to the delivery effort of epics, features and architecture runway in an agile context. It is likely the timing of independent assessments in an agile context will be different from more traditional up-front design situations. Regardless of where you are on the waterfall versus agile spectrum, independent architecture assessments should be timed when opportunity cost is low, and changes to the architecture are (still) relatively easy.

A good outside view

Providing outside views was one of the responsibilities of our “technical conscience” team at CGI (both for our own solutions and on-site for clients). For over a decade, we gathered many insights into how to add real value with our assessments. We certainly learned what not do. A  good architecture assessment is not:

  • A cat and mouse game between reviewer and team
  • A gate to pass or stamp of approval to obtain
  • An opportunity for the reviewer to show how clever they are
  • An assessment of the team’s performance
  • An audit or compliance check

Avoid these mistakes, as they will significantly reduce the outside view’s added value. In our experience, most of that value comes from:

  • Getting stakeholders together in a room: this is often the first time they hear each other talk about the solution or decision you’re assessing, which often leads to unexpected insights
  • Uncovering the trade-offs that have been made, creating visibility of (assumptions about) priorities
  • Identifying risks and opportunities when there’s still time to do something about them

As you can infer from the first bullet here, a good architecture assessment cannot be based on just a document describing the architecture. You need to talk to the team and their stakeholders to be able to fully understand the context and the drivers behind the choices that were made – ask the “why” questions.

Conclusion

Getting an “outside view” on architectural designs or decisions often yields good insights, and helps teams deal with risks and capitalize on opportunities at relatively low cost. Spot the best time, and ask someone who knows what they’re doing to assess your architecture. </begin shameless self-promotion> Let me know if I can help! </end>

Further resources: book Evaluating Software Architectures (SEI), interview with author Rick Kazman.

Between the Waterfall Wasteland and the Agile Outback

Eltjo Poort, IEEE Software, vol. 37, no. 01, pp. 92-97, 2020.

Agilists and architects too often talk past each other. In this issue’s “The Pragmatic Designer,” guest columnist Eltjo Poort helps to bridge the divide by identifying five architecture responsibilities. This enables teams to introspect about how well they are handling each, and encourages them to avoid the extremes.

Official link

A Map to Waterfall Wasteland and the Agile Outback

Over the past 18 months, we have been iteratively developing a way to assess maturity with respect to architecture in an agile context. We did this together with a number of client organizations that were interested in improving their architecture function. I will be sharing some insights from this development in upcoming blog posts. This first insight is about balancing the responsibilities of architecture work, to avoid getting lost in either the Waterfall Wasteland or the Agile Outback.

The nature of architecture

The perception of what ‘Architecture’ should mean in the context of software engineering has gone through a number of changes since it was first used in that context.

The original 1990s perception views architecture as a set of structures that represent an abstraction of a system being delivered – needed to deal with the growing complexity of typical software systems. In that view, the main responsibilities of the architect are to create (visual) models of the system, and to use those models to validate the architecture.

In the early 2000s, a second perception emerged, with a new responsibility focus: architects needed to make important decisions in order to create the right models of their solutions (‘right’ meaning that they fulfill their stakeholders’ needs). If ‘abstraction’ and ‘structures’ describe what the architect creates, the decision making refers to how they create.

Around 2010, partly under pressure of the agile movement’s focus on business value, a third perception was proposed: the why was added to the what and the how of architecture. This view shed light on the goal of architecture: to improve organizations’ control over risk and cost – not only during design, but extending the architects’ responsibility to the fulfillment of the solution.

The list of architecture responsibilities would not be complete without the one prerequisite without which the architect would be unable to fulfill any responsibility: understanding the context of the solution, meaning the stakeholders, their needs, and the environment in which the solution is to be delivered.

So we end up with five responsibilities of architects (or of the ‘architecture function’ of organizations): understanding context, making decisions, modeling, validating and fulfillment of the solution. These five responsibilities also map very nicely to Philippe Kruchten’s “Things architects really do”, to Eoin Woods’ “Architectural focus areas” and to the RCDA practices we have been applying for years.

Dependencies

It is important to note the many dependencies between the five responsibilities identified above. Just to mention a few:

  • modelling and decision making without understanding context will lead to wrong models and decisions
  • modelling actually implies decision making (about decompositions, relationships, etc.)
  • if there are no models and no decisions, there is nothing to validate
  • fulfillment of unvalidated decisions and models leads to trouble

So fulfilling the five responsibilities in isolation is not enough: they should be fulfilled in a coherent way.

Good architecture

Since architecture is context, there’s no such thing as good architecture in an absolute sense: the best one can hope for is an architecture that fits the stakeholder needs in its context. In my experience, the best fitting architectures result from paying proper attention to all five responsibilities mentioned above. This is not easy; due to mostly cultural pressures, dogmas and misconceptions, many organizations ignore some of the responsibilities, resulting in a flawed architecture function. Two extreme examples are the Waterfall Wasteland and the Agile Outback caricatures described below.

Paying proper attention to all five responsibilities, however, does not mean always paying equal attention: depending on the context, modeling may indeed require more attention than decision making, and validation may be more critical in some situations than in others.

When talking to teams, architects and stakeholders in different organizations, we started to notice some interesting patterns in the way they took up these responsibilities. We created caricatures to identify those patterns, and called these caricatures the Waterfall Wasteland and the Agile Outback.

The Waterfall Wasteland

In the Waterfall Wasteland, the architects live in an ivory tower. They ignore the decision making and fulfillment responsibilities, which they consider to be someone else’s. They have a very clear job description: to create perfect models and validate them against stakeholder needs. If the resulting solution is unsuccessful, it’s obviously not their fault. The idea that they would be responsible for decisions or share responsibility  for successful delivery is abhorrent to them: it would mean that their success would depend on the capability of others, and on their ability to cooperate…

The Agile Outback

In the Agile Outback, teams usually don’t have architects (although they might use euphemisms  like ‘pathfinder’ or ‘master builder’). Modeling is studiously avoided, since “The best architectures…emerge from self-organizing teams”, and doing modeling might offend the agile gods and disturb the ‘magical’ emergence process. Similarly, validating designs is considered a waste of time: failing early and often is a much quicker way to learn and improve.

Essential architecture skills

The five responsibilities of architects are a good basis for deriving the essential knowledge and skills of the architect:

  • For understanding context, we need analytic skills to identify the environment and stakeholders of our solution, communication and social skills to understand stakeholder needs and knowledge management skills to cultivate and share that understanding.
  • For making decisions, we need decision making skills like making trade-offs and prioritizing. We also need extensive knowledge of the relevant architectural tactics and strategies in our (business and technology) domains.
  • For modeling, we need creative skills for visualizing our design, and knowledge of relevant modeling languages and techniques like decomposition and composition.
  • For validating, we need analytic and trade-off skills (again), and knowledge about techniques for risk management, cost estimation and making business cases.
  • To fulfill our fulfillment responsibility, we need leadership skills like communication, convincing, listening and anticipation. But we also need to know a bit about the economy of delivering software-intensive systems and the relevant software lifecycles.

Looking at this list, it is no surprise that good overall architects are very rare. It might be a good idea to spread these responsibilities over a number of people, and create an architecture function in an organization: a (possibly virtual) team made up of members with complementing skills and knowledge.

Conclusion

Architecture has five responsibilities: understanding context, making decisions, modelling, validating and fulfillment. Organizations need to pay proper attention to each of these responsibilities; what is ‘proper’ depends on the context. Dogmatically denying some of the responsibilities will lead teams astray into the waterfall wasteland or the agile outback. The breadth of knowledge and skills needed to fulfill these responsibilities is substantial, and hard to find in one individual; it almost always takes a team to produce good architecture.

(Image from Pexels at Pixabay)

Value-driven Architecture Documentation

“[We value] working software over comprehensive documentation” features proudly on the front page of the Agile Manifesto. Further down on the page, the authors explain that they do not mean that there is no value in documentation (just less value than in the working software). The practice of documenting architectures in particular has significant positive correlations to solution success factors like predictability, customer satisfaction and technical fit. So we should not abandon the practice of architecture documentation, but we should document our architectures in a way that yields optimum business value.

Bloated architecture documentation

In the past few years, I have helped many organizations modernize their architecture way of working. At the start of these endeavors, the average architecture documents are usually quite large – 200 pages is not abnormal. The templates alone often contain dozens of pages. It didn’t use to be like that – when they started out documenting architectures 15 years or so ago, the templates were nice and small, and focused on a limited number of views, designed to show how a key set of stakeholder concerns had been addressed (such as Philippe Kruchten’s 4+1 Views). However, over the years new insights led to new sections to be added to the template. Concerns that led to business risks or disruptions required new views, such as a Security view or an Integration view. The emerging realization that architecture is a set of design decisions led to the addition of new ‘Architectural decisions’ sections. Outsourcing part of the development sometimes required yet again more detailed views.

The Architecture Document had become the single repository for all architectural knowledge related to a product. This is not a problem per se, except that these same documents were used to show stakeholders that their concerns were addressed by the architecture. Specifically, they were the vehicle for obtaining approval from business owners. Business stakeholders were expected to read, understand and approve hundreds of pages of architecture documentation. Naturally, this led to delays: the architects would spend more and more time preparing the document, and the approvers would take more and more time approving them. Leaving blank parts ‘to be completed later’ did not help, because it increased uncertainty in the stakeholders about the completeness of what they were approving. Documentation bloat had caused the architectural feedback loop to become longer and longer, and caused significant delays when starting up new initiatives.

The value of architecture documentation

It is not surprising that documentation bloat causes organizations to start questioning the added value of architecture documentation: perhaps that value is lower than the cost of the delays the document causes. So let’s see if we can answer this question: “Why were we doing this again?”

In the organizations I work with, I see three main purposes for architecture documentation. We are communicating the architecture to show stakeholders how their concerns are addressed, often in stakeholder-specific views, but also to involve them in the architectural decision process (driving the architectural feedback loop). Additionally, we are communicating the architecture “to the future” for various purposes – let’s call this architectural knowledge preservation. This leads to three distinct business goals for architecture documentation:

  • Explain how stakeholder concerns are addressed (including business, DEV and OPS)
  • Preserve architectural knowledge (for analysis, realization, trouble-shooting,…)
  • Collaborate on architectural decision making

Each of these business goals has manifest business value, e.g. in terms of decision support, driving progress, design quality, risk and cost control.

Separate according to purpose

When we look at the audiences and timing of the three business goals identified above, we quickly see that they require very different language and rhythms in order to provide optimum value:

The insight that the language and timing requirements of these purposes are so different helps us understand that the ideal of “one document as the single repository of architectural knowledge” is not viable. If we want our business stakeholders to quickly grasp that their budget and delivery concerns are addressed, we should not send them a 200 page document and then ask their approval. If we want the development teams to easily learn how to implement our architecture, we should not bother them with exhaustive lists of business rationale behind all the underlying decisions (we should not hide that rationale from them either: sometimes it is needed to make the right development choices).  

The picture above is an example of how to separate the various rhythms and languages of architecture documentation.

Conclusion

The insights above have helped us to significantly reduce the size of the architecture documents presented to stakeholders. In one organization this helped reduce the start-up time of smaller projects from (typically) two months to two weeks – a real boost to agility and benefit to the organization.

Move slow and fix things

Four years ago, Facebook changed its famous motto “Move fast and break things” to “Move fast with stable infra” (not quite as catchy?). Most people associate the original motto with disrupting established business models (see e.g. Jonathan Taplin’s book with the same name). Mark Zuckerberg himself, however, hinted that its origins lie in a particular attitude towards software engineering: “As developers, moving quickly was so important, we would even tolerate a few bugs to do it”. As time goes on, this attitude apparently does more harm than good – as even Facebook found out.

Via xkcd.com

“Move fast and break things”, in software engineering, signifies the need to quickly create functionality in the beginning of a product life cycle. It is related to the first agile principle: “Our highest priority is to satisfy the customer through early and continuous delivery of valuable software.” Functionality represents direct business value, and the “break things” part of the motto assigns lower priority to work that represents indirect business value, often called ‘enablers’.

Agile principles eight and nine, however, stress the importance of sustainable development and technical excellence, and the ability to “maintain a constant pace indefinitely”. Enablers create and maintain the architectural backbone of a product and safeguard long-term quality. An organization that neglects enablers in favor of functionality builds up technical debt; the longer enablers are postponed, the higher the debt and its impact on productivity and quality. A product or organization in technical debt cannot maintain a constant pace indefinitely.

All of the large organizations I encounter in my daily work are struggling with a burden of technical debt in one way or another. In many cases, the pressure to “move fast” has led to underfunding of enablers like ‘under the hood’ improvements and technical debt reduction. They have moved fast, and now things are broken.

How did we get here?

Discussions with architects, product owners and managers reveal some potential causes for the technical debt burden:

  • ‘Regular’ modern business practices: KPIs (Key Performance Indicators) often lead to prioritizing short-term, measurable success over long-term investments.
  • Overinflated stakeholder expectations: temporarily postponing enablers (often for good business reasons) leads to a steep growth in business value, delighting stakeholders. Teams then become reluctant to break the news that they cannot maintain this pace, forcing themselves to keep it up and build up more debt.
  • Cargo cult: the success of Internet giants prompts organizations to adopt their methods, practices and frameworks without properly considering the differences in culture and business model (and when the methods fail, their proponents call for even more extreme implementations, because “we must have been doing it wrong” of “we were sabotaged by non-believers”).
  • Misapplication of specific agile practices, such as:
    • WSJF (Weighted Shorted Job First) prioritization (specific to SAFe, the Scaled Agile Framework): due to their size and indirect business value, enablers generally cannot compete with features and functionality in WSJF prioritization. This is why SAFe advises to allocate capacity for enablers separately; ignoring this advice leads to misapplication of WSJF to enablers, and build-up of technical debt.
    • MVP (Minimum Viable Product): an initial product with just enough features to satisfy early customers, and to provide feedback for future development. It is in the nature of MVPs to ignore ‘edge cases’ (rarely occurring uses of the product) and some non-functional requirements (NFRs, e.g. limited scalability). However, edge cases and NFRs can have major impact on the architecture. Continuing development based on an MVP without properly considering (or completely rebuilding) the architecture is a misapplication of the MVP practice that leads to build-up of technical debt.

Product evolution based on a Minimum Viable Product may require rebuilding the architecture

The road ahead

During the panel discussion on the “Death of the Architect” at this year’s SATURN conference, I joked that the world is drowning in a pool of technical debt. That was a bit overdramatic, but nobody can deny that we have a problem. Not only is the technical debt burden draining businesses’ resources away from their ability to innovate, it is also causing real problems in society – that is how dependent we have become on properly working software. Architectures that do not support edge cases can even have significant ethical implications by excluding groups of clients and citizens from (sometimes essential) services (a number of cases like this are documented extensively in the Kafka Brigade’s ‘The Digital Cage’, currently only available in Dutch).

Technical debt causes real economic, societal and ethical problems. It is time we did something about it – starting by addressing the root causes mentioned above. I do not have the illusion that we can easily change the short-term focus in modern business governance practices, but we can certainly be more transparent in managing stakeholder expectations about sustainable development. We can also be smarter in applying agile practices – most of which is making sure we understand the practices and their limitations before we apply them.

Paying more attention to enablers means there will be less capacity for new things – initially, we will slow down. But maybe that’s a good thing. Grady Booch recently tweeted the question “What does software development look like in a post-agile world?” I hope we will slow down and fix things.

Opportunity Cost in the Technical Debt Business Case

A few years back, I discussed the business case for reducing technical debt, and the importance of accounting for the risk exposure in that business case. However, there is another item in that business case that deserves some attention: the opportunity cost, defined by The New Oxford American Dictionary  as “the loss of potential gain from other alternatives when one alternative is chosen.”

When a Dev team spends resources and time on reducing technical debt (upgrading, refactoring, repairing), the team will produce fewer end-user stories during that time. Opportunity cost represents the business value that those end-user stories would have yielded, as a way of accounting for the scarcity of the team’s resources.

The literal term ‘opportunity cost’ is seldom heard during technical debt discussions, but it is often a major factor in deciding when to reduce the debt. Whenever a stakeholder (e.g. a product manager) says something like “Yes, we should do something about this debt, but we cannot afford to do it now”, she is probably referring to the business features that end-users are waiting for, or that have been promised before a certain deadline. In other words, the opportunity cost of reducing the technical debt – the potential gain from the alternative of delivering the business features on time – is higher than the interest on the technical debt incurred during that period.

The figure is an attempt to illustrate the opportunity cost by comparing two scenarios: in scenario 1, the technical debt is not payed back, and in scenario 2, the debt is payed back in release 1.2. The value curve at the top of the figure makes a little dip in scenario 2 (dashed line), compared to the continued growth of scenario 1. Using Philppe Kruchten’s backlog color coding, the figure shows that in scenario 1, release 1.2 introduces five new (green) user stories, while in scenario 2, there is only time for one user story because we have spent the rest of the resources on reducing the (black) technical debt. The gap between the dashed and the solid line represents the opportunity cost of reducing the technical debt. (In case you are wondering why the dashed line goes down in release 1.2, even though we are adding a user story: I always feel that existing business features in a solution are subject to some kind of value decay, due to growing expectations and demands from end-users – debatable I know, but beside the point of this blog post).

Example

A good example of opportunity cost in architectural technical debt reduction was presented to me by attendants of an RCDA Practitioner Course a few months back. In their organization, a team had been developing business process automation features for 4 years. The organization had kept track of the labor cost savings attributed to that automation effort, which amounted to 9 FTE (full time equivalent positions) per year on average. The platform the software was running on was due for a major overhaul, because it could not easily be made compliant with new European Commission regulations (most notably GDPR). During the overhaul, they would not be able to develop new features – meaning an opportunity cost equivalent to 9 FTE per year, or 0.75 FTE per month spent exclusively on the overhaul. A significant opportunity cost, but it was determined that the risk of non-compliance outweighed the opportunity cost, in favor of the overhaul.

In conclusion: if you need to draw up a complete business case for taking care of a piece of technical debt, make sure you include the opportunity cost on the costs side. This will help to facilitate a rational discussion about the impact of delaying features, putting this (architectural) choice in its business context. And while you’re at it, don’t forget to include the reduced risk exposure on the benefits side!

What happened on SATURN 2018

The annual SATURN conference is usually one of the highlights of my professional year, and SATURN 2018 was no exception. This year’s edition took place in Plano, TX, and presented another jam-packed cornucopia of digital architectural knowledge in four parallel tracks over four days. Don’t be sad if you’ve missed it: all sessions have been recorded and will be posted to the SATURN playlist on YouTube. I will certainly check back there to see some of the talks I had to miss – because there was just too much going on in the parallell tracks..

Monday

Monday was warming-up day with full-day training sessions and a workshop on “growing great designers”. I attended the workhop, in which we discussed how to select and train designers and architects – but also how to get organizations to care about good design. It was interesting to brainstorm the essential “canon of knowledge” about software design, which made us realize that the paradigm of architecture as a set of design decisions is still not as accessible as we would like: there are great books and established training curricula about many other design paradigms, such as patterns, modeling and many architectural styles – but if someone asks “How do I put the idea of architecture as a risk-driven decision making discipline in practice?”, there’s no easy answer yet (although George Fairbanks’ “Just Enough Software Architecture” comes close, as does the RCDA training program). The workshop outcome can be found on github.

Tuesday

The conference proper took off with a keynote on evolutionary architecture by Rebecca Parsons of ThoughtWorks. She introduced the idea of an “Architectural fitness function” to describe how close an architecture is to its desired characteristics. Measuring the function over time would enable us to monitor architectural quality during an architecture’s evolution. Extending quality attribute scenarios with temporal behaviour?

In the afternoon, we played the Smart Decisions Game – a fun way to introduce people to the idea of making architectural trade-offs based on quality attribute requirements. The task was to select machine learning algorithms for a specific context, and we each received a deck of cards. Each card summarized an architectural choice (an algorithm), including its impact on quality attributes. The game was extremely well prepared by the guys from SoftServe and Rick Kazman. and I was especially impressed by the ease with which we could plug these algorithms into their toolbox.

We closed the day with a nice reception, playing quaint folklore games like Cornhole and the bizarre Armadillo races…

https://twitter.com/SATURN_News/status/994200180185534466

Wednesday

Wednesday’s keynote by Ricardo Valerdi was a fascinating journey into the world of Virtual Reality. Ricardo told the story of the development of an app for creating concussion awareness among football players. It was a nice bridge between architecture and user experience, which becomes ever more important as an architectual concern.

Michael Keeling then presented a very practical (as always) lesson on managing technical debt.

I was especially happy to see Thijmen de Gooijer expanding on the metaphor of marriage for committing to architectural decisions which I had introduced a year earlier in my talk on architecture life cycles. Thijmen had some great examples. Trying out compatibility with a potential mate by making a trip to IKEA is a bit like setting up a continuous delivery pipeline – the only way to the exit passes by everything you could possibly run into, and if you have seen your mate’s reaction to all that is on display you get a pretty good idea of what life with that person will be like…

In the afternoon, I talked about Shortening the architectural feedback loop – an extended version of my blog post with the same name. We had nice discussions, and this talk was voted runner-up for the best presentation award. [edit: watch the talk on the SEI YouTube channel].

Death of the architect

After my talk, there was an amusing panel discussion about what’s happening to the role of software architect and how teams should make important crosscutting design decisions, called “Death of the architect”. The general consensus seemed to be that it is a good thing that architecture is becoming more of a shared concern than the domain of a single person – however, the pendulum seems to have swung too far in the direction of decentral decision making, to the point where people don’t care about design anymore, or even refer to architecture as “the dark side”. This excessive denial of the importance of cross-cutting coherence has led to a worrying build-up of technical debt across many organizations, with some disastrous consequences. This prompted the remark that “If the architect is dead, the world is doomed: it will drown in a pool of techncial debt.”

The other highlight of the afternoon was a nice talk by Sebastian von Conrad, about applying the concepts of Douglas Hubbard’s “How to measure anything” to quality attribute quantification.

Thursday

The final day started with an amusing game of slide roulette (a lot of fun, but why schedule it at 9AM???), after which Linda Northrop Award winner Eoin Woods gave a very nice acceptance keynote, describing the Five Ages of software architecture, and his outlook for the coming age of “Dissolving systems”.

After two interesting GDPR-related talks by David Max and Andrzej Knafel, it was time for the Ethical Software Architect workshop that I had prepared with Michael Keeling. [edit: watch the session on the SEI YouTube channel]. We looked at some examples of ethical impact of architectural decisions in news headlines, and at some reasoning frameworks architects can use when they are confronted with ethical dilemmas. We then split up in groups to discuss fictitious scenarios that any architect could sooner or later encounter. These discussions became so lively that we got some complaints from jealous speakers in neighboring rooms afterwards! The attendants were so happy with this session that they voted it number 1 for this year’s best presentation award – which says a lot about how architects feel about ethical aspects of their work. There seems to be a real need for more ethical guidance for architects in the digital world, and I would be surprised if it wouldn’t become a regular topic in architecture conferences the coming years.

The closing keynote was by Michael Nygard, famous for best seller “Release It!“, a book about building software that survives the real world. Michael philosophized a bit about various categories of coupling between modules and systems: Operational Coupling (Consumer cannot run without the provider), Development Coupling (Changes in producer and consumer must be
coordinated), Semantic Coupling (Change together because of shared concepts), Functional Coupling (Change together because of shared responsibility) and Incidental Coupling (Change together for no good reason). These musings reminded the “elderly” among us of work done in the 70s and 80s about modularity and the birth of object orientation, but it was nice to see these design fundamentals expressed anew by someone who has great impact on modern views of software engineering management.

Inspiration

All in all, the conference left me feeling very fulfilled and inspired (and not just because of the nice recognition I received from the attendants’ rating of my sessions).

 

Next year: Pittsburgh!