John Obelenus
Solving Problems & Saving Time through Software and Crushing Entropy
There are two major upsides to duplicating data. Make no mistake, when I say “duplicated” I mean it. I mean not-normalized data.
I can already hear the chorus: “One Source Of Truth”. And I agree, there should be one source of truth *in most cases. But sometimes that doesn’t apply.
Let me define the situation for which this does not apply; a complex, concurrent, multi-service, and multi-user system. If you’ve got one of those keep on reading.
I liken this to double-entry bookkeeping. There is not one source of truth. You only get to the truth when you’re comparing two (or more) records.
This notion of double-entry bookkeeping is the first major upside to “duplicating” data. Or at the very least, storing it in multiple forms, at multiple stages along its transformation. Why? Because when something goes wrong you have a breadcrumb trail to help figure out at which stage it went wrong. When you only have one source of truth that is constantly being overwritten—and something goes wrong, because something always goes wrong—you’ll never have any insight into what the state of the data was before the error. It’s gone. Overwritten. Inaccessible.
The second upside is optimization. Often when creating the One Source Of Truth it is designed strictly and ontologically. All the relationships conform to the platonic ideal according to the essence and relationships of the object (sorry, I studied philosophy). This design values ontology higher than the pragmatism of a running process. Duplicating data flies in the face of this thinking, and instead optimizes for access patterns; namely how you’re going to read and write this data.
My recommendation is that your initial source of truth is write-optimized. Make it easy and fast to write your data. The faster and more compact your writes are in a distributed system the lower the probability of losing (much) data when problems arise.
Duplicate and transform your data into as many different shapes you need to be in order to be able to read fast.
If you ever want to keep a system performing reliably and fast you have two options: keep it really really small, or embrace the CAP theorem and data duplication.
Read more: Against the Purists
It feels like whittling. Even though “construction” is the most frequently used metaphor for trying to describe what we do all day at work, it doesn’t feel right. I also don’t know why we only use wood-working metaphors. Any psychiatrists in the house?
It feels like whittling to me because of something called “composability.” Composability is all about combining pieces of code to create something bigger, that does more work. There are several ways to do this, and times there are very real technical challenges. But this is a lot harder than it sounds.
When we think of construction the first attribute we think of is its composability. Any weekend warrior can go to Home Depot and get what they need, and so long as they’re paying attention, asking the right questions, and checking the specs—all the pieces will work together very nicely. As a software engineer that would be amazing. I dare to say we experience the opposite.
Very few things “Just Work”. Many commentators have lamented that a lot of programming tasks “these days” are just writing “glue” code to put various things together. I don’t disagree, but I also don’t think that is a Bad Thing™️, and its not easy to do well either. Because composability is hard.
In fact, its so hard there are many times we have to pass on re-using working services and code because retro-fitting it to our new purposes would actually take longer than just doing it over, precisely the way we need it. Composability can be harder than making something brand new.
This is why I chose whittling as my metaphor. Sometimes you get to simply take something you’ve already done and re-use it, whole-cloth. But, more often than not, you have to start from a blank slate, and mold the code/chisel the piece of wood until it works/looks just the way you need it to for what you’re doing.
Read more: What Building Software Feels Like
There are tons of opinions and ideas on how to get software engineers to work together, and I’ve shared some of my own before. I think the crucial one to get right is coming to the idea of a Directly Responsible Individual. Whether it is a bug, a feature, a service, part of a product, or the whole product (for smaller products) there should be one DRI.
I believe this the most effective, and most natural way to do our work. I believe this is, more or less, how open source projects tend to work. Single individuals own features, parts of the project, bug fixes, and processes. And decisions flow through those individuals.
It is very simple. The DRI is the source of truth for that indivisible thing. And “indivisible” is how you decide how many DRI’s you need to have. The DRI is where the buck stops. They are fundamentally responsible for what they are given to work on. For a sports example; the coach is the DRI of the team, when the team fails to perform the coach gets fired. Why? Because there was zero question who was responsible for the on-field performance of the team.
The DRI gets the final say. You can imagine this as if they have a 51% voting majority on every part of their project. Ownership in the strict sense of the word. They should, of course, ask for input, feedback, and opinions from others on their team and others not on their team. The individual knows their piece is but one piece of the greater whole, and everyone is working towards reaching the final goal. But the decisions ultimately rests with them. Why? Because the responsibility of delivery rests with them. You cannot give them full responsibility without full autonomy, otherwise you’re sabotaging them from the start.
The list of responsibilities you give to the DRI is up to you. If TDD is important to you, make it their responsibility too. If you want them to fully document their project, from goals to decisions to user training, go ahead. If you don’t have a dedicated DevOps team you can make it the DRI’s responsibility to manage the project in production. Of course, if you have a DevOps, or a product team responsible for documentation, or a QA team, etc, those teams ought to have their own responsibilities. Make the lines crystal clear.
Whether everyone works on their own, is paired up, or a mix of the two there should still be a single DRI. If you’re pairing, give them each a related thing to own. And they can switch during the day as to whose project they are working on, who is making the final decision.
Whether you have teams of three, or teams of eight, or twenty, you should be able to chop up the work into indivisible pieces. Naturally, there will be pieces that have to work closely together; two different pieces sharing the same data store and data models, many pieces using one API service, etc. But so long as you have defined and documented interfaces, you can surface discussions around a ChangeLog when things need to change.
I believe this works because owning what you’re working on gets you invested in doing a good job. It’s yours, no one else’s. Since it doesn’t belong to anyone else, it will reflect you and your choices.
There can be no buck-passing; “Oh, I thought someone else was going to do that,” and there can be no wondering about why something didn’t get done. There can be no disagreement as to how something ought to work, because one person makes that decision. The only question becomes: is it performing?
This makes management’s job much clearer; they now have one job and that job is to set context. Your manager, or lead, doesn’t have to wondering whether or not they should “jump in” and help you out. You can ask them. When you feel they are deciding things for you, you get to say: Thank you for your input, I will make the decision. Both micromanagement and drive-by-management need to be shut down, because it is no longer their job to make decisions. Their job is to set the context.
Management needs to give you the reasons why we’re doing this work. They need to give you the end goal, the job your software needs to do. They need to tell you what you’re starting with, and where you’re headed. They don’t get to decide the path you’re going to take.
Read more: A Responsible Individual
An adjective everyone wants to claim: resilient. “Our service/platform is resilient.” Resilient infrastructure is a topic that many have gone into detail to teach us all how to achieve it.
But I think we have skipped over something. Something more fundamental. Conway’s Law:
[O]rganizations which design systems … are constrained to produce designs which are copies of the communication structures of these organizations.
If you’re looking to build a resilient system, you better have a resilient organization, and communication, otherwise you’re going to be ice-skating up-hill all the time. You may yet achieve your goal. And then one day you’ll realize you’ve slipped and lost that resiliency.
It comes down to actually creating a culture in your organization that truly values resiliency. That is the only way you can be sure you’ll continually reach your goal.
The biggest success story for this topic is Amazon. The first goal wasn’t resiliency, it was about pleasing their customers. But, Bezos knew without a resilient system the customer experience was always subject to volatility. We all know what he did. He separated his teams.
He demanded that when his teams communicated with one another they did so through official channels. Just like any other customer would. In order to be absolutely sure the customers would be treated well—he turned teams at the same company into customers. This is a rather extreme, and useful, form of “dog fooding” (using what you build).
Rather than just telling everyone to “Be More Resilient”, as if that is a thing you can expect to work, he introduced a constraint that resulted in changed behavior. And the result of that behavior was very resilient systems that are highly regarded by everyone in our industry.
Read more: Resilient Microservices
Search is a fairly solved technology these days. Google reigns supreme for searching the internet “writ large”. Since they’ve done such a good job all the e-commerce and product sites have learned how to do it well. It is not hard to find the things you’re looking to buy.
But what about knowledge. And not just any knowledge—specialized knowledge. The knowledge that you and your organization knows. The kind you don’t want anyone else to know about. It’s not necessarily “Top Secret”, but you don’t put it on blast for everyone to see.
Searching your own institutional knowledge is hard. After all, that is why it is still useful to “know where the bodies are buried.” Because institutional knowledge is hard to find out. Part of the issue is that the search corpus is often spread out, between many tools and data sources. A lot of it is never written down. And if it is written down—are you willing to bet money it is up-to-date and current? I wouldn’t.
Like I wrote about in Shared Understanding, Imperfect Representation:
There are lots of good tools now to understand what your system is actually doing. There are lots of good tools now for understanding what your users are actually doing. But within your team(s) no one person has the big picture. Everyone has a part of the model of how things ought to work. Stitching that together repeatedly, with little margin for error, is very, very hard. There are not very many tools that are specialized for that.
This is not simply a Software Engineering problem. This is a problem with all Knowledge Work. Why? Because there is no external referent for knowledge work. It is all concepts that live in our heads, relationships that exist between us, and words exchanged in person or written down. You can’t see this work like you would see a building in progress, a car on an assembly line, or even a bike being assembled in your local bike shop.
There are many tools out there that people are using, but no one likes them. None of them actually achieve the goal. I believe there are three reasons.
The first reason is that I don’t think they’re focused on the process—they’re focused on the output. Wikis and word documents presume that you’ve already come to a conclusion. This is why you have seven word documents all named the same with various “final-v1”, “v2”, “final-final” on the end of them. This is a learning process, not a static end, recognize that.
The second reason is that these tools are too hard to get your thoughts out quickly, so people don’t use them. The tools ask too much of you up-front. What is the title, what kind of template do you want, what category does this belong to. Meanwhile you’ve got a nugget of knowledge burning a hole in your skull just trying to get out onto the page before you forget it or get distracted by the next thing.
The third reason is culture. There is value in being the person who knows where the bodies are buried. There is value in scarcity. If you don’t have a culture that values collaboration and writing things down I hope you’re not holding your breath on being able to find that one thing you’re looking for.
Read more: Finding Information Is Hard