Layers, Complexity, and Trends

15 Oct 2020

There is a short post that I cherish called: Complexity Has To Live Somewhere. Sometimes engineers need to be reminded that if everything was simple, clean, and easy, we wouldn't be getting paid. It is our job to create the interfaces, the abstractions, and the patterns that place the complexity where it belongs.

The truly insidious part about complexity is accidental complexity. To use a metaphor it is: "the cost of doing business." Accidental complexity is all the other crap that you have to do before you can do what you need to do. There are lots of reasons for accidental complexity; leaky abstractions, tech debt, and technical decisions made before your time.

I've written before how writing software is more of an artisan process than a construction process. And it is because this is still a young immature industry, despite how large and monied it is. We're still learning how to do this. Humans have been constructing buildings for thousands of years, we've figured out a few things and standardized on nearly everything. That said, there are still leaky abstractions and technical debt in construction.

I live in a house constructed in 1905. This has taught me a lot. Nothing is square, level, or plumb. That said, if you've ever walked into new construction—nothing is square level or plumb there either. Over that time there have been decisions and bad work. As an unskilled construction worker I can get by fixing things in this house. But the true expert, the master of their craft in construction and carpentry, can walk in here and produce work that results in perfection despite what is underneath.

We aren't close in the tech to allowing unskilled workers to make real progress. Hell, we're still trying to help ourselves.

Most of the startups that cater to other developers are working to eliminate accidental complexity for others by taking it on themselves and asking folks to pay for it. Many open source projects, I'm looking at you Kubernetes, are designed explicitly for removing this accidental complexity. To band together and try to make managing all the servers in a data center look like managing a single server. You can only reason about things that can fit inside your head. K8s is an operational metaphor that tries to get everything into your head so you can reason about it.

I look at Lambdas, and messaging services (whether it's AMPQ or Kafka), and I see the attempt to make a distributed system made up dozens of CPUs and hundreds of gigs of memory operate as if it was on one system. By making servers "not matter" you can think about your running application as a logic puzzle that runs on one system.

This is not limited to system design. We do this when we architect our code as well. We create interfaces and patterns that put complexity in the one spot where it is necessary to deal with. And we attempt to wall off each different complexity from one another.

This is the trend we've been on in cloud computing—eliminating anything and everything we can that makes applications look distributed. We don't always succeed. Just like in construction nothing is level, square, or plumb. When the abstractions leak we need the experts, the master artisans.

We are trying desperately to do more with less. Fewer engineers, fewer meetings, because those are the most expensive pieces of this puzzle, the network effect born by communicating. I'm not sure we're succeeding yet, but we're trying like hell.

Read more: Layers, Complexity, and Trends

Space Required

6 Oct 2020

There is a unique failure mode that I have experienced: trying too hard. I never saw this one coming, and I never expected it. "Trying too hard" means that you have a tight grip over each aspect of every project. The biggest symptom I have experienced from this failure mode: The Bystander Effect.

"Bystander effects" are when folks on the team wait for someone to tell them what to do or to make a decision. It is a lack of initiative. The team expects the leader to do all that. And this leader doesn't have to be a manager or "the boss", it can be a fellow team member. Trying too hard can result in unexpectedly becoming a gatekeeper. Everything needs their stamp, their confirmation, their inclusion. The team learns "this is how we do things here" and waits.

If you are seeing bystander effects, and lack of initiative, the antidote is space. Space is required for people to take ownership. Space is required for people to grow. Space is required for people to make improvements, and to improve themselves.

Creating space looks like putting in guardrails and explicitly stating your expectations. However, space is not a vacuum. It will be rocky at first, and folks will say it feels like a vacuum. Stating expectations is the tool to align everyone's behavior, this is the hands-off part (e.g. give the problem, and loosely define the process). The guardrails are where you want the team to include you in the process again. That can be varied based on your situation and may include: "if the solution will cost more than $X", "if the solution will take longer than Y", or when there are multiple teams and boundaries involved "if the solution requires changes to interfaces or APIs that other teams manage".

A good test of whether or not your team is set up for success and owning their own work is: can you go on a two week vacation and when you return you see progress.

Read more: Space Required

Choosing What Problem To Have

1 May 2020

Most of our engineering decisions these days are about choosing which problems you want to have. For me, this realization came off the back of the popularization of the CAP theorem. Learning that the natural constraints of a distributed system force you to pick availability or eventual consistency, means you have to choose what problem you want to have. Would you rather your system have slower responses, or even be down? Or, would you rather have data "look... off" for a while, and write rules to manage conflicts, or heaven forbid, manage some manually. While the big players are building for the route of eventual consistency, it is absolutely legitimate to choose availability. This is a thorny, if abstract problem, but choosing your problems exist on smaller scales too.

A recent example of this for me is dealing with any sort of data de-normalization. There are a lot of engineering choices to make here around transactions, code simplicity and reuse, and managing the cost of compute time. Often times the simplest and easiest to reason about implementation is a standard ETL every X hours. The problem you've chosen to have is that for N hours your data is out-of-date. That is another easy problem to explain and understand to people. But, this is not always a viable problem to deal with.

I love database triggers and procedures for these sorts of operations when the cost of compute is low enough. No matter how many points throughout the code that touch this data, the database is always the commonality, so code-reuse is essentially nil. Putting the operation in the database handles the transaction issue; the original operation is not completed until the trigger finishes. This works well enough when the compute cost is low, when its high you're going to slow down a high traffic db with locks.

If you put the ETL operation in a separate re-usable codebase that operates in a different transaction, and the compute cost is high, you have the potential for race conditions based on how you're kicking off your tasks and guarantees of run-once. Sometimes the constraints of your work chooses the problems you are going to have.

I see this playing out on the front end as well. I am, apparently, in the minority who thinks that the problems React and its ecosystem brings are not worth the benefit. I do not want and I don't want my customers to have; giant payloads and big cpu loads (especially for lower end devices). I honestly don't even think the developer experience is even that good. There are some upsides that have some second order problems: there are tons of components and packages. Like Python, there is a package for whatever you need to do. However, this leaves you a problem to solve: you are going to have to fight against the grain to make it all come together and look the way you actually need. My pet theory is that this is why there are so many front-end jobs. You can pick a template, theme, or even Facebook, Google, or Bootstrap design system and then grab a machete to make it match your brand. That is not a problem I want. The second-order problem is that developers only know how, or want to solve, problems with React. Many have lost track of the underlying things we are trying to solve in the browser, and trying to deliver to the user. This is when people are capable of being blind-sided because they've lost the forest for the trees. They never see the next thing coming. And there are lots of possibilities on the front end that don't include shadow DOMs or JAMstacks. The JAMstack is a great solution to bringing the fastest first-paint to the user by pushing content as far as you can to the edge, which means it has to be static. That is one way to solve sending giant payloads—putting it as close to the user as possible.

If we aren't continually paying attention to which problems we are good at solving, capable of solving, as a team, we run the risk of picking a problem that can do some damage, by accident.

Read more: Choosing What Problem To Have

Modals Are The Worst

25 Apr 2020

This is my "Hot Take" for the front-end world. I truly think modals are bad. Bad for the end user, and a crutch that too many developers and UX folk rely on.

Let's start with the users because they matter most. We have implemented modals to take over the whole screen, to be an interruption, and to hide all content behind it. My challenge for the experience is: replace every modal you have in your application with a browser dialog() call. The only difference is that the dialog locks the main thread—but you've already locked the thread in the users mind, so you might as well.

There have been numerous times where I need other data on the screen in the modal. But I can't see it, its hidden behind the modal. And I can't get to it. And of course, once you close a modal, whatever you did in there is gone.

Looking at modals on mobile is the nail in the coffin. Due to the screen size it must take up the entire screen. What is the difference between that and "navigating to a new page" in the mobile UI metaphor? When you actually navigate to a new screen you get the standard back button right where you expect it, right where it is like every, single, other, application. By using a modal on mobile you've confused your user and broken the standard UI metaphor for these devices.

Using modals is how we developers show our laziness. It is a crutch both in terms of not wanting to do more work in creating new pages (or states), and not wanting to think about how to improve the UI so a modal window is unnecessary. There are lots of possibilities that range from small and unobtrusive like inline editing, or more fundamental like left/right pane splits (e.g. a list on the left, detail on the right). I think we really need to start thinking about task-specific UX where we optimize for what task a user is trying to do. Build a UI specifically for that task. Don't just throw a modal into the mix because it allows the user to do something. I am willing to bet it is clunky. And if you had the job to use that modal window hundreds of times a day, you would quickly redesign it for something better.

I think the web would be much improved if ad blockers could block every custom modal window. Add modals to the deny list. There is an up-and-coming HTML <dialog> element, which is a replacement for dialog(). Let's build toward that, and destroy custom modals. I know people will abuse the dialog to take over the whole window—but it does not have to. And, if it is an HTML element it can be controlled by the browser; there can be system settings to protect users when we developers abuse it. I imagine the possibilities of site-lists that block their use of the element.

Read more: Modals Are The Worst

Choices: Engineering, Organizations, and Constraints

4 Apr 2020

I have not found a better distillation of the gordian knot that is modern software development. There was a time when one person could, effectively, have the entire computer in their mind and produce good software. That time is long since past. Due to the technical and economic choices we have made, and are living with, we are forced to work together. Michael Feathers is, I think, holding up mob programming as a solution to the problem. I absolutely think it is one possible solution. I think there is at least one other possible solution. The each have their own constraints and I want to walk through them as a thought exercise.

Mob Programming

What are the constraints around mob programming? Most teams have been doing it in-person. As work moves towards remote teams we are going to lose the high-fidelity of in person communication and bonding as a team. As teams become more distributed across timezones getting people together at the same time is going to be more difficult. As remote participants it is a lot harder to stay focused on people on the other side of a screen than if they were in person. One constraint that remains in place whether it is in person or remote is the increased importance of personalities gelling. Programming can be a very intense mental activity at times, and, to me, it can start to feel like letting someone in your head. You're going to be with these people for many hours a day—there are going to be interpersonal problems. The makeup of these teams, and management of these issues is crucial for long-term success. Any issues that would have come up in a team that is not mob programming are going to be multiplied by mob programming. These are real constraints that are going to require a lot of effort on the part of people managers. The folks working very closely together are going to need to be fully mature adults capable of managing their own emotions and being socially aware. These two traits are not something we commonly find in certain areas of our industry.

One Person Per Service

I'm not sure I've ever seen someone fully describe what I am about to try and describe. I'm not sure anyone we have all heard of, that I could point to, has done this. I think my principle here is: crystal clear divisions of responsibilities.

There has been a consistent trend in the industry around best practices of making things smaller, less dependent, and more frequent. One example: make smaller commits that are atomic, and deploy each of them, don't work for a week on a feature branch, push 100 commits, and have to figure out merge-hell on Friday when you are trying to deploy. A second example is a preference for (responsibly using) microservices over monorepos, the goal has been to create less dependencies and smaller environments. I think I am advocating for that in terms of Engineering Org design as well.

If software is best when designed by one mind, let one mind be responsible for one service. It's design, its environment, running it in production. If one mind is going to do this, as I said in the open, it has to be small enough to fit in the head of one mind. I think this fits in with microservices very well.

What are the constraints around this kind of approach? Communication between these minds becomes the single most important criteria for success. I maintain that we've seen an organization do this: Amazon. They removed "insider access" to teams within the company, making everyone rely on documented interfaces to work together. They created an artificial constraint in order to elicit a behavior they wanted. That level of communication is not easy, after all, how many companies and products have good documentation with clear technical writing? Not many (not even Amazon I would argue, it is plentiful, but is it clear?). Another critical issue here is a bus factor of 1. Your internal documentation and code quality has to be very high.

So what do you do with all the people if there are just going to be single-folks running around? Maybe your bus factor is not 1, maybe its 2, because you pair a more senior engineer with another engineer closer to entry-level. This is a very clear mentorship model, with a clear sponsorship, promotion, and "graduation" experience—you get your own service. (In my experience mentorship situations become difficult when folks don't recognize they ought to be acting more like a mentor, or more like a mentee.) All these folks with services are going to need underlying building blocks. There are always common elements in any software design. Other people can work on those building blocks, with clear documented interfaces and responsibilities. I can imagine this growing into a structure similar to the way the open-source community operates today.

Mob programming is an attempt to solve a communication problem. Adopting it creates an artificial constraint: there is only one keyboard, in order to elicit a new behavior: you must communicate to get code written. I think I am merely flipping the situation. My artificial constraint: one person, one service, in order to elicit a new behavior: good written communication.

I want to embrace remote-first approaches, embrace distributed teams, embrace asynchronous teams. Doing that requires good written communication. You can always ask people to do it, but until its actually required to make progress, you don't really know what you are going to get.

We are all trying to solve the same problem, trying to come up with what works for us. What works for one group, one company, won't work for another. There is not one way.

Read more: Choices: Engineering, Organizations, and Constraints