How to Implement Backend Architecture Design

Published on
11 mins read
––– views

We all approach problems differently, and our solutions aren't always the same. What worked in one place might not work in another. With no universal answer, how do we approach backend architecture design?

Stages

There are three key stages:

  1. Research
  2. Implement
  3. Iterate

These stages form an infinite loop, continuously feeding into each other. You research, build, and refine. Ideally, the system solves some problems, but new tweaks often introduce new challenges, requiring further research and updates.

If your design never needs changes, it might be perfect but that's rare. Alternatively, it might mean your system isn't being used at all.

Research

The main goal here is to define the primary purpose of the system. It's usually better for a system to focus on one thing and do it exceptionally well.

The key question: What is the primary purpose of the system? If you can't answer this, you haven't completed the first stage.

If your scope keeps expanding during research, you're dealing with "scope creep," something to avoid. Scope creep leads to complexity, and while building complex systems is one challenge, maintaining them is another.

Sure, you could build endlessly and leave it for someone else to handle but that’s not a responsible approach. How long can you avoid taking ownership?

Performance must be considered. For a startup or a small service, don't worry about whether the server can handle 1 million requests per second. Instead, put some load on it and observe how it behaves. It's good to create your own artificial chaos before encountering the real thing. Test how much it can handle. Calculate if your bottleneck is in the near future or if you still have time. How many more users can it accommodate? Challenge your own design before your customers do. At the very least, you will know its weaknesses and when it will break.

Understand your limitations: budget, tooling, resources, and time. It's known that Python has a slow "for loop," but if it's looping over a few items, who cares? It won't be significantly slow. However, if it's looping over thousands or millions of items, you've reached Python's limitations. Elixir, on the other hand, can offer better performance, scale more effectively, and handle more requests per second. But again, how many requests per second are you going to handle? If it's not too many, Python is fine. And if you don't have an Elixir developer on the team, is it worth hiring one just for system?

This is why you set your requirements first and then pick your tools. If a small change will get the ball rolling without creating future problems, call it an iteration and go that way. However, if you’re doing these iterations just to kick the can down the road, that’s not the right approach. Hopefully, others in the team will catch you and stop you from doing that, but it really depends on the company or team culture.

I’ve seen both. One place I worked had a “kick the can down the road” culture. Nothing got fixed; they just kept adding more if statements, creating a tangled mess of spaghetti code. No one cared, and the quality of the product didn’t interest anyone. I eventually had to leave the company for this reason. Other than that, the job was alright. People were nice enough and all that. But the team never had the desire to make quality products.

It’s the culture, and it’s hard to change. Usually, you need to fire at least half the team, preferably more, including the managers and start from scratch. But that’s a different topic.

Laws and government regulations are part of the research phase. After doing all of this research, you need to make sure it’s legal to do. Data protection laws are different in every country, but if you don’t follow them, there’s a risk your company might go kaput. Now, all of your hard work is trash.

we are still on the research phase.

The next step is to document your findings. You might not use all the documents right away, maybe not at all but they’ll be invaluable for future research. Documenting allows you to revisit ideas you’ve explored and saves time down the road. If someone new joins the team and suggests something you’ve already tried, you can refer them to the document and explain why it didn’t work.

Implement

After we are done with the research phase, we can move on to the implementation phase.

Here are the key points of the implementation phase:

  • Chose the right design
  • Define the architecture
  • Develop the architecture
  • Test the architecture
  • Deploy the architecture
  • Maintain the architecture

Chose the right design

Design must match the need, if it doesn't why are we even doing this? Chosen design, must align with the budget, resources, team skills and time. The design can be the best in the world, but takes a year to implement where you have 6 months, then its not the best design in the world because you don't have the time to fully implement it.

Define the architecture

Now, you are putting what you have researched and designed into a plan. This is where you start putting the puzzle pieces together to build the system. This is where you take out your papers, diagrams, create charts and all that to have a visual plan. You start putting the pieces together, is this component going to work with this? What about this interface?

By now, you should be able to hand this plan to your engineers and they shouldn't ask you many questions and be ready to build it.

Develop the Architecture

Finally, we can start the development process. It took three separate blog posts, each probably more than two or three pages, to get here, but we’ve finally reached the development phase.

Now is the time to share your plan with the engineering team. They should be able to understand what you’re trying to build. If they’re asking too many questions or seem unsure, those questions need to be answered, and the plan must be reevaluated. If the requirements phase was done thoroughly, the engineering team should even be able to pick their own stack because they’ll clearly understand what will and won’t work for the requirements.

If the requirements are vague, the team will make assumptions, and the final result will depend on luck, and the choices made. This isn't the best time to gamble.

If the chosen technology doesn’t make sense to you or raises questions, address those concerns before development goes too far. Finding middle ground is crucial, and open communication is key. After all the hard work in previous stages, you don’t want to end up with something unrelated to the original goals. If that happens, what was the point of all the stages, phases, and planning?

Test the Architecture

Before deployment, we need to test if it’s going to work. Emulating loads and stress-testing the system is crucial. It’s much better for us to crash it during testing than for customers to do so. If we crash it, we’ll know where and how it failed. Maybe we won’t immediately understand why, but we’ll have the chance to figure it out. If a customer crashes it, though—good luck. Have you ever tried talking with a non-tech customer to figure out how they broke your system? It’s not fun. I’d much rather let my team break it.

You can create integration tests, end-to-end tests, and any other tests that suit your needs or are feasible with your resources. Let’s be real, many places don’t prioritize testing, but having at least a basic suite of tests can save a lot of headaches later.

Deploy the Architecture

At this point, we should feel confident about deploying. If not, it’s likely that some steps were missed, or certain issues didn’t get addressed during earlier phases. If that’s the case, it’s necessary to revisit those stages and resolve the concerns before moving forward. Once everything is in order, you’re ready for production.

However, keep in mind that during deployment, you might encounter reasons to loop back to earlier stages especially if unforeseen challenges arise. Additionally, if the company culture is hostile or discourages open communication, engineers might hesitate to raise concerns earlier in the process, which can lead to problems surfacing only at the deployment stage. Encouraging a culture where people feel comfortable speaking up is essential to avoiding these situations. Because you will address those problems anyways, why not address them earlier when there isn't full of dependencies build around it?

Maintain the Architecture

After deployment, we aren't done. Now, we need to maintain, update, add new features, fix bugs and all that. To do that, we may need actual data, and users. We test to make sure our system works, and we maintain to keep it up and running all the time. Users will find and cause problems you and your team can't.

Best Practices

Try to include all stakeholders from the beginning, so everyone knows what the project is before it starts and has a general idea of what is going to be built. If you need to reach out to someone and ask for their input, it never hurts to do so.

Use a modular approach whenever possible, and when you can’t, ask yourself why not.

Plan for scalability, but be realistic. If you’re a startup with 1,000 users, maybe plan for 10,000 or 50,000 first. Jumping from 1,000 to 250,000 users won’t be easy, and it could create unnecessary work that you don’t need right now.

Every business has a maximum number of potential customers. Some businesses will have more, and some will have fewer. If your product targets a niche market, your customer base will likely remain in the hundreds of thousands. On the other hand, if you’re building a social media platform, you might have millions of users. Plan accordingly.
Hope your company grows and you hit millions of users—that would be great. Here’s the thing: if you go from 10k users to a million, you’ll get a lot of funding, and your budget will grow by 100x or even more. You'll be able to rebuild your system with better tools, more experienced engineers, and all that. When Instagram was first launched, it was a simple app—add a picture, add friends, and that was about it. Something most of us could have built in a few weeks. Now, it’s one of the largest social media platforms in the world. I don't have the exact numbers, but I'm sure there aren’t many lines of code from the first few years that survived to today.

If your company makes money and grows, your system will eventually be rewritten, but don’t design your system for a million users when you only have 10,000. That growth will change the requirements along the way.

It would help to build modular systems around the services you use. For example, let’s say your system runs on AWS, but it’s usually not the best idea to depend on one service this much. You might want to be able to switch to Google Cloud when needed. Designing your system with modularity, where you can replace the AWS part with the Google part and still have your system running, will be a lifesaver. With this approach, you can also solve the scalability problem by replacing small chunks of the system to meet new requirements, preventing the need for a complete rewrite.

In the testing phase, load your system with more customers than you currently have. For example, if you have 10,000 users and expect growth to 20,000 soon, start testing with 20,000 or more. See how your system behaves and identify bottlenecks before the real growth happens.

If you have a security team, involve them in the research phase of the project because, as you might have guessed, security is crucial. If you mess up security, you may face legal challenges and nobody likes that (except lawyers, and nobody likes lawyers).

If the security team says no, that "no" is one of the biggest "no"s you’ll find in most companies. They’re one of the few groups that can even say "no" to the CEO. That’s why it’s so important to have them involved in the research phase, so you can make changes early instead of building the whole system and finding out later that it’s a "no."