A lot of people in the tech industry are obsessed with this term: scalability. When developing a new feature, or planning a new project, this term is thrown around a lot.
People are asking continuously questions like "will it scale?", "how scalable is it?", "if we scale up, will it break?", etc.
But there are multiples things that need to be taken into consideration: what are we actually talking about? Hardware? The software development team? The codebase? The support team's effort to onboard new users? There are a lot of things that need to be considered with each software project. If more clients hop on, more resources will be needed, more servers can be rented to handle all the extra workload, more developers will be hired to handle the new features, bugfixes and experiments, etc.
When architecting a new project (or feature), we need to carefully consider some things such as:
I could go on with such questions, but that's not the point. I want to talk a little about some common scalability issues that usually happen with software projects, and how we can tackle them.
First, let's talk about the technical aspect of scaling up. When the project is young, barely an MVP, and there are only a bunch of users using the product at the same time, scalability would not be an issue. We could handle all the workload in a single process, on a single server.
But as more and more users hop in and become more active, inefficiencies start to become more obvious. Processes take too long, some things start to time out, the database load starts to become more visible.
What can we do when we see this increase of activity?
At first, when the project is young, you get the cheapest instances your cloud provider has available and the cheapest database. There's no point in spending thousands of dollars each month for an app that almost nobody is using. 2GB or RAM and one CPU is enough to run an MVP with some usage (of course, depending on the product. If your core offering involved a lot of machine learning or data processing, then you have to pump a lot of money in the product from the get-go).
When the limits of the cheapest server are reached, we can always scale vertically: upgrading the servers we already use to have more resources. This is a simple and efficient solution at the beginning, because it doesn't require any development effort. The application will keep running just like it has before.
But at some point, we reach 100% resource consumption on the most powerful machine as well. What can we do next?
Next, we can deploy multiple instances of the same application on different servers, put a load balancer in front of them to distribute the requests evenly between them. They will keep sharing the same underlying resources such as the database, but now it should run smoother. In theory, we could scale our app horizontally indefinitely, but sooner or later we will encounter other limits: the database. Databases are pretty fast and efficient at data storage and retrieval, but they have their limits too.
If we are using relational databases, they are harder to scale using horizontal scaling. There are some solutions for clustering relational databases, but they are clunky and hard to manage. You would need to hire people specialized in database management just to scale the database horizontally and make sure you don't end up with inconsistent data.
A lot of frameworks offer an ORM as a layer of abstraction over the database access. You don't have to manually write queries, and you just write regular backend code.
These systems are enough in the beginning, but as the business requirements become more complex and the project evolves, we need to better optimize the access in our database.
Here are the most common problems we usually deal with when the database starts to see increased loads:
Sometimes, data doesn't change that often, and we can live with displaying some old stale data. Or ideally, when it rarely changes (maybe once very day/week/month). Then we can safely store the query result in a cache, and when we later need it, retrieve from cache instead of the database. Caching is way faster and efficient, and has the possibility to reduce the load on the database quite a lot (and speed up our application).
Just make sure you set the cache to expire after a set time, so we eventually get the newer data in cache.
Another way to improve our application is to separate long running tasks from our main user-serving process. In the web world, this translates to separating the processing tasks from the web traffic serving.
By doing this, we allow the web serving traffic to process more requests with their resources, and we queue up and resolve the processing tasks as soon as we have resources available. If we see an increased in data processing, we can spawn extra workers to crunch the excess, and then kill them during the quieter hours. This way we can also reduce the costs with the infrastructure, by not having idle servers just waiting during slower hours.
By operational scaling, I am talking about processes. Each development team should have its own processes in place to collaborate more efficiently and get things done faster and with fewer headaches. Scaling your team will force you to make some decisions regarding your processes: completely replace old processes that don't work well with a big number of people, or adapt them.
With more developers (and not only) joining the team, a single ticket board will not be enough anymore. A single daily sync meeting with all developers at once will not be enough anymore. Teams will need to be formed based on code ownership and responsibility. Each team will need to have specific people filling some crucial roles, such as: the team leader, the people responsible with code reviews, etc.
A team to function properly, should have the following:
As the codebase grows, adding new thins will become slower and harder. More things can break, things are harder to navigate.
There are a set of good practices that all boil down to keeping the technical debt low.
I put "people" as a separate item from "teams" because you also need to focus on individuals. As you scale, more people join, each being driven by different things, wanting different things from life, preferring to collaborate in different ways. Some like meetings, some dislike them, some prefer in-person chit-chats, some prefer working from home, some are more productive later in the day, etc.
You have to figure out how to make all people comfortable at their job and enable them to grow, keep them engaged and help them solve their own problems.
Here are some things you can do:
Scaling a company is hard. There are multiple things to have in sight when doing so: the product (the actual application code, how it runs, how are we dealing with inefficiencies), the teams (how do we split up the different domains of the product so that we have specialized teams responsible for each), the people (how do we involve people and enable them to grow and participate in building up the product and the business).