A lot of people in the tech industry are obsessed with this term: scalability. When developing a new feature, or planning a new project, this term is thrown around a lot.
People are asking continuously questions like "will it scale?", "how scalable is it?", "if we scale up, will it break?", etc.
But there are multiples things that need to be taken into consideration: what are we actually talking about? Hardware? The software development team? The codebase? The support team's effort to onboard new users? There are a lot of things that need to be considered with each software project. If more clients hop on, more resources will be needed, more servers can be rented to handle all the extra workload, more developers will be hired to handle the new features, bugfixes and experiments, etc.
When architecting a new project (or feature), we need to carefully consider some things such as:
- how often will it be used?
- how many resources will it need per use? This isn't a concern when the project is young and doesn't have many active users, but as more users hop onboard, the small initial inefficiencies we ignored because "it's not really a priority to fix right now" will become a problem.
- will we need to manually intervene if something goes wrong? What are the chances of something going wrong?
- is the core algorithm efficient enough? How does the resource consumptions change when one client has huge quantities of data vs a lot of clients with less data?
I could go on with such questions, but that's not the point. I want to talk a little about some common scalability issues that usually happen with software projects, and how we can tackle them.
First, let's talk about the technical aspect of scaling up. When the project is young, barely an MVP, and there are only a bunch of users using the product at the same time, scalability would not be an issue. We could handle all the workload in a single process, on a single server.
But as more and more users hop in and become more active, inefficiencies start to become more obvious. Processes take too long, some things start to time out, the database load starts to become more visible.
What can we do when we see this increase of activity?
Vertical scaling - aka throw more money at the problem
At first, when the project is young, you get the cheapest instances your cloud provider has available and the cheapest database. There's no point in spending thousands of dollars each month for an app that almost nobody is using. 2GB or RAM and one CPU is enough to run an MVP with some usage (of course, depending on the product. If your core offering involved a lot of machine learning or data processing, then you have to pump a lot of money in the product from the get-go).
When the limits of the cheapest server are reached, we can always scale vertically: upgrading the servers we already use to have more resources. This is a simple and efficient solution at the beginning, because it doesn't require any development effort. The application will keep running just like it has before.
But at some point, we reach 100% resource consumption on the most powerful machine as well. What can we do next?
Horizontal scaling - aka splitting the workload
Next, we can deploy multiple instances of the same application on different servers, put a load balancer in front of them to distribute the requests evenly between them. They will keep sharing the same underlying resources such as the database, but now it should run smoother. In theory, we could scale our app horizontally indefinitely, but sooner or later we will encounter other limits: the database. Databases are pretty fast and efficient at data storage and retrieval, but they have their limits too.
If we are using relational databases, they are harder to scale using horizontal scaling. There are some solutions for clustering relational databases, but they are clunky and hard to manage. You would need to hire people specialized in database management just to scale the database horizontally and make sure you don't end up with inconsistent data.
A lot of frameworks offer an ORM as a layer of abstraction over the database access. You don't have to manually write queries, and you just write regular backend code.
These systems are enough in the beginning, but as the business requirements become more complex and the project evolves, we need to better optimize the access in our database.
Here are the most common problems we usually deal with when the database starts to see increased loads:
- duplicated queries: we can query once, store in memory and then pass the data around in our code using various patterns if the situation allows it (eg. data changes occurring between two identical queries are not relevant).
- the N+1 queries problem: instead of using joins to retrieve related items, we do too many queries (one for the "parent" and then one more query for each "child"). This is often the case with ORMs that do lazy fetching, and identifying and fixing these problems are easy and yield a huge performance boost. Queries are expensinve, and we should strive to batch as much as possible.
- overfetching - retrieving from the database a lot of data we don't use.
- indexing common access patterns - if we find ourselves querying a lot on the same field in some tables (filtering, ordering, grouping), and those queries become rather slow (or even if the individual query is not slow, but if we do it a lot of times, the total time spent resolving it adds up), we should consider indexing.
Caching - aka stop hitting the database so hard
Sometimes, data doesn't change that often, and we can live with displaying some old stale data. Or ideally, when it rarely changes (maybe once very day/week/month). Then we can safely store the query result in a cache, and when we later need it, retrieve from cache instead of the database. Caching is way faster and efficient, and has the possibility to reduce the load on the database quite a lot (and speed up our application).
Just make sure you set the cache to expire after a set time, so we eventually get the newer data in cache.
The master - worker pattern
Another way to improve our application is to separate long running tasks from our main user-serving process. In the web world, this translates to separating the processing tasks from the web traffic serving.
By doing this, we allow the web serving traffic to process more requests with their resources, and we queue up and resolve the processing tasks as soon as we have resources available. If we see an increased in data processing, we can spawn extra workers to crunch the excess, and then kill them during the quieter hours. This way we can also reduce the costs with the infrastructure, by not having idle servers just waiting during slower hours.
Operational and administrative scaling
By operational scaling, I am talking about processes. Each development team should have its own processes in place to collaborate more efficiently and get things done faster and with fewer headaches. Scaling your team will force you to make some decisions regarding your processes: completely replace old processes that don't work well with a big number of people, or adapt them.
With more developers (and not only) joining the team, a single ticket board will not be enough anymore. A single daily sync meeting with all developers at once will not be enough anymore. Teams will need to be formed based on code ownership and responsibility. Each team will need to have specific people filling some crucial roles, such as: the team leader, the people responsible with code reviews, etc.
A team to function properly, should have the following:
- a dedicated way to track work - a task management platform, but they need to be able to also contact and create tasks for other teams (for example, discovering a bug in other people's systems, that blocks them to integrate should not go through a long chain of command - people should collaborate directly between teams).
- some meetings for in-team collaboration (eg. daily sync meetings) and some meetings of inter-team collaboration (eg. product meetings, where topics such as collaboration on features that span multiple teams can be discussed, integrations, future improvements, etc).
- documentation - internal for the team, and external, for other teams. You don't want people from a team to ask the same question multiple times. You want all the frequently asked questions to be already answered and easily accessible by other teams. The questions directed to a team should not spark meetings, but instead should be resolvable only by sharing a link to a specific documentation link.
As the codebase grows, adding new thins will become slower and harder. More things can break, things are harder to navigate.
There are a set of good practices that all boil down to keeping the technical debt low.
- Code reviews - are crucial for keeping the code clean. There are things that developers miss on their own, and multiple eyes looking at the same code can catch more things before they hit production. This is especially crucial for newer employees, who are not yet accustomed to the codebase and the processes.
- CI/CD - it blows my mind that there are teams out there that don't have an automated test suite for all merge requests and before each deployment. In my opinion, this is crucial for moving fast (in spite of taking longer to merge code), because it increases the confidence that the system won't break, and it sometimes catch bugs/craches that appear as a result of merging.
- A good architecture takes you a long way. Having the code structured properly and sub-systems decoupled, make upgrading and testing way easier. Spending time and resources on figuring things out beforehand with as few as possible external dependencies, will pay dividends long term. Being locked down to using external services will make the development experience a living Hell, because most of the online services don't offer test environments for development.
- Building internal mini-frameworks for common tasks will end up saving a lot of time and headaches in the long term. If your team doesn't spend time making collaboration more efficient and instead spends time reinventing the wheel again and again, you will end up with the same functionality duplicated in multiple different ways (each programmer finds their own way of resolving the same problem).
I put "people" as a separate item from "teams" because you also need to focus on individuals. As you scale, more people join, each being driven by different things, wanting different things from life, preferring to collaborate in different ways. Some like meetings, some dislike them, some prefer in-person chit-chats, some prefer working from home, some are more productive later in the day, etc.
You have to figure out how to make all people comfortable at their job and enable them to grow, keep them engaged and help them solve their own problems.
Here are some things you can do:
- Learning sessions such as internal tech presentations, hackatons, encouraging collaboration and peer coding, encouraging people to ask questions, and shutting down the people who shame others for not knowing them (you shouldn't have them in your company in the first place).
- Open 1-to-1 discussions - there are specific conditions that have to be met for people to be 100% honest about the hard truths. You want to discover those. And then enable people to have the courage to tell you everything that they have to say (people often don't say the hard truths mainly because they fear repercussions).
- Higher position people should be reachable and available - Usually in companies there is a disconnect between the management and the developers. That's wrong. The people building the product need to know the reasoning behind every feature decision, so that they know their work matters. Nobody likes to think that their work is useless. This can be done by encouraging developers to contact the decision makers whenever they have some questions.
- Talk business with your colleagues - In most organizations, business decisions are taken in secret, behind closed doors, and the only things reaching the developers are the business requirements. What about the why? Why do we need that? What if there are some things that can be done better, easier and faster? Developers having knowledge about the code and architecture, might have a better solution for the same problem, which requires less effort and can yield better results. Decision-making should always involve the people who are responsible for the implementation.
- Keeping everybody in the loop - a good way to involve everybody, is to be as transparent as possible with how the company is doing: how are we doing client wise, finance wise, what are the forecasts, what are we currently doing as a company, what our big goals are for the quarter or for the year. Having a periodic catch-up with everybody is crucial for this aspect, because people want to get involved and see the big picture (commonly companies do quarterly or yearly all-hands meeting to present the latest statistics).
Scaling a company is hard. There are multiple things to have in sight when doing so: the product (the actual application code, how it runs, how are we dealing with inefficiencies), the teams (how do we split up the different domains of the product so that we have specialized teams responsible for each), the people (how do we involve people and enable them to grow and participate in building up the product and the business).