Demystifying cloud: knowing 80% by learning 20%

Posted on 2021-03-24 byVlad Călin

There is no secret that cloud computing is already an important part of the tech industry, and it's becoming more and more adopted with each year. Companies and engineers that do not embrace it will eventually be left behind and only the organizations and teams that use the capabilities of the cloud are reaping all the profits.

The truth is, nowadays any organization and in fact any individual is already using the cloud. Each service you are using... is in the cloud. The email you are using... is in the cloud. The websites you are accessing... they are in the cloud. The days of companies having their own datacenters or server on premises are over.

Even with this trend of rapidly adopting cloud technologies in the tech sector, unfortunately there are engineers that don't make a priority for them to at least understand what the cloud offers. Some of those who are interested in the topic find out that to learn "the cloud" has a very steep learning cloud and get discouraged fast. But honestly you can't really blame them. Just look at AWS's product section: over 250 existing cloud services, each one with its own use-cases, integrations and API. It's no wonder the learning curve is getting steeper and steeper with each year.

The major cloud providers have a tendency to add more and more services, in their attempts to secure a piece of the market. There is no person in the world who is able to enumerate all AWS's or Google Cloud's services by heart. They are just too many.

So how can one start their journey to understand how the cloud works and what are the available puzzle pieces that can be put together to build a "cloud native app" if there are so many places you can start? That's a question I found myself asking myself in the beginning, and now I am trying to offer some clarity, or at least some direction, to the people who are asking themselves this question now.

I am a strong believer of Pareto's principle, and I strongly believe that working less but smarter always beats working a lot and hard. Thus, understanding the core principles will most likely help you more in the long run than studying in depth absolutely everything.

I am going to offer some rules of thumb, or some "notes" about the secrets of the cloud, that I hope will better help you to grasp what the cloud is actually about.

I am talking from experience, as I still remember the moment when it all clicked for me and suddenly, all the services in AWS (or most of them) started to make some kind of sense. After that, I was able to navigate any cloud provider, be able to compare their offerings and overall make better decisions about taking architectural decisions for my apps.

The building blocks of the cloud

To start working efficiently in the cloud, it is very important to understand its building blocks.

What are the most common pieces that are available in each cloud and how do they fit together in the grand scheme of things?

Compute resources

The most important piece are the compute resources. They are in fact just virtual machines in the cloud, and 90% of the cloud services are build around them. They represent a standardized way to buy compute power based on somebody needs without having to commit to buying hardware.

Each cloud provider offer this kind of service: renting compute power on an hourly basis most commonly, so that companies end up paying only for the resources they are using, without losing the flexibility and control of the servers.

In AWS we're talking about EC2 (elastic compute), in Google Cloud we're talking about the Compute Engine, in Microsoft's Azure we are talking about Virtual Machines (pretty straight forward).

Around these central pieces, there are some auxiliary smaller pieces that make the whole experience worthwhile:

  • Block storage - virtual "hard disks" that can be attached to our virtual servers, swapped between different servers or cloned on demand, in a few minutes with a click of a button (or API call).
  • Flexible IPs - each instance receives an internal IP that other servers in the same cloud can it by, or public IPs which can be assigned to the instances we want.
  • Security policies/groups - firewalls configurable in bulk - having servers placed in different security groups, we can define some sets of rules (eg. disallow all traffic to the port 5432), and these rules are applied to the whole server group, in a matter of seconds.

Of course there are others as well, but these are the most important, and the most used.

Storage resources

Another important type of service, that is central to building things in the cloud are object storage services. Basically an infinitely scalable with infinite capacity FTP server, accessible through a special protocol (HTTP based).

The main advantage consists of the efficient cost structure such products employ, which is usually based on the amount of data stored (per GB stored) and access traffic (upload/download traffic amount).

Examples: AWS's S3, Google's Object Storage.

Serverless resources

Serverless is the new fancy thing everybody loses their mind about nowadays: no more servers to manage, you just invoke functions in the cloud in pre-made environments, and pay for the function execution time. For this reason, this model is also known as Function as a service.

Although it has its limitations, it also has its advantages. Its main limitation is the lack of flexibility when it comes to the environment you want your code to execute in: you can pick from a handful of available environments, and if you need something custom, you are out of luck.

Examples: AWS's Lambda, Google's Cloud Functions

Access control

A special kind of service is the one through which the access to the cloud resources is managed. It is based on configurations that tell who is allowed to access what. Usually it is modelled as RBAC (role based access control), where each role has assigned multiple permissions on various resources, and each identity has assigned one or more roles.

This way, the cloud providers use the same authorization system for providing access either to real people (via the web app / console) or programmatic access for automations (by using access keys).

For AWS, it is IAM, for Google Cloud is also IAM, for Azure it is Azure Active Directory.

Managed services

There are a ton of those. Basically any open source application has its managed version in some clouds, that is usually done through a combination of the services listed before.

Just to enumerate a few, the most frequent and noteworthy ones are:

  • Managed databases - databases that run on compute instances, with some tooling built around them to offer things like backups, monitoring, rollbacks, patching, etc. There are a lot, raging from classic managed SQL databases, managed popular open source software (eg. Redis: AWS's and Google's, ElasticSearch, MongoDB) to custom software built by the cloud provider and offered as a managed service (AWS's Aurora, Google's BigQuery, etc).
  • Managed Kubernetes - installation of the Kubernetes container orchestration on virtual servers and automatically configuring everything.

Notable mentions

There are two more types of services one can usually find in a cloud provider's catalog, that are pretty neat and should definitely be understood by a software engineer:

  • Load balancers - servers put in front of some traffic source, that is instructed to forward traffic based on some conditions. A usual architectural pattern is to have a bunch of instances of the same service behind a load balancer to distribute the traffic between the instances - for scalability purposes (balance the load... load balancer... get it?) but also for zero-time updates (performing rolling updates and gradually shifting the traffic from the old versions to the new ones)
  • CDN - Content Delivery Network - service that acts like a cache for static web resources that don't change too often. They tend to speed up web apps quite a bit because the caching is done in a geographical location close to the request source. So, if you have a majority of the traffic coming from a single region, CDNs are a pretty powerful ally.

Conclusion

That would be all I wanted to say. I hope that it helped to gain some clarity in some aspects when it comes to the cloud, because in my opinion, it is the most important area of the tech industry that sooner or later, most developers need to be acquainted with. Even though their main job would not be in configuring and working directly with these cloud resources, it surely helps to see the bigger pictures and optimize or take better decisions.