Understanding Vertical Scaling Horizontal Scaling

Scaling is about allocating resources for an application and managing those resources efficiently to minimize contention. The user experience (UX) is negatively impacted when an application requires more resources than are available.

What is Scalability?

The scalability of an application is a measure of the number of users it can effectively support at the same time. The point at which an application cannot handle additional users effectively is the limit of its scalability.

Scalability reaches its limit when a critical hardware resource runs out, though scalability can sometimes be extended by providing additional hardware resources.

The hardware resources needed by an application usually include:

  1. CPU
  2. Physical memory
  3. Hard Disk (capacity and throughput i.e SATA vs SSD)
  4. Network Bandwidth

Physical Server or Virtual Machine – It doesn’t matter

An application runs on multiple nodes, which have hardware resources. Application logic runs on compute nodes and data is stored on data nodes. There are other types of nodes, but these are the primary ones. A node might be part of a physical server (usually a virtual machine), a physical server, or even a cluster of servers, but the generic term
node is useful when the underlying resource doesn’t matter.

Usually it doesn’t matter.

The manner in which we add these resources defines which of two scaling approaches we take.

• To vertically scale up is to increase overall application capacity by increasing the resources within existing nodes.
• To horizontally scale out is to increase overall application capacity by adding nodes.

Take this Example:

Increasing Capacity of Roadways

Consider a road for automobile travel. If the road was unable to support the desired volume of traffic, we could improve matters in a number of possible ways.

Option #1 – One improvement would be to upgrade the road materials (“the hardware”) from a dirt road to pavement to support higher travel speeds. This is vertically scaling up; the cars and trucks (“the software”) will be able to go faster.

Option #2 – Alternatively, we could widen the road to multiple lanes. This is horizontally scaling out; more cars and trucks can drive in parallel. And of course we could both upgrade the road materials and add more lanes, combining scaling up with scaling out.

understanding vertical scaling vs horizontal scaling

Vertically Scaling Up

Vertically scaling up is also known simply as vertical scaling or scaling up. The main idea is to increase the capacity of individual nodes through hardware improvements. This might include adding memory, increasing the number of CPU cores, or other single node changes.

There are no guarantees that sufficiently capable hardware exists or is affordable. And once you have the hardware, you are also limited by the extent to which your software is able to take advantage of the hardware.

Because hardware changes are involved, usually this approach involves downtime.

Horizontally Scaling Out

Horizontally scaling out, also known simply as horizontal scaling or scaling out, increases overall application capacity by adding entire nodes. Each additional node typically adds equivalent capacity, such as the same amount of memory and the same CPU.

The architectural challenges in vertical scaling differ from those in horizontal scaling; the focus shifts from maximizing the power of individual nodes to combining the power of many nodes.

Homogeneous Nodes

When all the nodes supporting a specific function are configured identically—same hardware resources, same operating system, same function-specific software—we say these nodes are homogeneous.

Horizontal scaling with homogeneous nodes is an important simplification. If the nodes are homogeneous, then basic round-robin load balancing works nicely, capacity planning is easier, and it is easier to write rules for auto-scaling. If nodes can be different, it becomes more complicated to efficiently distribute requests because more context is needed.

Scalability is a Business Concern

A speedy website is good for business. A Compuware analysis of 33 major retailers across 10 million home page views showed that a 1-second delay in page load time reduced conversions by 7%. Google observed that adding a 500-millisecond delay to page response time caused a 20% decrease in traffic, while Yahoo! observed a 400-millisecond delay caused a 5-9% decrease. Amazon.com reported that a 100-millisecond delay caused a 1% decrease in retail revenue. Google has started using website performance as a signal in its search engine rankings.

More Knowledge

Check out our other free articles on our blog.
New to Linux? Go for the Linux course.
Want to learn automation? Try our Master Devops Program.
Fresher who needs a job? Our Job Guarantee Program is perfect for you.
If you are an expert with Linux, grow your career with AWS, Devops, Openstack or Openshift.