Elasticity vs Scalability

Often when we talk about cloud computing or distributed systems we often hear about system elasticity vs scalability, however, these two terms are not always used appropriately, so let’s shed some light on their subtle difference:

Scalability

Scalability is that characteristic that allows a computing system to add or remove resources on demand and very quickly. In most cases scalability should have no downtime (downtime) when resources are added or removed, or at most these times should be very short. Then there are two types of scalability, one vertical and the other horizontal, the latter being the one most commonly used in modern cloud systems. This is what is also referred to as scalability versus size of the system.

Another type we may need is geographic scalability i.e. the system keeps its performance unchanged even if I increase the distance between system nodes and users.

Finally last (but not least) we may also need administrative scalability, i.e., the SD, whether I am using it on a geographically distributed data center managed by the same provider, or whether I am using it on multiple data centers offered by different providers continues to run the same, so even if the system nodes are managed by different administrative entities.

Most SDs deal primarily with scalability with respect to size, which is the one in the common understanding (the best condition of course would be to have systems scalable with respect to all three types of scalability).

Regarding scalability with respect to size, we talk about: scale-in/out when we talk about horizontal scalability: we have one server offering a service and we increase/decrease the number of servers offering that service.

While we talk about scale-up/down regarding vertical scalability: growth in workloads is addressed by adding/removing resources (e.g., increased memory, increased processing capacity).

Elasticity

Elasticity means that the “power” of a system scales automatically, increasing and/or decreasing to meet varying workloads as resources are added and/or removed proportionally. This is precisely why elasticity requires a scalable system to perform its task. Thus, elasticity builds on scalability and further develops it by introducing the concept of resource management. Thanks to elasticity, it is no longer necessary to oversize the infrastructure with respect to the peak or peak loads.

Scalable data center vs Static data center

In a data center in the cloud (right graph), the amount of unused resources is only the gray area between resource capacity and actual resource demand. In the case of the traditional solution, for those who do not use the cloud, the sizing is done either on the peak (oversizing) or on the average value (and you take the risks), so based on the peak I have a fixed value of capacity and as you can see from the first graph I have a much larger amount of wasted resources so much so that in traditional data centers the utilization of servers is less than 20% (Google itself until 2007 reported a utilization never higher than 30%*). Also in traditional data centers, the utilization of network components is even smaller than that of servers, because it is less than 10%.

As for undersizing in traditional data centers:

leads to dissatisfaction and reduction in the number of users, which will lead to reduced profits (rule of thumb of x seconds, according to which if the user does not receive the requested service within 4 seconds, he/she starts to lose trust in that service provider and will go to someone else).

How can cloud systems measure this elasticity?

There is no contract at the moment, although elasticity is an obvious benefit in the cloud.

One possible solution as a metric is represented by the graph above. We have on the x-axis the time and on the ordinates the amount of resources that the cloud allocates as time varies; the curve in red represents the resource demand, thus the number of resources we should have available to meet exactly the workload handled by the cloud system; the curve in blue represents the amount of resources that the system is offering us at that moment.We talk in that case about under-provisioning or over-provisioning in whether or not we have the demands we want. So having a good metric for elasticity means minimizing the red areas and the blue areas as much as possible, so that we tend to go close to the demand curve.

So the metrics proposed to measure elasticity try to measure what is the accuracy of elasticity, that is, what is the sum of these areas represented by the red and the blue areas, and what is also the reaction speed of the system (in the figure the system is quite slow), that is, the sum of the time spent in over and under provisioning.