Elasticity is the ability for your resources to scale in response to stated criteria, often CloudWatch rules.
This is what happens when a load balancer adds instances whenever a web application gets a lot of traffic.
Scalability is required for elasticity, but not the other way around.
Not all AWS services support elasticity, and even those that do often need to be configured in a certain way.
--
In resume, Scalability gives you the ability to increase or decrease your resources, and elasticity lets those operations happen automatically according to configured rules.
One picture is worth a thousand words. I found it in Fundamentals of Software Architecture: An Engineering Approach by Mark Richards and Neal Ford.
Scalability handles the increase and decrease of resources according to the system's workload demands. So scalability does not have to be done automatically.
Elasticity is the ability to automatically or dynamically increase or decrease the resources as needed. Elastic resources match the current needs and resources are added or removed automatically to meet future demands when it is needed.
So in short ability of a system to handle Scalability automatically is elasticity
Elasticity is related to short-term requirements of a service or an application and its variation but scalability supports long-term needs.
Elasticity is the ability of a system to increase (or decrease) its compute, storage, netowrking, etc. capacity based on specified criteria such as the total load on the system.
For example, you can implement a backend system that initially has 1 server in its cluster but configure it to add an extra instance to the cluster if the average per minute CPU utilization of all the servers in the cluster exceeds a given threshold (e.g. 70%).
Similarly, you can configure your system to remove servers from the backend cluster if the load on the system decreases and the average per-minute CPU utilization goes below a threshold defined by you (e.g. 30%).
As another example, you can configure your system to increase the total disk space of your backend cluster by an order of 2 if more than 80% of the total storage currently available to it is used. If for whatever reason, at a later point, data is deleted from the storage and, say, the total used storage goes below 20%, you can decrease the total available disk space to its original value.
But some systems (e.g. legacy software) are not distributed and maybe they can only use 1 CPU core. So even though you can increase the compute capacity available to you on demand, the system cannot use this extra capacity in any shape or form. Such systems are not scalable. But a scalable system can use increased compute capacity and handle more load without impacting the overall performance of the system.
A scalable system does not depend on elasticity though. Traditionally, IT departments could replace their existing servers with newer servers that had more CPUs, RAM, and storage and port the system to the new hardware to employ the extra compute capacity available to it.
Cloud environments (AWS, Azure, Google Cloud, etc.) offer elasticity and some of their core services are also scalable out of the box. Furthermore, if you build a scalable software, you can deploy it to these cloud environments and benefit from the elastic infrastructure they provide you to automatically increase/decrease the compute resources available to you on-demand.
From my limited understanding of those concepts, an example:
Say we have a system of 5 computers that does 5 work units, if we need one more work unit to be done we we'll have to use one more computer. That is a scalable system but it is not elastic. Somebody going to have to go and get that other computer. Also, if a new computer is purchased and the extra work unit is not needed any more, the system get stuck with a redundant resource.
Now, lets say that the same system uses, instead of it's own computers, a cloud service that is suited for it's needs. Ideally, when the workload is up one work unit the cloud will provide the system with another "computing unit", when workload goes back down the cloud will gracefully stop providing that computing unit. That is a situation where a system is both scalable and elastic.
Scalability and Elasticity both refer to meeting traffic demand but in two different situations.
Scalability is meeting predictable traffic demand while elasticity is meeting sudden traffic demand.
image ref: https://www.skylinesacademy.com/blog/2020/3/6/az-900-cloud-concepts-scalability-and-elasticity
Both, Scalability and Elasticity refer to the ability of a system to grow and shrink in capacity and resources and to this extent are effectively one and the same. The difference is usually in needs and conditions under which this happens. Scalability is mostly manual, predictive and planned for expected conditions. Elasticity is automatic and reactive to external stimuli and conditions. Elasticity is automatic scalability in response to external conditions and situations.
Scalability refers to the ability of a system, network, or process to handle an increasing amount of work or load by adding resources. Scalability is often used to describe the ability of a system to handle increasing amounts of work or traffic in a predictable and controlled manner. In a scalable system, the system can be made larger or smaller as needed to meet the changing demands of the workload.
Elasticity, on the other hand, refers to the ability of a system to automatically scale its resources up or down in response to changing demand. An elastic system is able to automatically adjust its capacity to match the current workload, without any manual intervention. This allows for the system to be flexible and responsive and to minimize waste by only using the resources that are needed.