Mellon Collie and the Infinite Scalability

2015_09_01_mellon_collie_infinite

Ever hear the phrase “Infinitely Scalable”, or “Infinite Scalability”? If you are in the field of IT, of course you have! This phrase makes me not just irritated, but also a bit sad and pensive. Both the “Infinite” part, and the “Scalable” part.

(Note: My blog focuses on data storage technology, and that is the lens through which I’ll apply my critical view of this phrase in this blog)(Note: The artwork in this blog is from the Smashing Pumpkins album of virtually the same name… I did not ask for permission.)

Isilon says they are “Infinitely Scalable” (See http://emergingtechblog.emc.com/introducing-the-one-the-only-the-unified-scale-out-storage-system-from-isilon/), allowing users to …


…. take advantage of one simple, highly reliable, infinitely scalable storage system. ONLY from Isilon.

“Only from Isilon”, they say? What about Nasuni, who also claims Infinite Scalability? (See http://www.nasuni.com/news/nasuni-provides-global-architecture-engineering-firm-infinitely-scalable-primary-storage-continuous-data-protection/)


Nasuni Provides […] Infinitely Scalable Primary Storage and Continuous Data Protection

And Ceph? (See http://ceph.com/docs/master/architecture/)


Ceph provides an infinitely scalable Ceph Storage Cluster based upon RADOS …

And ScaleIO, which has been described by EMC’s Chad Sakac as (http://virtualgeek.typepad.com/virtual_geek/2015/05/emc-day-3-scaleio-unleashed-for-the-world.html) …


something that scales to infinity and beyond

Unfortunately, some of the “Infinite Scalability” lingo has been applied to products from NetApp (my current employer). To the best of my knowledge, this has not sourced from NetApp corporation itself or any of the lead NetApp evangelists. Such an example can be found on Reddit (See https://www.reddit.com/r/storage/comments/3bti4b/netapps_clustered_data_ontap/):


Fastest SPC results of all the major players, interoperable with any cloud, can handle disks of all kinds together in the same chassis, does every protocol under the sun, and will (with time) be infinitely scalable in both directions with zero downtime ever.

While a NetApp Clustered ONTAP (cDOT) solution can be built to be of an enormous scale (approaching 100PB of protected usable capacity, pre-deduplication and compression, no thin-provisioning or snapshot or cloning “effective capacity” marketing – see http://www.netapp.com/us/products/storage-systems/fas8000/fas8000-tech-specs.aspx), and supports non-disruptive changing of scale (adding / removing controllers, adding and removing of physical capacity), it is certainly not “Infinite”. This is one of the many reasons I love NetApp culture… No need to go overboard with using “Infinite”, when reality is impressive enough.

But, Ive had enough of those others. Enough of the “Infinite Scalability” noise.

Time to explore the topic of “Infinite Scalability”.

2015_09_02_mellon_chollie_explore

  • Is “Infinite Scalability” something that exists in reality?
  • Are we all in agreement on what “Scalable” means?
  • What is even being measured, and how?

And why do we even need to discuss this? Simple

The Intellectual Arrogance in IT

Source : https://www.pinterest.com/pin/200973202095467701/

There are laws that exist in the universe. Many of these laws we are still learning, while others (such as the futility of keeping socks properly paired) are well understood. We need to dream big and be creative, but that does not mean that the laws of the universe can be warped to meet the demands of our imagination.

Many of the individuals in IT are extremely intelligent. So why do many of the “thought leaders” in IT propose ideas that suggest that our industry developed systems that somehow break these laws? These laws are easily recognized and appreciated in other sciences and engineering disciplines, but why not in IT?

First, let us get a better grip on the definition of Scalability.

The Ways of Scale

When the term “Scalable” is used, consider that this word can have multiple meanings.

mellon_chollie_angels

A solid starting point for the definition of Scalability can be found in WikiPedia (See https://en.wikipedia.org/wiki/Scalability):

Scalability is the capability of a system, network, or process to handle a growing amount of work, or its potential to be enlarged in order to accommodate that growth. […]

An algorithm, design, […] or other system is said to scale if it is suitably efficient and practical when applied to large situations […]. If the design or system fails when a quantity increases, it does not scale.

I find it useful to consider the root of the word scalability (scale), along with the above definition, to identify the three valid dimensions of Scal(ing).

  1. You can build something that is Of a Large Scale (or of a great scale)… such as a multi-petabyte storage platform.

    Here, a requirement is that when factoring in the size, it delivers an a correspondingly large (but not necessarily linear) capacity.

  2. You can build something that has an architecture that Can be Built at Different Scales (but not necessarily able to change size once built).

    Consider the scalability of VSAN here (See “http://www.vmware.com/files/pdf/products/vsan/VMware-Virtual-San6-Scalability-Performance-Paper.pdf”). Thus document demonstrates that it can be built to be different sizes, and how well VSAN works at those different sizes. In this document, data is included for configurations of. 4, 16, 32, or 64 hosts.

  3. The Ability of a System to Support Dynamic Resizing (or Dynamic Change of Scale), where the size of the system can be increased (or even decreased) without disruption.

    This dimension of Scalability is suggested with the use of phrases such as “seamless scalability”.

    An aspect to consider here is administrator involvement – Is the resizing automatic ? Some storage systems will add capacity to the global pool automatically, even non-disruptively (you may see the phrase “non-disruptive scalability”). Meanwhile, while others may require (optionally) intervention such as disk configuration, and data rebalancing.

With these scalability dimensions now identified, here are some additional observations:

  • A “Scalable” system need not support all three dimensions simultaneously. For example, just because a system can be built at a large scale, that does not mean it can be dynamically resized. Same is true for a system that can be built at different sizes.
  • If a “Scalable” system can be built at different scales, that does not necessarily mean it can be built at a large scale.
  • We have not closely reviewed the definition of a cluster (or a single system) with respect to scalability. Much of the definition of clustering is critically depending upon the consistency of the system (“strong” vs “eventual”, in cluster terms). We will review that later.

Scaling in The World

Scaling is not a new concept. Im a big believer in looking at existing infrastructure (other engineering disciplines) for inspiration here.

Bridges

A person would never say they can build a bridge at any scale, much less of “infinite scale”. With respect to size, consider the obvious limits on bridge lengths (See Longest Suspension Bridges in the world https://en.wikipedia.org/wiki/List_of_longest_suspension_bridge_spans).

akashi_bridge
The Akashi Kaikyõ Bridge between Kobe and Awaji Island in Japan

In a suspension bridge, the main cables suspend the deck (girder, roadway). Most of the bridge’s weight (and any vehicles on the bridge) is suspended from the cables. The cables are held up only by the towers, which means that the towers support a tremendous weight (load). The steel cables are both strong and flexible. This makes long span suspension bridges susceptible to wind forces. These days, engineers take special measures to assure stability (“aerodynamic” stability”) to minimize vibration and swaying in a suspension bridge under heavy winds. (The 1940 Tacoma Narrows Bridge is the world’s most famous example of aerodynamic instability in a suspension bridge.)

Regarding the ability to change scale, a person would never say that a bridge can have lanes added or removed at any time. Consider the effort to add only a couple of lanes to existing active bridge, the Quesnell Bridge (a girder bridge located on Edmonton’s busiest traffic corridor, the Whitemud Freeway, with volumes of more than 120,000 vehicles per day). When this section of the freeway needed widening, rerouting that traffic onto a detour was not an option. See Bridge widening project – http://www.cisc-icca.ca/projects/alberta/2011/whitemud-drive-%E2%80%93-quesnell-bridge-widening

white_main

“We needed to design a system that would be cost effective, feasible and involve minimum construction time while allowing traffic to continue to flow,” says Gary Kriviak […].

Early analysis determined that there was some reserve capacity for additional weight on the existing piers and foundations, indicating that pier cap extensions were a feasible approach for supporting a widened bridge deck. A more conventional pier widening scheme would require construction from the foundation level up.

If no reserve capacity for additional load was available, new piers and foundations would have been required. Then – do the new lanes on the new piers join the old lanes, or do we have logically seperate bridges? And if so, then how does that impact intersections on either side of the bridge? Not to mention, the planning alone here is costly. You dont just bolt on a few lanes because a data sheet says that you can.

In conclusion here, saying a bridge is “infinitely scalable (build one of any size, and change size at any time) would be “intellectually arrogant”.

Airplanes

mellon_collie_plane_clip

Ever try resizing an airplane after it was built? For example, extending the length of the fuselage to hold more capacity, or maybe bolting on an extra engine for more speed?

Although the design of an airplane is reasonably scalable (planes of different sizes sort of look the same), the laws of physics require planes of different size to be built slightly differently (different materials, fasteners, etc). Any change to one dimension has a ripple effect on the whole. To get a feel for the countless interdependencies, check out the airplane design tool at http://developer.x-plane.com/manuals/planemaker/

Obviously, dynamic plane resizing is not possible, nor can they be built of infinite size. Like bridges, saying airplane design is “infinitely scalable” (build one of any size, and change size at any time) would also be “intellectually arrogant”.

Skyscrapers

The engineering challenges of building a skyscraper is fascinating. See https://en.wikipedia.org/wiki/Skyscraper_design_and_construction

3727085263_15c9526d91_b

Source : http://wirednewyork.com/forum/printthread.php?t=21249&pp=15&page=3

  • A taller building requires more elevators to service the additional floors, but the elevator shafts consume valuable floor space. If the service core (which contains the elevator shafts} becomes too big, it can reduce the profitability of the building
  • The load a skyscraper experiences is largely from the force of the building material itself. In most building designs, the weight of the structure is much larger than the weight of the material that it will support beyond its own weight. In technical terms, the dead load, the load of the structure, is larger than the live load, the weight of things in the structure (people, furniture, vehicles, etc.). As such, the amount of structural material required within the lower levels of a skyscraper will be much larger than the material required within higher levels. This is not always visually apparent.
  • The wind loading on a skyscraper is also considerable. In fact, the lateral wind load imposed on super-tall structures is generally the governing factor in the structural design. Wind pressure increases with height, so for very tall buildings, the loads associated with wind are larger than dead or live loads.

Again… thinking “Infinite scalability” when considering the challenges of building a skyscraper is crazy. But here we can see the challenges in a bridge, or an airplane, or a skyscraper – it is harder to visualize all of this in IT.

Consistency Impacts Scaling

mellon_collie_sliding

It is important to consider the consistency of a system when discussing scaling. Specifically, in IT, Cluster Consistency impacts Scaling.

We have so far put forward some compelling cases of how “infinite scalability” is not a reality in other engineering disciplines. Yet, you may still believe that it is different in IT by thinking “Hasn’t the internet demonstrated that it is infinitely scalable (IP address limits aside) ? If it can scale to be so enormous, why cant a simple storage system?” Here, scalable is a measurement of both overall size and the ability to support non-disruptive resizeability.

For the internet, the secret to “scalability” lies in DNS. The distributed (and eventually consistent) nature of the name lookup protocol allows it to “scale well”. The name servers are not strictly consistent with each other. They may update asynchronously. What this means (and we implicitly know this) is that the internet is not a cluster with strong consistency, but only eventual consistency. A good definition of these two types of consistency can be found in https://en.wikipedia.org/wiki/Scalability

In the context of scale-out data storage, scalability is defined as the maximum storage cluster size which guarantees full data consistency, meaning there is only ever one valid version of stored data in the whole cluster, independently from the number of redundant physical data copies. Clusters which provide “lazy” redundancy by updating copies in an asynchronous fashion are called ‘eventually consistent’. This type of scale-out design is suitable when availability and responsiveness are rated higher than consistency, which is true for many web file hosting services or web caches (if you want the latest version, wait some seconds for it to propagate). For all classical transaction-oriented applications, this design should be avoided.

Many open source and even commercial scale-out storage clusters, especially those built on top of standard PC hardware and networks, provide eventual consistency only. […] Write operations invalidate other copies, but often don’t wait for their acknowledgements. Read operations typically don’t check every redundant copy prior to answering, potentially missing the preceding write operation. The large amount of metadata signal traffic would require specialized hardware and short distances to be handled with acceptable performance (i.e. act like a non-clustered storage device or database).

In other words, if the internet needed to be strongly consistent, it would likely be limited to a single NAS server sitting in a single city.

Going back to the earlier discussion on scalability in other engineering disciplines – I suppose that if you loosened the definition of a skyscraper to include a collection of buildings, interconnected with hallways in a loose manner, then you could also say that a skyscraper is “infinitely scalable”. But we wouldn’t do that – a single building is considered to be the “consistency domain”, if you will, of a skyscraper.

What we see here are tradeoffs being identified in scaling. These are summarized in the CAP Theorem.

The CAP Theorem

Excellent read : http://guide.couchdb.org/draft/consistency.html

The CAP theorem describes a few different strategies for distributing application logic across networks. CouchDB’s solution uses replication to propagate application changes across participating nodes. This is a fundamentally different approach from consensus algorithms and relational databases, which operate at different intersections of consistency, availability, and partition tolerance.

The CAP theorem […] identifies three distinct concerns [and you can have two out of the three, but not all three]:

  • Consistency

    • All database clients see the same data, even with concurrent updates.
  • Availability

    • All database clients are able to access some version of the data.
  • Partition tolerance

    • The database can be split over multiple servers.

01

The CAP Theorem is usually discussed when analyzing databases, but it is valid when analyzing storage systems as well. It is the requirement of most storage systems (clusters) to provide strong consistency that ultimately limits their scalability (size and resizing flexibility. A tip of the hat is deserved here to the application developers who seem to have better recognized those laws of the universe than many platform techies who are preaching “infinite scalability”.

Hadoop

There are some false beliefs that Hadoop is also infinitely scalable. (I wont include references to those docs – there are too many companies and names to list). No surprise, it is not. (Thank you again NetApp, the Hadoop guide using eSeries does not have a single use of the word “infinite” – see http://www.netapp.com/us/media/tr-3969.pdf) One compromise that can be made when designing a Hadoop cluster is to limit the object size to large objects. By avoiding small objects, the absolute total capacity in bytes is maximized since the object count (metadata) is constrained in a strongly consistent storage cluster.

Distributing the metadata in Hadoop also helps, but it is still not infinite – See http://www.smartdatacollective.com/michelenemschoff/191151/how-maximize-performance-and-scalability-within-your-hadoop-architecture

The default architecture of Hadoop utilizes a single NameNode as a master over the remaining data nodes. With a single NameNode, all data is forced into a bottleneck. This limits the Hadoop cluster to 50-200 million files.

The implementation of a single NameNode also requires the use of commercial-grade NAS, not budget-friendly commodity hardware.

A better alternative to the single NameNode architecture is one that uses a distributed metadata structure. A visualized comparison of the two architectures is provided below:

The fact that massive hyperscalers have used Hadoop feeds the infinite scalability myth. While true, many actually have multiple Hadoop system that are loosely coupled (i.e. – providing eventual consistency, not strong consistency), and then redirectors are used to map storage and tasks to the appropriate cluster.

Quantifying Scalability

So we have established that scal(ing) is never “infinite”. But how is scaling / scalability measured?

I love the article “Scalability is not Boolean”, by Udi Dahan (See http://www.udidahan.com/2011/12/29/the-myth-of-infinite-scalability/)

The first issue with scalability is the use of the word as an adjective: scalable. [As in:] “Is the system scalable?” Or the similar verb use: “Does it scale?”

The problem here is the implication that there is a yes/no answer to the question [of scalability].

Scalability is more than a boolean Yes or No, or speed. It is complicated. Here are aspects of measuring it. This list can be a long blog post unto itself, maybe someday in the future:

  • Speed of Scaling
  • Linear vs. Non-Linear
  • Relativity of Size
  • Granularity of Scale
  • Quantity of Disruption during Change of Size
  • Amount of Intervention / Management / Planning of Change of Size
  • Risk / Availability Consistency
  • Part Flexibility

In Conclusion

The bottom line is that IT is not that different, and in fact could stand to learn a lot, from the challenges faced with scalability in other established engineering disciplines.

Infinite Scalability, in particular of a system that requires some type of strong consistency, is not just possible. The moment you hear the phrase, it a good LOL is AOK. Its marketing.

And when a system is described as scalable, stop and consider what dimension is being considered, as the term “scalability” is rather overloaded, nor is there any standard consensus on how to quantity if.

Thank you for reading, and good luck with it!

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s