Bursting NetApp Volumes to the Cloud? Avoid this Common Error.

The resource elasticity of public clouds offers unique opportunities for enterprises to increase efficiency and gain economic benefit by bursting their workloads to the cloud on demand.  EDA, Life Sciences, and Machine Learning are but a few examples of disciplines where cloud bursting can deliver massive infrastructure and business advantages.


Why burst to cloud?

The cloud offers elasticity – Expanding on-premises resources entails a long bring up time and, if the needs are temporary, winding down that infrastructure can also be costly. As discussed here the elasticity of cloud translates to economic value for your project.

The cloud offers scalability – Elasticity is an important attribute, but critical value lies in the ability to elastically scale to the sizes you desire, while also delivering linear performance growth. For example, if a platform can elastically scale from 10TB to 100TB, but your project requires bursting to 150TB, then that platform will simply not do. In some cases, the inability to predict how much scale you will need (“How successful will that Pokémon Go app be?”) is a key driver for the move to cloud. The beauty of cloud is that you don’t need to know in advance, as the platform will allow you to scale on demand and to the size that you need.

The cloud offers freedom of choice – Since you typically lease or buy vendor hardware for your on-premises environment, you are more than likely locked to certain vendors for relatively long periods of time. A properly architected cloud bursting solution can enable you to avoid similar issues in the cloud.  Prices differ between cloud vendors and periodically change, particularly when using spot or preemptive instances. Even within a given cloud provider, prices can significantly differ between regions. When bursting for today’s chip design project you might want to use region A, for the next project, two months later, you may want to chose region B. Don’t lock yourself in.


OK, I’ve decided to burst to cloud…now what do I do with my data?

Having realized that enterprise compute is being bursted to cloud, multiple storage and data management vendors are rushing to try to address corresponding requests support bursting for file-centric workloads. It is important to note, however, that not all in-cloud NAS solutions are the same. Selecting the right storage infrastructure for your cloud bursting needs is critical to the success of your project.


The different approaches to in-cloud file access

Broadly speaking, solutions that provide cloud compute instances with access to file data, fall into two categories:

“Near Cloud” box-based solutions – Some legacy vendors chose to service cloud customers by placing their hardware box physical close to cloud infrastructure. In many cases, the box will be placed in hosted locations with connectivity to the cloud vendor’s data center. Applications running on compute instances in the cloud would then be tunneled to the storage box for file services. While a new GUI or new APIs may be offered, at the end of the datapath, a legacy box is servicing the file system requests.

In-cloud, scale-out file systems – At Elastifile we chose a different approach. We decided to re-architect file storage from the ground up to fully leverage the advantages of cloud. Under this cloud native approach, the same elastic, cloud-based resources are both leveraged by cloud compute and used as the underlying infrastructure for a dynamically scalable file system. Being a truly software-defined solution, features and capabilities (such as, distributed file access and cloud optimized data management) are also provided on demand using the cloud-native architecture.


Choosing the right solution for bursting file workloads to cloud

It might seem that no choice is necessary. Some self-appointed advisors may say, “Since you already selected storage on-premises, you might as well just use the same approach in the cloud”. Take care. In the immortal words of Admiral Ackbar: “It’s a trap!”.

Needless to say, a better way to select the data management solution is to gauge the offering based on how it suits the task at hand.

In the case of cloud bursting, look to the reasons you are bursting to cloud, and pick the architecture that will help you achieve your goals:

Is the solution elastic? Can you increase and decrease usage easily, on demand, and pay only for what you are using? Some box-based solutions will require you to pre-purchase large amounts of storage to cover the box costs.  Alternatively, they will provide inflated on-demand prices designed to cover the vendors’ own box-based expenses. A software-based scale-out file system is elastic by design. Nodes can be spun up on demand to increase capacity or performance, and those nodes can also be spun down, at any time, for cost savings.

Can the solution scale? With box-based solutions, file system requirements may outgrow the confines of the box (or the small cluster of boxes) offered by the vendor on location. Be sure to ask your vendor for the maximum file system size offered in-cloud.  In contrast, a software-defined distributed file system can scale as needed in the cloud, with no hardware-centric restrictions. Add as much compute power as needed for your application, and match with as many resources as needed for the file system.

Are you offered freedom of choice?  Box-based solutions are limited to select regions where the boxes are physically nearby. Want to burst to a different region with lower compute costs? You’re out of luck. That is, unless you chose a cloud native software defined solution. Which, when well architected, can use the resources available in any region. Try it out here.


Still not sure if a solution you’re being offered fits your cloud needs?  As the old sleuth adage goes, “follow the datapath”. If you find yourself boxed in, you’ll know that something is amiss.