The thirst for Data and how we quench it?
I heard that NetApp was creating a new distributed file system that could evolve how NAS works, out of curiosity I started to look into it. I was curious before with Infinite Volumes and did we really need them. But this evolutionary step is exciting.
Now that ONTAP 9.1 is available, I thought it was about time I looked into FlexGroups in more detail.
There’s an excellent official Technical Report, TR-4557 – NetApp FlexGroup Technical Overview.
Data is growing.
When I started in the storage industry a 4GB disk was absolutely massive and the storage systems I worked on at the time could utilise 45 x 4GB drives, but that was just ridiculous, who on earth would need that much storage. The concept of 1TB of storage was just a wish list dream concept. So it’s no great secret to see that over the years our thirst for storage has grown. 1TB was a pipe dream, then 100TB became the pipe dream and then 1PB became a pipe dream, and well now we are pipe dreaming of Exabyte’s.
So file systems are getting bigger along with the size of the datasets. To give an example, my first digital camera …. A single photo would take up less than 1MB of space now a reasonable camera a single photo can 10MB or more. So our evolving data is getting thirsty for data space. So NetApp have looked at this closely and taken a visionary approach and designed FlexGroups to address this now and the future.
FlexGroup has been designed to solve multiple issues for large-scale NAS workloads.
- Capacity – Scales up to 20 petabytes
- High file counts – Up to 400 billion files
- Performance – parallelized operations in NAS workloads, across CPUs, nodes, aggregates and constituent FlexVols
- Simplicity of deployment – Simple to use GUI in System Manager; avoid having to use junction paths to get larger than 100TB capacity
- Load balancing – Use all of your storage resources for a dataset
- Resiliency – Fix metadata errors in real time without taking downtime
So if you have a heavy NAS workload and you want to be able to through all your available resources at it then quench your thirst with a FlexGroup.
How does a FlexGroup work?
FlexGroup essentially takes the concept of a FlexVol and simply enhances it by joining multiple FlexVol member constituents into a single namespace that acts like a single FlexVol to clients and storage administrators.
To a NAS client, it would look like this:
Files are written to individual FlexVol constituents across the FlexGroup. Files are not striped. The amount of concurrency you would see in a FlexGroup would depend on the number of constituents you used. Right now, the maximum number of constituents for a FlexGroup is 200. Since the max volume size is 100TB and the max file count for each volume is 2 billion, that’s where we get our “20PB, 400 billion files” number. Keep in mind that those limits are simply the tested limits – theoretically, the limits are able to go much higher.
So unlike a standalone volume, with FlexGroup ONTAP will which of the constituent members will be the best location to store that write, this is working to keep the FlexGroup balanced without a performance penalty.
So how do I win?
When NAS operations can be allocated across multiple FlexVols, we don’t run into the issue of serialization in the system. Instead, we start spreading the workload across multiple file systems (FlexVols) joined together (the FlexGroup). And unlike Infinite Volumes, there is no concept of a single FlexVol to handle metadata operations – every member volume in a FlexGroup is eligible to process metadata operations.
That way, a client can access a persistent mount point that shows gobs of available space without having to traverse different file systems like you’d have to do with FlexVols.
It’s been tribal knowledge for a while now to create multiple FlexVols in large NAS environments to parallelize operations, but we still had the issue of 100TB limits and the notion of file systems changing when you traversed volumes that were junctioned to other volumes. Plus, storage administrators would be looking at a ton of work trying to figure out how best to layout the data to get the best performance results.
Now, with NetApp FlexGroup, all of that architecture is done for you without needing to spend weeks architecting the layout.
So how is it faster?
In preliminary testing of a FlexGroup against a single FlexVol, we’ve seen up to 6x the performance. And that was with simple spinning SAS disk.
Adding more nodes and members can improve performance. Adding AFF into the mix can help latency.
In the first release of NetApp FlexGroup, we’ll have access to snapshot functionality. Essentially, this works the same as regular snapshots in ONTAP – it’s done at the FlexVol level and will capture a point in time of the filesystem and lock blocks into place with pointers. Because a FlexGroup is a collection of member FlexVols, we want to be sure snapshots are captured at the exact same time for filesystem consistency. As such, FlexGroup snapshots are coordinated by ONTAP to be taken at the same time.
How do you get NetApp FlexGroup?
NetApp FlexGroup is currently available in ONTAP 9.1RC1 for general availability.
- NFSv3 and SMB 2.0/2.1 (RC2 for SMB support)
- Thin Provisioning
- User and group quota reporting
- Storage efficiencies (inline deduplication, compression, compaction; post-process deduplication)
- OnCommand Performance Manager and System Manager support
- All-flash FAS (incidentally, the *only* all-flash array that currently supports this scale)
- Sharing SVMs with FlexVols
- Constituent volume moves
- 20 PB, 400 billion files
How can a FlexGroup be enhanced?
While FlexGroup as a feature is awesome on its own, there are also a number of ONTAP 9 features added that make a FlexGroup even more attractive.
The benefit that can be added with a FlexGroup right out of the box include:
- 15 TB SSDs
- Per-aggregate CPs
- RAID-TEC – triple parity to add extra protection to your large data sets
So is that it ? not by a long shot, there are lots of rumours about other enhancements coming so keep your eyes open and get ready to quench your data thirst.
Fast Lane Lead NetApp Expert