Blog index > Archives > Posts Tagged ‘NetApp’
avatar

Deduplication just went ballistic with Aggregate Inline Deduplication

Wednesday, September 27th, 2017

So with the release of ONTAP 9.2 came a long desired feature of being to deduplicate data at the aggregate level and not just volume level.

So why is this so key, well lets look at the historical view of how dedupe worked and its pitfalls. As a feature dedupe has always been a great feature but its biggest downfall was that if I sent an attachment to many colleagues there is a strong possibility that they will all want to save it for future use or reference. Now if your lucky they may all save the same file into the same volume in which case you win as dedupe will effectively reduce all instances to just one. Unfortunately we know that with IT we are never that lucky and as a result I may have some instances in a volume but more instances in other volumes in which case on a per volume basis I get the benefit but I still loose out. With ONTAP 9.2 this has been addressed so that you can now dedupe at the aggregate level, so you may have multiple copies of the file located in many volumes, if those volumes are located in the same aggregate then aggregate dedupe can take all instances and reduce them to a single instance. Now you really do win with Dedupe. From a simplified technical concept….If you’re not familiar with deduplication, it’s a feature that allows data blocks that are identical to rely on pointers to a single block instead of having multiple copies of the same blocks.

This is all currently done inline (as data is ingested) only, and currently  on All Flash FAS systems by default. The space savings really show with workloads such as ESXi datastores, where you may be applying OS patches across multiple VMs in multiple datastores hosted in multiple FlexVol volumes but all within an aggregate. Aggregate inline deduplication brings an average additional ~1.32:1 ratio of space savings for VMware workloads. Who doesn’t want to save some space?

Should you require any information on training from Fast Lane, please contact us on:NetApp

Phone: 0845 470 1000

enquiries@flane.co.uk

 

No Comments
avatar

NetApp ADP – Overview

Tuesday, October 4th, 2016

Since the “GA” release of Clustered Data ONTAP (CDoT) 8.3 and subsequent releases you are able to utilise NetApp’s Advanced Drive Partitioning (ADP).

Today ADP has two main implementations: Root-data partitioning and Flash Pool SSD Partitioning.

Root-Data Partitioning

Before ADP was released in 8.3 the smaller systems had an issue with excessive drive usage purely to get the system running. Best practice says that each node requires a 3 drive root aggregate (that’s 6 out of the 12 drives gone), plus 2 host spares (now you’ve lost 8) leaving 4 drives. So your options were extremely limited. If you take the total number of drives and divide by the number of data drives this gives a storage efficiency of ~17% (this does not sit well with customers).

NetApp recognise that this was a real sticky point for customers and as such they introduced ADP for the lower end systems and All Flash FAS arrays to regain the competitive edge to these units.

Without ADP you have an active-passive configuration:

Being Active-Passive means that only one node of the pair is actively accessing the data aggregate and should that controller fail then its partner will then take over and continue operations. This for most customers was not ideal.

In the above example we lose a total of (8) drives to parity and at least (1) drive per node is used as a hot spare.  This leaves only (2) drives to store actual data which nets 1.47TB of usable capacity. Now with ADP, instead of having to dedicate disks for the nodes root aggregate we can now logically partition the drive into two separate segments, a smaller segment to be used for the root aggregate and larger segment for data.

adp_high_lvl

With ADP you now have either active-passive or active-active:

By using ADP you now have significantly more useable capacity for data and using an active-passive configuration where you now use all remaining space for data again if you take the total number of drives and divide by the number of data drives this gives a storage efficiency of ~77% (this is more palatable for customers).

adp_active_passive_low

You can also create two data partitions and run with active-active, however you lose more capacity for the parity of two data partitions compared with just one. So you will go from ~77% to about ~62%.

adding-drives-adp

Constraints to consider before leveraging Root-Partitions 

  • HDD types that are not available as internal drives: ATA, FCAL and MSATA
  • 100GB SSDs cannot be leveraged to create root partitions
  • MetroCluster and ONTAP-v do not support root partitions
  • Aggregates composed or partitioned drives must leverage Raid-DP
  • Supported only on entry (2240, 2550, 2552, 2554) and All Flash FAS systems (systems with only SSDs attached).
  • Removing or failing one drive will cause raid rebuild and slight performance degradation for both the node’s root aggregate and underlying data aggregate.
  • Aggregates composed of partitioned drives must have a RAID type of RAID-DP.

Flash Pool SSD Partitioning

Previously Flash Pools were created by dedicating a sub-set of SSDs in either a Raid-4 or Raid-DP raid group to an existing spinning disk aggregate.  In clusters that may have multiple aggregates this traditional approach is very wasteful. For Example, if a 2-node cluster had (4) data aggregates, (3) on one node and (1) on the other the system we would require a minimum a total of (10) drives to allow for only (1) caching drive per data aggregate.  If these SSDs are 400GB SSDs then each aggregate would not only 330GB (right sized actual) of cache out of the 4TB total RAW.

With CDoT 8.3 a new concept “Storage Pools” were introduced, which increase cache-allocation agility by granting the ability to provision cache based on capacity rather than number of drives. Storage Pools allows the administrator to create one or more logical “pools” of SSD that is then divided into (4) equally divided slices (allocation units) that can then be applied to existing data aggregates across either node in the HA pair.

storage-pool

Using the same example as previously described, creating one large Storage Pool with (10) 400GB SSD drives we would net a total of (4) allocation units each with 799.27GB of usable cache capacity that can then be applied to our four separate aggregates.

By default when a storage pool is created (2) allocation units are assigned to each node in the HA pair.  These allocation units can be re-assigned to which ever node/aggregate needed them as long as they sit within the same HA pair. Thereby making better use of your SSD storage pool, compared to the previous method where those SSD would be allocated to one aggregate and there they would remain.

 

peter-green

 

Pete Green
Fast Lane Lead NetApp Expert

No Comments
avatar

Triple parity Raid – WHY?

Wednesday, September 21st, 2016

Ever since the dawn of disk drives we have acknowledged and accepted that magnetic media is not infinitely stable and reliable and as such we have come up with many ways to protect the data we store on disk.

We have the basics of Raid 1 where we just mirror the data to another drive. This allows a drive to fail and we still have all the data, however this type of solution is expensive as we double the amount of storage required.

So moving forward we came to Raid 4, and 5. Both give a parity bit to enable the rebuild of any failed drive to a hot spare drive without the cost of replicating everything. This was later enhanced with Raid 6 and RaidDP, both of which allowed for two concurrent drive failures without the loss of access and those failed drives could be successfully rebuilt.

So why did we go from one parity to two. Well this is mainly due to the fact that we are using less space to store more and as such the drive mechanics are far more precise with very little tolerance compared to older style drives where the data was written to larger areas of disk and generically had higher tolerance.

We are now breaching new realms with the higher capacity drives they have some underlying quirks to them. Firstly, being larger capacity means typically that they take longer to rebuild when you have a failure, and this extended rebuild times extends your vulnerability to consecutive failures.

So we need better protection, especially when you consider that we are currently seeing a higher read error rate with SSD drives deployed throughout data centres.dsc_0124

So although today the thought process may hold you back from this additional level of protection, it will become more common practice to add another level of protection especially with the likes of the 15TB SSD drives. With rebuild times being around 12 hours for 15TB and with drive capacities continuing to grow it won’t be long before larger drives are available and in turn the rebuild for them will probably also extend in relation to the additional capacity.

Should you encounter a read error on a disk that is part of a Raid configuration then you are most probably never going to notice it as this error will be hidden or masked by having the parity checksum in a dual parity configuration, but to sustain operations when there are two read errors can only be done with a third parity bit. With that in mind especially with higher capacity and extended rebuild times it makes perfect sense to evolve and utilise the triple parity.

NetApp® have with OnTap 9 release RAID-TEC™ which serves two purposes. Firstly it gives the added parity protection in Raid with a Third parity bit, but also the reconstruction method has been redeveloped so that during a drive rebuild there is no performance degradation, this means you keep your performance and have a higher level of protection giving a higher level of confidence.

The new RAID-TEC™ (Triple Erasure Encoding) gives more usable space and better protection of large drive sizes. Another aspect of this that for any existing RAID-DP aggregates they will be able to convert to RAID-TEC non-disruptively, which is a major plus point in comparison to the older Raid4 and RAID-DP the drive counts are more conducive to storage efficiency.

 

peter-green

Pete Green
Fast Lane Lead NetApp Expert

No Comments
avatar

Securing your data beyond the physical realms using Storage Encryption

Tuesday, September 20th, 2016

For most organisations having Raid protected storage is a given but what if your need complete piece of mind that your data is protected and “SECURE”

For some organisations Storage Encryption is not considered, and this can be for several reasons. It could simply be that there is a perception that the Encryption process will have an unwanted overhead, which may be deemed counterintuitive. It could be that there is just a lack of understanding as to what storage Encryption is and what it can give you.

So let’s take a look at what NetApp® offers.dsc_0122

NetApp® Storage Encryption (NSE) provides full-disk encryption and what’s more they do it without compromising storage efficiency or performance using self-encrypting drives supplied from some of the leading drive producers. NSE has the beauty of being a non-disruptive process that gives a comprehensive, cost effective, hardware-based level of security that has a very simplistic approach in its operation and usage. Although it is a simple solution to use it does not detract from its compliance with government and industry regulations. There is also no compromise on storage efficiency.

NetApp® uses full disk encryption (FDE) capable disks.  Data is not encrypted external to the disk drive itself – this is truly data at rest only.  Once in the controller or on the network, data is not encrypted. What makes this so good is that the encryption engine is built into the disk so all encryption takes place at close to line speed and therefore it does not give a performance penalty so whether your system uses encryption or not the performance will be the same. It is fair to say that encryption disks cost more then it will be a price point as to whether the cost is justified.

FDE disks have a requirement for a key to be generated and pushed down to the disk to enable encryption of data. FDE capable disks are available in a varying sizes from 600GB -1.2TB performance, 800GB and up SSDs, but there is the added advantage that if someone steals one drive or a complete set of drives without the key it is impossible to read the data.

So what exactly does NSE offer:

NSE supports the entire suite of storage efficiency technologies from NetApp. This includes array-based AntiVirus scanning, Deduplication, inline and post process compression. It also supports the SafeNet KeySecure encryption-key appliance, which strengthens and simplifies long-term key management. NSE complies with the OASIS KMIP standard and helps you comply with FISMA, HIPAA, PCI, Basel II, SB 1386 and E.U. Data protection Directive 95/46/EC regulations using FIPS 140-2 validated hardware

 

peter-green
Pete Green
Fast Lane Lead NetApp Expert

No Comments