Banner
Backup D.R. Replication Virtualisation Hardware/Media Privacy

Current Filter: Storage>>>>>>

PREVIOUS

   Current Article ID:4678

NEXT



Big data - no big deal?

Editorial Type: Opinion     Date: 09-2014    Views: 2993   





Geoffrey Noer, VP, Product Management at Panasas offers a guide to high-performance NAS in an increasingly data-centric world

Businesses are coming under growing pressure to utilise their data to give them a competitive edge. It is no surprise that more companies are looking to derive additional value from their data, be it research, financial modelling or designing. To store the sheer volume of data is just one headache, but the need to process it into something meaningful is another.

Today, the high performance computing (HPC) specialists are driving the technologies that enable the handling of immense data volumes. They rely on their IT infrastructure as the core to their business. Failure to efficiently and cost-effectively address the challenges of data processing inhibits innovation and discovery - and ultimately for some companies, bottom-line financial results. So how do these specialists manage their high volume and high speed?

BIG DATA, BIG PRESSURES
Large volumes of unstructured data, often referred to as "big data", are increasingly being processed in HPC environments such as found at biotech. Biological research into the human genome has led to a new dawn of scientific research, only made possible with the advent of HPC. It is not just the biotech industry harnessing HPC power, as other research laboratories along with the oil & gas industry, among many others, are collecting additional data for more accurate results. It is therefore no coincidence that around every three years the volume of required storage doubles.

True scale out network attached storage (NAS) offers a seamless step to increasing capacity, as projects and demand grows. This scalability ensures protecting the company's initial storage investment if it can easily scale-out with no performance or management penalties. However, storing data is one thing - managing it is another.

PERFORMANCE, QUICK AS A FLASH
The reality is that flash is very fast, especially for small files. However, today, an all-flash storage solution is impractical and costly for almost all multi-petabyte deployments. Conversely, a disk only solution will struggle with performance with a workload of millions of small files and associated file metadata. Consequently, the optimal solution is a hybrid system that gives the performance of flash for small files and metadata while delivering the low-cost and high capacity of disk. To ensure maximum effectiveness, the hybrid architecture needs to be built on multiple storage tiers that are intelligently managed for maximum performance in relation to cost:

• Tier 1: Fast RAM
• Caching for all files (read-ahead and write-behind for data and metadata)
• Tier 2: Flash storage
• Solid State Disk (SSD) drives accelerate small files and metadata performance
• Tier 3: Enterprise SATA drives
• Large files reside on cost-effective enterprise class SATA drives
• RAID striping for performance and data protection

Scale-out hybrid architectures with flash and hard drives are emerging to tackle a much wider set of workload requirements by addressing small file and large file workloads within a single integrated architecture. Handing mixed workloads with a wide variety of file sizes is critical to achieve high real-world performance. The proof point for this is that almost all technical workloads have a substantial component of small files., so having a hybrid platform makes economic and performance related sense.

A proper architecture for storage components is only the foundation as large volumes of unstructured data must be quickly processed to accelerate workflows. To achieve maximum performance a scale-out storage architecture should incorporate as many performance elements as possible - while avoiding legacy NAS head approaches that serialise data access and limit linear scalability.

The most effective scale-out NAS architectures will include some or all of the following:

• Parallel data transfer
• Fully parallel file system with parallel data path
• Flash Acceleration for small files
and metadata
• High IOPs for small file workloads
• Overall faster file processing
• Direct data access
• Enable clients to directly access storage devices without the performance bottlenecks that are common in legacy NAS head architectures
• Automated load balancing
• Optimises performance by eliminating hotspots
• Linear performance scaling
• Eliminates the need for multiple islands of storage



Page   1  2

Like this article? Click here to get the Newsletter and Magazine Free!

Email The Editor!         OR         Forward ArticleGo Top


PREVIOUS

                    


NEXT