Banner
Backup D.R. Replication Virtualisation Hardware/Media Privacy

Current Filter: Storage>>>>>>

PREVIOUS

   Current Article ID:2654

NEXT



Attack of the clones

Editorial Type: Opinion     Date: 07-2013    Views: 2492   





Multiple copies of primary data could be an unrecognised contributor toward rising storage costs, suggests Peter Eicher, Senior Product Specialist, Data Protection, Syncsort - but a solution is available

Anyone involved in information technology will be aware that data growth rates have been exploding, driving up not only spending on storage hardware, but also all the associated operational costs of power, cooling, data centre footprint, etc. But data growth rates are not the only thing forcing IT departments to increase storage spending. An often unnoticed cause of storage growth is the creation and maintenance of multiple copies of primary data.

The analyst firm IDC refers to this problem as the "copy data" challenge, and they have compiled some interesting data points:

• More than 60 per cent of all enterprise disk capacity worldwide is filled with copy data
• By 2016, spending on storage for copy data will approach $50 billion and copy data capacity will exceed 315 million terabytes.

Shocking numbers! This raises the obvious question: if 60 per cent of your disk capacity is filled with copies of data, what is causing this? And what can you do about it?

The short answer is this: you are storing lots of data copies because groups within your IT department are demanding them. IT departments have numerous "data consumers" that require access to information. There are many reasons they need it, and all are valid. Some need copies to run reports and data analytics. Copies are used so the number-crunching doesn't affect production systems. Software development and testing teams are always hunting for recent sets of data to work with. Regulatory search and compliance teams request copies.

The problem is made worse because these data consumers tend to work separately. When Group A asks for a copy of a large data set, it doesn't know or care that Group B also asked for that same data. So copies multiply across your organisation, and since there is no central management they tend to get lost in the day-to-day operations of busy IT staff. Many copies will end up sitting on spinning disk for months or even years, taking up valuable storage space but serving no purpose.

The situation can become quite extreme. IDC's research found some organisations with more than 100 copies of the same data! With this kind of waste going on, it's no surprise you are writing larger and larger checks to storage vendors every year.

WHAT'S THE CURE FOR TOO MANY COPIES?
The best way to handle growing numbers of data copies is a technology approach that has recently been referred to as Copy Data Management. It's not fundamentally new technology, but rather a way of using existing technology specifically to manage copy data.

Copy Data Management begins by centralising your data protection onto a single system and then using that system as the source for copies. The key is to be able to make use of data in a way that doesn't require copying data.

Disk vendors have supported this for a long time by providing zero-footprint snapshot clones. Clones are read/write-addressable volumes that are created "virtually" by the use of data pointers. These clones can be mounted and accessed without any additional storage being consumed (other than any new data written to the clone). NetApp is the disk vendor that has historically done the most to promote this method, with its FlexClone technology, but many other disk vendors have followed suit. Yes, there are implementation differences, but the goal is to provide access to data quickly and without the need to create file-by-file copies.

But disk array clones alone don't solve the problem because data is not centralised. If your IT shop has several disk vendors on the floor (a common occurrence) then each array will have its own toolkit and it will be difficult to manage centrally. You can also run into organisational issues with sharing data across organisational silos. The development team may still want to copy the data to their own storage systems before they work on it, and so on. Finally, if you perform read/write actions on a clone it's not "free" - it has an impact on your primary storage response rates, possibly slowing production systems.



Page   1  2

Like this article? Click here to get the Newsletter and Magazine Free!

Email The Editor!         OR         Forward ArticleGo Top


PREVIOUS

                    


NEXT