Banner
Backup D.R. Replication Virtualisation Hardware/Media Privacy

Current Filter: Storage>>>>>Comment>

PREVIOUS

   Current Article ID:4265

NEXT



Big data, copy data, junk data?

Editorial Type: Comment     Date: 05-2014    Views: 1426   




As ever, this issue of Storage magazine features a wide selection of case studies that illustrate some best practices and highly innovative uses of the technologies at the heart of the industry

One that really struck me was the piece on page 28 from NetApp about the national synchrotron facility in Oxfordshire. Much like the large hadron collider in Switzerland, this is an example of the kind of research setup that would have been simply impossible to envisage without the massive advances in storage and data management technologies of recent years. The Diamond Light Source project is using flash arrays to provide a 'high speed data buffer' - and interestingly, it is also using tape at the other end of the process (longer term storage) simply because of the vast volumes involved. As is so often the case with current end user stories, one of the key issues for this organisation was scalability and ease of expansion - crucial in this big data age.

Elsewhere in this issue though we see an opposing viewpoint from Actifio's CEO, Ash Ashutosh, who - while acknowledging the impact of big data - suggests that most businesses have a more pressing concern in copy data. As he explains: "If only 3% of data stored is 'big', what makes up the rest of it? It turns out that the real problem is data proliferation… The vast majority of stored data are extra copies of production data created by disparate data protection and management tools like backup, disaster recovery, development and testing, and analytics. According to IDC, global businesses will spend US$46 billion to store extra copies of their data in 2014. This 'copy data' glut in data centres costs businesses money, as they store and protect useless copies of an original."

Perhaps we as an industry are happy to maintain a vision of an ever expanding data universe, simply because it allows us to develop and sell ever faster/larger solutions to address that need; but are we missing the key point with regard to whether all that data actually needs to be stored (potentially in numerous repeated locations) at all? Compliance and governance requirements have helped the IT industry to convince the business world that everything must be kept at any cost, 'just in case'. But maybe more effort should be put into deciding which datasets are genuinely unique, before allocating more and more resources to storing them?

David Tyler
david.tyler@btc.co.uk

Like this article? Click here to get the Newsletter and Magazine Free!

Email The Editor!         OR         Forward ArticleGo Top


PREVIOUS

                    


NEXT