Banner
Backup D.R. Replication Virtualisation Hardware/Media Privacy

Current Filter: Storage>>>>>>

PREVIOUS

   Current Article ID:4025

NEXT



Backup: Spoilt for choice?

Editorial Type: Strategy     Date: 03-2014    Views: 3315   





Recent years have seen a plethora of new technologies and approaches to the backup market. Bill Andrews, CEO of ExaGrid Systems, breaks down the options

For 50 years organisations have used a backup application writing to tape for their disaster recovery. Over the past decade the backup market has seen the introduction of many new technologies. This article discusses each of the backup storage media approaches and their fit within IT organisations so you can easily determine which will best meet your business needs.

TAPE BACKUP
In the 1980s and 1990s, all organisations backed up to tape onsite and made tape copies to go offsite. The amount of data in the backup both onsite and offsite is anywhere from 40 to 100 times the primary storage data due to customers keeping weeks to months of retention onsite and months to years of retention offsite. As a result, tape continues to be used because the cost per GB is far less than disk.

DISK STAGING
Most organisations today use disk in between the backup application and the tape library for faster and more reliable backups and restores. This is called "disk staging." Since disk is still more expensive than tape, most organisations then - and still now - only keep one to two weeks of retention on disk staging onsite and then keep the longer-term retention on tape. It's not uncommon to see two weeks of retention onsite on low-cost disk, eight to ten weeks onsite on tape, and longer-term retention of months to a year offsite on tape.

DATA DEDUPLICATION
Before we get into the remaining solutions, we need to explain data deduplication technologies. Due to the cost of disk, organisations cannot afford to completely replace tape with disk. However, a new technology emerged called "data deduplication." Backups are highly redundant since each week the same data is backed up; only about 2% of the data changes from week to week. Instead of storing all the data for each backup, data deduplication breaks the data into fixed blocks, zones, or bytes and then compares them to find the data that has changed. Disk with data deduplication will use about 1/20th of the space that 'straight' disk without data deduplication will use. However, it is not as simple as just adding data deduplication to disk, since how data deduplication is implemented can greatly impact backup and restore performance, backup window length, and total cost.

DATA DEDUPLICATION IN THE BACKUP SOFTWARE
Many backup applications have added data deduplication into the backup clients/agents, into the media server, or both. These implementations are adequate for small amounts of data or low retention of one to four weeks, but are challenged as the data and retention grow. The reason is that data deduplication is very processor - and memory - intensive. Because this process is so compute-intensive, the backup software implementations will use far less aggressive algorithms so that backup performance isn't impacted. Instead of using very granular blocks, zones, or bytes, the backup software implementation uses much larger fixed block sizes, such as 64KB or 120KB. The combination of larger blocks and fixed blocks produces a deduplication rate of less than 10 to 1 which is far less than a dedicated purpose-built appliance. The amount of disk and WAN bandwidth used over time is much greater and will cost more than a dedicated deduplication appliance.

FIRST GENERATION - INLINE DATA DEDUPLICATION SCALE-UP APPLIANCES
Dedicated disk-based backup appliances with data deduplication employ far more aggressive data deduplication algorithms and achieve a much higher deduplication rate with an average of 20 to 1. These appliances use far less disk and bandwidth than deduplication performed in software and, as a result, are far less costly. Many dedicated disk-based backup with data deduplication appliances deduplicate the data inline, or "on the fly," on the way from the media server to the appliance. This approach can slow down the backups, as a highly compute-intensive process is being performed during the backup. In addition, when the data lands on disk, it is already deduplicated, making restores, offsite tape copies, and instant VM recoveries slow since the data needs to be put back together, or "rehydrated," every time. In addition, as the data grows, the backup window also expands since a scale-up architecture has a head-end controller and only adds capacity as data grows. Eventually, the backup window becomes so long that the front-end controller needs to be upgraded with a larger, faster, and far more expensive controller, called a "forklift upgrade."



Page   1  2

Like this article? Click here to get the Newsletter and Magazine Free!

Email The Editor!         OR         Forward ArticleGo Top


PREVIOUS

                    


NEXT