Old page wikitext, before the edit (old_wikitext ) | '{{about|backup in computer systems|other uses}}
{{Use dmy dates|date=August 2018}}
In [[information technology]], a '''backup''', or the process of backing up, refers to the copying into an [[archive file]] of computer [[data]] so it may be used to restore the original after a [[data loss]] event. The verb form is [[wikt:back up|"back up"]] (a [[phrasal verb]]), whereas the noun and adjective form is [[wikt:backup|"backup"]].<ref name="AHDictionaryBackup">{{cite web |title=back•up |url=https://www.ahdictionary.com/word/search.html?q=backup |website=The American Heritage Dictionary of the English Language |publisher=Houghton Mifflin Harcourt |accessdate=9 May 2018 |year=2018}}</ref>
Backups have two distinct purposes. The primary purpose is to recover data after its loss, be it by [[File deletion|data deletion]] or [[Data corruption|corruption]]. Data loss can be a common experience of computer users; a 2008 survey found that 66% of respondents had lost files on their home PC.<ref>[http://www.kabooza.com/globalsurvey.html Global Backup Survey] {{Webarchive|url=https://web.archive.org/web/20100327235844/http://www.kabooza.com/globalsurvey.html |date=27 March 2010 }}. Retrieved 15 February 2009</ref> The secondary purpose of backups is to recover data from an earlier time, according to a user-defined [[data retention]] policy, typically configured within a [[Backup software|backup application]] for how long copies of data are required.<ref name="NelsonPro11">{{cite book |url=https://books.google.com/books?id=r4uEEsq3CJYC&printsec=frontcover |title=Pro Data Backup and Recovery |chapter=Chapter 1: Introduction to Backup and Recovery |author=Nelson, S. |publisher=Apress |pages=1–16 |year=2011 |isbn=978-1-4302-2663-5 |accessdate=8 May 2018}}</ref> Though backups represent a simple form of [[disaster recovery]] and should be part of any [[disaster recovery plan]], backups by themselves should not be considered a complete disaster recovery plan. One reason for this is that not all backup systems are able to reconstitute a computer system or other complex configuration such as a [[computer cluster]], [[active directory]] server, or [[database server]] by simply restoring data from a backup.<ref name="CougiasTheBackup03">{{cite book |url=https://books.google.com/books?id=eLviiTag5A0C&pg=PA1 |title=The Backup Book: Disaster Recovery from Desktop to Data Center |chapter=Chapter 1: What's a Disaster Without a Recovery? |author=Cougias, D.J.; Heiberger, E.L.; Koop, K. |publisher=Network Frontiers |pages=1–14 |year=2003 |isbn=0-9729039-0-9}}</ref>
Since a backup system contains at least one copy of all data considered worth saving, the [[computer data storage|data storage]] requirements can be significant. Organizing this storage space and managing the backup process can be a complicated undertaking. A data repository model may be used to provide structure to the storage. Nowadays, there are many different types of [[data storage device]]s that are useful for making backups. There are also many different ways in which these devices can be arranged to provide geographic redundancy, [[data security]], and portability.
Before data are sent to their storage locations, they are selected, extracted, and manipulated. Many different techniques have been developed to optimize the backup procedure. These include optimizations for dealing with open files and live data sources as well as compression, encryption, and [[Data deduplication|de-duplication]], among others. Every backup scheme should include [[Dry run (testing)|dry runs]] that validate the reliability of the data being backed up. It is important to recognize the limitations and human factors involved in any backup scheme.
== Storage, the base of a backup system ==
=== Data repository models ===
Any backup strategy starts with a concept of a data repository. The backup data needs to be stored, and probably should be organized to a degree. The organisation could be as simple as a sheet of paper with a list of all backup media (CDs, etc.) and the dates they were produced. A more sophisticated setup could include a computerized index, catalog, or relational database. Different approaches have different advantages. Part of the model is the [[backup rotation scheme]].<ref name="DeanComp09">{{cite book |url=https://books.google.com/books?id=1QEMAAAAQBAJ&pg=PA602 |title=CompTIA Network+ 2009 in Depth |chapter=Chapter 14: Ensuring Integrity and Availability |author=Dean, T. |publisher=Cengage Learning |pages=571–614 |year=2009 |isbn=978-1-59863-878-3 |accessdate=8 May 2018}}</ref>
; Unstructured : An unstructured repository may simply be a stack of tapes or CD-Rs or DVD-Rs with minimal information about what was backed up and when. This is the easiest to implement, but probably the least likely to achieve a high level of recoverability as it lacks automation.
; Full only / [[system image|System imaging]] : A repository of this type contains complete system images taken at one or more specific points in time.<ref name="DeanComp09" /> This technology is frequently used by computer technicians to record known good configurations. Imaging<ref>{{Cite web |title=Five key questions to ask about your backup solution |url=http://sysgen.ca/five-key-backup-questions/ |website=sysgen.ca |accessdate=23 September 2015 |archive-url=https://web.archive.org/web/20160304042343/http://sysgen.ca/five-key-backup-questions/ |archive-date=4 March 2016 |dead-url=no |df=dmy-all}}</ref> is generally more useful for deploying a standard configuration to many systems rather than as a tool for making ongoing backups of diverse systems.
; [[Incremental backup|Incremental]] : An incremental style repository aims to make it more feasible to store backups from more points in time by organizing the data into increments of change between points in time. This eliminates the need to store duplicate copies of unchanged data: with full backups a lot of the data will be unchanged from what has been backed up previously.<ref name="DeanComp09" /> Typically, a ''full'' backup (of all files) is made on one occasion (or at infrequent intervals) and serves as the reference point for an incremental backup set. After that, a number of ''incremental'' backups are made after successive time periods. Restoring the whole system to the date of the last incremental backup would require starting from the last full backup taken before the data loss, and then applying in turn each of the incremental backups since then.<ref>[http://www.tech-faq.com/incremental-backup.shtml Incremental Backup] {{Webarchive|url=https://web.archive.org/web/20160621090117/http://www.tech-faq.com/incremental-backup.shtml |date=21 June 2016 }}. Retrieved 10 March 2006</ref> Additionally, some backup systems can reorganize the repository to synthesize full backups from a series of incrementals.
; [[Differential backup|Differential]] : Each differential backup saves the data that has changed since the last full backup.<ref name="DeanComp09" /> It has the advantage that only a maximum of two data sets are needed to restore the data. One disadvantage, compared to the incremental backup method, is that as time from the last full backup (and thus the accumulated changes in data) increases, so does the time to perform the differential backup. Restoring an entire system would require starting from the most recent full backup and then applying just the last differential backup since the last full backup.
:: Note: Vendors have standardized on the meaning of the terms "incremental backup" and "differential backup." However, there have been cases where conflicting definitions of these terms have been used. The most relevant characteristic of an incremental backup is which reference point it uses to check for changes. By standard definition, a differential backup copies files that have been created or changed since the last full backup, regardless of whether any other differential backups have been made since then, whereas an incremental backup copies files that have been created or changed since the most recent backup of any type (full or incremental). Other variations of incremental backup include multi-level incrementals and incremental backups that compare parts of files instead of just the whole file.
; Reverse delta : A reverse delta type repository stores a recent "mirror" of the source data and a series of differences between the mirror in its current state and its previous states. A reverse delta backup will start with a normal full backup. After the full backup is performed, the system will periodically synchronize the full backup with the live copy, while storing the data necessary to reconstruct older versions.<ref name="LeonSoftware15">{{cite book |url=https://books.google.com/books?id=pYcTBwAAQBAJ&pg=PA65 |title=Software Configuration Management Handbook |author=Leon, A. |publisher=Artech House |page=65 |year=2015 |isbn=978-1-60807-844-8 |accessdate=8 May 2018}}</ref> This can either be done using [[hard links]], or using binary [[data comparion|diffs]]. This system works particularly well for large, slowly changing, data sets.
; [[Continuous data protection]] : Instead of scheduling periodic backups, the system immediately logs every change on the host system. This is generally done by saving byte or block-level differences rather than file-level differences.<ref>[http://www.sertdatarecovery.com/business-data-backup-disaster-recovery-planning-resource.html Continuous Protection white paper] {{Webarchive|url=https://web.archive.org/web/20160304072358/http://www.sertdatarecovery.com/business-data-backup-disaster-recovery-planning-resource.html |date=4 March 2016 }}. (1 October 2005). Retrieved 10 March 2007</ref> It differs from simple [[disk mirroring]] in that it enables a roll-back of the log and thus restoration of old images of data.
=== Storage media ===
[[File:DVD, USB flash drive and external hard drive.jpg|thumb|right|From left to right, a [[DVD]] disc in plastic cover, a [[USB flash drive]] and an [[external hard drive]]]]
Regardless of the repository model that is used, the data has to be stored on some data storage medium.
; [[Magnetic tape data storage|Magnetic tape]] : Magnetic tape has long been the most commonly used medium for bulk data storage, backup, archiving, and interchange. Tape has typically had an order of magnitude better capacity-to-price ratio when compared to hard disk, but the ratios for tape and hard disk have become closer.<ref>[http://www.storagesearch.com/engenio-art2.html Disk to Disk Backup versus Tape – War or Truce?] {{Webarchive|url=https://web.archive.org/web/20160712235906/http://www.storagesearch.com/engenio-art2.html |date=12 July 2016 }} (9 December 2004). Retrieved 10 March 2007</ref> [[Magnetic tape data storage#Chronological list of tape formats|Many tape formats have been]] proprietary or specific to certain markets like mainframes or a particular brand of personal computer, but by 2014 [[Linear Tape-Open#Market performance|LTO]] was edging out two other remaining viable "super" formats—[[IBM 3592]] (now also referred to as the TS11xx series) and [[StorageTek tape formats#T10000|Oracle StorageTek T10000]],<ref name="ForbesKeepingDataLongTime">{{cite web |last1=Coughlin |first1=Tom |title=Keeping Data for a Long Time |url=https://www.forbes.com/sites/tomcoughlin/2014/06/29/keeping-data-for-a-long-time/ |website=Forbes |publisher=Forbes Media LLC |accessdate=19 April 2018 |date=29 June 2014 |at=para. Magnetic Tapes(popular formats, storage life), para. Hard Disk Drives(active archive), para. First consider flash memory in archiving(... may not have good media archive life)}}</ref> and [[Digital Data Storage#Future|further development of the smaller-capacity DDS format had been canceled]]. By 2017 [[Spectra Logic]], which builds [[tape library|tape libraries]] for both the LTO and TS11xx formats, was predicting that "Linear Tape Open (LTO) technology has been and will continue to be the primary tape technology."<ref name="SpectraLogicDigitalDataStorageOutlook2017">{{cite web |title=Digital Data Storage Outlook 2017 |url=https://spectralogic.com/wp-content/uploads/white-paper-digital-data-storage-outlook-2017-v3.pdf |website=Spectra |publisher=Spectra Logic |accessdate=11 July 2018 |page=14(Tape) |format=PDF |year=2017}}</ref> Tape is a [[sequential access]] medium, so even though access times may be poor, the rate of continuously writing or reading data can actually be very fast.
; [[Hard disk]]: The capacity-to-price ratio of hard disks has been improving for many years, making them more competitive with magnetic tape as a bulk storage medium. The main advantages of hard disk storage are low access times, availability, capacity and ease of use.<ref>{{cite web |url=http://www.tomshardware.com/2007/04/18/bye_bye_tape/ |title=Bye Bye Tape, Hello 5.3TB eSATA |accessdate=22 April 2007}}</ref> External disks can be connected via local interfaces like [[SCSI]], [[USB]], [[FireWire]], or [[eSATA]], or via longer distance technologies like [[Ethernet]], [[iSCSI]], or [[Fibre Channel]]. Some disk-based backup systems, via [[Virtual tape library|Virtual Tape Libraries]] or otherwise, support [[data deduplication]], which can dramatically reduce the amount of disk storage capacity consumed by daily and weekly backup data.<ref name="RetrospectWindows12UG">{{cite web |title=Retrospect ® 12 Windows User's Guide |url=http://download.retrospect.com/docs/win/v12/user_guide/Retrospect_Win_User_Guide-EN.pdf |website=Retrospect |publisher=Retrospect Inc. |accessdate=2 September 2018 |format=PDF |year=2017 |pages=30-31(deduplication via Snapshots), 41-43(removable disk drives), 31-32(Dashboard), 216-218(selector as subset filter for synthetic full backups), 426-427(E-mail)}}</ref><ref>{{Cite web |url=http://www.dcig.com/2009/07/symantec-shows-backup-exec-a-l.html |title=Symantec Shows Backup Exec a Little Dedupe Love; Lays out Source Side Deduplication Roadmap – DCIG |website=DCIG |access-date=26 February 2016 |archive-url=https://web.archive.org/web/20160304212819/http://www.dcig.com/2009/07/symantec-shows-backup-exec-a-l.html |archive-date=4 March 2016 |dead-url=no |df=dmy-all}}</ref><ref name="NetBackupDeduplicationGuide">{{cite web |title=Veritas NetBackup™ Deduplication Guide |url=https://www.veritas.com/content/support/en_US/doc/ka6j00000000ADEAA2 |website=Veritas |publisher=Veritas Technologies LLC |accessdate=26 July 2018 |year=2016}}</ref> One disadvantage of hard disk backups vis-a-vis tape is that hard drives are [[Hard disk drive#Magnetic recording|close-tolerance mechanical devices]] and may be more easily damaged, especially while being transported (e.g., for off-site backups).<ref name="PCWorldHardCoreDataPreservation">{{cite web |last1=Jacobi |first1=John L. |title=Hard-core data preservation: The best media and methods for archiving your data |url=https://www.pcworld.com/article/2984597/storage/hard-core-data-preservation-the-best-media-and-methods-for-archiving-your-data.html |website=PC World |accessdate=19 April 2018 |date=29 Feb 2016 |at=sec. External Hard Drives(on the shelf, magnetic properties, mechanical stresses, vulnerable to shocks)}}</ref> In the mid-2000s, several drive manufacturers began to produce portable drives employing [[Hard disk drive failure#Unloading|ramp loading and accelerometer]] technology (sometimes termed a "shock sensor"),<ref name="HGSTRampLoadUnload">{{cite web |title=Ramp Load/Unload Technology in Hard Disk Drives |url=https://www.hgst.com/sites/default/files/resources/LoadUnload_white_paper_FINAL.pdf |website=HGST |publisher=Western Digital |accessdate=29 June 2018 |page=3(sec. Enhanced Shock Tolerance) |format=PDF |date=November 2007}}</ref><ref name="ToshibaCanvio3.0PortableHDD">{{cite web |title=Toshiba Portable Hard Drive (Canvio® 3.0) |url=https://www.toshibadata.com.sg/Product-Canvio-Portable-Hard-Drive.aspx |website=Toshiba Data Dynamics Singapore |publisher=Toshiba Data Dynamics Pte Ltd |accessdate=16 June 2018 |year=2018 |at=sec. Overview(Internal shock sensor and ramp loading technology)}}</ref> and—by 2010—the industry average in drop tests for drives with that technology showed drives remaining intact and working after a 36-inch non-operating drop onto industrial carpeting.<ref name="IomegaDropShock">{{cite web |title=Iomega ® Drop Guard ™ Technology |url=https://www.doc-developpement-durable.org/file/Projets-informatiques/Drop%20Guard-disque-dur-tres-solide.pdf |website=Hard Drive Storage Solutions |publisher=Iomega Corp. |accessdate=12 July 2018 |pages=2(What is Drop Shock Technology?, What is Drop Guard Technology? (... 40% above the industry average)), 3(*NOTE) |date=20 September 2010}}</ref> The manufacturers do not, however, guarantee these results and note that a drive may fail to survive even a shorter drop.<ref name="IomegaDropShock" /> Some manufacturers also offer 'ruggedized' portable hard drives, which include a shock-absorbing case around the hard disk, and claim a range of higher drop specifications.<ref name="PCMagBestRuggedHDDs&SSDs">{{cite web |last1=Burek |first1=John |title=The Best Rugged Hard Drives and SSDs |url=https://www.pcmag.com/roundup/361072/the-best-rugged-hard-drives-and-ssds |website=PC Magazine |publisher=Ziff Davis |accessdate=4 August 2018 |at=What Exactly Makes a Drive Rugged?(When a drive is encased ... you're mostly at the mercy of the drive vendor to tell you the rated maximum drop distance for the drive) |date=15 May 2018}}</ref><ref name="WirecutterBestPortableHardDrive2017Don'tBuy">{{cite web |last1=Krajeski |first1=Justin |last2=Streams |first2=Kimber |title=The Best Portable Hard Drive |url=https://web.archive.org/web/20170331161821/http://thewirecutter.com/reviews/best-portable-hard-drive/#dont-buy-a-rugged-portable-hard-drive |work=The New York Times |accessdate=4 August 2018 |archiveurl=https://web.archive.org/web/20170331161821/http://thewirecutter.com/reviews/best-portable-hard-drive/#dont-buy-a-rugged-portable-hard-drive |archivedate=31 March 2017 |date=20 March 2017}}</ref> Another disadvantage is that over a period of years the stability of hard disk backups is shorter than that of tape backups.<ref name="ForbesKeepingDataLongTime" /><ref name="IronMountainBestLong-TermDataArchiveSolutions">{{cite web |title=Best Long-Term Data Archive Solutions |url=http://www.ironmountain.com/resources/general-articles/b/best-long-term-data-archive-solutions |website=Iron Mountain |publisher=Iron Mountain Inc. |accessdate=19 April 2018 |year=2018 |at=sec. More Reliable(average mean time between failure ... rates, best practice for migrating data)}}</ref><ref name="PCWorldHardCoreDataPreservation" />
; [[Optical storage]] : Recordable [[CD]]s, [[DVD]]s, and [[Blu-ray Disc]]s are commonly used with personal computers and generally have low media unit costs. However, the capacities and speeds of these and other optical discs have traditionally been lower than that of hard disks or tapes (though advances in optical media are slowly shrinking that gap<ref name="WanOptical14">{{cite journal |title=Optical storage: An emerging option in long-term digital preservation |journal=Frontiers of Optoelectronics |author=Wan, S.; Cao, Q.; Xie, C. |volume=7 |issue=4 |pages=486–492 |year=2014 |doi=10.1007/s12200-014-0442-2}}</ref><ref name="ZhangHigh18">{{cite journal |title=High-capacity optical long data memory based on enhanced Young's modulus in nanoplasmonic hybrid glass composites |journal=Nature Communications |author=Zhang, Q.; Xia, Z.; Cheng, Y.-B.; Gu, M. |volume=9 |pages=1183 |year=2018 |doi=10.1038/s41467-018-03589-y}}</ref>). Many optical disk formats are [[Write Once Read Many|WORM]] type, which makes them useful for archival purposes since the data cannot be changed. The use of an auto-changer or jukebox can make optical discs a feasible option for larger-scale backup systems. Some optical storage systems allow for cataloged data backups without human contact with the discs, allowing for longer data integrity.
; [[SSD]]/[[Solid state storage]] : Also known as [[flash memory]], [[thumb drive]]s, [[USB flash drive]]s, [[CompactFlash]], [[SmartMedia]], [[Memory Stick]], [[Secure Digital card]]s, etc., these devices are relatively expensive for their low capacity in comparison to hard disk drives, but are very convenient for backing up relatively low data volumes. A [[solid-state drive]] does not contain any movable parts unlike its magnetic drive counterpart, making it less susceptible to physical damage, and can have huge throughput in the order of 500Mbit/s to 6Gbit/s. The capacity offered from SSDs continues to grow and prices are gradually decreasing as they become more common.<ref name="MicheloniSolid17">{{cite journal |url=https://ieeexplore.ieee.org/stamp/stamp.jsp?arnumber=8013049 |title=Solid-State Drives (SSDs) |journal=Proceedings of the IEEE |author=Micheloni, R.; Olivo, P. |volume=105 |issue=9 |pages=1586–88 |year=2017 |doi=10.1109/JPROC.2017.2727228 |accessdate=8 May 2018}}</ref><ref name="PCMagBestRuggedHDDs&SSDs" /> Over a period of years the stability of flash memory backups is shorter than that of hard disk backups.<ref name="ForbesKeepingDataLongTime" />
; [[Remote backup service|Remote backup service AKA cloud backup]] : As [[broadband Internet access]] becomes more widespread, remote backup services are gaining in popularity. Backing up via the Internet to a remote location can protect against events such as fires, floods, or earthquakes which could destroy locally-stored backups.<ref name="DellEMC">{{cite web |url=https://www.emc.com/corporate/glossary/remote-backup.htm |title=Remote Backup |work=EMC Glossary |publisher=Dell, Inc |accessdate=8 May 2018}}</ref> There are, however, a number of drawbacks to remote backup services. First, Internet connections are usually slower than local data storage devices. Residential broadband is especially problematic as routine backups must use an upstream link that's usually much slower than the downstream link used only occasionally to retrieve a file from backup. This tends to limit the use of such services to relatively small amounts of high value data, even if a particular service provides initial [[seed loading]]. Secondly, users must trust a third party service provider to maintain the privacy and integrity of their data, although confidentiality can be assured by encrypting the data before transmission to the backup service with an [[key (cryptography)|encryption key]] known only to the user. Ultimately the backup service must itself use one of the above methods so this could be seen as a more complex way of doing traditional backups.
; [[Floppy disk]] and its derivatives : During the 1980s and early 1990s, many personal/home computer users associated backing up mostly with copying to floppy disks. However, the data capacity of floppy disks did not keep pace with growing demands, rendering them effectively obsolete. Later "[[superfloppy]]" devices and [[Iomega REV|related "non-floppy"]] devices provide greater storage capacity and remain supported as backup media by some developers.<ref name="RetrospectWindows12UG" />
=== Managing the data repository ===
Regardless of the data repository model, or data storage media used for backups, a balance needs to be struck between accessibility, security and cost. These media management methods are not mutually exclusive and are frequently combined to meet the user's needs. Using on-line disks for staging data before it is sent to a near-line [[tape library]] is a common example.
Data repository implementations include<ref name="StackpoleSoftware07">{{cite book |url=https://books.google.com/books?id=gjAhVzuV7k0C&pg=PA164 |title=Software Deployment, Updating, and Patching |author=Stackpole, B.; Hanrion, P. |publisher=CRC Press |pages=164–165 |year=2007 |isbn=978-1-4200-1329-0 |accessdate=8 May 2018}}</ref><ref name="GnanasundaramInfo12">{{cite book |url=https://books.google.com/books?id=PU7gkW9ArxIC&pg=PA255 |title=Information Storage and Management: Storing, Managing, and Protecting Digital Information in Classic, Virtualized, and Cloud Environments |editor=Gnanasundaram, S.; Shrivastava, A. |publisher=John Wiley and Sons |page=255 |year=2012 |isbn=978-1-118-23696-3 |accessdate=8 May 2018}}</ref>:
; [[Online|On-line]] : On-line backup storage is typically the most accessible type of data storage, which can begin restore in milliseconds of time. A good example is an internal hard disk or a [[disk array]] (maybe connected to [[Storage area network|SAN]]). This type of storage is very convenient and speedy, but is relatively expensive. On-line storage is quite vulnerable to being deleted or overwritten, either by accident, by intentional malevolent action, or in the wake of a data-deleting [[Computer virus|virus]] payload.
; [[Nearline storage|Near-line]] : Near-line storage is typically less accessible and less expensive than on-line storage, but still useful for backup data storage. A good example would be a [[tape library]] with restore times ranging from seconds to a few minutes. A mechanical device is usually used to move media units from storage into a drive where the data can be read or written. Generally it has safety properties similar to on-line storage.
; [[Off-line storage|Off-line]] : Off-line storage requires some direct human action to provide access to the storage media: for example inserting a tape into a tape drive or plugging in a cable. Because the data are not accessible via any computer except during limited periods in which they are written or read back, they are largely immune to a whole class of on-line backup failure modes. Access time will vary depending on whether the media are on-site or off-site.
; [[Off-site data protection]]: To protect against a disaster or other site-specific problem, many people choose to send backup media to an off-site vault. The vault can be as simple as a system administrator's home office or as sophisticated as a disaster-hardened, temperature-controlled, high-security bunker with facilities for backup media storage. Importantly a data replica ''can'' be off-site but also ''on-line'' (e.g., an off-site [[RAID]] mirror). Such a replica has fairly limited value as a backup, and should not be confused with an off-line backup.
; [[Backup site]] or disaster recovery center (DR center): In the event of a disaster, the data on backup media will not be sufficient to recover. Computer systems onto which the data can be restored and properly configured networks are necessary too. Some organizations have their own data recovery centers that are equipped for this scenario. Other organizations contract this out to a third-party recovery center. Because a DR site is itself a huge investment, backing up is very rarely considered the preferred method of moving data to a DR site. A more typical way would be remote [[disk mirroring]], which keeps the DR data as up to date as possible.
== Selection and extraction of data ==
A successful backup job starts with selecting and extracting coherent units of data. Most data on modern computer systems is stored in discrete units, known as [[Computer file|files]]. These files are organized into [[filesystem]]s. Files that are actively being updated can be thought of as "live" and present a challenge to back up. It is also useful to save metadata that describes the computer or the filesystem being backed up.
Deciding what to back up at any given time is a harder process than it seems. By backing up too much redundant data, the data repository will fill up too quickly. Backing up an insufficient amount of data can eventually lead to the loss of critical information.<ref name="LeesWhatTo17">{{cite web |url=https://irontree.co.za/what-to-backup-a-critical-look-at-your-data-1935.html |title=What to backup – a critical look at your data |author=Lees, D. |work=Irontree Blog |publisher=Irontree Internet Services CC |date=25 January 2017 |accessdate=8 May 2018}}</ref>
=== Files ===
; [[File copying|Copying files]] : With '''file-level''' approach, making copies of files is the simplest and most common way to perform a backup. A means to perform this basic function is included in all backup software and all operating systems.
; Partial file copying: Instead of copying whole files, one can limit the backup to only the blocks or bytes within a file that have changed in a given period of time. This technique can use substantially less storage space on the backup medium, but requires a high level of sophistication to reconstruct files in a restore situation. Some implementations require integration with the source file system.
; Deleted files : To prevent the unintentional restoration of files that have been intentionally deleted, a record of the deletion must be kept.
=== Filesystems ===
; Filesystem dump: Instead of copying files within a file system, a copy of the whole filesystem itself in '''block-level''' can be made. This is also known as a ''raw partition backup'' and is related to [[disk image|disk imaging]]. The process usually involves unmounting the filesystem and running a program like [[dd (Unix)]].<ref name="PrestonBackup07">{{cite book |url=https://books.google.com/books?id=6-w4fXbBInoC&pg=PA111 |title=Backup & Recovery: Inexpensive Backup Solutions for Open Systems |author=Preston, W.C. |publisher=O'Reilly Media, Inc |pages=111–114 |year=2007 |isbn=978-0-596-55504-7 |accessdate=8 May 2018}}</ref> Because the disk is read sequentially and with large buffers, this type of backup can be much faster than reading every file normally, especially when the filesystem contains many small files, is highly fragmented, or is nearly full. But because this method also reads the free disk blocks that contain no useful data, this method can also be slower than conventional reading, especially when the filesystem is nearly empty. Some filesystems, such as [[XFS]], provide a "dump" utility that reads the disk sequentially for high performance while skipping unused sections. The corresponding restore utility can selectively restore individual files or the entire volume at the operator's choice.<ref name="PrestonUnix99">{{cite book |url=https://books.google.com/books?id=_i1sO47qNnMC&pg=PA73 |title=Unix Backup & Recovery |author=Preston, W.C. |publisher=O'Reilly Media, Inc |pages=73–91 |year=1999 |isbn=978-1-56592-642-4 |accessdate=8 May 2018}}</ref>
; Identification of changes: Some filesystems have an [[archive bit]] for each file that says it was recently changed. Some backup software looks at the date of the file and compares it with the last backup to determine whether the file was changed.
; [[Versioning file system]] : A versioning filesystem keeps track of all changes to a file and makes those changes accessible to the user. Generally this gives access to any previous version, all the way back to the file's creation time. An example of this is the Wayback versioning filesystem for Linux.<ref>[http://www.aqualab.cs.northwestern.edu/publications/Cornell04VFS.html Wayback: A User-level V File System for Linux] {{Webarchive|url=https://web.archive.org/web/20070406204849/http://www.aqualab.cs.northwestern.edu/publications/Cornell04VFS.html |date=6 April 2007 }} (2004). Retrieved 10 March 2007</ref>
=== Live data ===
If a computer system is in use while it is being backed up, the possibility of files being open for reading or writing is real. If a file is open, the contents on disk may not correctly represent what the owner of the file intends. This is especially true for database files of all kinds. The term [[fuzzy backup]] can be used to describe a backup of live data that looks like it ran correctly, but does not represent the state of the data at any single point in time. This is because the data being backed up changed in the period of time between when the backup started and when it finished.<ref name="LiotineMission03">{{cite book |url=https://books.google.com/books?id=LecC2BhPPxMC&pg=PA244 |title=Mission-critical Network Planning |author=Liotine, M. |publisher=Artech House |page=244 |year=2003 |isbn=978-1-58053-559-5 |accessdate=8 May 2018}}</ref>
Backup options for live (and other) data availability scenarios include<ref name="deGuiseEnterprise08">{{cite book |url=https://books.google.com/books?id=2OtqvySBTu4C&pg=PA50 |title=Enterprise Systems Backup and Recovery: A Corporate Insurance Policy |author=de Guise, P. |publisher=CRC Press |pages=50–54 |year=2008 |isbn=978-1-4200-7640-0}}</ref>:
; [[Snapshot (computer storage)|Snapshot]] backup: A snapshot is an instantaneous function of some storage systems that presents a copy of the file system as if it were frozen at a specific point in time, often by a [[copy-on-write]] mechanism. An effective way to back up live data is to temporarily [[quiesce]] them (e.g., close all files), take a snapshot, and then resume live operations. At this point the snapshot can be backed up through normal methods.<ref>[http://edseek.com/~jasonb/articles/dirvish_backup/snapshot.html What is a Snapshot backup?] {{Webarchive|url=https://web.archive.org/web/20070403041940/http://edseek.com/~jasonb/articles/dirvish_backup/snapshot.html |date=3 April 2007 }}. Retrieved 10 March 2007</ref> While a snapshot is very handy for viewing a filesystem as it was at a different point in time, it is hardly an effective backup mechanism by itself.
; Open file backup: Many backup software packages feature the ability to handle open files in backup operations. Some simply check for openness and try again later. [[File locking]] is useful for regulating access to open files.
: When attempting to understand the logistics of backing up open files, one must consider that the backup process could take several minutes to back up a large file such as a database. In order to back up a file that is in use, it is vital that the entire backup represent a single-moment snapshot of the file, rather than a simple copy of a read-through. This represents a challenge when backing up a file that is constantly changing. Either the database file must be locked to prevent changes, or a method must be implemented to ensure that the original snapshot is preserved long enough to be copied, all while changes are being preserved. Backing up a file while it is being changed, in a manner that causes the first part of the backup to represent data ''before'' changes occur to be combined with later parts of the backup ''after'' the change results in a corrupted file that is unusable, as most large files contain internal references between their various parts that must remain consistent throughout the file.
; Cold database (offline) backup: During a cold backup, the database is closed or locked and not available to users. The datafiles do not change during the backup process so the database is in a consistent state when it is returned to normal operation.<ref>[http://www.wisc.edu/drmt/oratips/sess003.html#coldbackup Oracle Tips] {{Webarchive|url=https://web.archive.org/web/20070302110933/http://www.wisc.edu/drmt/oratips/sess003.html#coldbackup |date=2 March 2007 }} (10 December 1997). Retrieved 10 March 2007</ref>
; Hot database (online) backup: Some database management systems offer a means to generate a backup image of the database while it is online and usable ("hot"). This usually includes an inconsistent image of the data files plus a log of changes made while the procedure is running. Upon a restore, the changes in the log files are reapplied to bring the copy of the database up-to-date (the point in time at which the initial hot backup ended).<ref>[http://www.wisc.edu/drmt/oratips/sess003.html#Hotbackup Oracle Tips] {{Webarchive|url=https://web.archive.org/web/20070302110933/http://www.wisc.edu/drmt/oratips/sess003.html#Hotbackup |date=2 March 2007 }} (10 December 1997). Retrieved 10 March 2007</ref>
=== Metadata ===
Not all information stored on the computer is stored in files. Accurately recovering a complete system from scratch requires keeping track of this non-file data too.<ref name="Gresovnik1">{{cite web |url=http://www2.arnes.si/~ljc3m2/igor/blogs/technical/bootable_media_creation.html |title=Preparation of Bootable Media and Images |last=Grešovnik |first=Igor |date=April 2016 |archive-url=https://web.archive.org/web/20160425113119/http://www2.arnes.si/~ljc3m2/igor/blogs/technical/bootable_media_creation.html |archivedate=25 April 2016 |access-date=21 April 2016}}</ref>
; System description: System specifications are needed to procure an exact replacement after a disaster.
; [[Boot sector]] : The boot sector can sometimes be recreated more easily than saving it. Still, it usually isn't a normal file and the system won't boot without it.
; [[Disk partitioning|Partition]] layout: The layout of the original disk, as well as partition tables and filesystem settings, is needed to properly recreate the original system.
; File [[metadata]] : Each file's permissions, owner, group, ACLs, and any other metadata need to be backed up for a restore to properly recreate the original environment.
; System metadata: Different operating systems have different ways of storing configuration information. [[Microsoft Windows]] keeps a [[Windows Registry|registry]] of system information that is more difficult to restore than a typical file.
== Manipulation of data and dataset optimization ==
It is frequently useful or required to manipulate the data being backed up to optimize the backup process. These manipulations can provide many benefits including improved backup speed, restore speed, data security, media usage and/or reduced bandwidth requirements.
; [[Data compression|Compression]] : Various schemes can be employed to shrink the size of the source data to be stored so that it uses less storage space. Compression is frequently a built-in feature of tape drive hardware.<ref name="CherrySecuring15">{{cite book |url=https://books.google.com/books?id=SD_LAwAAQBAJ&pg=PA306 |title=Securing SQL Server: Protecting Your Database from Attackers |author=Cherry, D. |publisher=Syngress |pages=306–308 |year=2015 |isbn=978-0-12-801375-5 |accessdate=8 May 2018}}</ref>
; [[Data deduplication|Deduplication]] : When multiple similar systems are backed up to the same destination storage device, there exists the potential for much redundancy within the backed up data. For example, if 20 Windows workstations were backed up to the same data repository, they might share a common set of system files. The data repository only needs to store one copy of those files to be able to restore any one of those workstations. This technique can be applied at the file level or even on raw blocks of data, potentially resulting in a massive reduction in required storage space.<ref name="CherrySecuring15" /> Deduplication can occur on a server before any data moves to backup media, sometimes referred to as source/client side deduplication. This approach also reduces bandwidth required to send backup data to its target media. The process can also occur at the target storage device, sometimes referred to as inline or back-end deduplication.
;[[Replication (computer science)|Duplication]] : Sometimes backup jobs are duplicated to a second set of storage media. This can be done to rearrange the backup images to optimize restore speed or to have a second copy at a different location or on a different storage medium.
; [[Encryption]] : High-capacity removable storage media such as backup tapes present a data security risk if they are lost or stolen.<ref>[http://www.securityfocus.com/news/11048 Backups tapes a backdoor for identity thieves] {{Webarchive|url=https://web.archive.org/web/20160405033517/http://www.securityfocus.com/news/11048 |date=5 April 2016 }} (28 April 2004). Retrieved 10 March 2007</ref> Encrypting the data on these media can mitigate this problem, but presents new problems. Encryption is a CPU intensive process that can slow down backup speeds, and the security of the encrypted backups is only as effective as the security of the key management policy.<ref name="CherrySecuring15" />
; [[Multiplexing]] : When there are many more computers to be backed up than there are destination storage devices, the ability to use a single storage device with several simultaneous backups can be useful.<ref name="PrestonBackup07-02">{{cite book |url=https://books.google.com/books?id=6-w4fXbBInoC&pg=PA219 |title=Backup & Recovery: Inexpensive Backup Solutions for Open Systems |author=Preston, W.C. |publisher=O'Reilly Media, Inc |pages=219–220 |year=2007 |isbn=978-0-596-55504-7 |accessdate=8 May 2018}}</ref>
; Refactoring: The process of rearranging the backup sets in a data repository is known as refactoring. For example, if a backup system uses a single tape each day to store the incremental backups for all the protected computers, restoring one of the computers could potentially require many tapes. Refactoring could be used to consolidate all the backups for a single computer onto a single tape. This is especially useful for backup systems that do ''incrementals forever'' style backups.
; [[Disk staging|Staging]] : Sometimes backup jobs are copied to a staging disk before being copied to tape.<ref name="PrestonBackup07-02" /> This process is sometimes referred to as D2D2T, an acronym for Disk to Disk to Tape. This can be useful if there is a problem matching the speed of the final destination device with the source device as is frequently faced in network-based backup systems. It can also serve as a centralized location for applying other data manipulation techniques.
== Managing the backup process ==
As long as new data are being created and changes are being made, backups will need to be performed at frequent intervals. Individuals and organizations with anything from one computer to thousands of computer systems all require protection of data. The scales may be very different, but the objectives and limitations are essentially the same. Those who perform backups need to know how successful the backups are, regardless of scale.
=== Objectives ===
; [[Recovery point objective]] (RPO) : The point in time that the restarted infrastructure will reflect. Essentially, this is the roll-back that will be experienced as a result of the recovery. The most desirable RPO would be the point just prior to the data loss event. Making a more recent recovery point achievable requires increasing the frequency of [[file synchronization|synchronization]] between the source data and the backup repository.<ref>[http://www.riskythinking.com/glossary/recovery_point_objective.php Definition of ''recovery point objective''] {{Webarchive|url=https://web.archive.org/web/20070513180844/http://www.riskythinking.com/glossary/recovery_point_objective.php |date=13 May 2007 }}. Retrieved 10 March 2007</ref><ref>{{Cite web |title=Top four things to consider in business continuity planning |url=http://sysgen.ca/top-four-things-business-continuity-planning/ |website=sysgen.ca |accessdate=23 September 2015 |archive-url=https://web.archive.org/web/20160304075050/http://sysgen.ca/top-four-things-business-continuity-planning/ |archive-date=4 March 2016 |dead-url=no |df=dmy-all}}</ref>
; [[Recovery time objective]] (RTO) : The amount of time elapsed between disaster and restoration of business functions.<ref>[http://www.riskythinking.com/glossary/recovery_time_objective.php Definition of ''recovery time objective''] {{Webarchive|url=https://web.archive.org/web/20070516081425/http://www.riskythinking.com/glossary/recovery_time_objective.php |date=16 May 2007 }}. Retrieved 7 March 2007</ref>
; [[Data security]] : In addition to preserving access to data for its owners, data must be restricted from unauthorized access. Backups must be performed in a manner that does not compromise the original owner's undertaking. This can be achieved with data encryption and proper media handling policies.<ref name="LittleImplement03">{{cite book |url=https://books.google.com/books?id=_DqO6kizEDUC&pg=PA17 |title=Implementing Backup and Recovery: The Readiness Guide for the Enterprise |chapter=Chapter 2: Business Requirements of Backup Systems |author=Little, D.B. |publisher=John Wiley and Sons |pages=17–30 |year=2003 |isbn=978-0-471-48081-5 |accessdate=8 May 2018}}</ref>
; [[Data retention]] period : Regulations and policy can lead to situations where backups are expected to be retained for a particular period, but not any further. Retaining backups after this period can lead to unwanted liability and sub-optimal use of storage media.<ref name="LittleImplement03" />
=== Limitations ===
An effective backup scheme will take into consideration the following situational limitations<ref name="NelsonPro11-2">{{cite book |url=https://books.google.com/books?id=r4uEEsq3CJYC&printsec=frontcover |title=Pro Data Backup and Recovery |chapter=Chapter 9: Putting It All Together: Sample Backup Environments |author=Nelson, S. |publisher=Apress |pages=203–246 |year=2011 |isbn=978-1-4302-2663-5 |accessdate=8 May 2018}}</ref>:
; Backup window: The period of time when backups are permitted to run on a system is called the backup window. This is typically the time when the system sees the least usage and the backup process will have the least amount of interference with normal operations. The backup window is usually planned with users' convenience in mind. If a backup extends past the defined backup window, a decision is made whether it is more beneficial to abort the backup or to lengthen the backup window.
; Performance impact: All backup schemes have some performance impact on the system being backed up. For example, for the period of time that a computer system is being backed up, the hard drive is busy reading files for the purpose of backing up, and its full bandwidth is no longer available for other tasks. Such impacts should be analyzed.
; Costs of hardware, software, labor: All types of storage media have a finite capacity with a real cost. Matching the correct amount of storage capacity (over time) with the backup needs is an important part of the design of a backup scheme. Any backup scheme has some labor requirement, but complicated schemes have considerably higher labor requirements. The cost of commercial backup software can also be considerable.
; Network bandwidth: Distributed backup systems can be affected by limited network bandwidth.
=== Implementation ===
Meeting the defined objectives in the face of the above limitations can be a difficult task. The tools and concepts below can make that task more achievable.
; Scheduling: Using a [[job scheduler]] can greatly improve the reliability and consistency of backups by removing part of the human element. Many backup software packages include this functionality.
; Authentication: Over the course of regular operations, the user accounts and/or system agents that perform the backups need to be authenticated at some level. The power to copy all data off of or onto a system requires unrestricted access. Using an authentication mechanism is a good way to prevent the backup scheme from being used for unauthorized activity.
; Chain of trust : Removable [[storage media]] are physical items and must only be handled by trusted individuals. Establishing a chain of trusted individuals (and vendors) is critical to defining the security of the data.
=== Measuring the process ===
To ensure that the backup scheme is working as expected, the following best practices should be enacted<ref name="AkhtarDatabase12">{{cite journal |title=Database Backup and Recovery Best Practices |journal=ISACA Journal |author=Akhtar, A.N.; Buchholtz, J.; Ryan, M.; Setty, K. |volume=1 |pages=1–6 |year=2012 |url=https://www.isaca.org/Journal/archives/2012/Volume-1/Pages/Database-Backup-and-Recovery-Best-Practices.aspx |accessdate=8 May 2018}}</ref><ref name=DorionBackupReportingTool>{{cite web |last1=Dorion |first1=Pierre |title=Why you need a data backup reporting tool |url=http://searchdatabackup.techtarget.com/tip/Why-you-need-a-data-backup-reporting-tool |website=TechTarget |publisher=Tech Target Inc. |accessdate=13 November 2017 |date=June 2008}}</ref><ref name="PritchardCloud17">{{cite web |url=https://www.computerweekly.com/feature/Cloud-to-cloud-backup-What-it-is-and-why-you-need-it |title=Cloud-to-cloud backup: What it is and why you need it |author=Pritchard, S. |work=Computer Weekly |publisher=TechTarget |date=December 2017 |accessdate=8 May 2018}}</ref>:
; [[Backup validation]] : (also known as "backup success validation") Provides information about the backup, and proves compliance to regulatory bodies outside the organization: for example, an insurance company in the USA might be required under [[Health Insurance Portability and Accountability Act|HIPAA]] to demonstrate that its client data meet records retention requirements.<ref>[http://www.hipaadvisory.com/regs/recordretention.htm HIPAA Advisory] {{Webarchive|url=https://web.archive.org/web/20070411135655/http://www.hipaadvisory.com/regs/recordretention.htm |date=11 April 2007 }}. Retrieved 10 March 2007</ref> Disaster, data complexity, data value and increasing dependence upon ever-growing volumes of data all contribute to the anxiety around and dependence upon successful backups to ensure [[business continuity]]. Thus many organizations rely on third-party or "independent" solutions to test, validate, and optimize their backup operations (backup reporting).
; Reporting: In larger configurations, reports are useful for monitoring media usage, device status, errors, vault coordination and other information about the backup process.
; Logging: In addition to the history of computer generated reports, activity and change logs are useful for monitoring backup system events.
; Validation: Many backup programs use [[checksum]]s or [[hash function|hashes]] to validate that the data was accurately copied. These offer several advantages. First, they allow data integrity to be verified without reference to the original file: if the file as stored on the backup medium has the same checksum as the saved value, then it is very probably correct. Second, some backup programs can use checksums to avoid making redundant copies of files, and thus improve backup speed. This is particularly useful for the de-duplication process.
; Monitored backup: Backup processes can be monitored locally via a software dashboard or by a third party monitoring center. Both alert users to any errors that occur during automated backups. Some third-party monitoring services also allow collection of historical metadata, that can be used for storage resource management purposes like projection of data growth and locating redundant primary storage capacity and reclaimable backup capacity.
== Enterprise client-server backup ==
"Enterprise client-server" backup software describes a class of software applications that back up data from a variety of client computers centrally to one or more server computers, with the particular needs of [[Company|enterprises]] in mind. They may employ a scripted client–server<ref name="KissellTakeControl2.0">{{cite book |last1=Kissell |first1=Joe |title=Take Control of Mac OS X Backups |date=2007 |publisher=TidBITS Electronic Publishing |location=Ithaca, NY |isbn=0-9759503-0-4 |edition=Version 2.0 |url=http://people.fas.harvard.edu/~techtool/pages/Take_Control_of_Mac_OS_X_Backups_(2.0).pdf |accessdate=22 September 2017 |ref=Kissell |pages=24 (client-server), 127 (script), 165 (client-server), 128 (subvolume—''later'' renamed Favorite Folder in Macintosh variant)}}</ref> backup model<ref name="EnterpriseBackupChallenges">{{cite web |last1=Rassokhin? |first1=Alexander? |title=Enterprise Network Backup Challenges |url=http://www.backupschedule.net/enterprise-network-backup.html |website=All About Backup |publisher=Novosoft LLC |accessdate=13 November 2017 |year=2012}}</ref> with a backup [[server (computing)|server]] program running on one computer, and with small-footprint [[client (computing)|client]] programs (referred to as "agents" in some applications) running on the other computers being backed up, in either a single platform or [[heterogeneous network|mixed platform network]]. Enterprise-specific requirements<ref name="EnterpriseBackupChallenges" /> include the need to back up large amounts of data on a systematic basis, to adhere to legal requirements for the maintenance and archiving of files and data, and to satisfy short-recovery-time objectives. To satisfy these requirements, which World Backup Day (31 March)<ref name="CBC-WorldBackupDay">{{cite web |url=http://www.cbc.ca/news/technology/world-backup-day-1.3510588 |title=World Backup Day highlights importance of protecting data |last=Misener |first=Dan |date=29 March 2016 |publisher=CBC News}}</ref><ref name="ZDNetWorldBackupDay">{{cite web |title=World Backup Day: deutliche Lücken zwischen Sicherheitsrisiko und Nutzerverhalten |url=http://www.zdnet.de/88291257/ |publisher=[[ZDNet]] |date=31 March 2017 |language=de-DE |first=Anja |last=Schmoll-Trautmann}}</ref><ref name="eWeekWorldBackupDay">{{cite web |last1=Preimesberger |first1=Chris |title=World Backup Day 2017: 'We Don't Know the Day Nor the Hour' |url=http://www.eweek.com/storage/world-backup-day-2017-we-don-t-know-the-day-nor-the-hour |website=eWeek |publisher=QuinStreet |accessdate=11 November 2017 |date=31 March 2017 |at=Ian Wood of Veritas}}</ref> highlights, it is typical for an enterprise to appoint a backup administrator, who is a part of office administration rather than of the IT staff, and whose role is "being the keeper of the data".<ref name="DorionBackupAdminRole">{{cite web |last1=Dorion |first1=Pierre |title=The true role of a backup administrator |url=http://searchdatabackup.techtarget.com/news/1322981/The-true-role-of-a-backup-administrator |website=TechTarget |publisher=TechTarget, Inc. |accessdate=13 November 2017 |date=4 August 2008 |quote=On the other hand, the role of a backup administrator should be one of administration, not operation....whose role is "being the keeper of the data"}}</ref>
Such applications make cumulative backups of ''multiple'' client machines' source files to, or do restores from, what would ordinarily be referred to as an [[archive file]]. However some of these applications use (or once used<ref name="BackupExecArchivingOptionNoLongeSupported">{{cite web |title=Backup Exec Archiving Option is no longer supported for Backup Exec 15 Feature Pack 1 |url=https://www.veritas.com/support/en_US/article.100023956 |website=Veritas Support |publisher=Veritas Technologies LLC |accessdate=13 May 2018 |date=30 June 2015}}</ref>) the term "archive" to refer to a backup operation that deletes data from a client source once the data's backup is complete.<ref name="NetBackupWhatIsArchiving">{{cite web |last1=Bokelman |first1=Seth |title=what is archiving in Netbackup? |url=https://vox.veritas.com/t5/NetBackup/what-is-archiving-in-Netbackup/m-p/490153#M112727 |website=VOX |publisher=Veritas Technologies LLC |accessdate=13 May 2018 |date=26 February 2012}}</ref><ref name="RetrospectMac14UG">{{cite web |title=Retrospect ® 14.0 Mac User's Guide |url=http://download.retrospect.com/docs/mac/v14/user_guide/Retrospect_Mac_User_Guide-EN.pdf |website=Retrospect |publisher=Retrospect Inc. |accessdate=28 March 2017 |format=PDF |date=March 2017}}</ref> Therefore the discussion of these applications will use the non-proprietary term "set(s) of backups" instead of "archive file(s)".
=== Performance ===
The [[Hard disk drive#Price evolution|steady improvement in hard disk drive price per byte]] has made feasible a [[Backup#Manipulation of data and dataset optimization|disk-to-disk-to-tape]] strategy, combining the speed of disk backup and restore with the capacity and low cost of tape for offsite archival and disaster recovery purposes.<ref name="FernandoCombineDiskTapeBenefits">{{cite web |last1=Fernando |first1=Sal |title=Combine disk, tape benefits to protect data |url=http://www.zdnet.com/article/combine-disk-tape-benefits-to-protect-data/ |publisher=ZDNet |accessdate=13 November 2017 |date=30 April 2008}}</ref> This, with [[Comparison of file systems#File capabilities|file system technology]], has led to features such as:
; Improved disk-to-disk-to-tape capabilities: Enable automated transfers to tape for safe offsite storage of disk sets of backups that were created for fast onsite restores.<ref name="EMCRetroWindows7">{{cite web |title=New EMC Dantz Retrospect 7 Improves Data Protection for SMBs and the Distributed Enterprise |url=http://www.emc.com/about/news/press/us/2005/20050131-2906.htm |website=DellEMC [current] |publisher=EMC Corp. [orig. publisher] |accessdate=23 November 2016 |date=31 January 2005}}</ref><ref name="NetBackupAboutReplicationDirector">{{cite web |title=About NetBackup Replication Director |url=https://www.veritas.com/support/en_US/doc/59229900-126796169-0/v58079997-126796169 |website=Veritas Support |publisher=Veritas Technologies LLC (US) |accessdate=18 November 2017 |date=13 July 2017}}</ref><ref name="BackupExecDuplicatingBackedUpData">{{cite web |title=Symantec Backup Exec: About duplicating backed up data |url=http://backup-exec.helpmax.net/en/backing-up-data/about-duplicating-backed-up-data/ |website=''Helpmax.net'' |publisher=HelpMax Software Help & Shop Inc. |accessdate=13 January 2018}}</ref>
; Create synthetic full backups: For example, onto tapes from existing disk sets of backups—by copying multiple backups of the same source(s) from one set of backups to another. This is termed a [[Incremental backup#Synthetic full backup|"synthetic full backup"]] because, after the transfer, the destination set of backups contains the same data it would after full backups.<ref name="EMCRetroWindows7" /><ref name="NetBackupAboutSyntheticBackups">{{cite web |title=About synthetic backups |url=https://www.veritas.com/content/support/en_US/doc/18716246-126559472-0/id-SF780163836-126559472 |website=Veritas Support |publisher=Veritas Technologies LLC (US) |accessdate=18 November 2017 |date=25 September 2017}}</ref><ref name="BackupExecSyntheticFullBackup">{{cite web |title=Symantec Backup Exec: About the synthetic backup feature |url=http://backup-exec.helpmax.net/en/symantec-backup-exec-advanced-disk-based-backup-option/about-the-synthetic-backup-feature/ |website=''Helpmax.net'' |publisher=HelpMax Software Help & Shop Inc. |accessdate=13 January 2018}}</ref> One application can exclude<ref group=note name=RetrospectExclusionInclusion>Exclusion and/or inclusion is done with Selectors in the Windows variant; this misleading term has been changed to Rules in the Macintosh variant.</ref> files and folders from the synthetic full backup.<ref name="RetrospectWindows12UG" />
; Automated data grooming: Frees up space on disk sets of backups by removing out-of-date backup data—usually based on an administrator-defined retention period.<ref name="eWeekWorldBackupDay" /><ref name="FernandoCombineDiskTapeBenefits" /><ref name="EMCRetroWindows7" /><ref name="NetBackupStorageLifecyclePolicy">{{cite web |last1=Kaczorek |first1=Mariusz |title=NetBackup Storage Lifecycle Policy (SLP): Overview |url=https://www.settlersoman.com/netbackup-storage-lifecycle-policy-slp-overview/ |website=Settlersoman |publisher=Settlersoman |accessdate=2 February 2018 |date=15 August 2015}}</ref><ref name="BackupExecDataGrooming">{{cite web |last1=Jain |first1=Hemant |title=VOX Knowledge Base: Data Protection Knowledge Base: Data Protection |url=https://vox.veritas.com/t5/Articles/Automated-Disk-management-and-Data-retention-in-Backup-Exec-DLM/ta-p/809167 |website=VOX |publisher=Veritas Technologies LLC |accessdate=13 January 2018 |date=14 April 2015 |quote=Employee [of Veritas]}}</ref><ref group=note>A few backup applications—mostly free ones—term this "pruning" instead of "grooming", but other applications use the term "pruning" to mean omitting certain ''types'' of files from backups.</ref> One method of removing data is to keep the last backup of each day/week/month for the last respective week/month/specified-number-of-months, permitting compliance with regulatory requirements.<ref name="RetrospectMac12UG">{{cite web |title=Retrospect ® 12.0 Mac User's Guide |url=http://download.retrospect.com/docs/mac/v12/user_guide/Retrospect_Mac_User_Guide-EN.pdf |website=Retrospect |publisher=Retrospect Inc. |accessdate=28 December 2017 |format=PDF |year=2015}}</ref> One application has a "performance-optimized grooming" mode that only removes outdated information from a set of backups that it can quickly delete.<ref name="TitBITSMacintosh13">{{cite web |last1=Schmitz |first1=Agen |title=Retrospect 13 |url=https://tidbits.com/article/16311 |website=TitBITS |publisher=TidBITS Publishing Inc. |accessdate=27 October 2016 |date=5 March 2016}}</ref> This is the only mode of grooming allowed for cloud sets of backups, and is also up to 5 times as fast when used on locally stored disk sets of backups. The "storage-optimized grooming" mode reclaims more space because it rewrites the set of backups, and in this application also permits exclusion compliance with the [[GDPR]]<ref name="RetrospectKnowledgeBase">{{cite web |title=Support: Knowledge Base |url=https://www.retrospect.com/en/support/kb/ |website=Retrospect |publisher=Retrospect Inc. |accessdate=25 August 2018 |date=2 July 2018 |at=#Resources (Auto Launching Guide ..., ... difference between "Backup" and "Duplicate", Avid Support ..., Instant Scan FAQ), #Email Backup, #Top Articles (BackupBot – Deep Dive into ProactiveAI, How to Set Up Remote Backup, GDPR – Deep Dive into Data Retention Policies, Deep Dive - Components of a Retrospect Backup)}}</ref> via rules<ref group=note name=RetrospectExclusionInclusion />—that can instead be used for other filtering.<ref name="TitBITSMacintosh15.1.1">{{cite web |last1=Schmitz |first1=Agen |title=Retrospect 15.1.1 |url=https://tidbits.com/watchlist/retrospect-15-1-1/ |website=TitBITS |publisher=TidBITS Publishing Inc. |accessdate=20 June 2018 |date=28 May 2018}}</ref>
; Multithreaded backup server: Capable of simultaneously performing multiple backup, restore, and copy operations in separate "activity threads" (once needed only by those who could afford multiple tape drives).<ref name="EnterpriseBackupChallenges" /><ref name="NetBackupMultistreamMultiplex">{{cite web |title=What is the difference between multiplexing and multistreaming? |url=https://www.veritas.com/support/en_US/article.TECH10085 |website=Veritas Support |publisher=Veritas Technologies LLC (US) |accessdate=19 November 2017 |date=29 January 2015}}</ref><ref name="BackupExecRunConcurrentJobs">{{cite web |last1=McMillen |first1=Robert |title=How to run concurrent jobs in Backup exec 15 |url=https://www.youtube.com/watch?v=1-9x9So038g |via=YouTube |publisher=Google |accessdate=14 January 2018 |format=Video |date=21 July 2015}}</ref> In one application, all the categories of information for a particular "backup server" are stored by it; when an [[backup#User interface|"Administration Console"]] process is started, its process synchronizes information with all running LAN/WAN backup servers.<ref name="TidBITSEMCShips">{{cite web |last1=Engst |first1=Adam |title=EMC Ships Modernized Retrospect 8 |url=https://tidbits.com/article/10159 |website=TidBITS |publisher=TidBITS Publishing Inc. |accessdate=12 September 2017 |date=23 March 2009}}</ref>
; Block-level incremental backup: The ability to back up only the blocks of a file that have changed, a [[Incremental backup#Block level incremental|refinement of incremental backup]] that saves space<ref name="TitBITSMacintosh11">{{cite web |last1=Schmitz |first1=Agen |title=Retrospect 11 |url=https://tidbits.com/article/14573 |website=TitBITS |publisher=TidBITS Publishing Inc. |accessdate=27 April 2017 |date=6 March 2014}}</ref><ref name="NetBackupBlockLevelOracle">{{cite web |title=How Veritas NetBackup block-level incremental backup works for Oracle database files |url=https://sort.symantec.com/public/documents/sfha/6.0/aix/productguides/html/sf_adv_ora/ch21s01s01.htm |website=Symantec |publisher=Veritas Technologies LLC (US) |accessdate=18 November 2017 |year=2013}}</ref><ref name="BackupExecBlockLevel">{{cite web |last1=Harbaugh |first1=Logan |title=Developing a Real Backup Plan with Symantec's Backup Exec 15 |url=https://edtechmagazine.com/higher/article/2015/10/developing-real-backup-plan-symantecs-backup-exec-15 |website=EdTech |publisher=CDW LLC |accessdate=14 January 2018 |date=Fall 2015}}</ref> and may save time.<ref name="EnterpriseBackupChallenges" /><ref name="WhitehouseFile-levelBlock-levelDedup">{{cite web |last1=Whitehouse |first1=Lauren |title=The pros and cons of file-level vs. block-level data deduplication technology |url=http://searchdatabackup.techtarget.com/tip/The-pros-and-cons-of-file-level-vs-block-level-data-deduplication-technology |website=TechTarget |publisher=Tech Target Inc. |accessdate=13 November 2017 |date=September 2008}}</ref> Such [[Backup#Files|partial file copying]] is especially applicable to a [[database]].
; "Instant" scanning of client volumes: Uses the [[USN Journal]] on Windows NTFS and [[FSEvents]] on macOS to reduce the scanning component<ref name="RetrospectKnowledgeBase" /> time on both incremental backups, fitting more sources into the [[Backup#Limitations|backup window]],<ref name="EnterpriseBackupChallenges" /><ref name="NetBackupAccelerator">{{cite web |title=About the Accelerator feature in NetBackup 7.5 |url=https://www.veritas.com/support/en_US/article.000086263 |website=Veritas Support |publisher=Veritas Technologies LLC (US) |accessdate=18 November 2017 |date=10 November 2017}}</ref><ref name="BackupExecDeterminingIfFileBackedUp">{{cite web |title=Veritas Backup Exec Administrator's Guide: How Backup Exec determines if a file has been backed up |url=https://www.veritas.com/content/support/en_US/doc/59226269-99535599-0/v63768146-99535599 |website=Veritas Support |publisher=Veritas Technologies LLC |accessdate=7 February 2018 |date=11 November 2017}}</ref> and on restores.<ref name="TidBITSMac10">{{cite web |last1=Engst |first1=Adam |title=Retrospect 10 Reduces Backup Time with Instant Scan Technology |url=https://tidbits.com/article/13379 |website=TidBITS |publisher=TidBITS Publishing Inc. |accessdate=25 October 2016 |date=6 November 2012}}</ref>
; Cramming or evading the [[Backup#Limitations|backup window]]: One application has the "multiplexed backup" capability of cramming the [[Backup#Limitations|backup window]] by sending data from multiple clients to a single tape drive simultaneously; "this is useful for low end clients with slow throughput ... [that] cannot send data fast enough to keep the tape drive busy .... will reduce the performance of restores."<ref name="NetBackupMultistreamMultiplex" /> Another application allows an enterprise that has computers transiently connecting to the network over a long workday to evade the window by using [[Retrospect (software)#Small-group features|Proactive scripts]].
=== Source file integrity ===
; Backing up interactive applications : [[Interactive computing|Such applications]] must be protected by having their services [[quiesce|paused]] while their [[Backup#Live data|live data]] is being backed up, and then [[:wikt:unpause|unpaused]].<ref name="EnterpriseBackupSoftware: WorkstationsEmailDatabases">{{cite web |last1=Rassokhin? |first1=Alexander? |title=Enterprise Backup Software: Backup Network Workstations, Email and Databases |url=http://www.backupschedule.net/enterprise-backup.html |website=All about Backup |publisher=Novosoft LLC |accessdate=24 January 2018 |year=2012}}</ref> Some enterprise backup applications accomplish pausing/unpausing of services via built-in provisions—for many specific databases and other interactive applications—that become automatically part of the backup software's script execution; these provisions [[Retrospect (software)#Editions and Add-Ons|may be purchased separately]].<ref name="NetBackupDatabase&AppAgentCompatibility">{{cite web |title=Veritas NetBackup ™ 8.0 – 8.x.x Database and Application Agent Compatibility List |url=https://www.veritas.com/content/support/en_US/doc/NB_80_DBSCL |website=Veritas |publisher=Veritas Technologies LLC (US) |accessdate=19 November 2017 |date=17 November 2017}}</ref><ref name="BackupExecAgents&Options">{{cite web |title=Backup Exec TM 16 Agents and Options |url=https://www.veritas.com/content/dam/Veritas/docs/data-sheets/be16-agents-and-options.pdf |website=Veritas |publisher=Veritas Technologies LLC |accessdate=14 January 2018 |year=2016}}</ref> However another application has also added [[Scripting language#Extension/embeddable languages|"script hooks"]] that enable the optional automatic execution—at specific events during runs of a GUI-coded backup script—of portions of an external script containing commands pre-written in a standard [[scripting language]]. Since the external script is provided by an installation's backup administrator, its code activated by the "script hooks" may accomplish not only data protection—via pausing/unpausing interactive services—but also [[Backup#User interface|integration with monitoring systems]].<ref name="RetrospectMac14UG" />
=== User interface ===
To accommodate the requirements of a backup administrator who may not be part of the IT staff with access to the secure server area, enterprise client-server software may include features such as:
; Administration Console:
:The backup administrator's backup server [[GUI]] management and near-term reporting tool.<ref name="DorionBackupReportingTool" /> Its window shows the selected backup server, with a standard toolbar on top. A sidebar on the left or navigation bar shows the clickable categories of backup server information for it; each category shows a panel, which may have a specialized toolbar below or in place of the standard toolbar. The built-in categories include activities—thus providing [[Backup#Measuring the process|monitored backup]], past backups of each individual source, scripts/policies/jobs (terminology depending on the application), sources (directly/indirectly), sets of backups, and storage devices.<ref name="RetrospectMac14UG" /><ref name="NetBackupAdminGuideVol.1">{{cite web |title=Symantec NetBackup ™ Administrator's Guide, Volume I Windows |url=http://www-personal.umich.edu/~danno/symantec/NetBackup_AdminGuideI_WinServer.pdf |website=Symantec |publisher=Veritas Technologies LLC (US) |accessdate=18 November 2017 |pages=35–45(Administration Console), 833–843(Activity Monitor), 888–894(Reports utility), 912(Remote Administration Console), 915–938(Java Console) |year=2012}}</ref><ref name="BackupExecAdminConsole">{{cite web |title=Symantec Backup Exec: About the Administration Console |url=http://backup-exec.helpmax.net/en/introducing-backup-exec/about-the-administration-console/ |website=''Helpmax.net'' |publisher=HelpMax Software Help & Shop Inc. |accessdate=10 December 2017}}</ref>
; User-initiated backups and restores: These supplement the administrator-initiated backups and restores which backup applications have always had, and relieve the administrator of time-consuming tasks.<ref name="DorionBackupAdminRole" /> The user designates the date of the past backup from which files or folders are to be restored—once IT staff has mounted the proper backup volume on the backup server.<ref name="FernandoCombineDiskTapeBenefits" /><ref name="RetrospectMac14UG" /><ref name="NetBackupOperationalRestore">{{cite web |title=OpsCenter Operational Restore |url=https://www.veritas.com/support/en_US/article.100038022 |website=Veritas Support |publisher=Veritas Technologies LLC (US) |accessdate=18 November 2017 |date=12 March 2012}}</ref><ref name="BackupExecUserRetrieve">{{cite web |title=How Backup Exec Retrieve works |url=http://backup-exec.helpmax.net/en/using-backup-exec-retrieve/how-backup-exec-retrieve-works/ |website=''Helpmax.net'' |publisher=HelpMax Software Help & Shop Inc. |accessdate=14 January 2018}}</ref>
; High-level/long-term reports supplementing the Administration Console<ref name="DorionBackupReportingTool" />: Within one application's Console panel displayed by clicking the name of the backup server itself in the sidebar, an activities pane on the top left of the displayed [[Dashboard (business)|Dashboard]] has a moving bar graph for each activity going on for the backup server together with a pause and stop button for the activity. Three more panes give the results of activities in the past week: backups each day, sources backed up, and sources not backed up. Finally a storage pane has a line for each set of backups, showing the last-modified date and depictions of the total bytes used and available.<ref name="TitBITSMacintosh11" /><ref name="RetrospectMac14UG" /> For the application's Windows variant, the Dashboard acts as a display-only substitute for a non-existent Console.<ref name="RetrospectWindows12UG" /> Other applications have a separate reporting facility that can cover multiple backup servers.<ref name="NetBackupOperationsManager">{{cite web |last1=Antony |first1=Erica |author2=Tim Burlowski |title=NetBackup Operations Manager: Monitoring, Alerting and Reporting for Veritas NetBackup |url=https://vox.veritas.com/t5/Articles/NetBackup-Operations-Manager-Monitoring-Alerting-and-Reporting/ta-p/806080 |website=Symantec |publisher=Veritas Technologies LLC (US) |accessdate=18 November 2017 |pages=4–5(monitoring), 6–7(alerting), 7(3rdPartyEventMgmt.), 11–18(reporting) |format=PDF attachment |date=January 2008}}</ref><ref name="BackupExecEnterpriseDataProtection">{{cite web |title=Windows® Enterprise Data Protection with Symantec Backup Exec™ |url=http://www.r2gen.com.br/images/symantec/pdf/symantec_protegendo_sua_empresa.pdf |website=Symantec |publisher=Veritas Technologies LLC |accessdate=14 January 2018 |pages=5–8 (CASO) |format=PDF |year=2007}}</ref>
; E-mailing of notifications about operations to chosen recipients<ref name="DorionBackupReportingTool" />: Can alert the recipient to, e.g., errors or warnings, with a log to assist in pinpointing problems.<ref name="RetrospectWindows12UG" /><ref name="NetBackupOperationsManager" /><ref name="BackupExecConfigureNotifications">{{cite web |title=How to configure notification recipients in Backup Exec 12.0 and above |url=https://www.veritas.com/support/en_US/article.100016176 |website=Veritas Support |publisher=Veritas Technologies LLC |accessdate=15 January 2018 |date=10 November 2017}}</ref>
; Integration with monitoring systems<ref name="DorionBackupReportingTool" />: Such systems provide [[Backup#Measuring the process|backup validation]]. One application's administrators can deploy custom scripts that—invoking [[webhook]] code via [[Backup#Source file integrity|script hooks]]—populate such systems as the freeware [[Nagios]] and [[IFTTT]] and the [[freemium]] [[Slack (software)|Slack]] with script successes and failures corresponding to the activities category of the Console, per-source backup information corresponding to the past backups category of the Console, and media requests.<ref name="RetrospectMac14UG" /> Another application has integration with two of the developer's monitoring systems, one that is part of the client-server backup application and one that is more generalized.<ref name="NetBackupOperationsManager" /> Yet another application has integration with a monitoring system that is part of the client-server backup application,<ref name="BackupExecJobMonitor">{{cite web |title=Veritas Backup Exec Administrator's Guide: About the Job Monitor |url=https://www.veritas.com/content/support/en_US/doc/59226269-99535599-0/v76313540-99535599 |website=Veritas Support |publisher=Veritas Technologies LLC |accessdate=15 January 2018 |date=11 November 2017}}</ref> but can also be integrated with Nagios.<ref name="BackupExecNagiosPlugins">{{cite web |title=Nagios plugins for monitoring BackupExec |url=https://exchange.nagios.org/directory/Plugins/Backup-and-Recovery/BackupExec |website=Nagios Exchange |publisher=Nagios Enterprises |accessdate=15 January 2018}}</ref>
=== LAN/WAN/Cloud ===
; Advanced network client support: All applications includes support for multiple network interfaces.<ref name="EnterpriseBackupChallenges" /><ref name="EMCRetroMac8">{{cite web |title=EMC Announces Retrospect 8.0 Backup and Recovery Software For Mac |url=http://www.infotomorrowmag.com/about/news/press/2009/20090106-02.htm |website=DellEMC [current] |publisher=EMC Corp. [orig. publisher] |accessdate=10 November 2016 |date=6 January 2009}}</ref><ref name="BackupExecConfiguringNetworkOptionsBackup">{{cite web |title=Veritas Backup Exec Administrator's Guide: Configuring network options for backup jobs |url=https://www.veritas.com/content/support/en_US/doc/59226269-99535599-0/v96257307-99535599 |website=Veritas Support |publisher=Veritas Technologies LLC |accessdate=15 January 2018 |date=17 November 2017}}</ref> However one application, unless [[Data deduplication#Source versus target deduplication|deduplication is done by a separate sub-application between the client and the backup server]], cannot provide "resilient network connections" for machines on a WAN.<ref name="NetBackupDeduplication">{{cite web |title=Veritas NetBackup™ Deduplication Guide |url=https://www.veritas.com/content/support/en_US/doc/ka6j00000000ADEAA2 |website=Veritas |publisher=Veritas Technologies LLC (US) |accessdate=18 November 2017 |page=171(Resilient network properties) |format=PDF |year=2016}}</ref> One application can extend support to "remote" clients anywhere on the Internet for a [[Retrospect (software)#Small-group features|Proactive]] script and for [[Backup#User interface|user-initiated backups/restores]].<ref name="RetrospectKnowledgeBase" />
; Cloud seeding and Large-Scale Recovery: Because of a large amount of data already backed up,<ref name="EnterpriseBackupChallenges" /> an enterprise adopting [[Backup#Storage_media|cloud backup]] likely will need to do [[Seed loading|"seeding"]]. This service copies a large volume of locally stored backup data onto a large-capacity disk device, which is then physically shipped to the cloud storage site and uploaded.<ref name="WhatIsAWSSnowball?">{{cite web |title=What Is an AWS Snowball Appliance? |url=https://docs.aws.amazon.com/snowball/latest/ug/whatissnowball.html |website=AWS |publisher=Amazon.com |accessdate=8 March 2018 |year=2018}}</ref><ref name="RouseCloudSeedingDef">{{cite web |last1=Rouse |first1=Margaret |title=Definition: cloud seeding |url=http://searchdatabackup.techtarget.com/definition/cloud-seeding |website=TechTarget |publisher=Tech Target Inc. |accessdate=16 November 2017 |date=December 2011}}</ref> After the large initial upload, the enterprise's backup software can be reconfigured to read from and write to the backup incrementally in its cloud location.<ref name="RetrospectChangingPathsMac">{{cite web |title=Changing paths Cloud Mac |url=https://www.youtube.com/watch?v=Ac3BhXO4T1g |via=YouTube |publisher=Retrospect Inc. |accessdate=7 October 2016 |format=Video |date=29 February 2016}}</ref> The service may need to be employed in reverse for faster [[Disaster_recovery|large-scale data recovery]] times than would be possible via an Internet connection.<ref name="WhatIsAWSSnowball?" /> Some applications offer seeding and large-scale recovery via third-party services, which may use a high-speed Internet channel to/from cloud storage rather than a shipable physical device.<ref name="NetBackupAmazonStorageGateway">{{cite web |last1=High |first1=Dave |author2=Mahmud, Fozz |title=NBU and the Amazon Storage Gateway VTL |url=https://www.youtube.com/watch?v=rU1rFK9o20s |website=Veritas |publisher=Veritas Technologies LLC |accessdate=17 January 2018 |format=Video |date=10 March 2016}}</ref><ref name="BackupExecCloudConnector">{{cite web |title=Backup Exec 16: Best Practices for Using the Veritas Backup Exec Cloud Connector |url=https://www.veritas.com/content/support/en_US/doc/72686287-129480082-0/v128967126-129480082 |website=Veritas Support |publisher=Veritas Technologies LLC |accessdate=15 January 2018 |date=25 October 2017}}</ref>
== See also ==
;About backup
* Backup software
** [[List of backup software]]
* [[Glossary of backup terms]]
* [[Remote backup service]]
* [[Virtual backup appliance]]
;Related topics
* [[Data consistency]]
* [[Data degradation]]
* [[Data proliferation]]
* [[Database dump]]
* [[Digital preservation]]
* [[Disaster recovery and business continuity auditing]]
* [[File synchronization]]
* [[Information repository]]
== Notes ==
{{reflist|group=note}}
== References ==
{{Reflist|2}}
== External links ==
{{Wiktionary|back up}}
{{Wiktionary|backup}}
{{Commons category|Backup}}
[[Category:Data security]]
[[Category:Backup| ]]' |
Unified diff of changes made by edit (edit_diff ) | '@@ -1,196 +1,1 @@
-{{about|backup in computer systems|other uses}}
-{{Use dmy dates|date=August 2018}}
-
-In [[information technology]], a '''backup''', or the process of backing up, refers to the copying into an [[archive file]] of computer [[data]] so it may be used to restore the original after a [[data loss]] event. The verb form is [[wikt:back up|"back up"]] (a [[phrasal verb]]), whereas the noun and adjective form is [[wikt:backup|"backup"]].<ref name="AHDictionaryBackup">{{cite web |title=back•up |url=https://www.ahdictionary.com/word/search.html?q=backup |website=The American Heritage Dictionary of the English Language |publisher=Houghton Mifflin Harcourt |accessdate=9 May 2018 |year=2018}}</ref>
-
-Backups have two distinct purposes. The primary purpose is to recover data after its loss, be it by [[File deletion|data deletion]] or [[Data corruption|corruption]]. Data loss can be a common experience of computer users; a 2008 survey found that 66% of respondents had lost files on their home PC.<ref>[http://www.kabooza.com/globalsurvey.html Global Backup Survey] {{Webarchive|url=https://web.archive.org/web/20100327235844/http://www.kabooza.com/globalsurvey.html |date=27 March 2010 }}. Retrieved 15 February 2009</ref> The secondary purpose of backups is to recover data from an earlier time, according to a user-defined [[data retention]] policy, typically configured within a [[Backup software|backup application]] for how long copies of data are required.<ref name="NelsonPro11">{{cite book |url=https://books.google.com/books?id=r4uEEsq3CJYC&printsec=frontcover |title=Pro Data Backup and Recovery |chapter=Chapter 1: Introduction to Backup and Recovery |author=Nelson, S. |publisher=Apress |pages=1–16 |year=2011 |isbn=978-1-4302-2663-5 |accessdate=8 May 2018}}</ref> Though backups represent a simple form of [[disaster recovery]] and should be part of any [[disaster recovery plan]], backups by themselves should not be considered a complete disaster recovery plan. One reason for this is that not all backup systems are able to reconstitute a computer system or other complex configuration such as a [[computer cluster]], [[active directory]] server, or [[database server]] by simply restoring data from a backup.<ref name="CougiasTheBackup03">{{cite book |url=https://books.google.com/books?id=eLviiTag5A0C&pg=PA1 |title=The Backup Book: Disaster Recovery from Desktop to Data Center |chapter=Chapter 1: What's a Disaster Without a Recovery? |author=Cougias, D.J.; Heiberger, E.L.; Koop, K. |publisher=Network Frontiers |pages=1–14 |year=2003 |isbn=0-9729039-0-9}}</ref>
-
-Since a backup system contains at least one copy of all data considered worth saving, the [[computer data storage|data storage]] requirements can be significant. Organizing this storage space and managing the backup process can be a complicated undertaking. A data repository model may be used to provide structure to the storage. Nowadays, there are many different types of [[data storage device]]s that are useful for making backups. There are also many different ways in which these devices can be arranged to provide geographic redundancy, [[data security]], and portability.
-
-Before data are sent to their storage locations, they are selected, extracted, and manipulated. Many different techniques have been developed to optimize the backup procedure. These include optimizations for dealing with open files and live data sources as well as compression, encryption, and [[Data deduplication|de-duplication]], among others. Every backup scheme should include [[Dry run (testing)|dry runs]] that validate the reliability of the data being backed up. It is important to recognize the limitations and human factors involved in any backup scheme.
-
-== Storage, the base of a backup system ==
-
-=== Data repository models ===
-Any backup strategy starts with a concept of a data repository. The backup data needs to be stored, and probably should be organized to a degree. The organisation could be as simple as a sheet of paper with a list of all backup media (CDs, etc.) and the dates they were produced. A more sophisticated setup could include a computerized index, catalog, or relational database. Different approaches have different advantages. Part of the model is the [[backup rotation scheme]].<ref name="DeanComp09">{{cite book |url=https://books.google.com/books?id=1QEMAAAAQBAJ&pg=PA602 |title=CompTIA Network+ 2009 in Depth |chapter=Chapter 14: Ensuring Integrity and Availability |author=Dean, T. |publisher=Cengage Learning |pages=571–614 |year=2009 |isbn=978-1-59863-878-3 |accessdate=8 May 2018}}</ref>
-
-; Unstructured : An unstructured repository may simply be a stack of tapes or CD-Rs or DVD-Rs with minimal information about what was backed up and when. This is the easiest to implement, but probably the least likely to achieve a high level of recoverability as it lacks automation.
-; Full only / [[system image|System imaging]] : A repository of this type contains complete system images taken at one or more specific points in time.<ref name="DeanComp09" /> This technology is frequently used by computer technicians to record known good configurations. Imaging<ref>{{Cite web |title=Five key questions to ask about your backup solution |url=http://sysgen.ca/five-key-backup-questions/ |website=sysgen.ca |accessdate=23 September 2015 |archive-url=https://web.archive.org/web/20160304042343/http://sysgen.ca/five-key-backup-questions/ |archive-date=4 March 2016 |dead-url=no |df=dmy-all}}</ref> is generally more useful for deploying a standard configuration to many systems rather than as a tool for making ongoing backups of diverse systems.
-; [[Incremental backup|Incremental]] : An incremental style repository aims to make it more feasible to store backups from more points in time by organizing the data into increments of change between points in time. This eliminates the need to store duplicate copies of unchanged data: with full backups a lot of the data will be unchanged from what has been backed up previously.<ref name="DeanComp09" /> Typically, a ''full'' backup (of all files) is made on one occasion (or at infrequent intervals) and serves as the reference point for an incremental backup set. After that, a number of ''incremental'' backups are made after successive time periods. Restoring the whole system to the date of the last incremental backup would require starting from the last full backup taken before the data loss, and then applying in turn each of the incremental backups since then.<ref>[http://www.tech-faq.com/incremental-backup.shtml Incremental Backup] {{Webarchive|url=https://web.archive.org/web/20160621090117/http://www.tech-faq.com/incremental-backup.shtml |date=21 June 2016 }}. Retrieved 10 March 2006</ref> Additionally, some backup systems can reorganize the repository to synthesize full backups from a series of incrementals.
-; [[Differential backup|Differential]] : Each differential backup saves the data that has changed since the last full backup.<ref name="DeanComp09" /> It has the advantage that only a maximum of two data sets are needed to restore the data. One disadvantage, compared to the incremental backup method, is that as time from the last full backup (and thus the accumulated changes in data) increases, so does the time to perform the differential backup. Restoring an entire system would require starting from the most recent full backup and then applying just the last differential backup since the last full backup.
-:: Note: Vendors have standardized on the meaning of the terms "incremental backup" and "differential backup." However, there have been cases where conflicting definitions of these terms have been used. The most relevant characteristic of an incremental backup is which reference point it uses to check for changes. By standard definition, a differential backup copies files that have been created or changed since the last full backup, regardless of whether any other differential backups have been made since then, whereas an incremental backup copies files that have been created or changed since the most recent backup of any type (full or incremental). Other variations of incremental backup include multi-level incrementals and incremental backups that compare parts of files instead of just the whole file.
-; Reverse delta : A reverse delta type repository stores a recent "mirror" of the source data and a series of differences between the mirror in its current state and its previous states. A reverse delta backup will start with a normal full backup. After the full backup is performed, the system will periodically synchronize the full backup with the live copy, while storing the data necessary to reconstruct older versions.<ref name="LeonSoftware15">{{cite book |url=https://books.google.com/books?id=pYcTBwAAQBAJ&pg=PA65 |title=Software Configuration Management Handbook |author=Leon, A. |publisher=Artech House |page=65 |year=2015 |isbn=978-1-60807-844-8 |accessdate=8 May 2018}}</ref> This can either be done using [[hard links]], or using binary [[data comparion|diffs]]. This system works particularly well for large, slowly changing, data sets.
-; [[Continuous data protection]] : Instead of scheduling periodic backups, the system immediately logs every change on the host system. This is generally done by saving byte or block-level differences rather than file-level differences.<ref>[http://www.sertdatarecovery.com/business-data-backup-disaster-recovery-planning-resource.html Continuous Protection white paper] {{Webarchive|url=https://web.archive.org/web/20160304072358/http://www.sertdatarecovery.com/business-data-backup-disaster-recovery-planning-resource.html |date=4 March 2016 }}. (1 October 2005). Retrieved 10 March 2007</ref> It differs from simple [[disk mirroring]] in that it enables a roll-back of the log and thus restoration of old images of data.
-
-=== Storage media ===
-[[File:DVD, USB flash drive and external hard drive.jpg|thumb|right|From left to right, a [[DVD]] disc in plastic cover, a [[USB flash drive]] and an [[external hard drive]]]]
-Regardless of the repository model that is used, the data has to be stored on some data storage medium.
-
-; [[Magnetic tape data storage|Magnetic tape]] : Magnetic tape has long been the most commonly used medium for bulk data storage, backup, archiving, and interchange. Tape has typically had an order of magnitude better capacity-to-price ratio when compared to hard disk, but the ratios for tape and hard disk have become closer.<ref>[http://www.storagesearch.com/engenio-art2.html Disk to Disk Backup versus Tape – War or Truce?] {{Webarchive|url=https://web.archive.org/web/20160712235906/http://www.storagesearch.com/engenio-art2.html |date=12 July 2016 }} (9 December 2004). Retrieved 10 March 2007</ref> [[Magnetic tape data storage#Chronological list of tape formats|Many tape formats have been]] proprietary or specific to certain markets like mainframes or a particular brand of personal computer, but by 2014 [[Linear Tape-Open#Market performance|LTO]] was edging out two other remaining viable "super" formats—[[IBM 3592]] (now also referred to as the TS11xx series) and [[StorageTek tape formats#T10000|Oracle StorageTek T10000]],<ref name="ForbesKeepingDataLongTime">{{cite web |last1=Coughlin |first1=Tom |title=Keeping Data for a Long Time |url=https://www.forbes.com/sites/tomcoughlin/2014/06/29/keeping-data-for-a-long-time/ |website=Forbes |publisher=Forbes Media LLC |accessdate=19 April 2018 |date=29 June 2014 |at=para. Magnetic Tapes(popular formats, storage life), para. Hard Disk Drives(active archive), para. First consider flash memory in archiving(... may not have good media archive life)}}</ref> and [[Digital Data Storage#Future|further development of the smaller-capacity DDS format had been canceled]]. By 2017 [[Spectra Logic]], which builds [[tape library|tape libraries]] for both the LTO and TS11xx formats, was predicting that "Linear Tape Open (LTO) technology has been and will continue to be the primary tape technology."<ref name="SpectraLogicDigitalDataStorageOutlook2017">{{cite web |title=Digital Data Storage Outlook 2017 |url=https://spectralogic.com/wp-content/uploads/white-paper-digital-data-storage-outlook-2017-v3.pdf |website=Spectra |publisher=Spectra Logic |accessdate=11 July 2018 |page=14(Tape) |format=PDF |year=2017}}</ref> Tape is a [[sequential access]] medium, so even though access times may be poor, the rate of continuously writing or reading data can actually be very fast.
-; [[Hard disk]]: The capacity-to-price ratio of hard disks has been improving for many years, making them more competitive with magnetic tape as a bulk storage medium. The main advantages of hard disk storage are low access times, availability, capacity and ease of use.<ref>{{cite web |url=http://www.tomshardware.com/2007/04/18/bye_bye_tape/ |title=Bye Bye Tape, Hello 5.3TB eSATA |accessdate=22 April 2007}}</ref> External disks can be connected via local interfaces like [[SCSI]], [[USB]], [[FireWire]], or [[eSATA]], or via longer distance technologies like [[Ethernet]], [[iSCSI]], or [[Fibre Channel]]. Some disk-based backup systems, via [[Virtual tape library|Virtual Tape Libraries]] or otherwise, support [[data deduplication]], which can dramatically reduce the amount of disk storage capacity consumed by daily and weekly backup data.<ref name="RetrospectWindows12UG">{{cite web |title=Retrospect ® 12 Windows User's Guide |url=http://download.retrospect.com/docs/win/v12/user_guide/Retrospect_Win_User_Guide-EN.pdf |website=Retrospect |publisher=Retrospect Inc. |accessdate=2 September 2018 |format=PDF |year=2017 |pages=30-31(deduplication via Snapshots), 41-43(removable disk drives), 31-32(Dashboard), 216-218(selector as subset filter for synthetic full backups), 426-427(E-mail)}}</ref><ref>{{Cite web |url=http://www.dcig.com/2009/07/symantec-shows-backup-exec-a-l.html |title=Symantec Shows Backup Exec a Little Dedupe Love; Lays out Source Side Deduplication Roadmap – DCIG |website=DCIG |access-date=26 February 2016 |archive-url=https://web.archive.org/web/20160304212819/http://www.dcig.com/2009/07/symantec-shows-backup-exec-a-l.html |archive-date=4 March 2016 |dead-url=no |df=dmy-all}}</ref><ref name="NetBackupDeduplicationGuide">{{cite web |title=Veritas NetBackup™ Deduplication Guide |url=https://www.veritas.com/content/support/en_US/doc/ka6j00000000ADEAA2 |website=Veritas |publisher=Veritas Technologies LLC |accessdate=26 July 2018 |year=2016}}</ref> One disadvantage of hard disk backups vis-a-vis tape is that hard drives are [[Hard disk drive#Magnetic recording|close-tolerance mechanical devices]] and may be more easily damaged, especially while being transported (e.g., for off-site backups).<ref name="PCWorldHardCoreDataPreservation">{{cite web |last1=Jacobi |first1=John L. |title=Hard-core data preservation: The best media and methods for archiving your data |url=https://www.pcworld.com/article/2984597/storage/hard-core-data-preservation-the-best-media-and-methods-for-archiving-your-data.html |website=PC World |accessdate=19 April 2018 |date=29 Feb 2016 |at=sec. External Hard Drives(on the shelf, magnetic properties, mechanical stresses, vulnerable to shocks)}}</ref> In the mid-2000s, several drive manufacturers began to produce portable drives employing [[Hard disk drive failure#Unloading|ramp loading and accelerometer]] technology (sometimes termed a "shock sensor"),<ref name="HGSTRampLoadUnload">{{cite web |title=Ramp Load/Unload Technology in Hard Disk Drives |url=https://www.hgst.com/sites/default/files/resources/LoadUnload_white_paper_FINAL.pdf |website=HGST |publisher=Western Digital |accessdate=29 June 2018 |page=3(sec. Enhanced Shock Tolerance) |format=PDF |date=November 2007}}</ref><ref name="ToshibaCanvio3.0PortableHDD">{{cite web |title=Toshiba Portable Hard Drive (Canvio® 3.0) |url=https://www.toshibadata.com.sg/Product-Canvio-Portable-Hard-Drive.aspx |website=Toshiba Data Dynamics Singapore |publisher=Toshiba Data Dynamics Pte Ltd |accessdate=16 June 2018 |year=2018 |at=sec. Overview(Internal shock sensor and ramp loading technology)}}</ref> and—by 2010—the industry average in drop tests for drives with that technology showed drives remaining intact and working after a 36-inch non-operating drop onto industrial carpeting.<ref name="IomegaDropShock">{{cite web |title=Iomega ® Drop Guard ™ Technology |url=https://www.doc-developpement-durable.org/file/Projets-informatiques/Drop%20Guard-disque-dur-tres-solide.pdf |website=Hard Drive Storage Solutions |publisher=Iomega Corp. |accessdate=12 July 2018 |pages=2(What is Drop Shock Technology?, What is Drop Guard Technology? (... 40% above the industry average)), 3(*NOTE) |date=20 September 2010}}</ref> The manufacturers do not, however, guarantee these results and note that a drive may fail to survive even a shorter drop.<ref name="IomegaDropShock" /> Some manufacturers also offer 'ruggedized' portable hard drives, which include a shock-absorbing case around the hard disk, and claim a range of higher drop specifications.<ref name="PCMagBestRuggedHDDs&SSDs">{{cite web |last1=Burek |first1=John |title=The Best Rugged Hard Drives and SSDs |url=https://www.pcmag.com/roundup/361072/the-best-rugged-hard-drives-and-ssds |website=PC Magazine |publisher=Ziff Davis |accessdate=4 August 2018 |at=What Exactly Makes a Drive Rugged?(When a drive is encased ... you're mostly at the mercy of the drive vendor to tell you the rated maximum drop distance for the drive) |date=15 May 2018}}</ref><ref name="WirecutterBestPortableHardDrive2017Don'tBuy">{{cite web |last1=Krajeski |first1=Justin |last2=Streams |first2=Kimber |title=The Best Portable Hard Drive |url=https://web.archive.org/web/20170331161821/http://thewirecutter.com/reviews/best-portable-hard-drive/#dont-buy-a-rugged-portable-hard-drive |work=The New York Times |accessdate=4 August 2018 |archiveurl=https://web.archive.org/web/20170331161821/http://thewirecutter.com/reviews/best-portable-hard-drive/#dont-buy-a-rugged-portable-hard-drive |archivedate=31 March 2017 |date=20 March 2017}}</ref> Another disadvantage is that over a period of years the stability of hard disk backups is shorter than that of tape backups.<ref name="ForbesKeepingDataLongTime" /><ref name="IronMountainBestLong-TermDataArchiveSolutions">{{cite web |title=Best Long-Term Data Archive Solutions |url=http://www.ironmountain.com/resources/general-articles/b/best-long-term-data-archive-solutions |website=Iron Mountain |publisher=Iron Mountain Inc. |accessdate=19 April 2018 |year=2018 |at=sec. More Reliable(average mean time between failure ... rates, best practice for migrating data)}}</ref><ref name="PCWorldHardCoreDataPreservation" />
-; [[Optical storage]] : Recordable [[CD]]s, [[DVD]]s, and [[Blu-ray Disc]]s are commonly used with personal computers and generally have low media unit costs. However, the capacities and speeds of these and other optical discs have traditionally been lower than that of hard disks or tapes (though advances in optical media are slowly shrinking that gap<ref name="WanOptical14">{{cite journal |title=Optical storage: An emerging option in long-term digital preservation |journal=Frontiers of Optoelectronics |author=Wan, S.; Cao, Q.; Xie, C. |volume=7 |issue=4 |pages=486–492 |year=2014 |doi=10.1007/s12200-014-0442-2}}</ref><ref name="ZhangHigh18">{{cite journal |title=High-capacity optical long data memory based on enhanced Young's modulus in nanoplasmonic hybrid glass composites |journal=Nature Communications |author=Zhang, Q.; Xia, Z.; Cheng, Y.-B.; Gu, M. |volume=9 |pages=1183 |year=2018 |doi=10.1038/s41467-018-03589-y}}</ref>). Many optical disk formats are [[Write Once Read Many|WORM]] type, which makes them useful for archival purposes since the data cannot be changed. The use of an auto-changer or jukebox can make optical discs a feasible option for larger-scale backup systems. Some optical storage systems allow for cataloged data backups without human contact with the discs, allowing for longer data integrity.
-; [[SSD]]/[[Solid state storage]] : Also known as [[flash memory]], [[thumb drive]]s, [[USB flash drive]]s, [[CompactFlash]], [[SmartMedia]], [[Memory Stick]], [[Secure Digital card]]s, etc., these devices are relatively expensive for their low capacity in comparison to hard disk drives, but are very convenient for backing up relatively low data volumes. A [[solid-state drive]] does not contain any movable parts unlike its magnetic drive counterpart, making it less susceptible to physical damage, and can have huge throughput in the order of 500Mbit/s to 6Gbit/s. The capacity offered from SSDs continues to grow and prices are gradually decreasing as they become more common.<ref name="MicheloniSolid17">{{cite journal |url=https://ieeexplore.ieee.org/stamp/stamp.jsp?arnumber=8013049 |title=Solid-State Drives (SSDs) |journal=Proceedings of the IEEE |author=Micheloni, R.; Olivo, P. |volume=105 |issue=9 |pages=1586–88 |year=2017 |doi=10.1109/JPROC.2017.2727228 |accessdate=8 May 2018}}</ref><ref name="PCMagBestRuggedHDDs&SSDs" /> Over a period of years the stability of flash memory backups is shorter than that of hard disk backups.<ref name="ForbesKeepingDataLongTime" />
-; [[Remote backup service|Remote backup service AKA cloud backup]] : As [[broadband Internet access]] becomes more widespread, remote backup services are gaining in popularity. Backing up via the Internet to a remote location can protect against events such as fires, floods, or earthquakes which could destroy locally-stored backups.<ref name="DellEMC">{{cite web |url=https://www.emc.com/corporate/glossary/remote-backup.htm |title=Remote Backup |work=EMC Glossary |publisher=Dell, Inc |accessdate=8 May 2018}}</ref> There are, however, a number of drawbacks to remote backup services. First, Internet connections are usually slower than local data storage devices. Residential broadband is especially problematic as routine backups must use an upstream link that's usually much slower than the downstream link used only occasionally to retrieve a file from backup. This tends to limit the use of such services to relatively small amounts of high value data, even if a particular service provides initial [[seed loading]]. Secondly, users must trust a third party service provider to maintain the privacy and integrity of their data, although confidentiality can be assured by encrypting the data before transmission to the backup service with an [[key (cryptography)|encryption key]] known only to the user. Ultimately the backup service must itself use one of the above methods so this could be seen as a more complex way of doing traditional backups.
-; [[Floppy disk]] and its derivatives : During the 1980s and early 1990s, many personal/home computer users associated backing up mostly with copying to floppy disks. However, the data capacity of floppy disks did not keep pace with growing demands, rendering them effectively obsolete. Later "[[superfloppy]]" devices and [[Iomega REV|related "non-floppy"]] devices provide greater storage capacity and remain supported as backup media by some developers.<ref name="RetrospectWindows12UG" />
-
-=== Managing the data repository ===
-Regardless of the data repository model, or data storage media used for backups, a balance needs to be struck between accessibility, security and cost. These media management methods are not mutually exclusive and are frequently combined to meet the user's needs. Using on-line disks for staging data before it is sent to a near-line [[tape library]] is a common example.
-
-Data repository implementations include<ref name="StackpoleSoftware07">{{cite book |url=https://books.google.com/books?id=gjAhVzuV7k0C&pg=PA164 |title=Software Deployment, Updating, and Patching |author=Stackpole, B.; Hanrion, P. |publisher=CRC Press |pages=164–165 |year=2007 |isbn=978-1-4200-1329-0 |accessdate=8 May 2018}}</ref><ref name="GnanasundaramInfo12">{{cite book |url=https://books.google.com/books?id=PU7gkW9ArxIC&pg=PA255 |title=Information Storage and Management: Storing, Managing, and Protecting Digital Information in Classic, Virtualized, and Cloud Environments |editor=Gnanasundaram, S.; Shrivastava, A. |publisher=John Wiley and Sons |page=255 |year=2012 |isbn=978-1-118-23696-3 |accessdate=8 May 2018}}</ref>:
-
-; [[Online|On-line]] : On-line backup storage is typically the most accessible type of data storage, which can begin restore in milliseconds of time. A good example is an internal hard disk or a [[disk array]] (maybe connected to [[Storage area network|SAN]]). This type of storage is very convenient and speedy, but is relatively expensive. On-line storage is quite vulnerable to being deleted or overwritten, either by accident, by intentional malevolent action, or in the wake of a data-deleting [[Computer virus|virus]] payload.
-; [[Nearline storage|Near-line]] : Near-line storage is typically less accessible and less expensive than on-line storage, but still useful for backup data storage. A good example would be a [[tape library]] with restore times ranging from seconds to a few minutes. A mechanical device is usually used to move media units from storage into a drive where the data can be read or written. Generally it has safety properties similar to on-line storage.
-; [[Off-line storage|Off-line]] : Off-line storage requires some direct human action to provide access to the storage media: for example inserting a tape into a tape drive or plugging in a cable. Because the data are not accessible via any computer except during limited periods in which they are written or read back, they are largely immune to a whole class of on-line backup failure modes. Access time will vary depending on whether the media are on-site or off-site.
-; [[Off-site data protection]]: To protect against a disaster or other site-specific problem, many people choose to send backup media to an off-site vault. The vault can be as simple as a system administrator's home office or as sophisticated as a disaster-hardened, temperature-controlled, high-security bunker with facilities for backup media storage. Importantly a data replica ''can'' be off-site but also ''on-line'' (e.g., an off-site [[RAID]] mirror). Such a replica has fairly limited value as a backup, and should not be confused with an off-line backup.
-; [[Backup site]] or disaster recovery center (DR center): In the event of a disaster, the data on backup media will not be sufficient to recover. Computer systems onto which the data can be restored and properly configured networks are necessary too. Some organizations have their own data recovery centers that are equipped for this scenario. Other organizations contract this out to a third-party recovery center. Because a DR site is itself a huge investment, backing up is very rarely considered the preferred method of moving data to a DR site. A more typical way would be remote [[disk mirroring]], which keeps the DR data as up to date as possible.
-
-== Selection and extraction of data ==
-A successful backup job starts with selecting and extracting coherent units of data. Most data on modern computer systems is stored in discrete units, known as [[Computer file|files]]. These files are organized into [[filesystem]]s. Files that are actively being updated can be thought of as "live" and present a challenge to back up. It is also useful to save metadata that describes the computer or the filesystem being backed up.
-
-Deciding what to back up at any given time is a harder process than it seems. By backing up too much redundant data, the data repository will fill up too quickly. Backing up an insufficient amount of data can eventually lead to the loss of critical information.<ref name="LeesWhatTo17">{{cite web |url=https://irontree.co.za/what-to-backup-a-critical-look-at-your-data-1935.html |title=What to backup – a critical look at your data |author=Lees, D. |work=Irontree Blog |publisher=Irontree Internet Services CC |date=25 January 2017 |accessdate=8 May 2018}}</ref>
-
-=== Files ===
-; [[File copying|Copying files]] : With '''file-level''' approach, making copies of files is the simplest and most common way to perform a backup. A means to perform this basic function is included in all backup software and all operating systems.
-
-; Partial file copying: Instead of copying whole files, one can limit the backup to only the blocks or bytes within a file that have changed in a given period of time. This technique can use substantially less storage space on the backup medium, but requires a high level of sophistication to reconstruct files in a restore situation. Some implementations require integration with the source file system.
-
-; Deleted files : To prevent the unintentional restoration of files that have been intentionally deleted, a record of the deletion must be kept.
-
-=== Filesystems ===
-; Filesystem dump: Instead of copying files within a file system, a copy of the whole filesystem itself in '''block-level''' can be made. This is also known as a ''raw partition backup'' and is related to [[disk image|disk imaging]]. The process usually involves unmounting the filesystem and running a program like [[dd (Unix)]].<ref name="PrestonBackup07">{{cite book |url=https://books.google.com/books?id=6-w4fXbBInoC&pg=PA111 |title=Backup & Recovery: Inexpensive Backup Solutions for Open Systems |author=Preston, W.C. |publisher=O'Reilly Media, Inc |pages=111–114 |year=2007 |isbn=978-0-596-55504-7 |accessdate=8 May 2018}}</ref> Because the disk is read sequentially and with large buffers, this type of backup can be much faster than reading every file normally, especially when the filesystem contains many small files, is highly fragmented, or is nearly full. But because this method also reads the free disk blocks that contain no useful data, this method can also be slower than conventional reading, especially when the filesystem is nearly empty. Some filesystems, such as [[XFS]], provide a "dump" utility that reads the disk sequentially for high performance while skipping unused sections. The corresponding restore utility can selectively restore individual files or the entire volume at the operator's choice.<ref name="PrestonUnix99">{{cite book |url=https://books.google.com/books?id=_i1sO47qNnMC&pg=PA73 |title=Unix Backup & Recovery |author=Preston, W.C. |publisher=O'Reilly Media, Inc |pages=73–91 |year=1999 |isbn=978-1-56592-642-4 |accessdate=8 May 2018}}</ref>
-
-; Identification of changes: Some filesystems have an [[archive bit]] for each file that says it was recently changed. Some backup software looks at the date of the file and compares it with the last backup to determine whether the file was changed.
-
-; [[Versioning file system]] : A versioning filesystem keeps track of all changes to a file and makes those changes accessible to the user. Generally this gives access to any previous version, all the way back to the file's creation time. An example of this is the Wayback versioning filesystem for Linux.<ref>[http://www.aqualab.cs.northwestern.edu/publications/Cornell04VFS.html Wayback: A User-level V File System for Linux] {{Webarchive|url=https://web.archive.org/web/20070406204849/http://www.aqualab.cs.northwestern.edu/publications/Cornell04VFS.html |date=6 April 2007 }} (2004). Retrieved 10 March 2007</ref>
-
-=== Live data ===
-If a computer system is in use while it is being backed up, the possibility of files being open for reading or writing is real. If a file is open, the contents on disk may not correctly represent what the owner of the file intends. This is especially true for database files of all kinds. The term [[fuzzy backup]] can be used to describe a backup of live data that looks like it ran correctly, but does not represent the state of the data at any single point in time. This is because the data being backed up changed in the period of time between when the backup started and when it finished.<ref name="LiotineMission03">{{cite book |url=https://books.google.com/books?id=LecC2BhPPxMC&pg=PA244 |title=Mission-critical Network Planning |author=Liotine, M. |publisher=Artech House |page=244 |year=2003 |isbn=978-1-58053-559-5 |accessdate=8 May 2018}}</ref>
-
-Backup options for live (and other) data availability scenarios include<ref name="deGuiseEnterprise08">{{cite book |url=https://books.google.com/books?id=2OtqvySBTu4C&pg=PA50 |title=Enterprise Systems Backup and Recovery: A Corporate Insurance Policy |author=de Guise, P. |publisher=CRC Press |pages=50–54 |year=2008 |isbn=978-1-4200-7640-0}}</ref>:
-
-; [[Snapshot (computer storage)|Snapshot]] backup: A snapshot is an instantaneous function of some storage systems that presents a copy of the file system as if it were frozen at a specific point in time, often by a [[copy-on-write]] mechanism. An effective way to back up live data is to temporarily [[quiesce]] them (e.g., close all files), take a snapshot, and then resume live operations. At this point the snapshot can be backed up through normal methods.<ref>[http://edseek.com/~jasonb/articles/dirvish_backup/snapshot.html What is a Snapshot backup?] {{Webarchive|url=https://web.archive.org/web/20070403041940/http://edseek.com/~jasonb/articles/dirvish_backup/snapshot.html |date=3 April 2007 }}. Retrieved 10 March 2007</ref> While a snapshot is very handy for viewing a filesystem as it was at a different point in time, it is hardly an effective backup mechanism by itself.
-
-; Open file backup: Many backup software packages feature the ability to handle open files in backup operations. Some simply check for openness and try again later. [[File locking]] is useful for regulating access to open files.
-: When attempting to understand the logistics of backing up open files, one must consider that the backup process could take several minutes to back up a large file such as a database. In order to back up a file that is in use, it is vital that the entire backup represent a single-moment snapshot of the file, rather than a simple copy of a read-through. This represents a challenge when backing up a file that is constantly changing. Either the database file must be locked to prevent changes, or a method must be implemented to ensure that the original snapshot is preserved long enough to be copied, all while changes are being preserved. Backing up a file while it is being changed, in a manner that causes the first part of the backup to represent data ''before'' changes occur to be combined with later parts of the backup ''after'' the change results in a corrupted file that is unusable, as most large files contain internal references between their various parts that must remain consistent throughout the file.
-
-; Cold database (offline) backup: During a cold backup, the database is closed or locked and not available to users. The datafiles do not change during the backup process so the database is in a consistent state when it is returned to normal operation.<ref>[http://www.wisc.edu/drmt/oratips/sess003.html#coldbackup Oracle Tips] {{Webarchive|url=https://web.archive.org/web/20070302110933/http://www.wisc.edu/drmt/oratips/sess003.html#coldbackup |date=2 March 2007 }} (10 December 1997). Retrieved 10 March 2007</ref>
-
-; Hot database (online) backup: Some database management systems offer a means to generate a backup image of the database while it is online and usable ("hot"). This usually includes an inconsistent image of the data files plus a log of changes made while the procedure is running. Upon a restore, the changes in the log files are reapplied to bring the copy of the database up-to-date (the point in time at which the initial hot backup ended).<ref>[http://www.wisc.edu/drmt/oratips/sess003.html#Hotbackup Oracle Tips] {{Webarchive|url=https://web.archive.org/web/20070302110933/http://www.wisc.edu/drmt/oratips/sess003.html#Hotbackup |date=2 March 2007 }} (10 December 1997). Retrieved 10 March 2007</ref>
-
-=== Metadata ===
-Not all information stored on the computer is stored in files. Accurately recovering a complete system from scratch requires keeping track of this non-file data too.<ref name="Gresovnik1">{{cite web |url=http://www2.arnes.si/~ljc3m2/igor/blogs/technical/bootable_media_creation.html |title=Preparation of Bootable Media and Images |last=Grešovnik |first=Igor |date=April 2016 |archive-url=https://web.archive.org/web/20160425113119/http://www2.arnes.si/~ljc3m2/igor/blogs/technical/bootable_media_creation.html |archivedate=25 April 2016 |access-date=21 April 2016}}</ref>
-; System description: System specifications are needed to procure an exact replacement after a disaster.
-; [[Boot sector]] : The boot sector can sometimes be recreated more easily than saving it. Still, it usually isn't a normal file and the system won't boot without it.
-; [[Disk partitioning|Partition]] layout: The layout of the original disk, as well as partition tables and filesystem settings, is needed to properly recreate the original system.
-; File [[metadata]] : Each file's permissions, owner, group, ACLs, and any other metadata need to be backed up for a restore to properly recreate the original environment.
-; System metadata: Different operating systems have different ways of storing configuration information. [[Microsoft Windows]] keeps a [[Windows Registry|registry]] of system information that is more difficult to restore than a typical file.
-
-== Manipulation of data and dataset optimization ==
-It is frequently useful or required to manipulate the data being backed up to optimize the backup process. These manipulations can provide many benefits including improved backup speed, restore speed, data security, media usage and/or reduced bandwidth requirements.
-; [[Data compression|Compression]] : Various schemes can be employed to shrink the size of the source data to be stored so that it uses less storage space. Compression is frequently a built-in feature of tape drive hardware.<ref name="CherrySecuring15">{{cite book |url=https://books.google.com/books?id=SD_LAwAAQBAJ&pg=PA306 |title=Securing SQL Server: Protecting Your Database from Attackers |author=Cherry, D. |publisher=Syngress |pages=306–308 |year=2015 |isbn=978-0-12-801375-5 |accessdate=8 May 2018}}</ref>
-; [[Data deduplication|Deduplication]] : When multiple similar systems are backed up to the same destination storage device, there exists the potential for much redundancy within the backed up data. For example, if 20 Windows workstations were backed up to the same data repository, they might share a common set of system files. The data repository only needs to store one copy of those files to be able to restore any one of those workstations. This technique can be applied at the file level or even on raw blocks of data, potentially resulting in a massive reduction in required storage space.<ref name="CherrySecuring15" /> Deduplication can occur on a server before any data moves to backup media, sometimes referred to as source/client side deduplication. This approach also reduces bandwidth required to send backup data to its target media. The process can also occur at the target storage device, sometimes referred to as inline or back-end deduplication.
-;[[Replication (computer science)|Duplication]] : Sometimes backup jobs are duplicated to a second set of storage media. This can be done to rearrange the backup images to optimize restore speed or to have a second copy at a different location or on a different storage medium.
-; [[Encryption]] : High-capacity removable storage media such as backup tapes present a data security risk if they are lost or stolen.<ref>[http://www.securityfocus.com/news/11048 Backups tapes a backdoor for identity thieves] {{Webarchive|url=https://web.archive.org/web/20160405033517/http://www.securityfocus.com/news/11048 |date=5 April 2016 }} (28 April 2004). Retrieved 10 March 2007</ref> Encrypting the data on these media can mitigate this problem, but presents new problems. Encryption is a CPU intensive process that can slow down backup speeds, and the security of the encrypted backups is only as effective as the security of the key management policy.<ref name="CherrySecuring15" />
-; [[Multiplexing]] : When there are many more computers to be backed up than there are destination storage devices, the ability to use a single storage device with several simultaneous backups can be useful.<ref name="PrestonBackup07-02">{{cite book |url=https://books.google.com/books?id=6-w4fXbBInoC&pg=PA219 |title=Backup & Recovery: Inexpensive Backup Solutions for Open Systems |author=Preston, W.C. |publisher=O'Reilly Media, Inc |pages=219–220 |year=2007 |isbn=978-0-596-55504-7 |accessdate=8 May 2018}}</ref>
-; Refactoring: The process of rearranging the backup sets in a data repository is known as refactoring. For example, if a backup system uses a single tape each day to store the incremental backups for all the protected computers, restoring one of the computers could potentially require many tapes. Refactoring could be used to consolidate all the backups for a single computer onto a single tape. This is especially useful for backup systems that do ''incrementals forever'' style backups.
-; [[Disk staging|Staging]] : Sometimes backup jobs are copied to a staging disk before being copied to tape.<ref name="PrestonBackup07-02" /> This process is sometimes referred to as D2D2T, an acronym for Disk to Disk to Tape. This can be useful if there is a problem matching the speed of the final destination device with the source device as is frequently faced in network-based backup systems. It can also serve as a centralized location for applying other data manipulation techniques.
-
-== Managing the backup process ==
-As long as new data are being created and changes are being made, backups will need to be performed at frequent intervals. Individuals and organizations with anything from one computer to thousands of computer systems all require protection of data. The scales may be very different, but the objectives and limitations are essentially the same. Those who perform backups need to know how successful the backups are, regardless of scale.
-
-=== Objectives ===
-; [[Recovery point objective]] (RPO) : The point in time that the restarted infrastructure will reflect. Essentially, this is the roll-back that will be experienced as a result of the recovery. The most desirable RPO would be the point just prior to the data loss event. Making a more recent recovery point achievable requires increasing the frequency of [[file synchronization|synchronization]] between the source data and the backup repository.<ref>[http://www.riskythinking.com/glossary/recovery_point_objective.php Definition of ''recovery point objective''] {{Webarchive|url=https://web.archive.org/web/20070513180844/http://www.riskythinking.com/glossary/recovery_point_objective.php |date=13 May 2007 }}. Retrieved 10 March 2007</ref><ref>{{Cite web |title=Top four things to consider in business continuity planning |url=http://sysgen.ca/top-four-things-business-continuity-planning/ |website=sysgen.ca |accessdate=23 September 2015 |archive-url=https://web.archive.org/web/20160304075050/http://sysgen.ca/top-four-things-business-continuity-planning/ |archive-date=4 March 2016 |dead-url=no |df=dmy-all}}</ref>
-; [[Recovery time objective]] (RTO) : The amount of time elapsed between disaster and restoration of business functions.<ref>[http://www.riskythinking.com/glossary/recovery_time_objective.php Definition of ''recovery time objective''] {{Webarchive|url=https://web.archive.org/web/20070516081425/http://www.riskythinking.com/glossary/recovery_time_objective.php |date=16 May 2007 }}. Retrieved 7 March 2007</ref>
-; [[Data security]] : In addition to preserving access to data for its owners, data must be restricted from unauthorized access. Backups must be performed in a manner that does not compromise the original owner's undertaking. This can be achieved with data encryption and proper media handling policies.<ref name="LittleImplement03">{{cite book |url=https://books.google.com/books?id=_DqO6kizEDUC&pg=PA17 |title=Implementing Backup and Recovery: The Readiness Guide for the Enterprise |chapter=Chapter 2: Business Requirements of Backup Systems |author=Little, D.B. |publisher=John Wiley and Sons |pages=17–30 |year=2003 |isbn=978-0-471-48081-5 |accessdate=8 May 2018}}</ref>
-; [[Data retention]] period : Regulations and policy can lead to situations where backups are expected to be retained for a particular period, but not any further. Retaining backups after this period can lead to unwanted liability and sub-optimal use of storage media.<ref name="LittleImplement03" />
-
-=== Limitations ===
-An effective backup scheme will take into consideration the following situational limitations<ref name="NelsonPro11-2">{{cite book |url=https://books.google.com/books?id=r4uEEsq3CJYC&printsec=frontcover |title=Pro Data Backup and Recovery |chapter=Chapter 9: Putting It All Together: Sample Backup Environments |author=Nelson, S. |publisher=Apress |pages=203–246 |year=2011 |isbn=978-1-4302-2663-5 |accessdate=8 May 2018}}</ref>:
-
-; Backup window: The period of time when backups are permitted to run on a system is called the backup window. This is typically the time when the system sees the least usage and the backup process will have the least amount of interference with normal operations. The backup window is usually planned with users' convenience in mind. If a backup extends past the defined backup window, a decision is made whether it is more beneficial to abort the backup or to lengthen the backup window.
-; Performance impact: All backup schemes have some performance impact on the system being backed up. For example, for the period of time that a computer system is being backed up, the hard drive is busy reading files for the purpose of backing up, and its full bandwidth is no longer available for other tasks. Such impacts should be analyzed.
-; Costs of hardware, software, labor: All types of storage media have a finite capacity with a real cost. Matching the correct amount of storage capacity (over time) with the backup needs is an important part of the design of a backup scheme. Any backup scheme has some labor requirement, but complicated schemes have considerably higher labor requirements. The cost of commercial backup software can also be considerable.
-; Network bandwidth: Distributed backup systems can be affected by limited network bandwidth.
-
-=== Implementation ===
-Meeting the defined objectives in the face of the above limitations can be a difficult task. The tools and concepts below can make that task more achievable.
-; Scheduling: Using a [[job scheduler]] can greatly improve the reliability and consistency of backups by removing part of the human element. Many backup software packages include this functionality.
-; Authentication: Over the course of regular operations, the user accounts and/or system agents that perform the backups need to be authenticated at some level. The power to copy all data off of or onto a system requires unrestricted access. Using an authentication mechanism is a good way to prevent the backup scheme from being used for unauthorized activity.
-; Chain of trust : Removable [[storage media]] are physical items and must only be handled by trusted individuals. Establishing a chain of trusted individuals (and vendors) is critical to defining the security of the data.
-
-=== Measuring the process ===
-To ensure that the backup scheme is working as expected, the following best practices should be enacted<ref name="AkhtarDatabase12">{{cite journal |title=Database Backup and Recovery Best Practices |journal=ISACA Journal |author=Akhtar, A.N.; Buchholtz, J.; Ryan, M.; Setty, K. |volume=1 |pages=1–6 |year=2012 |url=https://www.isaca.org/Journal/archives/2012/Volume-1/Pages/Database-Backup-and-Recovery-Best-Practices.aspx |accessdate=8 May 2018}}</ref><ref name=DorionBackupReportingTool>{{cite web |last1=Dorion |first1=Pierre |title=Why you need a data backup reporting tool |url=http://searchdatabackup.techtarget.com/tip/Why-you-need-a-data-backup-reporting-tool |website=TechTarget |publisher=Tech Target Inc. |accessdate=13 November 2017 |date=June 2008}}</ref><ref name="PritchardCloud17">{{cite web |url=https://www.computerweekly.com/feature/Cloud-to-cloud-backup-What-it-is-and-why-you-need-it |title=Cloud-to-cloud backup: What it is and why you need it |author=Pritchard, S. |work=Computer Weekly |publisher=TechTarget |date=December 2017 |accessdate=8 May 2018}}</ref>:
-
-; [[Backup validation]] : (also known as "backup success validation") Provides information about the backup, and proves compliance to regulatory bodies outside the organization: for example, an insurance company in the USA might be required under [[Health Insurance Portability and Accountability Act|HIPAA]] to demonstrate that its client data meet records retention requirements.<ref>[http://www.hipaadvisory.com/regs/recordretention.htm HIPAA Advisory] {{Webarchive|url=https://web.archive.org/web/20070411135655/http://www.hipaadvisory.com/regs/recordretention.htm |date=11 April 2007 }}. Retrieved 10 March 2007</ref> Disaster, data complexity, data value and increasing dependence upon ever-growing volumes of data all contribute to the anxiety around and dependence upon successful backups to ensure [[business continuity]]. Thus many organizations rely on third-party or "independent" solutions to test, validate, and optimize their backup operations (backup reporting).
-; Reporting: In larger configurations, reports are useful for monitoring media usage, device status, errors, vault coordination and other information about the backup process.
-; Logging: In addition to the history of computer generated reports, activity and change logs are useful for monitoring backup system events.
-; Validation: Many backup programs use [[checksum]]s or [[hash function|hashes]] to validate that the data was accurately copied. These offer several advantages. First, they allow data integrity to be verified without reference to the original file: if the file as stored on the backup medium has the same checksum as the saved value, then it is very probably correct. Second, some backup programs can use checksums to avoid making redundant copies of files, and thus improve backup speed. This is particularly useful for the de-duplication process.
-; Monitored backup: Backup processes can be monitored locally via a software dashboard or by a third party monitoring center. Both alert users to any errors that occur during automated backups. Some third-party monitoring services also allow collection of historical metadata, that can be used for storage resource management purposes like projection of data growth and locating redundant primary storage capacity and reclaimable backup capacity.
-
-== Enterprise client-server backup ==
-
-"Enterprise client-server" backup software describes a class of software applications that back up data from a variety of client computers centrally to one or more server computers, with the particular needs of [[Company|enterprises]] in mind. They may employ a scripted client–server<ref name="KissellTakeControl2.0">{{cite book |last1=Kissell |first1=Joe |title=Take Control of Mac OS X Backups |date=2007 |publisher=TidBITS Electronic Publishing |location=Ithaca, NY |isbn=0-9759503-0-4 |edition=Version 2.0 |url=http://people.fas.harvard.edu/~techtool/pages/Take_Control_of_Mac_OS_X_Backups_(2.0).pdf |accessdate=22 September 2017 |ref=Kissell |pages=24 (client-server), 127 (script), 165 (client-server), 128 (subvolume—''later'' renamed Favorite Folder in Macintosh variant)}}</ref> backup model<ref name="EnterpriseBackupChallenges">{{cite web |last1=Rassokhin? |first1=Alexander? |title=Enterprise Network Backup Challenges |url=http://www.backupschedule.net/enterprise-network-backup.html |website=All About Backup |publisher=Novosoft LLC |accessdate=13 November 2017 |year=2012}}</ref> with a backup [[server (computing)|server]] program running on one computer, and with small-footprint [[client (computing)|client]] programs (referred to as "agents" in some applications) running on the other computers being backed up, in either a single platform or [[heterogeneous network|mixed platform network]]. Enterprise-specific requirements<ref name="EnterpriseBackupChallenges" /> include the need to back up large amounts of data on a systematic basis, to adhere to legal requirements for the maintenance and archiving of files and data, and to satisfy short-recovery-time objectives. To satisfy these requirements, which World Backup Day (31 March)<ref name="CBC-WorldBackupDay">{{cite web |url=http://www.cbc.ca/news/technology/world-backup-day-1.3510588 |title=World Backup Day highlights importance of protecting data |last=Misener |first=Dan |date=29 March 2016 |publisher=CBC News}}</ref><ref name="ZDNetWorldBackupDay">{{cite web |title=World Backup Day: deutliche Lücken zwischen Sicherheitsrisiko und Nutzerverhalten |url=http://www.zdnet.de/88291257/ |publisher=[[ZDNet]] |date=31 March 2017 |language=de-DE |first=Anja |last=Schmoll-Trautmann}}</ref><ref name="eWeekWorldBackupDay">{{cite web |last1=Preimesberger |first1=Chris |title=World Backup Day 2017: 'We Don't Know the Day Nor the Hour' |url=http://www.eweek.com/storage/world-backup-day-2017-we-don-t-know-the-day-nor-the-hour |website=eWeek |publisher=QuinStreet |accessdate=11 November 2017 |date=31 March 2017 |at=Ian Wood of Veritas}}</ref> highlights, it is typical for an enterprise to appoint a backup administrator, who is a part of office administration rather than of the IT staff, and whose role is "being the keeper of the data".<ref name="DorionBackupAdminRole">{{cite web |last1=Dorion |first1=Pierre |title=The true role of a backup administrator |url=http://searchdatabackup.techtarget.com/news/1322981/The-true-role-of-a-backup-administrator |website=TechTarget |publisher=TechTarget, Inc. |accessdate=13 November 2017 |date=4 August 2008 |quote=On the other hand, the role of a backup administrator should be one of administration, not operation....whose role is "being the keeper of the data"}}</ref>
-
-Such applications make cumulative backups of ''multiple'' client machines' source files to, or do restores from, what would ordinarily be referred to as an [[archive file]]. However some of these applications use (or once used<ref name="BackupExecArchivingOptionNoLongeSupported">{{cite web |title=Backup Exec Archiving Option is no longer supported for Backup Exec 15 Feature Pack 1 |url=https://www.veritas.com/support/en_US/article.100023956 |website=Veritas Support |publisher=Veritas Technologies LLC |accessdate=13 May 2018 |date=30 June 2015}}</ref>) the term "archive" to refer to a backup operation that deletes data from a client source once the data's backup is complete.<ref name="NetBackupWhatIsArchiving">{{cite web |last1=Bokelman |first1=Seth |title=what is archiving in Netbackup? |url=https://vox.veritas.com/t5/NetBackup/what-is-archiving-in-Netbackup/m-p/490153#M112727 |website=VOX |publisher=Veritas Technologies LLC |accessdate=13 May 2018 |date=26 February 2012}}</ref><ref name="RetrospectMac14UG">{{cite web |title=Retrospect ® 14.0 Mac User's Guide |url=http://download.retrospect.com/docs/mac/v14/user_guide/Retrospect_Mac_User_Guide-EN.pdf |website=Retrospect |publisher=Retrospect Inc. |accessdate=28 March 2017 |format=PDF |date=March 2017}}</ref> Therefore the discussion of these applications will use the non-proprietary term "set(s) of backups" instead of "archive file(s)".
-
-=== Performance ===
-
-The [[Hard disk drive#Price evolution|steady improvement in hard disk drive price per byte]] has made feasible a [[Backup#Manipulation of data and dataset optimization|disk-to-disk-to-tape]] strategy, combining the speed of disk backup and restore with the capacity and low cost of tape for offsite archival and disaster recovery purposes.<ref name="FernandoCombineDiskTapeBenefits">{{cite web |last1=Fernando |first1=Sal |title=Combine disk, tape benefits to protect data |url=http://www.zdnet.com/article/combine-disk-tape-benefits-to-protect-data/ |publisher=ZDNet |accessdate=13 November 2017 |date=30 April 2008}}</ref> This, with [[Comparison of file systems#File capabilities|file system technology]], has led to features such as:
-; Improved disk-to-disk-to-tape capabilities: Enable automated transfers to tape for safe offsite storage of disk sets of backups that were created for fast onsite restores.<ref name="EMCRetroWindows7">{{cite web |title=New EMC Dantz Retrospect 7 Improves Data Protection for SMBs and the Distributed Enterprise |url=http://www.emc.com/about/news/press/us/2005/20050131-2906.htm |website=DellEMC [current] |publisher=EMC Corp. [orig. publisher] |accessdate=23 November 2016 |date=31 January 2005}}</ref><ref name="NetBackupAboutReplicationDirector">{{cite web |title=About NetBackup Replication Director |url=https://www.veritas.com/support/en_US/doc/59229900-126796169-0/v58079997-126796169 |website=Veritas Support |publisher=Veritas Technologies LLC (US) |accessdate=18 November 2017 |date=13 July 2017}}</ref><ref name="BackupExecDuplicatingBackedUpData">{{cite web |title=Symantec Backup Exec: About duplicating backed up data |url=http://backup-exec.helpmax.net/en/backing-up-data/about-duplicating-backed-up-data/ |website=''Helpmax.net'' |publisher=HelpMax Software Help & Shop Inc. |accessdate=13 January 2018}}</ref>
-; Create synthetic full backups: For example, onto tapes from existing disk sets of backups—by copying multiple backups of the same source(s) from one set of backups to another. This is termed a [[Incremental backup#Synthetic full backup|"synthetic full backup"]] because, after the transfer, the destination set of backups contains the same data it would after full backups.<ref name="EMCRetroWindows7" /><ref name="NetBackupAboutSyntheticBackups">{{cite web |title=About synthetic backups |url=https://www.veritas.com/content/support/en_US/doc/18716246-126559472-0/id-SF780163836-126559472 |website=Veritas Support |publisher=Veritas Technologies LLC (US) |accessdate=18 November 2017 |date=25 September 2017}}</ref><ref name="BackupExecSyntheticFullBackup">{{cite web |title=Symantec Backup Exec: About the synthetic backup feature |url=http://backup-exec.helpmax.net/en/symantec-backup-exec-advanced-disk-based-backup-option/about-the-synthetic-backup-feature/ |website=''Helpmax.net'' |publisher=HelpMax Software Help & Shop Inc. |accessdate=13 January 2018}}</ref> One application can exclude<ref group=note name=RetrospectExclusionInclusion>Exclusion and/or inclusion is done with Selectors in the Windows variant; this misleading term has been changed to Rules in the Macintosh variant.</ref> files and folders from the synthetic full backup.<ref name="RetrospectWindows12UG" />
-
-; Automated data grooming: Frees up space on disk sets of backups by removing out-of-date backup data—usually based on an administrator-defined retention period.<ref name="eWeekWorldBackupDay" /><ref name="FernandoCombineDiskTapeBenefits" /><ref name="EMCRetroWindows7" /><ref name="NetBackupStorageLifecyclePolicy">{{cite web |last1=Kaczorek |first1=Mariusz |title=NetBackup Storage Lifecycle Policy (SLP): Overview |url=https://www.settlersoman.com/netbackup-storage-lifecycle-policy-slp-overview/ |website=Settlersoman |publisher=Settlersoman |accessdate=2 February 2018 |date=15 August 2015}}</ref><ref name="BackupExecDataGrooming">{{cite web |last1=Jain |first1=Hemant |title=VOX Knowledge Base: Data Protection Knowledge Base: Data Protection |url=https://vox.veritas.com/t5/Articles/Automated-Disk-management-and-Data-retention-in-Backup-Exec-DLM/ta-p/809167 |website=VOX |publisher=Veritas Technologies LLC |accessdate=13 January 2018 |date=14 April 2015 |quote=Employee [of Veritas]}}</ref><ref group=note>A few backup applications—mostly free ones—term this "pruning" instead of "grooming", but other applications use the term "pruning" to mean omitting certain ''types'' of files from backups.</ref> One method of removing data is to keep the last backup of each day/week/month for the last respective week/month/specified-number-of-months, permitting compliance with regulatory requirements.<ref name="RetrospectMac12UG">{{cite web |title=Retrospect ® 12.0 Mac User's Guide |url=http://download.retrospect.com/docs/mac/v12/user_guide/Retrospect_Mac_User_Guide-EN.pdf |website=Retrospect |publisher=Retrospect Inc. |accessdate=28 December 2017 |format=PDF |year=2015}}</ref> One application has a "performance-optimized grooming" mode that only removes outdated information from a set of backups that it can quickly delete.<ref name="TitBITSMacintosh13">{{cite web |last1=Schmitz |first1=Agen |title=Retrospect 13 |url=https://tidbits.com/article/16311 |website=TitBITS |publisher=TidBITS Publishing Inc. |accessdate=27 October 2016 |date=5 March 2016}}</ref> This is the only mode of grooming allowed for cloud sets of backups, and is also up to 5 times as fast when used on locally stored disk sets of backups. The "storage-optimized grooming" mode reclaims more space because it rewrites the set of backups, and in this application also permits exclusion compliance with the [[GDPR]]<ref name="RetrospectKnowledgeBase">{{cite web |title=Support: Knowledge Base |url=https://www.retrospect.com/en/support/kb/ |website=Retrospect |publisher=Retrospect Inc. |accessdate=25 August 2018 |date=2 July 2018 |at=#Resources (Auto Launching Guide ..., ... difference between "Backup" and "Duplicate", Avid Support ..., Instant Scan FAQ), #Email Backup, #Top Articles (BackupBot – Deep Dive into ProactiveAI, How to Set Up Remote Backup, GDPR – Deep Dive into Data Retention Policies, Deep Dive - Components of a Retrospect Backup)}}</ref> via rules<ref group=note name=RetrospectExclusionInclusion />—that can instead be used for other filtering.<ref name="TitBITSMacintosh15.1.1">{{cite web |last1=Schmitz |first1=Agen |title=Retrospect 15.1.1 |url=https://tidbits.com/watchlist/retrospect-15-1-1/ |website=TitBITS |publisher=TidBITS Publishing Inc. |accessdate=20 June 2018 |date=28 May 2018}}</ref>
-; Multithreaded backup server: Capable of simultaneously performing multiple backup, restore, and copy operations in separate "activity threads" (once needed only by those who could afford multiple tape drives).<ref name="EnterpriseBackupChallenges" /><ref name="NetBackupMultistreamMultiplex">{{cite web |title=What is the difference between multiplexing and multistreaming? |url=https://www.veritas.com/support/en_US/article.TECH10085 |website=Veritas Support |publisher=Veritas Technologies LLC (US) |accessdate=19 November 2017 |date=29 January 2015}}</ref><ref name="BackupExecRunConcurrentJobs">{{cite web |last1=McMillen |first1=Robert |title=How to run concurrent jobs in Backup exec 15 |url=https://www.youtube.com/watch?v=1-9x9So038g |via=YouTube |publisher=Google |accessdate=14 January 2018 |format=Video |date=21 July 2015}}</ref> In one application, all the categories of information for a particular "backup server" are stored by it; when an [[backup#User interface|"Administration Console"]] process is started, its process synchronizes information with all running LAN/WAN backup servers.<ref name="TidBITSEMCShips">{{cite web |last1=Engst |first1=Adam |title=EMC Ships Modernized Retrospect 8 |url=https://tidbits.com/article/10159 |website=TidBITS |publisher=TidBITS Publishing Inc. |accessdate=12 September 2017 |date=23 March 2009}}</ref>
-; Block-level incremental backup: The ability to back up only the blocks of a file that have changed, a [[Incremental backup#Block level incremental|refinement of incremental backup]] that saves space<ref name="TitBITSMacintosh11">{{cite web |last1=Schmitz |first1=Agen |title=Retrospect 11 |url=https://tidbits.com/article/14573 |website=TitBITS |publisher=TidBITS Publishing Inc. |accessdate=27 April 2017 |date=6 March 2014}}</ref><ref name="NetBackupBlockLevelOracle">{{cite web |title=How Veritas NetBackup block-level incremental backup works for Oracle database files |url=https://sort.symantec.com/public/documents/sfha/6.0/aix/productguides/html/sf_adv_ora/ch21s01s01.htm |website=Symantec |publisher=Veritas Technologies LLC (US) |accessdate=18 November 2017 |year=2013}}</ref><ref name="BackupExecBlockLevel">{{cite web |last1=Harbaugh |first1=Logan |title=Developing a Real Backup Plan with Symantec's Backup Exec 15 |url=https://edtechmagazine.com/higher/article/2015/10/developing-real-backup-plan-symantecs-backup-exec-15 |website=EdTech |publisher=CDW LLC |accessdate=14 January 2018 |date=Fall 2015}}</ref> and may save time.<ref name="EnterpriseBackupChallenges" /><ref name="WhitehouseFile-levelBlock-levelDedup">{{cite web |last1=Whitehouse |first1=Lauren |title=The pros and cons of file-level vs. block-level data deduplication technology |url=http://searchdatabackup.techtarget.com/tip/The-pros-and-cons-of-file-level-vs-block-level-data-deduplication-technology |website=TechTarget |publisher=Tech Target Inc. |accessdate=13 November 2017 |date=September 2008}}</ref> Such [[Backup#Files|partial file copying]] is especially applicable to a [[database]].
-; "Instant" scanning of client volumes: Uses the [[USN Journal]] on Windows NTFS and [[FSEvents]] on macOS to reduce the scanning component<ref name="RetrospectKnowledgeBase" /> time on both incremental backups, fitting more sources into the [[Backup#Limitations|backup window]],<ref name="EnterpriseBackupChallenges" /><ref name="NetBackupAccelerator">{{cite web |title=About the Accelerator feature in NetBackup 7.5 |url=https://www.veritas.com/support/en_US/article.000086263 |website=Veritas Support |publisher=Veritas Technologies LLC (US) |accessdate=18 November 2017 |date=10 November 2017}}</ref><ref name="BackupExecDeterminingIfFileBackedUp">{{cite web |title=Veritas Backup Exec Administrator's Guide: How Backup Exec determines if a file has been backed up |url=https://www.veritas.com/content/support/en_US/doc/59226269-99535599-0/v63768146-99535599 |website=Veritas Support |publisher=Veritas Technologies LLC |accessdate=7 February 2018 |date=11 November 2017}}</ref> and on restores.<ref name="TidBITSMac10">{{cite web |last1=Engst |first1=Adam |title=Retrospect 10 Reduces Backup Time with Instant Scan Technology |url=https://tidbits.com/article/13379 |website=TidBITS |publisher=TidBITS Publishing Inc. |accessdate=25 October 2016 |date=6 November 2012}}</ref>
-; Cramming or evading the [[Backup#Limitations|backup window]]: One application has the "multiplexed backup" capability of cramming the [[Backup#Limitations|backup window]] by sending data from multiple clients to a single tape drive simultaneously; "this is useful for low end clients with slow throughput ... [that] cannot send data fast enough to keep the tape drive busy .... will reduce the performance of restores."<ref name="NetBackupMultistreamMultiplex" /> Another application allows an enterprise that has computers transiently connecting to the network over a long workday to evade the window by using [[Retrospect (software)#Small-group features|Proactive scripts]].
-
-=== Source file integrity ===
-; Backing up interactive applications : [[Interactive computing|Such applications]] must be protected by having their services [[quiesce|paused]] while their [[Backup#Live data|live data]] is being backed up, and then [[:wikt:unpause|unpaused]].<ref name="EnterpriseBackupSoftware: WorkstationsEmailDatabases">{{cite web |last1=Rassokhin? |first1=Alexander? |title=Enterprise Backup Software: Backup Network Workstations, Email and Databases |url=http://www.backupschedule.net/enterprise-backup.html |website=All about Backup |publisher=Novosoft LLC |accessdate=24 January 2018 |year=2012}}</ref> Some enterprise backup applications accomplish pausing/unpausing of services via built-in provisions—for many specific databases and other interactive applications—that become automatically part of the backup software's script execution; these provisions [[Retrospect (software)#Editions and Add-Ons|may be purchased separately]].<ref name="NetBackupDatabase&AppAgentCompatibility">{{cite web |title=Veritas NetBackup ™ 8.0 – 8.x.x Database and Application Agent Compatibility List |url=https://www.veritas.com/content/support/en_US/doc/NB_80_DBSCL |website=Veritas |publisher=Veritas Technologies LLC (US) |accessdate=19 November 2017 |date=17 November 2017}}</ref><ref name="BackupExecAgents&Options">{{cite web |title=Backup Exec TM 16 Agents and Options |url=https://www.veritas.com/content/dam/Veritas/docs/data-sheets/be16-agents-and-options.pdf |website=Veritas |publisher=Veritas Technologies LLC |accessdate=14 January 2018 |year=2016}}</ref> However another application has also added [[Scripting language#Extension/embeddable languages|"script hooks"]] that enable the optional automatic execution—at specific events during runs of a GUI-coded backup script—of portions of an external script containing commands pre-written in a standard [[scripting language]]. Since the external script is provided by an installation's backup administrator, its code activated by the "script hooks" may accomplish not only data protection—via pausing/unpausing interactive services—but also [[Backup#User interface|integration with monitoring systems]].<ref name="RetrospectMac14UG" />
-
-=== User interface ===
-
-To accommodate the requirements of a backup administrator who may not be part of the IT staff with access to the secure server area, enterprise client-server software may include features such as:
-
-; Administration Console:
-:The backup administrator's backup server [[GUI]] management and near-term reporting tool.<ref name="DorionBackupReportingTool" /> Its window shows the selected backup server, with a standard toolbar on top. A sidebar on the left or navigation bar shows the clickable categories of backup server information for it; each category shows a panel, which may have a specialized toolbar below or in place of the standard toolbar. The built-in categories include activities—thus providing [[Backup#Measuring the process|monitored backup]], past backups of each individual source, scripts/policies/jobs (terminology depending on the application), sources (directly/indirectly), sets of backups, and storage devices.<ref name="RetrospectMac14UG" /><ref name="NetBackupAdminGuideVol.1">{{cite web |title=Symantec NetBackup ™ Administrator's Guide, Volume I Windows |url=http://www-personal.umich.edu/~danno/symantec/NetBackup_AdminGuideI_WinServer.pdf |website=Symantec |publisher=Veritas Technologies LLC (US) |accessdate=18 November 2017 |pages=35–45(Administration Console), 833–843(Activity Monitor), 888–894(Reports utility), 912(Remote Administration Console), 915–938(Java Console) |year=2012}}</ref><ref name="BackupExecAdminConsole">{{cite web |title=Symantec Backup Exec: About the Administration Console |url=http://backup-exec.helpmax.net/en/introducing-backup-exec/about-the-administration-console/ |website=''Helpmax.net'' |publisher=HelpMax Software Help & Shop Inc. |accessdate=10 December 2017}}</ref>
-; User-initiated backups and restores: These supplement the administrator-initiated backups and restores which backup applications have always had, and relieve the administrator of time-consuming tasks.<ref name="DorionBackupAdminRole" /> The user designates the date of the past backup from which files or folders are to be restored—once IT staff has mounted the proper backup volume on the backup server.<ref name="FernandoCombineDiskTapeBenefits" /><ref name="RetrospectMac14UG" /><ref name="NetBackupOperationalRestore">{{cite web |title=OpsCenter Operational Restore |url=https://www.veritas.com/support/en_US/article.100038022 |website=Veritas Support |publisher=Veritas Technologies LLC (US) |accessdate=18 November 2017 |date=12 March 2012}}</ref><ref name="BackupExecUserRetrieve">{{cite web |title=How Backup Exec Retrieve works |url=http://backup-exec.helpmax.net/en/using-backup-exec-retrieve/how-backup-exec-retrieve-works/ |website=''Helpmax.net'' |publisher=HelpMax Software Help & Shop Inc. |accessdate=14 January 2018}}</ref>
-; High-level/long-term reports supplementing the Administration Console<ref name="DorionBackupReportingTool" />: Within one application's Console panel displayed by clicking the name of the backup server itself in the sidebar, an activities pane on the top left of the displayed [[Dashboard (business)|Dashboard]] has a moving bar graph for each activity going on for the backup server together with a pause and stop button for the activity. Three more panes give the results of activities in the past week: backups each day, sources backed up, and sources not backed up. Finally a storage pane has a line for each set of backups, showing the last-modified date and depictions of the total bytes used and available.<ref name="TitBITSMacintosh11" /><ref name="RetrospectMac14UG" /> For the application's Windows variant, the Dashboard acts as a display-only substitute for a non-existent Console.<ref name="RetrospectWindows12UG" /> Other applications have a separate reporting facility that can cover multiple backup servers.<ref name="NetBackupOperationsManager">{{cite web |last1=Antony |first1=Erica |author2=Tim Burlowski |title=NetBackup Operations Manager: Monitoring, Alerting and Reporting for Veritas NetBackup |url=https://vox.veritas.com/t5/Articles/NetBackup-Operations-Manager-Monitoring-Alerting-and-Reporting/ta-p/806080 |website=Symantec |publisher=Veritas Technologies LLC (US) |accessdate=18 November 2017 |pages=4–5(monitoring), 6–7(alerting), 7(3rdPartyEventMgmt.), 11–18(reporting) |format=PDF attachment |date=January 2008}}</ref><ref name="BackupExecEnterpriseDataProtection">{{cite web |title=Windows® Enterprise Data Protection with Symantec Backup Exec™ |url=http://www.r2gen.com.br/images/symantec/pdf/symantec_protegendo_sua_empresa.pdf |website=Symantec |publisher=Veritas Technologies LLC |accessdate=14 January 2018 |pages=5–8 (CASO) |format=PDF |year=2007}}</ref>
-; E-mailing of notifications about operations to chosen recipients<ref name="DorionBackupReportingTool" />: Can alert the recipient to, e.g., errors or warnings, with a log to assist in pinpointing problems.<ref name="RetrospectWindows12UG" /><ref name="NetBackupOperationsManager" /><ref name="BackupExecConfigureNotifications">{{cite web |title=How to configure notification recipients in Backup Exec 12.0 and above |url=https://www.veritas.com/support/en_US/article.100016176 |website=Veritas Support |publisher=Veritas Technologies LLC |accessdate=15 January 2018 |date=10 November 2017}}</ref>
-; Integration with monitoring systems<ref name="DorionBackupReportingTool" />: Such systems provide [[Backup#Measuring the process|backup validation]]. One application's administrators can deploy custom scripts that—invoking [[webhook]] code via [[Backup#Source file integrity|script hooks]]—populate such systems as the freeware [[Nagios]] and [[IFTTT]] and the [[freemium]] [[Slack (software)|Slack]] with script successes and failures corresponding to the activities category of the Console, per-source backup information corresponding to the past backups category of the Console, and media requests.<ref name="RetrospectMac14UG" /> Another application has integration with two of the developer's monitoring systems, one that is part of the client-server backup application and one that is more generalized.<ref name="NetBackupOperationsManager" /> Yet another application has integration with a monitoring system that is part of the client-server backup application,<ref name="BackupExecJobMonitor">{{cite web |title=Veritas Backup Exec Administrator's Guide: About the Job Monitor |url=https://www.veritas.com/content/support/en_US/doc/59226269-99535599-0/v76313540-99535599 |website=Veritas Support |publisher=Veritas Technologies LLC |accessdate=15 January 2018 |date=11 November 2017}}</ref> but can also be integrated with Nagios.<ref name="BackupExecNagiosPlugins">{{cite web |title=Nagios plugins for monitoring BackupExec |url=https://exchange.nagios.org/directory/Plugins/Backup-and-Recovery/BackupExec |website=Nagios Exchange |publisher=Nagios Enterprises |accessdate=15 January 2018}}</ref>
-
-=== LAN/WAN/Cloud ===
-; Advanced network client support: All applications includes support for multiple network interfaces.<ref name="EnterpriseBackupChallenges" /><ref name="EMCRetroMac8">{{cite web |title=EMC Announces Retrospect 8.0 Backup and Recovery Software For Mac |url=http://www.infotomorrowmag.com/about/news/press/2009/20090106-02.htm |website=DellEMC [current] |publisher=EMC Corp. [orig. publisher] |accessdate=10 November 2016 |date=6 January 2009}}</ref><ref name="BackupExecConfiguringNetworkOptionsBackup">{{cite web |title=Veritas Backup Exec Administrator's Guide: Configuring network options for backup jobs |url=https://www.veritas.com/content/support/en_US/doc/59226269-99535599-0/v96257307-99535599 |website=Veritas Support |publisher=Veritas Technologies LLC |accessdate=15 January 2018 |date=17 November 2017}}</ref> However one application, unless [[Data deduplication#Source versus target deduplication|deduplication is done by a separate sub-application between the client and the backup server]], cannot provide "resilient network connections" for machines on a WAN.<ref name="NetBackupDeduplication">{{cite web |title=Veritas NetBackup™ Deduplication Guide |url=https://www.veritas.com/content/support/en_US/doc/ka6j00000000ADEAA2 |website=Veritas |publisher=Veritas Technologies LLC (US) |accessdate=18 November 2017 |page=171(Resilient network properties) |format=PDF |year=2016}}</ref> One application can extend support to "remote" clients anywhere on the Internet for a [[Retrospect (software)#Small-group features|Proactive]] script and for [[Backup#User interface|user-initiated backups/restores]].<ref name="RetrospectKnowledgeBase" />
-; Cloud seeding and Large-Scale Recovery: Because of a large amount of data already backed up,<ref name="EnterpriseBackupChallenges" /> an enterprise adopting [[Backup#Storage_media|cloud backup]] likely will need to do [[Seed loading|"seeding"]]. This service copies a large volume of locally stored backup data onto a large-capacity disk device, which is then physically shipped to the cloud storage site and uploaded.<ref name="WhatIsAWSSnowball?">{{cite web |title=What Is an AWS Snowball Appliance? |url=https://docs.aws.amazon.com/snowball/latest/ug/whatissnowball.html |website=AWS |publisher=Amazon.com |accessdate=8 March 2018 |year=2018}}</ref><ref name="RouseCloudSeedingDef">{{cite web |last1=Rouse |first1=Margaret |title=Definition: cloud seeding |url=http://searchdatabackup.techtarget.com/definition/cloud-seeding |website=TechTarget |publisher=Tech Target Inc. |accessdate=16 November 2017 |date=December 2011}}</ref> After the large initial upload, the enterprise's backup software can be reconfigured to read from and write to the backup incrementally in its cloud location.<ref name="RetrospectChangingPathsMac">{{cite web |title=Changing paths Cloud Mac |url=https://www.youtube.com/watch?v=Ac3BhXO4T1g |via=YouTube |publisher=Retrospect Inc. |accessdate=7 October 2016 |format=Video |date=29 February 2016}}</ref> The service may need to be employed in reverse for faster [[Disaster_recovery|large-scale data recovery]] times than would be possible via an Internet connection.<ref name="WhatIsAWSSnowball?" /> Some applications offer seeding and large-scale recovery via third-party services, which may use a high-speed Internet channel to/from cloud storage rather than a shipable physical device.<ref name="NetBackupAmazonStorageGateway">{{cite web |last1=High |first1=Dave |author2=Mahmud, Fozz |title=NBU and the Amazon Storage Gateway VTL |url=https://www.youtube.com/watch?v=rU1rFK9o20s |website=Veritas |publisher=Veritas Technologies LLC |accessdate=17 January 2018 |format=Video |date=10 March 2016}}</ref><ref name="BackupExecCloudConnector">{{cite web |title=Backup Exec 16: Best Practices for Using the Veritas Backup Exec Cloud Connector |url=https://www.veritas.com/content/support/en_US/doc/72686287-129480082-0/v128967126-129480082 |website=Veritas Support |publisher=Veritas Technologies LLC |accessdate=15 January 2018 |date=25 October 2017}}</ref>
-
-== See also ==
-;About backup
-* Backup software
-** [[List of backup software]]
-* [[Glossary of backup terms]]
-* [[Remote backup service]]
-* [[Virtual backup appliance]]
-
-;Related topics
-* [[Data consistency]]
-* [[Data degradation]]
-* [[Data proliferation]]
-* [[Database dump]]
-* [[Digital preservation]]
-* [[Disaster recovery and business continuity auditing]]
-* [[File synchronization]]
-* [[Information repository]]
-
-== Notes ==
-{{reflist|group=note}}
-
-== References ==
-{{Reflist|2}}
-
-== External links ==
-{{Wiktionary|back up}}
-{{Wiktionary|backup}}
-{{Commons category|Backup}}
-
-[[Category:Data security]]
-[[Category:Backup| ]]
+zero<ref></ref>
' |
Lines removed in edit (removed_lines ) | [
0 => '{{about|backup in computer systems|other uses}}',
1 => '{{Use dmy dates|date=August 2018}}',
2 => false,
3 => 'In [[information technology]], a '''backup''', or the process of backing up, refers to the copying into an [[archive file]] of computer [[data]] so it may be used to restore the original after a [[data loss]] event. The verb form is [[wikt:back up|"back up"]] (a [[phrasal verb]]), whereas the noun and adjective form is [[wikt:backup|"backup"]].<ref name="AHDictionaryBackup">{{cite web |title=back•up |url=https://www.ahdictionary.com/word/search.html?q=backup |website=The American Heritage Dictionary of the English Language |publisher=Houghton Mifflin Harcourt |accessdate=9 May 2018 |year=2018}}</ref>',
4 => false,
5 => 'Backups have two distinct purposes. The primary purpose is to recover data after its loss, be it by [[File deletion|data deletion]] or [[Data corruption|corruption]]. Data loss can be a common experience of computer users; a 2008 survey found that 66% of respondents had lost files on their home PC.<ref>[http://www.kabooza.com/globalsurvey.html Global Backup Survey] {{Webarchive|url=https://web.archive.org/web/20100327235844/http://www.kabooza.com/globalsurvey.html |date=27 March 2010 }}. Retrieved 15 February 2009</ref> The secondary purpose of backups is to recover data from an earlier time, according to a user-defined [[data retention]] policy, typically configured within a [[Backup software|backup application]] for how long copies of data are required.<ref name="NelsonPro11">{{cite book |url=https://books.google.com/books?id=r4uEEsq3CJYC&printsec=frontcover |title=Pro Data Backup and Recovery |chapter=Chapter 1: Introduction to Backup and Recovery |author=Nelson, S. |publisher=Apress |pages=1–16 |year=2011 |isbn=978-1-4302-2663-5 |accessdate=8 May 2018}}</ref> Though backups represent a simple form of [[disaster recovery]] and should be part of any [[disaster recovery plan]], backups by themselves should not be considered a complete disaster recovery plan. One reason for this is that not all backup systems are able to reconstitute a computer system or other complex configuration such as a [[computer cluster]], [[active directory]] server, or [[database server]] by simply restoring data from a backup.<ref name="CougiasTheBackup03">{{cite book |url=https://books.google.com/books?id=eLviiTag5A0C&pg=PA1 |title=The Backup Book: Disaster Recovery from Desktop to Data Center |chapter=Chapter 1: What's a Disaster Without a Recovery? |author=Cougias, D.J.; Heiberger, E.L.; Koop, K. |publisher=Network Frontiers |pages=1–14 |year=2003 |isbn=0-9729039-0-9}}</ref>',
6 => false,
7 => 'Since a backup system contains at least one copy of all data considered worth saving, the [[computer data storage|data storage]] requirements can be significant. Organizing this storage space and managing the backup process can be a complicated undertaking. A data repository model may be used to provide structure to the storage. Nowadays, there are many different types of [[data storage device]]s that are useful for making backups. There are also many different ways in which these devices can be arranged to provide geographic redundancy, [[data security]], and portability.',
8 => false,
9 => 'Before data are sent to their storage locations, they are selected, extracted, and manipulated. Many different techniques have been developed to optimize the backup procedure. These include optimizations for dealing with open files and live data sources as well as compression, encryption, and [[Data deduplication|de-duplication]], among others. Every backup scheme should include [[Dry run (testing)|dry runs]] that validate the reliability of the data being backed up. It is important to recognize the limitations and human factors involved in any backup scheme.',
10 => false,
11 => '== Storage, the base of a backup system ==',
12 => false,
13 => '=== Data repository models ===',
14 => 'Any backup strategy starts with a concept of a data repository. The backup data needs to be stored, and probably should be organized to a degree. The organisation could be as simple as a sheet of paper with a list of all backup media (CDs, etc.) and the dates they were produced. A more sophisticated setup could include a computerized index, catalog, or relational database. Different approaches have different advantages. Part of the model is the [[backup rotation scheme]].<ref name="DeanComp09">{{cite book |url=https://books.google.com/books?id=1QEMAAAAQBAJ&pg=PA602 |title=CompTIA Network+ 2009 in Depth |chapter=Chapter 14: Ensuring Integrity and Availability |author=Dean, T. |publisher=Cengage Learning |pages=571–614 |year=2009 |isbn=978-1-59863-878-3 |accessdate=8 May 2018}}</ref>',
15 => false,
16 => '; Unstructured : An unstructured repository may simply be a stack of tapes or CD-Rs or DVD-Rs with minimal information about what was backed up and when. This is the easiest to implement, but probably the least likely to achieve a high level of recoverability as it lacks automation.',
17 => '; Full only / [[system image|System imaging]] : A repository of this type contains complete system images taken at one or more specific points in time.<ref name="DeanComp09" /> This technology is frequently used by computer technicians to record known good configurations. Imaging<ref>{{Cite web |title=Five key questions to ask about your backup solution |url=http://sysgen.ca/five-key-backup-questions/ |website=sysgen.ca |accessdate=23 September 2015 |archive-url=https://web.archive.org/web/20160304042343/http://sysgen.ca/five-key-backup-questions/ |archive-date=4 March 2016 |dead-url=no |df=dmy-all}}</ref> is generally more useful for deploying a standard configuration to many systems rather than as a tool for making ongoing backups of diverse systems.',
18 => '; [[Incremental backup|Incremental]] : An incremental style repository aims to make it more feasible to store backups from more points in time by organizing the data into increments of change between points in time. This eliminates the need to store duplicate copies of unchanged data: with full backups a lot of the data will be unchanged from what has been backed up previously.<ref name="DeanComp09" /> Typically, a ''full'' backup (of all files) is made on one occasion (or at infrequent intervals) and serves as the reference point for an incremental backup set. After that, a number of ''incremental'' backups are made after successive time periods. Restoring the whole system to the date of the last incremental backup would require starting from the last full backup taken before the data loss, and then applying in turn each of the incremental backups since then.<ref>[http://www.tech-faq.com/incremental-backup.shtml Incremental Backup] {{Webarchive|url=https://web.archive.org/web/20160621090117/http://www.tech-faq.com/incremental-backup.shtml |date=21 June 2016 }}. Retrieved 10 March 2006</ref> Additionally, some backup systems can reorganize the repository to synthesize full backups from a series of incrementals.',
19 => '; [[Differential backup|Differential]] : Each differential backup saves the data that has changed since the last full backup.<ref name="DeanComp09" /> It has the advantage that only a maximum of two data sets are needed to restore the data. One disadvantage, compared to the incremental backup method, is that as time from the last full backup (and thus the accumulated changes in data) increases, so does the time to perform the differential backup. Restoring an entire system would require starting from the most recent full backup and then applying just the last differential backup since the last full backup.',
20 => ':: Note: Vendors have standardized on the meaning of the terms "incremental backup" and "differential backup." However, there have been cases where conflicting definitions of these terms have been used. The most relevant characteristic of an incremental backup is which reference point it uses to check for changes. By standard definition, a differential backup copies files that have been created or changed since the last full backup, regardless of whether any other differential backups have been made since then, whereas an incremental backup copies files that have been created or changed since the most recent backup of any type (full or incremental). Other variations of incremental backup include multi-level incrementals and incremental backups that compare parts of files instead of just the whole file.',
21 => '; Reverse delta : A reverse delta type repository stores a recent "mirror" of the source data and a series of differences between the mirror in its current state and its previous states. A reverse delta backup will start with a normal full backup. After the full backup is performed, the system will periodically synchronize the full backup with the live copy, while storing the data necessary to reconstruct older versions.<ref name="LeonSoftware15">{{cite book |url=https://books.google.com/books?id=pYcTBwAAQBAJ&pg=PA65 |title=Software Configuration Management Handbook |author=Leon, A. |publisher=Artech House |page=65 |year=2015 |isbn=978-1-60807-844-8 |accessdate=8 May 2018}}</ref> This can either be done using [[hard links]], or using binary [[data comparion|diffs]]. This system works particularly well for large, slowly changing, data sets.',
22 => '; [[Continuous data protection]] : Instead of scheduling periodic backups, the system immediately logs every change on the host system. This is generally done by saving byte or block-level differences rather than file-level differences.<ref>[http://www.sertdatarecovery.com/business-data-backup-disaster-recovery-planning-resource.html Continuous Protection white paper] {{Webarchive|url=https://web.archive.org/web/20160304072358/http://www.sertdatarecovery.com/business-data-backup-disaster-recovery-planning-resource.html |date=4 March 2016 }}. (1 October 2005). Retrieved 10 March 2007</ref> It differs from simple [[disk mirroring]] in that it enables a roll-back of the log and thus restoration of old images of data.',
23 => false,
24 => '=== Storage media ===',
25 => '[[File:DVD, USB flash drive and external hard drive.jpg|thumb|right|From left to right, a [[DVD]] disc in plastic cover, a [[USB flash drive]] and an [[external hard drive]]]]',
26 => 'Regardless of the repository model that is used, the data has to be stored on some data storage medium.',
27 => false,
28 => '; [[Magnetic tape data storage|Magnetic tape]] : Magnetic tape has long been the most commonly used medium for bulk data storage, backup, archiving, and interchange. Tape has typically had an order of magnitude better capacity-to-price ratio when compared to hard disk, but the ratios for tape and hard disk have become closer.<ref>[http://www.storagesearch.com/engenio-art2.html Disk to Disk Backup versus Tape – War or Truce?] {{Webarchive|url=https://web.archive.org/web/20160712235906/http://www.storagesearch.com/engenio-art2.html |date=12 July 2016 }} (9 December 2004). Retrieved 10 March 2007</ref> [[Magnetic tape data storage#Chronological list of tape formats|Many tape formats have been]] proprietary or specific to certain markets like mainframes or a particular brand of personal computer, but by 2014 [[Linear Tape-Open#Market performance|LTO]] was edging out two other remaining viable "super" formats—[[IBM 3592]] (now also referred to as the TS11xx series) and [[StorageTek tape formats#T10000|Oracle StorageTek T10000]],<ref name="ForbesKeepingDataLongTime">{{cite web |last1=Coughlin |first1=Tom |title=Keeping Data for a Long Time |url=https://www.forbes.com/sites/tomcoughlin/2014/06/29/keeping-data-for-a-long-time/ |website=Forbes |publisher=Forbes Media LLC |accessdate=19 April 2018 |date=29 June 2014 |at=para. Magnetic Tapes(popular formats, storage life), para. Hard Disk Drives(active archive), para. First consider flash memory in archiving(... may not have good media archive life)}}</ref> and [[Digital Data Storage#Future|further development of the smaller-capacity DDS format had been canceled]]. By 2017 [[Spectra Logic]], which builds [[tape library|tape libraries]] for both the LTO and TS11xx formats, was predicting that "Linear Tape Open (LTO) technology has been and will continue to be the primary tape technology."<ref name="SpectraLogicDigitalDataStorageOutlook2017">{{cite web |title=Digital Data Storage Outlook 2017 |url=https://spectralogic.com/wp-content/uploads/white-paper-digital-data-storage-outlook-2017-v3.pdf |website=Spectra |publisher=Spectra Logic |accessdate=11 July 2018 |page=14(Tape) |format=PDF |year=2017}}</ref> Tape is a [[sequential access]] medium, so even though access times may be poor, the rate of continuously writing or reading data can actually be very fast.',
29 => '; [[Hard disk]]: The capacity-to-price ratio of hard disks has been improving for many years, making them more competitive with magnetic tape as a bulk storage medium. The main advantages of hard disk storage are low access times, availability, capacity and ease of use.<ref>{{cite web |url=http://www.tomshardware.com/2007/04/18/bye_bye_tape/ |title=Bye Bye Tape, Hello 5.3TB eSATA |accessdate=22 April 2007}}</ref> External disks can be connected via local interfaces like [[SCSI]], [[USB]], [[FireWire]], or [[eSATA]], or via longer distance technologies like [[Ethernet]], [[iSCSI]], or [[Fibre Channel]]. Some disk-based backup systems, via [[Virtual tape library|Virtual Tape Libraries]] or otherwise, support [[data deduplication]], which can dramatically reduce the amount of disk storage capacity consumed by daily and weekly backup data.<ref name="RetrospectWindows12UG">{{cite web |title=Retrospect ® 12 Windows User's Guide |url=http://download.retrospect.com/docs/win/v12/user_guide/Retrospect_Win_User_Guide-EN.pdf |website=Retrospect |publisher=Retrospect Inc. |accessdate=2 September 2018 |format=PDF |year=2017 |pages=30-31(deduplication via Snapshots), 41-43(removable disk drives), 31-32(Dashboard), 216-218(selector as subset filter for synthetic full backups), 426-427(E-mail)}}</ref><ref>{{Cite web |url=http://www.dcig.com/2009/07/symantec-shows-backup-exec-a-l.html |title=Symantec Shows Backup Exec a Little Dedupe Love; Lays out Source Side Deduplication Roadmap – DCIG |website=DCIG |access-date=26 February 2016 |archive-url=https://web.archive.org/web/20160304212819/http://www.dcig.com/2009/07/symantec-shows-backup-exec-a-l.html |archive-date=4 March 2016 |dead-url=no |df=dmy-all}}</ref><ref name="NetBackupDeduplicationGuide">{{cite web |title=Veritas NetBackup™ Deduplication Guide |url=https://www.veritas.com/content/support/en_US/doc/ka6j00000000ADEAA2 |website=Veritas |publisher=Veritas Technologies LLC |accessdate=26 July 2018 |year=2016}}</ref> One disadvantage of hard disk backups vis-a-vis tape is that hard drives are [[Hard disk drive#Magnetic recording|close-tolerance mechanical devices]] and may be more easily damaged, especially while being transported (e.g., for off-site backups).<ref name="PCWorldHardCoreDataPreservation">{{cite web |last1=Jacobi |first1=John L. |title=Hard-core data preservation: The best media and methods for archiving your data |url=https://www.pcworld.com/article/2984597/storage/hard-core-data-preservation-the-best-media-and-methods-for-archiving-your-data.html |website=PC World |accessdate=19 April 2018 |date=29 Feb 2016 |at=sec. External Hard Drives(on the shelf, magnetic properties, mechanical stresses, vulnerable to shocks)}}</ref> In the mid-2000s, several drive manufacturers began to produce portable drives employing [[Hard disk drive failure#Unloading|ramp loading and accelerometer]] technology (sometimes termed a "shock sensor"),<ref name="HGSTRampLoadUnload">{{cite web |title=Ramp Load/Unload Technology in Hard Disk Drives |url=https://www.hgst.com/sites/default/files/resources/LoadUnload_white_paper_FINAL.pdf |website=HGST |publisher=Western Digital |accessdate=29 June 2018 |page=3(sec. Enhanced Shock Tolerance) |format=PDF |date=November 2007}}</ref><ref name="ToshibaCanvio3.0PortableHDD">{{cite web |title=Toshiba Portable Hard Drive (Canvio® 3.0) |url=https://www.toshibadata.com.sg/Product-Canvio-Portable-Hard-Drive.aspx |website=Toshiba Data Dynamics Singapore |publisher=Toshiba Data Dynamics Pte Ltd |accessdate=16 June 2018 |year=2018 |at=sec. Overview(Internal shock sensor and ramp loading technology)}}</ref> and—by 2010—the industry average in drop tests for drives with that technology showed drives remaining intact and working after a 36-inch non-operating drop onto industrial carpeting.<ref name="IomegaDropShock">{{cite web |title=Iomega ® Drop Guard ™ Technology |url=https://www.doc-developpement-durable.org/file/Projets-informatiques/Drop%20Guard-disque-dur-tres-solide.pdf |website=Hard Drive Storage Solutions |publisher=Iomega Corp. |accessdate=12 July 2018 |pages=2(What is Drop Shock Technology?, What is Drop Guard Technology? (... 40% above the industry average)), 3(*NOTE) |date=20 September 2010}}</ref> The manufacturers do not, however, guarantee these results and note that a drive may fail to survive even a shorter drop.<ref name="IomegaDropShock" /> Some manufacturers also offer 'ruggedized' portable hard drives, which include a shock-absorbing case around the hard disk, and claim a range of higher drop specifications.<ref name="PCMagBestRuggedHDDs&SSDs">{{cite web |last1=Burek |first1=John |title=The Best Rugged Hard Drives and SSDs |url=https://www.pcmag.com/roundup/361072/the-best-rugged-hard-drives-and-ssds |website=PC Magazine |publisher=Ziff Davis |accessdate=4 August 2018 |at=What Exactly Makes a Drive Rugged?(When a drive is encased ... you're mostly at the mercy of the drive vendor to tell you the rated maximum drop distance for the drive) |date=15 May 2018}}</ref><ref name="WirecutterBestPortableHardDrive2017Don'tBuy">{{cite web |last1=Krajeski |first1=Justin |last2=Streams |first2=Kimber |title=The Best Portable Hard Drive |url=https://web.archive.org/web/20170331161821/http://thewirecutter.com/reviews/best-portable-hard-drive/#dont-buy-a-rugged-portable-hard-drive |work=The New York Times |accessdate=4 August 2018 |archiveurl=https://web.archive.org/web/20170331161821/http://thewirecutter.com/reviews/best-portable-hard-drive/#dont-buy-a-rugged-portable-hard-drive |archivedate=31 March 2017 |date=20 March 2017}}</ref> Another disadvantage is that over a period of years the stability of hard disk backups is shorter than that of tape backups.<ref name="ForbesKeepingDataLongTime" /><ref name="IronMountainBestLong-TermDataArchiveSolutions">{{cite web |title=Best Long-Term Data Archive Solutions |url=http://www.ironmountain.com/resources/general-articles/b/best-long-term-data-archive-solutions |website=Iron Mountain |publisher=Iron Mountain Inc. |accessdate=19 April 2018 |year=2018 |at=sec. More Reliable(average mean time between failure ... rates, best practice for migrating data)}}</ref><ref name="PCWorldHardCoreDataPreservation" />',
30 => '; [[Optical storage]] : Recordable [[CD]]s, [[DVD]]s, and [[Blu-ray Disc]]s are commonly used with personal computers and generally have low media unit costs. However, the capacities and speeds of these and other optical discs have traditionally been lower than that of hard disks or tapes (though advances in optical media are slowly shrinking that gap<ref name="WanOptical14">{{cite journal |title=Optical storage: An emerging option in long-term digital preservation |journal=Frontiers of Optoelectronics |author=Wan, S.; Cao, Q.; Xie, C. |volume=7 |issue=4 |pages=486–492 |year=2014 |doi=10.1007/s12200-014-0442-2}}</ref><ref name="ZhangHigh18">{{cite journal |title=High-capacity optical long data memory based on enhanced Young's modulus in nanoplasmonic hybrid glass composites |journal=Nature Communications |author=Zhang, Q.; Xia, Z.; Cheng, Y.-B.; Gu, M. |volume=9 |pages=1183 |year=2018 |doi=10.1038/s41467-018-03589-y}}</ref>). Many optical disk formats are [[Write Once Read Many|WORM]] type, which makes them useful for archival purposes since the data cannot be changed. The use of an auto-changer or jukebox can make optical discs a feasible option for larger-scale backup systems. Some optical storage systems allow for cataloged data backups without human contact with the discs, allowing for longer data integrity.',
31 => '; [[SSD]]/[[Solid state storage]] : Also known as [[flash memory]], [[thumb drive]]s, [[USB flash drive]]s, [[CompactFlash]], [[SmartMedia]], [[Memory Stick]], [[Secure Digital card]]s, etc., these devices are relatively expensive for their low capacity in comparison to hard disk drives, but are very convenient for backing up relatively low data volumes. A [[solid-state drive]] does not contain any movable parts unlike its magnetic drive counterpart, making it less susceptible to physical damage, and can have huge throughput in the order of 500Mbit/s to 6Gbit/s. The capacity offered from SSDs continues to grow and prices are gradually decreasing as they become more common.<ref name="MicheloniSolid17">{{cite journal |url=https://ieeexplore.ieee.org/stamp/stamp.jsp?arnumber=8013049 |title=Solid-State Drives (SSDs) |journal=Proceedings of the IEEE |author=Micheloni, R.; Olivo, P. |volume=105 |issue=9 |pages=1586–88 |year=2017 |doi=10.1109/JPROC.2017.2727228 |accessdate=8 May 2018}}</ref><ref name="PCMagBestRuggedHDDs&SSDs" /> Over a period of years the stability of flash memory backups is shorter than that of hard disk backups.<ref name="ForbesKeepingDataLongTime" />',
32 => '; [[Remote backup service|Remote backup service AKA cloud backup]] : As [[broadband Internet access]] becomes more widespread, remote backup services are gaining in popularity. Backing up via the Internet to a remote location can protect against events such as fires, floods, or earthquakes which could destroy locally-stored backups.<ref name="DellEMC">{{cite web |url=https://www.emc.com/corporate/glossary/remote-backup.htm |title=Remote Backup |work=EMC Glossary |publisher=Dell, Inc |accessdate=8 May 2018}}</ref> There are, however, a number of drawbacks to remote backup services. First, Internet connections are usually slower than local data storage devices. Residential broadband is especially problematic as routine backups must use an upstream link that's usually much slower than the downstream link used only occasionally to retrieve a file from backup. This tends to limit the use of such services to relatively small amounts of high value data, even if a particular service provides initial [[seed loading]]. Secondly, users must trust a third party service provider to maintain the privacy and integrity of their data, although confidentiality can be assured by encrypting the data before transmission to the backup service with an [[key (cryptography)|encryption key]] known only to the user. Ultimately the backup service must itself use one of the above methods so this could be seen as a more complex way of doing traditional backups.',
33 => '; [[Floppy disk]] and its derivatives : During the 1980s and early 1990s, many personal/home computer users associated backing up mostly with copying to floppy disks. However, the data capacity of floppy disks did not keep pace with growing demands, rendering them effectively obsolete. Later "[[superfloppy]]" devices and [[Iomega REV|related "non-floppy"]] devices provide greater storage capacity and remain supported as backup media by some developers.<ref name="RetrospectWindows12UG" />',
34 => false,
35 => '=== Managing the data repository ===',
36 => 'Regardless of the data repository model, or data storage media used for backups, a balance needs to be struck between accessibility, security and cost. These media management methods are not mutually exclusive and are frequently combined to meet the user's needs. Using on-line disks for staging data before it is sent to a near-line [[tape library]] is a common example.',
37 => false,
38 => 'Data repository implementations include<ref name="StackpoleSoftware07">{{cite book |url=https://books.google.com/books?id=gjAhVzuV7k0C&pg=PA164 |title=Software Deployment, Updating, and Patching |author=Stackpole, B.; Hanrion, P. |publisher=CRC Press |pages=164–165 |year=2007 |isbn=978-1-4200-1329-0 |accessdate=8 May 2018}}</ref><ref name="GnanasundaramInfo12">{{cite book |url=https://books.google.com/books?id=PU7gkW9ArxIC&pg=PA255 |title=Information Storage and Management: Storing, Managing, and Protecting Digital Information in Classic, Virtualized, and Cloud Environments |editor=Gnanasundaram, S.; Shrivastava, A. |publisher=John Wiley and Sons |page=255 |year=2012 |isbn=978-1-118-23696-3 |accessdate=8 May 2018}}</ref>:',
39 => false,
40 => '; [[Online|On-line]] : On-line backup storage is typically the most accessible type of data storage, which can begin restore in milliseconds of time. A good example is an internal hard disk or a [[disk array]] (maybe connected to [[Storage area network|SAN]]). This type of storage is very convenient and speedy, but is relatively expensive. On-line storage is quite vulnerable to being deleted or overwritten, either by accident, by intentional malevolent action, or in the wake of a data-deleting [[Computer virus|virus]] payload.',
41 => '; [[Nearline storage|Near-line]] : Near-line storage is typically less accessible and less expensive than on-line storage, but still useful for backup data storage. A good example would be a [[tape library]] with restore times ranging from seconds to a few minutes. A mechanical device is usually used to move media units from storage into a drive where the data can be read or written. Generally it has safety properties similar to on-line storage.',
42 => '; [[Off-line storage|Off-line]] : Off-line storage requires some direct human action to provide access to the storage media: for example inserting a tape into a tape drive or plugging in a cable. Because the data are not accessible via any computer except during limited periods in which they are written or read back, they are largely immune to a whole class of on-line backup failure modes. Access time will vary depending on whether the media are on-site or off-site.',
43 => '; [[Off-site data protection]]: To protect against a disaster or other site-specific problem, many people choose to send backup media to an off-site vault. The vault can be as simple as a system administrator's home office or as sophisticated as a disaster-hardened, temperature-controlled, high-security bunker with facilities for backup media storage. Importantly a data replica ''can'' be off-site but also ''on-line'' (e.g., an off-site [[RAID]] mirror). Such a replica has fairly limited value as a backup, and should not be confused with an off-line backup.',
44 => '; [[Backup site]] or disaster recovery center (DR center): In the event of a disaster, the data on backup media will not be sufficient to recover. Computer systems onto which the data can be restored and properly configured networks are necessary too. Some organizations have their own data recovery centers that are equipped for this scenario. Other organizations contract this out to a third-party recovery center. Because a DR site is itself a huge investment, backing up is very rarely considered the preferred method of moving data to a DR site. A more typical way would be remote [[disk mirroring]], which keeps the DR data as up to date as possible.',
45 => false,
46 => '== Selection and extraction of data ==',
47 => 'A successful backup job starts with selecting and extracting coherent units of data. Most data on modern computer systems is stored in discrete units, known as [[Computer file|files]]. These files are organized into [[filesystem]]s. Files that are actively being updated can be thought of as "live" and present a challenge to back up. It is also useful to save metadata that describes the computer or the filesystem being backed up.',
48 => false,
49 => 'Deciding what to back up at any given time is a harder process than it seems. By backing up too much redundant data, the data repository will fill up too quickly. Backing up an insufficient amount of data can eventually lead to the loss of critical information.<ref name="LeesWhatTo17">{{cite web |url=https://irontree.co.za/what-to-backup-a-critical-look-at-your-data-1935.html |title=What to backup – a critical look at your data |author=Lees, D. |work=Irontree Blog |publisher=Irontree Internet Services CC |date=25 January 2017 |accessdate=8 May 2018}}</ref>',
50 => false,
51 => '=== Files ===',
52 => '; [[File copying|Copying files]] : With '''file-level''' approach, making copies of files is the simplest and most common way to perform a backup. A means to perform this basic function is included in all backup software and all operating systems.',
53 => false,
54 => '; Partial file copying: Instead of copying whole files, one can limit the backup to only the blocks or bytes within a file that have changed in a given period of time. This technique can use substantially less storage space on the backup medium, but requires a high level of sophistication to reconstruct files in a restore situation. Some implementations require integration with the source file system.',
55 => false,
56 => '; Deleted files : To prevent the unintentional restoration of files that have been intentionally deleted, a record of the deletion must be kept.',
57 => false,
58 => '=== Filesystems ===',
59 => '; Filesystem dump: Instead of copying files within a file system, a copy of the whole filesystem itself in '''block-level''' can be made. This is also known as a ''raw partition backup'' and is related to [[disk image|disk imaging]]. The process usually involves unmounting the filesystem and running a program like [[dd (Unix)]].<ref name="PrestonBackup07">{{cite book |url=https://books.google.com/books?id=6-w4fXbBInoC&pg=PA111 |title=Backup & Recovery: Inexpensive Backup Solutions for Open Systems |author=Preston, W.C. |publisher=O'Reilly Media, Inc |pages=111–114 |year=2007 |isbn=978-0-596-55504-7 |accessdate=8 May 2018}}</ref> Because the disk is read sequentially and with large buffers, this type of backup can be much faster than reading every file normally, especially when the filesystem contains many small files, is highly fragmented, or is nearly full. But because this method also reads the free disk blocks that contain no useful data, this method can also be slower than conventional reading, especially when the filesystem is nearly empty. Some filesystems, such as [[XFS]], provide a "dump" utility that reads the disk sequentially for high performance while skipping unused sections. The corresponding restore utility can selectively restore individual files or the entire volume at the operator's choice.<ref name="PrestonUnix99">{{cite book |url=https://books.google.com/books?id=_i1sO47qNnMC&pg=PA73 |title=Unix Backup & Recovery |author=Preston, W.C. |publisher=O'Reilly Media, Inc |pages=73–91 |year=1999 |isbn=978-1-56592-642-4 |accessdate=8 May 2018}}</ref>',
60 => false,
61 => '; Identification of changes: Some filesystems have an [[archive bit]] for each file that says it was recently changed. Some backup software looks at the date of the file and compares it with the last backup to determine whether the file was changed.',
62 => false,
63 => '; [[Versioning file system]] : A versioning filesystem keeps track of all changes to a file and makes those changes accessible to the user. Generally this gives access to any previous version, all the way back to the file's creation time. An example of this is the Wayback versioning filesystem for Linux.<ref>[http://www.aqualab.cs.northwestern.edu/publications/Cornell04VFS.html Wayback: A User-level V File System for Linux] {{Webarchive|url=https://web.archive.org/web/20070406204849/http://www.aqualab.cs.northwestern.edu/publications/Cornell04VFS.html |date=6 April 2007 }} (2004). Retrieved 10 March 2007</ref>',
64 => false,
65 => '=== Live data ===',
66 => 'If a computer system is in use while it is being backed up, the possibility of files being open for reading or writing is real. If a file is open, the contents on disk may not correctly represent what the owner of the file intends. This is especially true for database files of all kinds. The term [[fuzzy backup]] can be used to describe a backup of live data that looks like it ran correctly, but does not represent the state of the data at any single point in time. This is because the data being backed up changed in the period of time between when the backup started and when it finished.<ref name="LiotineMission03">{{cite book |url=https://books.google.com/books?id=LecC2BhPPxMC&pg=PA244 |title=Mission-critical Network Planning |author=Liotine, M. |publisher=Artech House |page=244 |year=2003 |isbn=978-1-58053-559-5 |accessdate=8 May 2018}}</ref>',
67 => false,
68 => 'Backup options for live (and other) data availability scenarios include<ref name="deGuiseEnterprise08">{{cite book |url=https://books.google.com/books?id=2OtqvySBTu4C&pg=PA50 |title=Enterprise Systems Backup and Recovery: A Corporate Insurance Policy |author=de Guise, P. |publisher=CRC Press |pages=50–54 |year=2008 |isbn=978-1-4200-7640-0}}</ref>:',
69 => false,
70 => '; [[Snapshot (computer storage)|Snapshot]] backup: A snapshot is an instantaneous function of some storage systems that presents a copy of the file system as if it were frozen at a specific point in time, often by a [[copy-on-write]] mechanism. An effective way to back up live data is to temporarily [[quiesce]] them (e.g., close all files), take a snapshot, and then resume live operations. At this point the snapshot can be backed up through normal methods.<ref>[http://edseek.com/~jasonb/articles/dirvish_backup/snapshot.html What is a Snapshot backup?] {{Webarchive|url=https://web.archive.org/web/20070403041940/http://edseek.com/~jasonb/articles/dirvish_backup/snapshot.html |date=3 April 2007 }}. Retrieved 10 March 2007</ref> While a snapshot is very handy for viewing a filesystem as it was at a different point in time, it is hardly an effective backup mechanism by itself.',
71 => false,
72 => '; Open file backup: Many backup software packages feature the ability to handle open files in backup operations. Some simply check for openness and try again later. [[File locking]] is useful for regulating access to open files.',
73 => ': When attempting to understand the logistics of backing up open files, one must consider that the backup process could take several minutes to back up a large file such as a database. In order to back up a file that is in use, it is vital that the entire backup represent a single-moment snapshot of the file, rather than a simple copy of a read-through. This represents a challenge when backing up a file that is constantly changing. Either the database file must be locked to prevent changes, or a method must be implemented to ensure that the original snapshot is preserved long enough to be copied, all while changes are being preserved. Backing up a file while it is being changed, in a manner that causes the first part of the backup to represent data ''before'' changes occur to be combined with later parts of the backup ''after'' the change results in a corrupted file that is unusable, as most large files contain internal references between their various parts that must remain consistent throughout the file.',
74 => false,
75 => '; Cold database (offline) backup: During a cold backup, the database is closed or locked and not available to users. The datafiles do not change during the backup process so the database is in a consistent state when it is returned to normal operation.<ref>[http://www.wisc.edu/drmt/oratips/sess003.html#coldbackup Oracle Tips] {{Webarchive|url=https://web.archive.org/web/20070302110933/http://www.wisc.edu/drmt/oratips/sess003.html#coldbackup |date=2 March 2007 }} (10 December 1997). Retrieved 10 March 2007</ref>',
76 => false,
77 => '; Hot database (online) backup: Some database management systems offer a means to generate a backup image of the database while it is online and usable ("hot"). This usually includes an inconsistent image of the data files plus a log of changes made while the procedure is running. Upon a restore, the changes in the log files are reapplied to bring the copy of the database up-to-date (the point in time at which the initial hot backup ended).<ref>[http://www.wisc.edu/drmt/oratips/sess003.html#Hotbackup Oracle Tips] {{Webarchive|url=https://web.archive.org/web/20070302110933/http://www.wisc.edu/drmt/oratips/sess003.html#Hotbackup |date=2 March 2007 }} (10 December 1997). Retrieved 10 March 2007</ref>',
78 => false,
79 => '=== Metadata ===',
80 => 'Not all information stored on the computer is stored in files. Accurately recovering a complete system from scratch requires keeping track of this non-file data too.<ref name="Gresovnik1">{{cite web |url=http://www2.arnes.si/~ljc3m2/igor/blogs/technical/bootable_media_creation.html |title=Preparation of Bootable Media and Images |last=Grešovnik |first=Igor |date=April 2016 |archive-url=https://web.archive.org/web/20160425113119/http://www2.arnes.si/~ljc3m2/igor/blogs/technical/bootable_media_creation.html |archivedate=25 April 2016 |access-date=21 April 2016}}</ref>',
81 => '; System description: System specifications are needed to procure an exact replacement after a disaster.',
82 => '; [[Boot sector]] : The boot sector can sometimes be recreated more easily than saving it. Still, it usually isn't a normal file and the system won't boot without it.',
83 => '; [[Disk partitioning|Partition]] layout: The layout of the original disk, as well as partition tables and filesystem settings, is needed to properly recreate the original system.',
84 => '; File [[metadata]] : Each file's permissions, owner, group, ACLs, and any other metadata need to be backed up for a restore to properly recreate the original environment.',
85 => '; System metadata: Different operating systems have different ways of storing configuration information. [[Microsoft Windows]] keeps a [[Windows Registry|registry]] of system information that is more difficult to restore than a typical file.',
86 => false,
87 => '== Manipulation of data and dataset optimization ==',
88 => 'It is frequently useful or required to manipulate the data being backed up to optimize the backup process. These manipulations can provide many benefits including improved backup speed, restore speed, data security, media usage and/or reduced bandwidth requirements.',
89 => '; [[Data compression|Compression]] : Various schemes can be employed to shrink the size of the source data to be stored so that it uses less storage space. Compression is frequently a built-in feature of tape drive hardware.<ref name="CherrySecuring15">{{cite book |url=https://books.google.com/books?id=SD_LAwAAQBAJ&pg=PA306 |title=Securing SQL Server: Protecting Your Database from Attackers |author=Cherry, D. |publisher=Syngress |pages=306–308 |year=2015 |isbn=978-0-12-801375-5 |accessdate=8 May 2018}}</ref>',
90 => '; [[Data deduplication|Deduplication]] : When multiple similar systems are backed up to the same destination storage device, there exists the potential for much redundancy within the backed up data. For example, if 20 Windows workstations were backed up to the same data repository, they might share a common set of system files. The data repository only needs to store one copy of those files to be able to restore any one of those workstations. This technique can be applied at the file level or even on raw blocks of data, potentially resulting in a massive reduction in required storage space.<ref name="CherrySecuring15" /> Deduplication can occur on a server before any data moves to backup media, sometimes referred to as source/client side deduplication. This approach also reduces bandwidth required to send backup data to its target media. The process can also occur at the target storage device, sometimes referred to as inline or back-end deduplication.',
91 => ';[[Replication (computer science)|Duplication]] : Sometimes backup jobs are duplicated to a second set of storage media. This can be done to rearrange the backup images to optimize restore speed or to have a second copy at a different location or on a different storage medium.',
92 => '; [[Encryption]] : High-capacity removable storage media such as backup tapes present a data security risk if they are lost or stolen.<ref>[http://www.securityfocus.com/news/11048 Backups tapes a backdoor for identity thieves] {{Webarchive|url=https://web.archive.org/web/20160405033517/http://www.securityfocus.com/news/11048 |date=5 April 2016 }} (28 April 2004). Retrieved 10 March 2007</ref> Encrypting the data on these media can mitigate this problem, but presents new problems. Encryption is a CPU intensive process that can slow down backup speeds, and the security of the encrypted backups is only as effective as the security of the key management policy.<ref name="CherrySecuring15" />',
93 => '; [[Multiplexing]] : When there are many more computers to be backed up than there are destination storage devices, the ability to use a single storage device with several simultaneous backups can be useful.<ref name="PrestonBackup07-02">{{cite book |url=https://books.google.com/books?id=6-w4fXbBInoC&pg=PA219 |title=Backup & Recovery: Inexpensive Backup Solutions for Open Systems |author=Preston, W.C. |publisher=O'Reilly Media, Inc |pages=219–220 |year=2007 |isbn=978-0-596-55504-7 |accessdate=8 May 2018}}</ref>',
94 => '; Refactoring: The process of rearranging the backup sets in a data repository is known as refactoring. For example, if a backup system uses a single tape each day to store the incremental backups for all the protected computers, restoring one of the computers could potentially require many tapes. Refactoring could be used to consolidate all the backups for a single computer onto a single tape. This is especially useful for backup systems that do ''incrementals forever'' style backups.',
95 => '; [[Disk staging|Staging]] : Sometimes backup jobs are copied to a staging disk before being copied to tape.<ref name="PrestonBackup07-02" /> This process is sometimes referred to as D2D2T, an acronym for Disk to Disk to Tape. This can be useful if there is a problem matching the speed of the final destination device with the source device as is frequently faced in network-based backup systems. It can also serve as a centralized location for applying other data manipulation techniques.',
96 => false,
97 => '== Managing the backup process ==',
98 => 'As long as new data are being created and changes are being made, backups will need to be performed at frequent intervals. Individuals and organizations with anything from one computer to thousands of computer systems all require protection of data. The scales may be very different, but the objectives and limitations are essentially the same. Those who perform backups need to know how successful the backups are, regardless of scale.',
99 => false,
100 => '=== Objectives ===',
101 => '; [[Recovery point objective]] (RPO) : The point in time that the restarted infrastructure will reflect. Essentially, this is the roll-back that will be experienced as a result of the recovery. The most desirable RPO would be the point just prior to the data loss event. Making a more recent recovery point achievable requires increasing the frequency of [[file synchronization|synchronization]] between the source data and the backup repository.<ref>[http://www.riskythinking.com/glossary/recovery_point_objective.php Definition of ''recovery point objective''] {{Webarchive|url=https://web.archive.org/web/20070513180844/http://www.riskythinking.com/glossary/recovery_point_objective.php |date=13 May 2007 }}. Retrieved 10 March 2007</ref><ref>{{Cite web |title=Top four things to consider in business continuity planning |url=http://sysgen.ca/top-four-things-business-continuity-planning/ |website=sysgen.ca |accessdate=23 September 2015 |archive-url=https://web.archive.org/web/20160304075050/http://sysgen.ca/top-four-things-business-continuity-planning/ |archive-date=4 March 2016 |dead-url=no |df=dmy-all}}</ref>',
102 => '; [[Recovery time objective]] (RTO) : The amount of time elapsed between disaster and restoration of business functions.<ref>[http://www.riskythinking.com/glossary/recovery_time_objective.php Definition of ''recovery time objective''] {{Webarchive|url=https://web.archive.org/web/20070516081425/http://www.riskythinking.com/glossary/recovery_time_objective.php |date=16 May 2007 }}. Retrieved 7 March 2007</ref>',
103 => '; [[Data security]] : In addition to preserving access to data for its owners, data must be restricted from unauthorized access. Backups must be performed in a manner that does not compromise the original owner's undertaking. This can be achieved with data encryption and proper media handling policies.<ref name="LittleImplement03">{{cite book |url=https://books.google.com/books?id=_DqO6kizEDUC&pg=PA17 |title=Implementing Backup and Recovery: The Readiness Guide for the Enterprise |chapter=Chapter 2: Business Requirements of Backup Systems |author=Little, D.B. |publisher=John Wiley and Sons |pages=17–30 |year=2003 |isbn=978-0-471-48081-5 |accessdate=8 May 2018}}</ref>',
104 => '; [[Data retention]] period : Regulations and policy can lead to situations where backups are expected to be retained for a particular period, but not any further. Retaining backups after this period can lead to unwanted liability and sub-optimal use of storage media.<ref name="LittleImplement03" />',
105 => false,
106 => '=== Limitations ===',
107 => 'An effective backup scheme will take into consideration the following situational limitations<ref name="NelsonPro11-2">{{cite book |url=https://books.google.com/books?id=r4uEEsq3CJYC&printsec=frontcover |title=Pro Data Backup and Recovery |chapter=Chapter 9: Putting It All Together: Sample Backup Environments |author=Nelson, S. |publisher=Apress |pages=203–246 |year=2011 |isbn=978-1-4302-2663-5 |accessdate=8 May 2018}}</ref>:',
108 => false,
109 => '; Backup window: The period of time when backups are permitted to run on a system is called the backup window. This is typically the time when the system sees the least usage and the backup process will have the least amount of interference with normal operations. The backup window is usually planned with users' convenience in mind. If a backup extends past the defined backup window, a decision is made whether it is more beneficial to abort the backup or to lengthen the backup window.',
110 => '; Performance impact: All backup schemes have some performance impact on the system being backed up. For example, for the period of time that a computer system is being backed up, the hard drive is busy reading files for the purpose of backing up, and its full bandwidth is no longer available for other tasks. Such impacts should be analyzed.',
111 => '; Costs of hardware, software, labor: All types of storage media have a finite capacity with a real cost. Matching the correct amount of storage capacity (over time) with the backup needs is an important part of the design of a backup scheme. Any backup scheme has some labor requirement, but complicated schemes have considerably higher labor requirements. The cost of commercial backup software can also be considerable.',
112 => '; Network bandwidth: Distributed backup systems can be affected by limited network bandwidth.',
113 => false,
114 => '=== Implementation ===',
115 => 'Meeting the defined objectives in the face of the above limitations can be a difficult task. The tools and concepts below can make that task more achievable.',
116 => '; Scheduling: Using a [[job scheduler]] can greatly improve the reliability and consistency of backups by removing part of the human element. Many backup software packages include this functionality.',
117 => '; Authentication: Over the course of regular operations, the user accounts and/or system agents that perform the backups need to be authenticated at some level. The power to copy all data off of or onto a system requires unrestricted access. Using an authentication mechanism is a good way to prevent the backup scheme from being used for unauthorized activity.',
118 => '; Chain of trust : Removable [[storage media]] are physical items and must only be handled by trusted individuals. Establishing a chain of trusted individuals (and vendors) is critical to defining the security of the data.',
119 => false,
120 => '=== Measuring the process ===',
121 => 'To ensure that the backup scheme is working as expected, the following best practices should be enacted<ref name="AkhtarDatabase12">{{cite journal |title=Database Backup and Recovery Best Practices |journal=ISACA Journal |author=Akhtar, A.N.; Buchholtz, J.; Ryan, M.; Setty, K. |volume=1 |pages=1–6 |year=2012 |url=https://www.isaca.org/Journal/archives/2012/Volume-1/Pages/Database-Backup-and-Recovery-Best-Practices.aspx |accessdate=8 May 2018}}</ref><ref name=DorionBackupReportingTool>{{cite web |last1=Dorion |first1=Pierre |title=Why you need a data backup reporting tool |url=http://searchdatabackup.techtarget.com/tip/Why-you-need-a-data-backup-reporting-tool |website=TechTarget |publisher=Tech Target Inc. |accessdate=13 November 2017 |date=June 2008}}</ref><ref name="PritchardCloud17">{{cite web |url=https://www.computerweekly.com/feature/Cloud-to-cloud-backup-What-it-is-and-why-you-need-it |title=Cloud-to-cloud backup: What it is and why you need it |author=Pritchard, S. |work=Computer Weekly |publisher=TechTarget |date=December 2017 |accessdate=8 May 2018}}</ref>:',
122 => false,
123 => '; [[Backup validation]] : (also known as "backup success validation") Provides information about the backup, and proves compliance to regulatory bodies outside the organization: for example, an insurance company in the USA might be required under [[Health Insurance Portability and Accountability Act|HIPAA]] to demonstrate that its client data meet records retention requirements.<ref>[http://www.hipaadvisory.com/regs/recordretention.htm HIPAA Advisory] {{Webarchive|url=https://web.archive.org/web/20070411135655/http://www.hipaadvisory.com/regs/recordretention.htm |date=11 April 2007 }}. Retrieved 10 March 2007</ref> Disaster, data complexity, data value and increasing dependence upon ever-growing volumes of data all contribute to the anxiety around and dependence upon successful backups to ensure [[business continuity]]. Thus many organizations rely on third-party or "independent" solutions to test, validate, and optimize their backup operations (backup reporting).',
124 => '; Reporting: In larger configurations, reports are useful for monitoring media usage, device status, errors, vault coordination and other information about the backup process.',
125 => '; Logging: In addition to the history of computer generated reports, activity and change logs are useful for monitoring backup system events.',
126 => '; Validation: Many backup programs use [[checksum]]s or [[hash function|hashes]] to validate that the data was accurately copied. These offer several advantages. First, they allow data integrity to be verified without reference to the original file: if the file as stored on the backup medium has the same checksum as the saved value, then it is very probably correct. Second, some backup programs can use checksums to avoid making redundant copies of files, and thus improve backup speed. This is particularly useful for the de-duplication process.',
127 => '; Monitored backup: Backup processes can be monitored locally via a software dashboard or by a third party monitoring center. Both alert users to any errors that occur during automated backups. Some third-party monitoring services also allow collection of historical metadata, that can be used for storage resource management purposes like projection of data growth and locating redundant primary storage capacity and reclaimable backup capacity.',
128 => false,
129 => '== Enterprise client-server backup ==',
130 => false,
131 => '"Enterprise client-server" backup software describes a class of software applications that back up data from a variety of client computers centrally to one or more server computers, with the particular needs of [[Company|enterprises]] in mind. They may employ a scripted client–server<ref name="KissellTakeControl2.0">{{cite book |last1=Kissell |first1=Joe |title=Take Control of Mac OS X Backups |date=2007 |publisher=TidBITS Electronic Publishing |location=Ithaca, NY |isbn=0-9759503-0-4 |edition=Version 2.0 |url=http://people.fas.harvard.edu/~techtool/pages/Take_Control_of_Mac_OS_X_Backups_(2.0).pdf |accessdate=22 September 2017 |ref=Kissell |pages=24 (client-server), 127 (script), 165 (client-server), 128 (subvolume—''later'' renamed Favorite Folder in Macintosh variant)}}</ref> backup model<ref name="EnterpriseBackupChallenges">{{cite web |last1=Rassokhin? |first1=Alexander? |title=Enterprise Network Backup Challenges |url=http://www.backupschedule.net/enterprise-network-backup.html |website=All About Backup |publisher=Novosoft LLC |accessdate=13 November 2017 |year=2012}}</ref> with a backup [[server (computing)|server]] program running on one computer, and with small-footprint [[client (computing)|client]] programs (referred to as "agents" in some applications) running on the other computers being backed up, in either a single platform or [[heterogeneous network|mixed platform network]]. Enterprise-specific requirements<ref name="EnterpriseBackupChallenges" /> include the need to back up large amounts of data on a systematic basis, to adhere to legal requirements for the maintenance and archiving of files and data, and to satisfy short-recovery-time objectives. To satisfy these requirements, which World Backup Day (31 March)<ref name="CBC-WorldBackupDay">{{cite web |url=http://www.cbc.ca/news/technology/world-backup-day-1.3510588 |title=World Backup Day highlights importance of protecting data |last=Misener |first=Dan |date=29 March 2016 |publisher=CBC News}}</ref><ref name="ZDNetWorldBackupDay">{{cite web |title=World Backup Day: deutliche Lücken zwischen Sicherheitsrisiko und Nutzerverhalten |url=http://www.zdnet.de/88291257/ |publisher=[[ZDNet]] |date=31 March 2017 |language=de-DE |first=Anja |last=Schmoll-Trautmann}}</ref><ref name="eWeekWorldBackupDay">{{cite web |last1=Preimesberger |first1=Chris |title=World Backup Day 2017: 'We Don't Know the Day Nor the Hour' |url=http://www.eweek.com/storage/world-backup-day-2017-we-don-t-know-the-day-nor-the-hour |website=eWeek |publisher=QuinStreet |accessdate=11 November 2017 |date=31 March 2017 |at=Ian Wood of Veritas}}</ref> highlights, it is typical for an enterprise to appoint a backup administrator, who is a part of office administration rather than of the IT staff, and whose role is "being the keeper of the data".<ref name="DorionBackupAdminRole">{{cite web |last1=Dorion |first1=Pierre |title=The true role of a backup administrator |url=http://searchdatabackup.techtarget.com/news/1322981/The-true-role-of-a-backup-administrator |website=TechTarget |publisher=TechTarget, Inc. |accessdate=13 November 2017 |date=4 August 2008 |quote=On the other hand, the role of a backup administrator should be one of administration, not operation....whose role is "being the keeper of the data"}}</ref>',
132 => false,
133 => 'Such applications make cumulative backups of ''multiple'' client machines' source files to, or do restores from, what would ordinarily be referred to as an [[archive file]]. However some of these applications use (or once used<ref name="BackupExecArchivingOptionNoLongeSupported">{{cite web |title=Backup Exec Archiving Option is no longer supported for Backup Exec 15 Feature Pack 1 |url=https://www.veritas.com/support/en_US/article.100023956 |website=Veritas Support |publisher=Veritas Technologies LLC |accessdate=13 May 2018 |date=30 June 2015}}</ref>) the term "archive" to refer to a backup operation that deletes data from a client source once the data's backup is complete.<ref name="NetBackupWhatIsArchiving">{{cite web |last1=Bokelman |first1=Seth |title=what is archiving in Netbackup? |url=https://vox.veritas.com/t5/NetBackup/what-is-archiving-in-Netbackup/m-p/490153#M112727 |website=VOX |publisher=Veritas Technologies LLC |accessdate=13 May 2018 |date=26 February 2012}}</ref><ref name="RetrospectMac14UG">{{cite web |title=Retrospect ® 14.0 Mac User's Guide |url=http://download.retrospect.com/docs/mac/v14/user_guide/Retrospect_Mac_User_Guide-EN.pdf |website=Retrospect |publisher=Retrospect Inc. |accessdate=28 March 2017 |format=PDF |date=March 2017}}</ref> Therefore the discussion of these applications will use the non-proprietary term "set(s) of backups" instead of "archive file(s)".',
134 => false,
135 => '=== Performance ===',
136 => false,
137 => 'The [[Hard disk drive#Price evolution|steady improvement in hard disk drive price per byte]] has made feasible a [[Backup#Manipulation of data and dataset optimization|disk-to-disk-to-tape]] strategy, combining the speed of disk backup and restore with the capacity and low cost of tape for offsite archival and disaster recovery purposes.<ref name="FernandoCombineDiskTapeBenefits">{{cite web |last1=Fernando |first1=Sal |title=Combine disk, tape benefits to protect data |url=http://www.zdnet.com/article/combine-disk-tape-benefits-to-protect-data/ |publisher=ZDNet |accessdate=13 November 2017 |date=30 April 2008}}</ref> This, with [[Comparison of file systems#File capabilities|file system technology]], has led to features such as:',
138 => '; Improved disk-to-disk-to-tape capabilities: Enable automated transfers to tape for safe offsite storage of disk sets of backups that were created for fast onsite restores.<ref name="EMCRetroWindows7">{{cite web |title=New EMC Dantz Retrospect 7 Improves Data Protection for SMBs and the Distributed Enterprise |url=http://www.emc.com/about/news/press/us/2005/20050131-2906.htm |website=DellEMC [current] |publisher=EMC Corp. [orig. publisher] |accessdate=23 November 2016 |date=31 January 2005}}</ref><ref name="NetBackupAboutReplicationDirector">{{cite web |title=About NetBackup Replication Director |url=https://www.veritas.com/support/en_US/doc/59229900-126796169-0/v58079997-126796169 |website=Veritas Support |publisher=Veritas Technologies LLC (US) |accessdate=18 November 2017 |date=13 July 2017}}</ref><ref name="BackupExecDuplicatingBackedUpData">{{cite web |title=Symantec Backup Exec: About duplicating backed up data |url=http://backup-exec.helpmax.net/en/backing-up-data/about-duplicating-backed-up-data/ |website=''Helpmax.net'' |publisher=HelpMax Software Help & Shop Inc. |accessdate=13 January 2018}}</ref>',
139 => '; Create synthetic full backups: For example, onto tapes from existing disk sets of backups—by copying multiple backups of the same source(s) from one set of backups to another. This is termed a [[Incremental backup#Synthetic full backup|"synthetic full backup"]] because, after the transfer, the destination set of backups contains the same data it would after full backups.<ref name="EMCRetroWindows7" /><ref name="NetBackupAboutSyntheticBackups">{{cite web |title=About synthetic backups |url=https://www.veritas.com/content/support/en_US/doc/18716246-126559472-0/id-SF780163836-126559472 |website=Veritas Support |publisher=Veritas Technologies LLC (US) |accessdate=18 November 2017 |date=25 September 2017}}</ref><ref name="BackupExecSyntheticFullBackup">{{cite web |title=Symantec Backup Exec: About the synthetic backup feature |url=http://backup-exec.helpmax.net/en/symantec-backup-exec-advanced-disk-based-backup-option/about-the-synthetic-backup-feature/ |website=''Helpmax.net'' |publisher=HelpMax Software Help & Shop Inc. |accessdate=13 January 2018}}</ref> One application can exclude<ref group=note name=RetrospectExclusionInclusion>Exclusion and/or inclusion is done with Selectors in the Windows variant; this misleading term has been changed to Rules in the Macintosh variant.</ref> files and folders from the synthetic full backup.<ref name="RetrospectWindows12UG" />',
140 => false,
141 => '; Automated data grooming: Frees up space on disk sets of backups by removing out-of-date backup data—usually based on an administrator-defined retention period.<ref name="eWeekWorldBackupDay" /><ref name="FernandoCombineDiskTapeBenefits" /><ref name="EMCRetroWindows7" /><ref name="NetBackupStorageLifecyclePolicy">{{cite web |last1=Kaczorek |first1=Mariusz |title=NetBackup Storage Lifecycle Policy (SLP): Overview |url=https://www.settlersoman.com/netbackup-storage-lifecycle-policy-slp-overview/ |website=Settlersoman |publisher=Settlersoman |accessdate=2 February 2018 |date=15 August 2015}}</ref><ref name="BackupExecDataGrooming">{{cite web |last1=Jain |first1=Hemant |title=VOX Knowledge Base: Data Protection Knowledge Base: Data Protection |url=https://vox.veritas.com/t5/Articles/Automated-Disk-management-and-Data-retention-in-Backup-Exec-DLM/ta-p/809167 |website=VOX |publisher=Veritas Technologies LLC |accessdate=13 January 2018 |date=14 April 2015 |quote=Employee [of Veritas]}}</ref><ref group=note>A few backup applications—mostly free ones—term this "pruning" instead of "grooming", but other applications use the term "pruning" to mean omitting certain ''types'' of files from backups.</ref> One method of removing data is to keep the last backup of each day/week/month for the last respective week/month/specified-number-of-months, permitting compliance with regulatory requirements.<ref name="RetrospectMac12UG">{{cite web |title=Retrospect ® 12.0 Mac User's Guide |url=http://download.retrospect.com/docs/mac/v12/user_guide/Retrospect_Mac_User_Guide-EN.pdf |website=Retrospect |publisher=Retrospect Inc. |accessdate=28 December 2017 |format=PDF |year=2015}}</ref> One application has a "performance-optimized grooming" mode that only removes outdated information from a set of backups that it can quickly delete.<ref name="TitBITSMacintosh13">{{cite web |last1=Schmitz |first1=Agen |title=Retrospect 13 |url=https://tidbits.com/article/16311 |website=TitBITS |publisher=TidBITS Publishing Inc. |accessdate=27 October 2016 |date=5 March 2016}}</ref> This is the only mode of grooming allowed for cloud sets of backups, and is also up to 5 times as fast when used on locally stored disk sets of backups. The "storage-optimized grooming" mode reclaims more space because it rewrites the set of backups, and in this application also permits exclusion compliance with the [[GDPR]]<ref name="RetrospectKnowledgeBase">{{cite web |title=Support: Knowledge Base |url=https://www.retrospect.com/en/support/kb/ |website=Retrospect |publisher=Retrospect Inc. |accessdate=25 August 2018 |date=2 July 2018 |at=#Resources (Auto Launching Guide ..., ... difference between "Backup" and "Duplicate", Avid Support ..., Instant Scan FAQ), #Email Backup, #Top Articles (BackupBot – Deep Dive into ProactiveAI, How to Set Up Remote Backup, GDPR – Deep Dive into Data Retention Policies, Deep Dive - Components of a Retrospect Backup)}}</ref> via rules<ref group=note name=RetrospectExclusionInclusion />—that can instead be used for other filtering.<ref name="TitBITSMacintosh15.1.1">{{cite web |last1=Schmitz |first1=Agen |title=Retrospect 15.1.1 |url=https://tidbits.com/watchlist/retrospect-15-1-1/ |website=TitBITS |publisher=TidBITS Publishing Inc. |accessdate=20 June 2018 |date=28 May 2018}}</ref>',
142 => '; Multithreaded backup server: Capable of simultaneously performing multiple backup, restore, and copy operations in separate "activity threads" (once needed only by those who could afford multiple tape drives).<ref name="EnterpriseBackupChallenges" /><ref name="NetBackupMultistreamMultiplex">{{cite web |title=What is the difference between multiplexing and multistreaming? |url=https://www.veritas.com/support/en_US/article.TECH10085 |website=Veritas Support |publisher=Veritas Technologies LLC (US) |accessdate=19 November 2017 |date=29 January 2015}}</ref><ref name="BackupExecRunConcurrentJobs">{{cite web |last1=McMillen |first1=Robert |title=How to run concurrent jobs in Backup exec 15 |url=https://www.youtube.com/watch?v=1-9x9So038g |via=YouTube |publisher=Google |accessdate=14 January 2018 |format=Video |date=21 July 2015}}</ref> In one application, all the categories of information for a particular "backup server" are stored by it; when an [[backup#User interface|"Administration Console"]] process is started, its process synchronizes information with all running LAN/WAN backup servers.<ref name="TidBITSEMCShips">{{cite web |last1=Engst |first1=Adam |title=EMC Ships Modernized Retrospect 8 |url=https://tidbits.com/article/10159 |website=TidBITS |publisher=TidBITS Publishing Inc. |accessdate=12 September 2017 |date=23 March 2009}}</ref>',
143 => '; Block-level incremental backup: The ability to back up only the blocks of a file that have changed, a [[Incremental backup#Block level incremental|refinement of incremental backup]] that saves space<ref name="TitBITSMacintosh11">{{cite web |last1=Schmitz |first1=Agen |title=Retrospect 11 |url=https://tidbits.com/article/14573 |website=TitBITS |publisher=TidBITS Publishing Inc. |accessdate=27 April 2017 |date=6 March 2014}}</ref><ref name="NetBackupBlockLevelOracle">{{cite web |title=How Veritas NetBackup block-level incremental backup works for Oracle database files |url=https://sort.symantec.com/public/documents/sfha/6.0/aix/productguides/html/sf_adv_ora/ch21s01s01.htm |website=Symantec |publisher=Veritas Technologies LLC (US) |accessdate=18 November 2017 |year=2013}}</ref><ref name="BackupExecBlockLevel">{{cite web |last1=Harbaugh |first1=Logan |title=Developing a Real Backup Plan with Symantec's Backup Exec 15 |url=https://edtechmagazine.com/higher/article/2015/10/developing-real-backup-plan-symantecs-backup-exec-15 |website=EdTech |publisher=CDW LLC |accessdate=14 January 2018 |date=Fall 2015}}</ref> and may save time.<ref name="EnterpriseBackupChallenges" /><ref name="WhitehouseFile-levelBlock-levelDedup">{{cite web |last1=Whitehouse |first1=Lauren |title=The pros and cons of file-level vs. block-level data deduplication technology |url=http://searchdatabackup.techtarget.com/tip/The-pros-and-cons-of-file-level-vs-block-level-data-deduplication-technology |website=TechTarget |publisher=Tech Target Inc. |accessdate=13 November 2017 |date=September 2008}}</ref> Such [[Backup#Files|partial file copying]] is especially applicable to a [[database]].',
144 => '; "Instant" scanning of client volumes: Uses the [[USN Journal]] on Windows NTFS and [[FSEvents]] on macOS to reduce the scanning component<ref name="RetrospectKnowledgeBase" /> time on both incremental backups, fitting more sources into the [[Backup#Limitations|backup window]],<ref name="EnterpriseBackupChallenges" /><ref name="NetBackupAccelerator">{{cite web |title=About the Accelerator feature in NetBackup 7.5 |url=https://www.veritas.com/support/en_US/article.000086263 |website=Veritas Support |publisher=Veritas Technologies LLC (US) |accessdate=18 November 2017 |date=10 November 2017}}</ref><ref name="BackupExecDeterminingIfFileBackedUp">{{cite web |title=Veritas Backup Exec Administrator's Guide: How Backup Exec determines if a file has been backed up |url=https://www.veritas.com/content/support/en_US/doc/59226269-99535599-0/v63768146-99535599 |website=Veritas Support |publisher=Veritas Technologies LLC |accessdate=7 February 2018 |date=11 November 2017}}</ref> and on restores.<ref name="TidBITSMac10">{{cite web |last1=Engst |first1=Adam |title=Retrospect 10 Reduces Backup Time with Instant Scan Technology |url=https://tidbits.com/article/13379 |website=TidBITS |publisher=TidBITS Publishing Inc. |accessdate=25 October 2016 |date=6 November 2012}}</ref>',
145 => '; Cramming or evading the [[Backup#Limitations|backup window]]: One application has the "multiplexed backup" capability of cramming the [[Backup#Limitations|backup window]] by sending data from multiple clients to a single tape drive simultaneously; "this is useful for low end clients with slow throughput ... [that] cannot send data fast enough to keep the tape drive busy .... will reduce the performance of restores."<ref name="NetBackupMultistreamMultiplex" /> Another application allows an enterprise that has computers transiently connecting to the network over a long workday to evade the window by using [[Retrospect (software)#Small-group features|Proactive scripts]].',
146 => false,
147 => '=== Source file integrity ===',
148 => '; Backing up interactive applications : [[Interactive computing|Such applications]] must be protected by having their services [[quiesce|paused]] while their [[Backup#Live data|live data]] is being backed up, and then [[:wikt:unpause|unpaused]].<ref name="EnterpriseBackupSoftware: WorkstationsEmailDatabases">{{cite web |last1=Rassokhin? |first1=Alexander? |title=Enterprise Backup Software: Backup Network Workstations, Email and Databases |url=http://www.backupschedule.net/enterprise-backup.html |website=All about Backup |publisher=Novosoft LLC |accessdate=24 January 2018 |year=2012}}</ref> Some enterprise backup applications accomplish pausing/unpausing of services via built-in provisions—for many specific databases and other interactive applications—that become automatically part of the backup software's script execution; these provisions [[Retrospect (software)#Editions and Add-Ons|may be purchased separately]].<ref name="NetBackupDatabase&AppAgentCompatibility">{{cite web |title=Veritas NetBackup ™ 8.0 – 8.x.x Database and Application Agent Compatibility List |url=https://www.veritas.com/content/support/en_US/doc/NB_80_DBSCL |website=Veritas |publisher=Veritas Technologies LLC (US) |accessdate=19 November 2017 |date=17 November 2017}}</ref><ref name="BackupExecAgents&Options">{{cite web |title=Backup Exec TM 16 Agents and Options |url=https://www.veritas.com/content/dam/Veritas/docs/data-sheets/be16-agents-and-options.pdf |website=Veritas |publisher=Veritas Technologies LLC |accessdate=14 January 2018 |year=2016}}</ref> However another application has also added [[Scripting language#Extension/embeddable languages|"script hooks"]] that enable the optional automatic execution—at specific events during runs of a GUI-coded backup script—of portions of an external script containing commands pre-written in a standard [[scripting language]]. Since the external script is provided by an installation's backup administrator, its code activated by the "script hooks" may accomplish not only data protection—via pausing/unpausing interactive services—but also [[Backup#User interface|integration with monitoring systems]].<ref name="RetrospectMac14UG" />',
149 => false,
150 => '=== User interface ===',
151 => false,
152 => 'To accommodate the requirements of a backup administrator who may not be part of the IT staff with access to the secure server area, enterprise client-server software may include features such as:',
153 => false,
154 => '; Administration Console:',
155 => ':The backup administrator's backup server [[GUI]] management and near-term reporting tool.<ref name="DorionBackupReportingTool" /> Its window shows the selected backup server, with a standard toolbar on top. A sidebar on the left or navigation bar shows the clickable categories of backup server information for it; each category shows a panel, which may have a specialized toolbar below or in place of the standard toolbar. The built-in categories include activities—thus providing [[Backup#Measuring the process|monitored backup]], past backups of each individual source, scripts/policies/jobs (terminology depending on the application), sources (directly/indirectly), sets of backups, and storage devices.<ref name="RetrospectMac14UG" /><ref name="NetBackupAdminGuideVol.1">{{cite web |title=Symantec NetBackup ™ Administrator's Guide, Volume I Windows |url=http://www-personal.umich.edu/~danno/symantec/NetBackup_AdminGuideI_WinServer.pdf |website=Symantec |publisher=Veritas Technologies LLC (US) |accessdate=18 November 2017 |pages=35–45(Administration Console), 833–843(Activity Monitor), 888–894(Reports utility), 912(Remote Administration Console), 915–938(Java Console) |year=2012}}</ref><ref name="BackupExecAdminConsole">{{cite web |title=Symantec Backup Exec: About the Administration Console |url=http://backup-exec.helpmax.net/en/introducing-backup-exec/about-the-administration-console/ |website=''Helpmax.net'' |publisher=HelpMax Software Help & Shop Inc. |accessdate=10 December 2017}}</ref>',
156 => '; User-initiated backups and restores: These supplement the administrator-initiated backups and restores which backup applications have always had, and relieve the administrator of time-consuming tasks.<ref name="DorionBackupAdminRole" /> The user designates the date of the past backup from which files or folders are to be restored—once IT staff has mounted the proper backup volume on the backup server.<ref name="FernandoCombineDiskTapeBenefits" /><ref name="RetrospectMac14UG" /><ref name="NetBackupOperationalRestore">{{cite web |title=OpsCenter Operational Restore |url=https://www.veritas.com/support/en_US/article.100038022 |website=Veritas Support |publisher=Veritas Technologies LLC (US) |accessdate=18 November 2017 |date=12 March 2012}}</ref><ref name="BackupExecUserRetrieve">{{cite web |title=How Backup Exec Retrieve works |url=http://backup-exec.helpmax.net/en/using-backup-exec-retrieve/how-backup-exec-retrieve-works/ |website=''Helpmax.net'' |publisher=HelpMax Software Help & Shop Inc. |accessdate=14 January 2018}}</ref>',
157 => '; High-level/long-term reports supplementing the Administration Console<ref name="DorionBackupReportingTool" />: Within one application's Console panel displayed by clicking the name of the backup server itself in the sidebar, an activities pane on the top left of the displayed [[Dashboard (business)|Dashboard]] has a moving bar graph for each activity going on for the backup server together with a pause and stop button for the activity. Three more panes give the results of activities in the past week: backups each day, sources backed up, and sources not backed up. Finally a storage pane has a line for each set of backups, showing the last-modified date and depictions of the total bytes used and available.<ref name="TitBITSMacintosh11" /><ref name="RetrospectMac14UG" /> For the application's Windows variant, the Dashboard acts as a display-only substitute for a non-existent Console.<ref name="RetrospectWindows12UG" /> Other applications have a separate reporting facility that can cover multiple backup servers.<ref name="NetBackupOperationsManager">{{cite web |last1=Antony |first1=Erica |author2=Tim Burlowski |title=NetBackup Operations Manager: Monitoring, Alerting and Reporting for Veritas NetBackup |url=https://vox.veritas.com/t5/Articles/NetBackup-Operations-Manager-Monitoring-Alerting-and-Reporting/ta-p/806080 |website=Symantec |publisher=Veritas Technologies LLC (US) |accessdate=18 November 2017 |pages=4–5(monitoring), 6–7(alerting), 7(3rdPartyEventMgmt.), 11–18(reporting) |format=PDF attachment |date=January 2008}}</ref><ref name="BackupExecEnterpriseDataProtection">{{cite web |title=Windows® Enterprise Data Protection with Symantec Backup Exec™ |url=http://www.r2gen.com.br/images/symantec/pdf/symantec_protegendo_sua_empresa.pdf |website=Symantec |publisher=Veritas Technologies LLC |accessdate=14 January 2018 |pages=5–8 (CASO) |format=PDF |year=2007}}</ref>',
158 => '; E-mailing of notifications about operations to chosen recipients<ref name="DorionBackupReportingTool" />: Can alert the recipient to, e.g., errors or warnings, with a log to assist in pinpointing problems.<ref name="RetrospectWindows12UG" /><ref name="NetBackupOperationsManager" /><ref name="BackupExecConfigureNotifications">{{cite web |title=How to configure notification recipients in Backup Exec 12.0 and above |url=https://www.veritas.com/support/en_US/article.100016176 |website=Veritas Support |publisher=Veritas Technologies LLC |accessdate=15 January 2018 |date=10 November 2017}}</ref>',
159 => '; Integration with monitoring systems<ref name="DorionBackupReportingTool" />: Such systems provide [[Backup#Measuring the process|backup validation]]. One application's administrators can deploy custom scripts that—invoking [[webhook]] code via [[Backup#Source file integrity|script hooks]]—populate such systems as the freeware [[Nagios]] and [[IFTTT]] and the [[freemium]] [[Slack (software)|Slack]] with script successes and failures corresponding to the activities category of the Console, per-source backup information corresponding to the past backups category of the Console, and media requests.<ref name="RetrospectMac14UG" /> Another application has integration with two of the developer's monitoring systems, one that is part of the client-server backup application and one that is more generalized.<ref name="NetBackupOperationsManager" /> Yet another application has integration with a monitoring system that is part of the client-server backup application,<ref name="BackupExecJobMonitor">{{cite web |title=Veritas Backup Exec Administrator's Guide: About the Job Monitor |url=https://www.veritas.com/content/support/en_US/doc/59226269-99535599-0/v76313540-99535599 |website=Veritas Support |publisher=Veritas Technologies LLC |accessdate=15 January 2018 |date=11 November 2017}}</ref> but can also be integrated with Nagios.<ref name="BackupExecNagiosPlugins">{{cite web |title=Nagios plugins for monitoring BackupExec |url=https://exchange.nagios.org/directory/Plugins/Backup-and-Recovery/BackupExec |website=Nagios Exchange |publisher=Nagios Enterprises |accessdate=15 January 2018}}</ref>',
160 => false,
161 => '=== LAN/WAN/Cloud ===',
162 => '; Advanced network client support: All applications includes support for multiple network interfaces.<ref name="EnterpriseBackupChallenges" /><ref name="EMCRetroMac8">{{cite web |title=EMC Announces Retrospect 8.0 Backup and Recovery Software For Mac |url=http://www.infotomorrowmag.com/about/news/press/2009/20090106-02.htm |website=DellEMC [current] |publisher=EMC Corp. [orig. publisher] |accessdate=10 November 2016 |date=6 January 2009}}</ref><ref name="BackupExecConfiguringNetworkOptionsBackup">{{cite web |title=Veritas Backup Exec Administrator's Guide: Configuring network options for backup jobs |url=https://www.veritas.com/content/support/en_US/doc/59226269-99535599-0/v96257307-99535599 |website=Veritas Support |publisher=Veritas Technologies LLC |accessdate=15 January 2018 |date=17 November 2017}}</ref> However one application, unless [[Data deduplication#Source versus target deduplication|deduplication is done by a separate sub-application between the client and the backup server]], cannot provide "resilient network connections" for machines on a WAN.<ref name="NetBackupDeduplication">{{cite web |title=Veritas NetBackup™ Deduplication Guide |url=https://www.veritas.com/content/support/en_US/doc/ka6j00000000ADEAA2 |website=Veritas |publisher=Veritas Technologies LLC (US) |accessdate=18 November 2017 |page=171(Resilient network properties) |format=PDF |year=2016}}</ref> One application can extend support to "remote" clients anywhere on the Internet for a [[Retrospect (software)#Small-group features|Proactive]] script and for [[Backup#User interface|user-initiated backups/restores]].<ref name="RetrospectKnowledgeBase" />',
163 => '; Cloud seeding and Large-Scale Recovery: Because of a large amount of data already backed up,<ref name="EnterpriseBackupChallenges" /> an enterprise adopting [[Backup#Storage_media|cloud backup]] likely will need to do [[Seed loading|"seeding"]]. This service copies a large volume of locally stored backup data onto a large-capacity disk device, which is then physically shipped to the cloud storage site and uploaded.<ref name="WhatIsAWSSnowball?">{{cite web |title=What Is an AWS Snowball Appliance? |url=https://docs.aws.amazon.com/snowball/latest/ug/whatissnowball.html |website=AWS |publisher=Amazon.com |accessdate=8 March 2018 |year=2018}}</ref><ref name="RouseCloudSeedingDef">{{cite web |last1=Rouse |first1=Margaret |title=Definition: cloud seeding |url=http://searchdatabackup.techtarget.com/definition/cloud-seeding |website=TechTarget |publisher=Tech Target Inc. |accessdate=16 November 2017 |date=December 2011}}</ref> After the large initial upload, the enterprise's backup software can be reconfigured to read from and write to the backup incrementally in its cloud location.<ref name="RetrospectChangingPathsMac">{{cite web |title=Changing paths Cloud Mac |url=https://www.youtube.com/watch?v=Ac3BhXO4T1g |via=YouTube |publisher=Retrospect Inc. |accessdate=7 October 2016 |format=Video |date=29 February 2016}}</ref> The service may need to be employed in reverse for faster [[Disaster_recovery|large-scale data recovery]] times than would be possible via an Internet connection.<ref name="WhatIsAWSSnowball?" /> Some applications offer seeding and large-scale recovery via third-party services, which may use a high-speed Internet channel to/from cloud storage rather than a shipable physical device.<ref name="NetBackupAmazonStorageGateway">{{cite web |last1=High |first1=Dave |author2=Mahmud, Fozz |title=NBU and the Amazon Storage Gateway VTL |url=https://www.youtube.com/watch?v=rU1rFK9o20s |website=Veritas |publisher=Veritas Technologies LLC |accessdate=17 January 2018 |format=Video |date=10 March 2016}}</ref><ref name="BackupExecCloudConnector">{{cite web |title=Backup Exec 16: Best Practices for Using the Veritas Backup Exec Cloud Connector |url=https://www.veritas.com/content/support/en_US/doc/72686287-129480082-0/v128967126-129480082 |website=Veritas Support |publisher=Veritas Technologies LLC |accessdate=15 January 2018 |date=25 October 2017}}</ref>',
164 => false,
165 => '== See also ==',
166 => ';About backup',
167 => '* Backup software',
168 => '** [[List of backup software]]',
169 => '* [[Glossary of backup terms]]',
170 => '* [[Remote backup service]]',
171 => '* [[Virtual backup appliance]]',
172 => false,
173 => ';Related topics',
174 => '* [[Data consistency]]',
175 => '* [[Data degradation]]',
176 => '* [[Data proliferation]]',
177 => '* [[Database dump]]',
178 => '* [[Digital preservation]]',
179 => '* [[Disaster recovery and business continuity auditing]]',
180 => '* [[File synchronization]]',
181 => '* [[Information repository]]',
182 => false,
183 => '== Notes ==',
184 => '{{reflist|group=note}}',
185 => false,
186 => '== References ==',
187 => '{{Reflist|2}}',
188 => false,
189 => '== External links ==',
190 => '{{Wiktionary|back up}}',
191 => '{{Wiktionary|backup}}',
192 => '{{Commons category|Backup}}',
193 => false,
194 => '[[Category:Data security]]',
195 => '[[Category:Backup| ]]'
] |