Examine individual changes
Appearance
This page allows you to examine the variables generated by the Edit Filter for an individual change.
Variables generated for this change
Variable | Value |
---|---|
Edit count of the user (user_editcount ) | null |
Name of the user account (user_name ) | '103.21.53.106' |
Age of the user account (user_age ) | 0 |
Groups (including implicit) the user is in (user_groups ) | [
0 => '*'
] |
Rights that the user has (user_rights ) | [
0 => 'createaccount',
1 => 'read',
2 => 'edit',
3 => 'createtalk',
4 => 'writeapi',
5 => 'viewmywatchlist',
6 => 'editmywatchlist',
7 => 'viewmyprivateinfo',
8 => 'editmyprivateinfo',
9 => 'editmyoptions',
10 => 'abusefilter-log-detail',
11 => 'urlshortener-create-url',
12 => 'centralauth-merge',
13 => 'abusefilter-view',
14 => 'abusefilter-log',
15 => 'vipsscaler-test'
] |
Whether the user is editing from mobile app (user_app ) | false |
Whether or not a user is editing through the mobile interface (user_mobile ) | false |
Page ID (page_id ) | 533867 |
Page namespace (page_namespace ) | 0 |
Page title without namespace (page_title ) | 'Backup' |
Full page title (page_prefixedtitle ) | 'Backup' |
Edit protection level of the page (page_restrictions_edit ) | [] |
Last ten users to contribute to the page (page_recent_contributors ) | [
0 => 'KH-1',
1 => '103.21.53.106',
2 => 'ClueBot NG',
3 => 'Citation bot',
4 => 'Pancho507',
5 => '203.81.71.103',
6 => 'GreenC bot',
7 => 'InternetArchiveBot',
8 => 'WereSpielChequers',
9 => '106.219.54.197'
] |
Page age in seconds (page_age ) | 523790322 |
Action (action ) | 'edit' |
Edit summary/reason (summary ) | '/* Incremental */ ' |
Old content model (old_content_model ) | 'wikitext' |
New content model (new_content_model ) | 'wikitext' |
Old page wikitext, before the edit (old_wikitext ) | '{{Use dmy dates|date=October 2019}}
{{about|backup in computer systems|other uses}}
In [[information technology]], a '''backup''', or '''data backup''' is a copy of [[computer data]] taken and stored elsewhere so that it may be used to restore the original after a [[data loss]] event. The verb form, referring to the process of doing so, is "[[wikt:back up|back up]]", whereas the noun and adjective form is "[[wikt:backup|backup]]".<ref name="AHDictionaryBackup">{{cite web |title=back•up |url=https://www.ahdictionary.com/word/search.html?q=backup |website=The American Heritage Dictionary of the English Language |publisher=Houghton Mifflin Harcourt |accessdate=9 May 2018 |year=2018}}</ref> Backups can be used to recover data after its loss from [[File deletion|data deletion]] or [[Data corruption|corruption]], or to recover data from an earlier time.<ref name="NelsonPro11">{{cite book
|url=https://books.google.com/books?id=r4uEEsq3CJYC
|title=Pro Data Backup and Recovery
|chapter=Chapter 1: Introduction to Backup and Recovery
|author=S. Nelson |publisher=Apress |pages=1–16 |year=2011
|isbn=978-1-4302-2663-5 |accessdate=8 May 2018}}</ref> Backups provide a simple form of [[disaster recovery]]; however not all backup systems are able to reconstitute a computer system or other complex configuration such as a [[computer cluster]], [[active directory]] server, or [[database server]].<ref name="CougiasTheBackup03Chapter01">{{cite book |chapter-url=https://books.google.com/books?id=eLviiTag5A0C&pg=PA1 |title=The Backup Book: Disaster Recovery from Desktop to Data Center |chapter=Chapter 1: What's a Disaster Without a Recovery? |author=Cougias, D.J. |author2=Heiberger, E.L. |author3=Koop, K. |publisher=Network Frontiers |pages=1–14 |year=2003 |isbn=0-9729039-0-9}}</ref>
A backup system contains at least one copy of all data considered worth saving. The [[computer data storage|data storage]] requirements can be large. An [[information repository]] model may be used to provide structure to this storage. There are different types of [[data storage device]]s used for copying backups of data that is already in secondary storage onto [[archive file]]s.<ref group = note name=ArchiveFileMayNotContainOld/HistoricalMaterial>In contrast to everyday use of the term "archive", the data stored in an "archive file" is not necessarily old or of historical interest.</ref><ref name="KissellTakeControlMacOSX">{{cite book
|author=Joe Kissell
|title=Take Control of Mac OS X Backups |date=2007
|publisher=TidBITS Electronic Publishing |location=Ithaca, NY
|isbn=978-0-9759503-0-2 |edition=Version 2.0
|url=http://people.fas.harvard.edu/~techtool/pages/Take_Control_of_Mac_OS_X_Backups_(2.0).pdf
|accessdate=17 May 2019 |ref=Kissell
|pages=18–20 ("The Archive", meaning information repository, including versioning), 24 (client-server), 82–83 (archive file), 112–114 (Off-site storage backup rotation scheme), 126–141 (old Retrospect terminology and GUI—still used in Windows variant), 165 (client-server), 128 (subvolume—later renamed Favorite Folder in Macintosh variant)}}</ref> There are also different ways these devices can be arranged to provide geographic dispersion, [[data security]], and portability.
Data is selected, extracted, and manipulated for storage. The process can include methods for [[Data_consistency#Point-in-time_consistency|dealing with live data]], including open files, as well as compression, encryption, and [[Data deduplication|de-duplication]]. Additional techniques apply to [[enterprise client-server backup]]. Backup schemes may include [[Dry run (testing)|dry runs]] that validate the reliability of the data being backed up. There are limitations<ref name=OutaBiz>{{cite newspaper |newspaper=[[The New York Times]]
|url=https://www.nytimes.com/2018/01/11/smarter-living/backing-up-your-photos.html
|title=A Beginner's Guide to Backing Up Photos
|author=Terry Sullivan |date=11 January 2018
|quote=a hard drive ... an established company ... declared bankruptcy ... where many ... had ...}}</ref> and human factors involved in any backup scheme.
==Storage==
A backup strategy requires an information repository, "a secondary storage space for data"<ref name="wiseGEEKInformationRepository">{{cite web |last1=McMahon |first1=Mary |title=What Is an Information Repository? |url=https://www.wisegeek.com/what-is-an-information-repository.htm |website=wiseGEEK |publisher=Conjecture Corporation |accessdate=8 May 2019 |date=1 April 2019 |quote=In the sense of an approach to data management, an information repository is a secondary storage space for data.}}</ref> that aggregates backups of data "sources". The repository could be as simple as a list of all backup media (DVDs, etc.) and the dates produced, or could include a computerized index, catalog, or [[relational database]].
The backup data needs to be stored, requiring a [[backup rotation scheme]],<ref name="KissellTakeControlMacOSX" /> which is a system of backing up data to computer media that limits the number of backups of different dates retained separately, by appropriate re-use of the data storage media by overwriting of backups no longer needed. The scheme determines how and when each piece of removable storage is used for a backup operation and how long it is retained once it has backup data stored on it. The 3-2-1 rule can aid in the backup process. It states that there should be at least 3 copies of the data, stored on 2 different types of storage media, and one copy should be kept offsite, in a remote location (this can include [[cloud storage]]). 2 or more different media should be used to eliminate data loss due to similar reasons (for example, optical discs may tolerate being underwater while LTO tapes may not, and SSDs cannot fail due to [[head crash]]es or damaged spindle motors since they don't have any moving parts, unlike hard drives) An offsite copy protects against fire, theft of physical media (such as tapes or discs) and natural disasters like floods and earthquakes.<ref>https://www.nakivo.com/blog/3-2-1-backup-rule-efficient-data-protection-strategy/</ref> Disaster protected hard drives like those made by [[ioSafe]] are an alternative to an offsite copy, but they have limitations like only being able to resist fire for a limited period of time, so an offsite copy still remains as the ideal choice.
===Backup methods===
====Unstructured====
An unstructured repository may simply be a stack of tapes, DVD-Rs or external HDDs with minimal information about what was backed up and when. This method is the easiest to implement, but unlikely to achieve a high level of recoverability as it lacks automation.
====Full only/System imaging====
A repository using this backup method contains complete source data copies taken at one or more specific points in time.<ref name="NakivoTypesOfBackup">{{cite web |last1=Mayer |first1=Alex |title=Backup Types Explained: Full, Incremental, Differential, Synthetic, and Forever-Incremental |url=https://www.nakivo.com/blog/backup-types-explained-full-incremental-differential-synthetic-and-forever-incremental/ |website=Nakivo Blog |publisher=Nakivo |accessdate=17 May 2019 |at=Full Backup, Incremental Backup, Differential Backup, Mirror Backup, Reverse Incremental Backup, Continuous Data Protection (CDP), Synthetic Full Backup, Forever-Incremental Backup
|date=6 November 2017}}</ref> Copying [[system image]]s, this method is frequently used by computer technicians to record known good configurations. However, imaging<ref>{{Cite web |title=Five key questions to ask about your backup solution |url=http://sysgen.ca/five-key-backup-questions/ |website=sysgen.ca |accessdate=23 September 2015 |archive-url=https://web.archive.org/web/20160304042343/http://sysgen.ca/five-key-backup-questions/ |archive-date=4 March 2016 |url-status=live |df=dmy-all|at=Does your company have a low tolerance to longer "data access outages" and/or would you like to minimize the time your company may be without its data?|date=23 March 2014 }}</ref> is generally more useful as a way of deploying a standard configuration to many systems rather than as a tool for making ongoing backups of diverse systems.
====Incremental====
An [[incremental backup]] stores data changed since a reference point in time.<ref name="NakivoWhatIsIncrementalBackup">{{cite web |last1=Reed |first1=Jessie |title=What Is Incremental Backup? |url=https://www.nakivo.com/blog/what-is-incremental-backup/ |website=Nakivo Blog |publisher=Nakivo |accessdate=17 May 2019 |at=Reverse incremental, Multilevel incremental, Block-level |date=27 February 2018}}</ref> Duplicate copies of unchanged data aren't copied.<ref name="NakivoTypesOfBackup" /> Typically a full backup of all files is once or at infrequent intervals, serving as the reference point for an incremental repository. Subsequently, a number of incremental backups are made after successive time periods. Restores begin with the last full backup and then apply the incrementals.<ref name="Tech-FAQIncrementalBackup">{{cite web |title=Incremental Backup |url=http://www.tech-faq.com/incremental-backup.shtml |website=Tech-FAQ |publisher=Independent Media |accessdate=10 March 2006 |archiveurl=https://web.archive.org/web/20160621090117/http://www.tech-faq.com/incremental-backup.shtml |archivedate=21 June 2016 |date=13 June 2005}}</ref>
Some backup systems<ref name="PondHowTimeMachineWorks">{{cite web | last1=Pond | first1=James| url=http://baligu.com/pondini/TM/Works.html | title=How Time Machine Works its Magic |website=Apple OSX and Time Machine Tips |publisher=baligu.com| accessdate=19 May 2019 | date=31 August 2013 | at=File System Event Store,Hard Links}}</ref> can create a {{visible anchor|synthetic full backup}} from a series of incrementals, thus providing the equivalent of frequently doing a full backup.<ref name="NakivoTypesOfBackup"/> When done to modify a single archive file, this speeds restores of recent versions of files.
====Near-CDP{{anchor|Continuous_data_protection}}====
[[Continuous Data Protection]] (CDP) refers to a backup that instantly saves a copy of every change made to the data. This allows restoration of data to any point in time and is the most comprehensive and advanced data protection.<ref name=InformationWeekWhyCDPGettingMorePractical>{{cite web
|author=Behzad Behtash
|url=https://www.informationweek.com/why-continuous-data-protections-getting-more-practical/d/d-id/1088883
|title=Why Continuous Data Protection's Getting More Practical
|work=Disaster recovery/business continuity
|publisher=InformationWeek |date=6 May 2010 |accessdate=12 November 2011 |quote=A true CDP approach should capture all data writes, thus continuously backing up data and eliminating backup windows.... CDP is the gold standard—the most comprehensive and advanced data protection. But "near CDP" technologies can deliver enough protection for many companies with less complexity and cost. For example, snapshots can provide a reasonable near-CDP-level of protection for file shares, letting users directly access data on the file share at regular intervals--say, every half-hour or 15 minutes. That's certainly a higher level of protection than tape-based or disk-based nightly backups and may be all you need.}}</ref> Near-CDP backup applications—often [[List of backup software#Proprietary|marketed]] as "CDP"—automatically take incremental backups at a specific interval, for example every 15 minutes, one hour, or 24 hours. They can therefore only allow restores to an interval boundary.<ref name=InformationWeekWhyCDPGettingMorePractical /> Near-CDP backup applications use [[Journaling file system|journaling]] and are typically based on periodic "snapshots",<ref name="ComputerWeeklyCDPExplained">{{cite web |title=Continuous data protection (CDP) explained: True CDP vs near-CDP |url=https://www.computerweekly.com/Continuous-data-protection-CDP-explained-True-CDP-vs-near-CDP |website=ComputerWeekly.com |publisher=TechTarget |accessdate=22 June 2019 |date=July 2010 |quote=... copies data from a source to a target. True CDP does this every time a change is made, while so-called near-CDP does this at pre-set time intervals. Near-CDP is effectively the same as snapshotting....True CDP systems record every write and copy them to the target where all changes are stored in a log. [new paragraph] By contrast, near-CDP/snapshot systems copy files in a straightforward manner but require applications to be quiesced and made ready for backup, either via the application's backup mode or using, for example, Microsoft's Volume Shadow Copy Services (VSS).}}</ref> [[file system permissions|read-only]] copies of the data frozen at a particular [[point in time]].
Near-CDP (except for [[Apple Time Machine]])<ref name="PondiniHowTMWorksItsMagic">{{cite web |last1=Pond |first1=James |title=How Time Machine Works its Magic |url=https://www.baligu.com/pondini/TM/Works.html |website=Apple OSX and Time Machine Tips |publisher=Baligu.com (as mirrored after James Pond died in 2013) |accessdate=10 July 2019 |date=31 August 2013 |quote=The File System Event Store is a hidden log that OSX keeps on each HFS+ formatted disk/partition of changes made to the data on it. It doesn’t list every file that’s changed, but each directory (folder) that’s had anything changed inside it.}}</ref> [[Intent log|intent-logs]] every change on the host system,<ref name="deGuiseEnterprise09#A.3.3">{{cite book |url=https://books.google.com/books?id=2OtqvySBTu4C&pg=PA287|title=Enterprise Systems Backup and Recovery: A Corporate Insurance Policy |author=de Guise, P. |publisher=CRC Press |pages=285–287 |year=2009 |isbn=978-1-4200-7639-4}}</ref> often by saving byte or block-level differences rather than file-level differences.<ref name="NakivoTypesOfBackup" /> This backup method differs from simple [[disk mirroring]]<ref name="NakivoTypesOfBackup" /> in that it enables a roll-back of the log and thus a restoration of old images of data. Intent-logging allows precautions for the consistency of live data, protecting ''self-consistent'' files but requiring ''applications'' "be quiesced and made ready for backup."
Near-CDP is more practicable for ordinary personal backup applications, as opposed to ''true'' CDP, which must be run in conjunction with a virtual machine<ref name="VictorWuEMCRecoverPointVM">{{cite web |last1=Wu |first1=Victor |title=EMC RecoverPoint for Virtual Machine Overview |url=https://wuchikin.wordpress.com/2017/03/04/emc-recoverpoint-for-virtual-machine-overview/ |website=Victor Virtual |publisher=WuChiKin |accessdate=22 June 2019 |date=4 March 2017 |quote=The splitter splits out the Write IOs to the VMDK/RDM of a VM and sends a copy to the production VMDK and also to the RecoverPoint for VMs cluster.}}</ref><ref name="RES-QServicesZertoOrVeeam">{{cite web |title=Zerto or Veeam? |url=https://resqdr.com/zerto-or-veeam/ |website=RES-Q Services |accessdate=7 July 2019 |date=March 2017 |quote=Zerto doesn’t use snapshot technology like Veeam. Instead, Zerto deploys small virtual machines on its physical hosts. These Zerto VMs capture the data as it is written to the host and then send a copy of that data to the replication site.....However, Veeam has the advantage of being able to more efficiently capture and store data for long-term retention needs. There is also a significant pricing difference, with Veeam being cheaper than Zerto.}}</ref> or equivalent<ref name="CloudEndureAgentRelated">{{cite web |title=Agent Related |url=https://docs.cloudendure.com/Content/FAQ/FAQ/Agent_Related.htm |website=CloudEndure.com |accessdate=3 July 2019 |at=What does the CloudEndure Agent do? |date=2019 |quote=The CloudEndure Agent performs an initial block-level read of the content of any volume attached to the server and replicates it to the Replication Server. The Agent then acts as an OS-level read filter to capture writes and synchronizes any block level modifications to the CloudEndure Replication Server, ensuring near-zero RPO.}}</ref> and is therefore generally used in enterprise client-server backups.
====Reverse incremental====
A [[Incremental backup#Reverse incremental|Reverse incremental]] backup method stores a recent archive file "mirror" of the source data and a series of differences between the "mirror" in its current state and its previous states. A reverse incremental backup method starts with a non-image full backup. After the full backup is performed, the system periodically synchronizes the full backup with the live copy, while storing the data necessary to reconstruct older versions. This can either be done using [[hard links]]—as Apple Time Machine does, or using binary [[data comparison|diffs]].
====Differential====
A differential backup saves only the data that has changed since the last full backup. This means a maximum of two backups from the repository are used to restore the data. However, as time from the last full backup (and thus the accumulated changes in data) increases, so does the time to perform the differential backup. Restoring an entire system requires starting from the most recent full backup and then applying just the last differential backup.
A differential backup copies files that have been created or changed since the last full backup, regardless of whether any other differential backups have been made since, whereas an incremental backup copies files that have been created or changed since the most recent backup of any type (full or incremental). Other variations of incremental backup include multi-level incrementals and block-level incrementals that compare parts of files instead of just entire files.
===Storage media===
[[File:DVD, USB flash drive and external hard drive.jpg|thumb|right|From left to right, a [[DVD]] disc in plastic cover, a USB flash drive and an [[external hard drive]]]]
Regardless of the repository model that is used, the data has to be copied onto an archive file data storage medium. The medium used is also referred to as the type of backup destination.
====Magnetic tape====
[[Magnetic tape data storage|Magnetic tape]] was for a long time the most commonly used medium for bulk data storage, backup, archiving, and interchange. It was previously a less expensive option, but this is no longer the case for smaller amounts of data.<ref name=EngenioDiskToDiskVsTape>{{cite web |date=9 December 2004 |accessdate=26 May 2019
|url=http://www.storagesearch.com/engenio-art2.html |archive-url=https://web.archive.org/web/20050207082953/http://www.storagesearch.com/engenio-art2.html |url-status=dead |archive-date=7 February 2005 |title=Disk to Disk Backup versus Tape – War or Truce? |last1=Gardner |first1=Steve |at=Peaceful coexistence |publisher=Engenio
}}</ref> Tape is a [[sequential access]] medium, so the rate of continuously writing or reading data can be very fast.
Many tape formats have been proprietary or specific to certain markets like mainframes or a particular brand of personal computer. By 2014 [[Linear Tape-Open#Market performance|LTO]] had become the primary tape technology.<ref name="SpectraLogicDigitalDataStorageOutlook2017">{{cite web |title=Digital Data Storage Outlook 2017 |url=https://spectralogic.com/wp-content/uploads/white-paper-digital-data-storage-outlook-2017-v3.pdf |website=Spectra |publisher=Spectra Logic |accessdate=11 July 2018 |page=7(Solid-State), 10(Magnetic Disk), 14(Tape), 17(Optical) |year=2017}}</ref> The other remaining viable "super" format is the [[IBM 3592]] (also referred to as the TS11xx series). The [[StorageTek tape formats#T10000|Oracle StorageTek T10000]] was discontinued in 2016.<ref name="ForbesKeepingDataLongTime">{{cite web |author=Tom Coughlin |title=Keeping Data for a Long Time |url=https://www.forbes.com/sites/tomcoughlin/2014/06/29/keeping-data-for-a-long-time/ |website=Forbes |publisher=Forbes Media LLC |accessdate=19 April 2018 |date=29 June 2014 |at=para. Magnetic Tapes(popular formats, storage life), para. Hard Disk Drives(active archive), para. First consider flash memory in archiving(... may not have good media archive life)}}</ref>
====Hard disk====
The use of [[hard disk]] storage has increased over time as it has become progressively cheaper. Hard disks are usually easy to use, widely available, and can be accessed quickly.<ref name="SpectraLogicDigitalDataStorageOutlook2017" /> However, hard disk backups are [[Hard disk drive#Magnetic recording|close-tolerance mechanical devices]] and may be more easily damaged than tapes, especially while being transported.<ref name="PCWorldHardCoreDataPreservation">{{cite web |last1=Jacobi |first1=John L. |title=Hard-core data preservation: The best media and methods for archiving your data |url=https://www.pcworld.com/article/2984597/storage/hard-core-data-preservation-the-best-media-and-methods-for-archiving-your-data.html |website=PC World |accessdate=19 April 2018 |date=29 February 2016 |at=sec. External Hard Drives(on the shelf, magnetic properties, mechanical stresses, vulnerable to shocks), Tape, Online storage}}</ref> In the mid-2000s, several drive manufacturers began to produce portable drives employing [[Hard disk drive failure#Unloading|ramp loading and accelerometer]] technology (sometimes termed a "shock sensor"),<ref name="HGSTRampLoadUnload">{{cite web |title=Ramp Load/Unload Technology in Hard Disk Drives |url=https://www.hgst.com/sites/default/files/resources/LoadUnload_white_paper_FINAL.pdf |website=HGST |publisher=Western Digital |accessdate=29 June 2018 |page=3(sec. Enhanced Shock Tolerance) |date=November 2007}}</ref><ref name="ToshibaCanvio3.0PortableHDD">{{cite web |title=Toshiba Portable Hard Drive (Canvio® 3.0) |url=https://www.toshibadata.com.sg/Product-Canvio-Portable-Hard-Drive.aspx |website=Toshiba Data Dynamics Singapore |publisher=Toshiba Data Dynamics Pte Ltd |accessdate=16 June 2018 |year=2018 |at=sec. Overview(Internal shock sensor and ramp loading technology)}}</ref> and by 2010 the industry average in drop tests for drives with that technology showed drives remaining intact and working after a 36-inch non-operating drop onto industrial carpeting.<ref name="IomegaDropShock">{{cite web |title=Iomega Drop Guard ™ Technology |url=https://www.doc-developpement-durable.org/file/Projets-informatiques/Drop%20Guard-disque-dur-tres-solide.pdf |website=Hard Drive Storage Solutions |publisher=Iomega Corp. |accessdate=12 July 2018 |pages=2(What is Drop Shock Technology?, What is Drop Guard Technology? (... features special internal cushioning .... 40% above the industry average)), 3(*NOTE) |date=20 September 2010}}</ref> Some manufacturers also offer 'ruggedized' portable hard drives, which include a shock-absorbing case around the hard disk, and [[MIL-STD-810#Applicability to "ruggedized" consumer products|claim]] a range of higher drop specifications.<ref name="IomegaDropShock" /><ref name=PCMagBest>
{{cite web |author=John Burek
|title=The Best Rugged Hard Drives and SSDs
|url=https://www.pcmag.com/roundup/361072/the-best-rugged-hard-drives-and-ssds
|website=[[PC Magazine]] |publisher=Ziff Davis |accessdate=4 August 2018
|at=What Exactly Makes a Drive Rugged?(When a drive is encased ... you're mostly at the mercy of the drive vendor to tell you the rated maximum drop distance for the drive) |date=15 May 2018}}</ref><ref name="WirecutterBestPortableHardDrive2017Don'tBuy">{{cite web
|author=Justin Krajeski |author2=Kimber Streams
|title=The Best Portable Hard Drive
|url=http://thewirecutter.com/reviews/best-portable-hard-drive/#dont-buy-a-rugged-portable-hard-drive
|work=[[The New York Times]] |accessdate=4 August 2018
|archive-url=https://web.archive.org/web/20170331161821/http://thewirecutter.com/reviews/best-portable-hard-drive/#dont-buy-a-rugged-portable-hard-drive |url-status=dead |archivedate=31 March 2017 |date=20 March 2017}}</ref> Over a period of years the stability of hard disk backups is shorter than that of tape backups.<ref name="ForbesKeepingDataLongTime" /><ref name="IronMountainBestLong-TermDataArchiveSolutions">{{cite web |title=Best Long-Term Data Archive Solutions |url=http://www.ironmountain.com/resources/general-articles/b/best-long-term-data-archive-solutions |website=Iron Mountain |publisher=Iron Mountain Inc. |accessdate=19 April 2018 |year=2018 |at=sec. More Reliable(average mean time between failure ... rates, best practice for migrating data)}}</ref><ref name="PCWorldHardCoreDataPreservation" />
External hard disks can be connected via local interfaces like [[SCSI]], [[USB]], [[FireWire]], or [[eSATA]], or via longer-distance technologies like [[Ethernet]], [[iSCSI]], or [[Fibre Channel]]. Some disk-based backup systems, via [[Virtual tape library|Virtual Tape Libraries]] or otherwise, support data deduplication, which can reduce the amount of disk storage capacity consumed by daily and weekly backup data.<ref name="KissellTakeControlBackingUp">{{cite book |last1=Kissell |first1=Joe |title=Take Control of Backing Up Your Mac |date=2011 |publisher=TidBITS Publishing Inc. |location=Ithaca NY |isbn=978-1-61542-394-1 |page=41(Deduplication) |url=https://books.google.com/books?id=ANe3k_7bnAcC&q=retrospect+deduplication&pg=PT41 |accessdate=17 September 2019}}</ref><ref>{{Cite web |url=http://www.dcig.com/2009/07/symantec-shows-backup-exec-a-l.html |title=Symantec Shows Backup Exec a Little Dedupe Love; Lays out Source Side Deduplication Roadmap – DCIG |website=DCIG |access-date=26 February 2016 |archive-url=https://web.archive.org/web/20160304212819/http://www.dcig.com/2009/07/symantec-shows-backup-exec-a-l.html |archive-date=4 March 2016 |url-status=live |df=dmy-all}}</ref><ref name="NetBackupDeduplicationGuide">{{cite web |title=Veritas NetBackup™ Deduplication Guide |url=https://www.veritas.com/content/support/en_US/doc/ka6j00000000ADEAA2 |website=Veritas |publisher=Veritas Technologies LLC |accessdate=26 July 2018 |year=2016}}</ref>
====Optical storage====
[[Optical storage]] uses lasers to store and retrieve data. Recordable [[CD]]s, DVDs, and [[Blu-ray Disc]]s are commonly used with personal computers and are generally cheap. In the past, the capacities and speeds of these discs have been lower than hard disks or tapes, although advances in optical media are slowly shrinking that gap.<ref name="WanOptical14">{{cite journal
|title=Optical storage: An emerging option in long-term digital preservation
|journal=Frontiers of Optoelectronics
|author1=S. Wan |author2=Q. Cao |author3=C. Xie |volume=7 |issue=4 |pages=486–492 |year=2014
|doi=10.1007/s12200-014-0442-2|s2cid=60816607
}}</ref><ref>{{cite journal
|title=High-capacity optical long data memory based on enhanced Young's modulus in nanoplasmonic hybrid glass composites
|journal=Nature Communications
|author1=Q. Zhang |author2=Z. Xia |author3=Y.-B. Cheng |author4=M. Gu
|volume=9 |issue=1
|pages=1183 |year=2018 |doi=10.1038/s41467-018-03589-y|pmid=29568055
|bibcode=2018NatCo...9.1183Z
|pmc=5864957 }}</ref> Many optical disc formats are [[Write Once Read Many|WORM]] type, which makes them useful for archival purposes since the data cannot be changed. Some optical storage systems allow for cataloged data backups without human contact with the discs, allowing for longer data integrity. A French study in 2008 indicated that the lifespan of typically-sold [[CD-R#Lifespan|CD-Rs]] was 2–10 years,<ref name=INA_CD-R_Study>{{cite web
|url=http://www.ina.fr/video/3571726001/20-heures-emission-du-3-mars-2008.fr.html
|title= Journal de 20 Heures |accessdate=3 March 2008
|at=approximately minute 30 of the TV news broadcast
|work= Institut national de l'audiovisuel
|author1=Gérard Poirier |author2=Foued Berahou
|date=3 March 2008}}</ref> but one manufacturer later estimated the longevity of its CD-Rs with a gold-sputtered layer to be as high as 100 years.<ref>{{cite web
|url=http://delkin.com/i-5937134-archival-gold-cd-r-300-year-disc-binder-of-10-discs-with-scratch-armor-surface.html
|title=Archival Gold CD-R "300 Year Disc" Binder of 10 Discs with Scratch Armor Surface
|archivedate=27 September 2013 |archiveurl=https://web.archive.org/web/20130927170900/http://delkin.com/i-5937134-archival-gold-cd-r-300-year-disc-binder-of-10-discs-with-scratch-armor-surface.html
|website=Delkin Devices |publisher=Delkin Devices Inc.}}</ref> Sony's [[Optical Disc Archive]]<ref name="SpectraLogicDigitalDataStorageOutlook2017" /> can in 2016 reach a read rate of 250MB/s.<ref name="SonyOpticalDiscArchiveGen2">{{cite web |title=Optical Disc Archive Generation 2 |url=https://pro.sony/s3/cms-static-content/file/49/1237494482649.pdf |website=Optical Disc Archive |publisher=Sony |accessdate=15 August 2019 |page=12(World’s First 8-Channel Optical Drive Unit) |date=April 2016}}</ref>
====Solid-state drive====
[[Solid-state drives]] (SSDs) use [[integrated circuit]] assemblies to store data. [[Flash memory]], [[thumb drive]]s, [[USB flash drive]]s, [[CompactFlash]], [[SmartMedia]], [[Memory Stick]]s, and [[Secure Digital card]] devices are relatively expensive for their low capacity, but convenient for backing up relatively low data volumes. A solid-state drive does not contain any movable parts, making it less susceptible to physical damage, and can have huge throughput of around 500 Mbit/s up to 6 Gbit/s. Available SSDs have become more capacious and cheaper.<ref>{{cite journal
|title=Solid-State Drives (SSDs) |journal=Proceedings of the IEEE
|author1=R. Micheloni |author2=P. Olivo
|volume=105 |issue=9 |pages=1586–88 |year=2017
|doi=10.1109/JPROC.2017.2727228 }}</ref><ref name=PCMagBest/> Flash memory backups are stable for fewer years than hard disk backups.<ref name="ForbesKeepingDataLongTime" />
====Remote backup service====
[[Remote backup service]]s or cloud backups involve service providers storing data offsite. This has been used to protect against events such as fires, floods, or earthquakes which could destroy locally stored backups.<ref name="DellRemoteBackup">{{cite web
|url=https://www.emc.com/corporate/glossary/remote-backup.htm
|title=Remote Backup |work=EMC Glossary |publisher=Dell, Inc
|accessdate=8 May 2018
|quote=Effective remote backup requires that production data be regularly backed up to a location far enough away from the primary location so that both locations would not be affected by the same disruptive event.}}</ref> Cloud-based backup (through services like or similar to [[Google Drive]], and [[Microsoft OneDrive]]) provides a layer of data protection.<ref name="PCWorldHardCoreDataPreservation" /> However, the users must trust the provider to maintain the privacy and integrity of their data, with confidentiality enhanced by the use of [[encryption]]. Because speed and availability are limited by a user's online connection,<ref name="PCWorldHardCoreDataPreservation" /> users with large amounts of data may need to use cloud seeding and large-scale recovery.
===Management===
Various methods can be used to manage backup media, striking a balance between accessibility, security and cost. These media management methods are not mutually exclusive and are frequently combined to meet the user's needs. Using on-line disks for staging data before it is sent to a near-line [[tape library]] is a common example.<ref name="StackpoleSoftware07">{{cite book |url=https://books.google.com/books?id=gjAhVzuV7k0C&pg=PA164 |title=Software Deployment, Updating, and Patching |author=Stackpole, B. |author2=Hanrion, P. |publisher=CRC Press |pages=164–165 |year=2007 |isbn=978-1-4200-1329-0 |accessdate=8 May 2018}}</ref><ref name="GnanasundaramInfo12">{{cite book |url=https://books.google.com/books?id=PU7gkW9ArxIC&pg=PA255 |title=Information Storage and Management: Storing, Managing, and Protecting Digital Information in Classic, Virtualized, and Cloud Environments |editor=Gnanasundaram, S. |editor2=Shrivastava, A. |publisher=John Wiley and Sons |page=255 |year=2012 |isbn=978-1-118-23696-3 |accessdate=8 May 2018}}</ref>
====Online====
[[Online]] backup storage is typically the most accessible type of data storage, and can begin a restore in milliseconds. An internal hard disk or a [[disk array]] (maybe connected to [[Storage area network|SAN]]) is an example of an online backup. This type of storage is convenient and speedy, but is vulnerable to being deleted or overwritten, either by accident, by malevolent action, or in the wake of a data-deleting [[Computer virus|virus]] payload.
====Near-line====
[[Nearline storage]] is typically less accessible and less expensive than online storage, but still useful for backup data storage. A mechanical device is usually used to move media units from storage into a drive where the data can be read or written. Generally it has safety properties similar to on-line storage. An example is a [[tape library]] with restore times ranging from seconds to a few minutes.
====Off-line====
[[Off-line storage]] requires some direct action to provide access to the storage media: for example, inserting a tape into a tape drive or plugging in a cable. Because the data is not accessible via any computer except during limited periods in which they are written or read back, they are largely immune to on-line backup failure modes. Access time varies depending on whether the media are on-site or off-site.
====Off-site data protection====
Backup media may be sent to an [[off-site data protection|off-site]] vault to protect against a disaster or other site-specific problem. The vault can be as simple as a system administrator's home office or as sophisticated as a disaster-hardened, temperature-controlled, high-security bunker with facilities for backup media storage. A data replica can be off-site but also on-line (e.g., an off-site [[RAID]] mirror). Such a replica has fairly limited value as a backup.
====Backup site====
A [[backup site]] or disaster recovery center is used to store data that can enable computer systems and networks to be restored and properly configure in the event of a disaster. Some organisations have their own data recovery centres, while others contract this out to a third-party. Due to high costs, backing up is rarely considered the preferred method of moving data to a DR site. A more typical way would be remote [[disk mirroring]], which keeps the DR data as up to date as possible.
==Selection and extraction of data==
A backup operation starts with selecting and extracting coherent units of data. Most data on modern computer systems is stored in discrete units, known as [[Computer file|files]]. These files are organized into [[filesystem]]s. Deciding what to back up at any given time involves tradeoffs. By backing up too much redundant data, the information repository will fill up too quickly. Backing up an insufficient amount of data can eventually lead to the loss of critical information.<ref name="LeesWhatTo17">{{cite web |url=https://irontree.co.za/what-to-backup-a-critical-look-at-your-data-1935.html |title=What to backup – a critical look at your data |author=Lee
|work=Irontree Blog |publisher=Irontree Internet Services CC
|date=25 January 2017 |accessdate=8 May 2018}}</ref>
===Files===
*[[File copying|Copying files]] : Making copies of files is the simplest and most common way to perform a backup. A means to perform this basic function is included in all backup software and all operating systems.
*Partial file copying: A backup may include only the blocks or bytes within a file that have changed in a given period of time. This can substantially reduce needed storage space, but requires higher sophistication to reconstruct files in a restore situation. Some implementations require integration with the source file system.
*Deleted files : To prevent the unintentional restoration of files that have been intentionally deleted, a record of the deletion must be kept.
*Versioning of files : Most backup applications, other than those that do only full only/System imaging, also back up files that have been modified since the last backup. "That way, you can retrieve many different versions of a given file, and if you delete it on your hard disk, you can still find it in your [information repository] archive."<ref name="KissellTakeControlMacOSX" />
===Filesystems===
*Filesystem dump: A copy of the whole filesystem in block-level can be made. This is also known as a "raw partition backup" and is related to [[disk image|disk imaging]]. The process usually involves unmounting the filesystem and running a program like [[dd (Unix)]].<ref name="PrestonBackup07">{{cite book |url=https://books.google.com/books?id=6-w4fXbBInoC&pg=PA111 |title=Backup & Recovery: Inexpensive Backup Solutions for Open Systems |author=Preston, W.C. |publisher=O'Reilly Media, Inc |pages=111–114 |year=2007 |isbn=978-0-596-55504-7 |accessdate=8 May 2018}}</ref> Because the disk is read sequentially and with large buffers, this type of backup can be faster than reading every file normally, especially when the filesystem contains many small files, is highly fragmented, or is nearly full. But because this method also reads the free disk blocks that contain no useful data, this method can also be slower than conventional reading, especially when the filesystem is nearly empty. Some filesystems, such as [[XFS]], provide a "dump" utility that reads the disk sequentially for high performance while skipping unused sections. The corresponding restore utility can selectively restore individual files or the entire volume at the operator's choice.<ref name="PrestonUnix99">{{cite book |url=https://archive.org/details/unixbackuprecove00wcur |url-access=registration |title=Unix Backup & Recovery |author=Preston, W.C. |publisher=O'Reilly Media, Inc |pages=[https://archive.org/details/unixbackuprecove00wcur/page/73 73]–91 |year=1999 |isbn=978-1-56592-642-4 |accessdate=8 May 2018}}</ref>
*Identification of changes: Some filesystems have an [[archive bit]] for each file that says it was recently changed. Some backup software looks at the date of the file and compares it with the last backup to determine whether the file was changed.
*[[Versioning file system]] : A versioning filesystem tracks all changes to a file. The [[NILFS]] versioning filesystem for Linux is an example.<ref name="NILFSHome">{{cite web |title=NILFS Home |url=https://nilfs.sourceforge.io/en/ |website=NILFS Continuous Snapshotting System |publisher=NILFS Community |accessdate=22 August 2019 |date=2019}}</ref>
===Live data===
Files that are actively being updated present a challenge to back up. One way to back up live data is to temporarily [[quiesce]] them (e.g., close all files), take a "snapshot", and then resume live operations. At this point the snapshot can be backed up through normal methods.<ref name="CougiasTheBackup03Chapter11">{{cite book |chapter-url=https://books.google.com/books?id=eLviiTag5A0C&pg=PA360|title=The Backup Book: Disaster Recovery from Desktop to Data Center |chapter=Chapter 11: Open file backup for databases |author=Cougias, D.J. |author2=Heiberger, E.L. |author3=Koop, K. |publisher=Network Frontiers |pages=356–360 |year=2003 |isbn=0-9729039-0-9}}</ref> A [[Snapshot (computer storage)|snapshot]] is an instantaneous function of some [[file system|filesystems]] that presents a copy of the filesystem as if it were frozen at a specific point in time, often by a [[copy-on-write]] mechanism. Snapshotting a file while it is being changed results in a corrupted file that is unusable. This is also the case across interrelated files, as may be found in a conventional database or in applications such as [[Microsoft Exchange Server]].<ref name="ComputerWeeklyCDPExplained" /> The term [[fuzzy backup]] can be used to describe a backup of live data that looks like it ran correctly, but does not represent the state of the data at a single point in time.<ref name="LiotineMission03">{{cite book |url=https://books.google.com/books?id=LecC2BhPPxMC&pg=PA244 |title=Mission-critical Network Planning |author=Liotine, M. |publisher=Artech House |page=244 |year=2003 |isbn=978-1-58053-559-5 |accessdate=8 May 2018}}</ref>
Backup options for data files that cannot be or are not quiesced include:<ref name="deGuiseEnterprise09#3.4.7">{{cite book |url=https://books.google.com/books?id=2OtqvySBTu4C&pg=PA50 |title=Enterprise Systems Backup and Recovery: A Corporate Insurance Policy |author=de Guise, P. |publisher=CRC Press |pages=50–54 |year=2009 |isbn=978-1-4200-7639-4}}</ref>
*Open file backup: Many backup software applications undertake to back up open files in an internally consistent state.<ref name="HandyBackupOpenFileWindows">{{cite web |title=Open File Backup Software for Windows |url=https://www.handybackup.net/open-file-backup.shtml
|website=Handy Backup |publisher=Novosoft LLC
|accessdate=29 November 2018 |date=8 November 2018}}</ref> Some applications simply check whether open files are in use and try again later.<ref name="CougiasTheBackup03Chapter11" /> Other applications exclude open files that are updated very frequently.<ref name="ArqTroubleshootingBackingUpOpen/lockedFiles">{{cite web |last1=Reitshamer |first1=Stefan |title=Troubleshooting backing up open/locked files on Windows |url=https://www.arqbackup.com/blog/troubleshooting-backing-up-openlocked-files-on-windows/ |website=Arq Blog
|publisher=Haystack Software
|accessdate=29 November 2018 |date=5 July 2017 |at=Stefan Reitshamer is the principal developer of Arq}}</ref> Some [[High availability|low-availability]] interactive applications can be backed up via natural/induced pausing.
*Interrelated database files backup: Some interrelated database file systems offer a means to generate a "hot backup"<ref name="UWiscOracleBackups">{{cite web |last1=Boss |first1=Nina |title=Oracle Tips Session #3: Oracle Backups |url=http://www.wisc.edu/drmt/oratips/sess003.html#Hotbackup |archive-url=https://web.archive.org/web/20070302110933/http://www.wisc.edu/drmt/oratips/sess003.html#Hotbackup |url-status=dead |website=www.wisc.edu |publisher=University of Wisconsin |accessdate=1 December 2018 |archivedate=2 March 2007 |date=10 December 1997}}</ref> of the database while it is online and usable. This may include a snapshot of the data files plus a snapshotted log of changes made while the backup is running. Upon a restore, the changes in the log files are applied to bring the copy of the database up to the point in time at which the initial backup ended.<ref name="ArcserveOracleWhatIsARCHIVE-LOG">{{cite web |title=What is ARCHIVE-LOG and NO-ARCHIVE-LOG mode in Oracle and the advantages & disadvantages of these modes? |url=https://support.arcserve.com/s/article/202080249?language=en_US |website=Arcserve Backup |publisher=Arcserve |accessdate=29 November 2018 |date=27 September 2018}}</ref> Other low-availability interactive applications can be backed up via coordinated snapshots. However, [[High availability|genuinely-high-availability]] interactive applications can be only be backed up via Continuous Data Protection.
===Metadata===
Not all information stored on the computer is stored in files. Accurately recovering a complete system from scratch requires keeping track of this [[metadata|non-file data]] too.<ref name="Gresovnik1">{{cite web |url=http://www2.arnes.si/~ljc3m2/igor/blogs/technical/bootable_media_creation.html |title=Preparation of Bootable Media and Images |last=Grešovnik |first=Igor |date=April 2016 |archive-url=https://web.archive.org/web/20160425113119/http://www2.arnes.si/~ljc3m2/igor/blogs/technical/bootable_media_creation.html |archivedate=25 April 2016 |access-date=21 April 2016}}</ref>
*System description: System specifications are needed to procure an exact replacement after a disaster.
*[[Boot sector]] : The boot sector can sometimes be recreated more easily than saving it. It usually isn't a normal file and the system won't boot without it.
*[[Disk partitioning|Partition]] layout: The layout of the original disk, as well as partition tables and filesystem settings, is needed to properly recreate the original system.
*File [[metadata]] : Each file's permissions, owner, group, ACLs, and any other metadata need to be backed up for a restore to properly recreate the original environment.
*System metadata: Different operating systems have different ways of storing configuration information. [[Microsoft Windows]] keeps a [[Windows Registry|registry]] of system information that is more difficult to restore than a typical file.
==Manipulation of data and dataset optimization==
It is frequently useful or required to manipulate the data being backed up to optimize the backup process. These manipulations can improve backup speed, restore speed, data security, media usage and/or reduced bandwidth requirements.
===Automated data grooming===
Out-of-date data can be automatically deleted, but for personal backup applications—as opposed to enterprise client-server backup applications where automated data "grooming" can be customized—the deletion<ref group=note>Some backup applications—notably [[Rsync#History|rsync]] and [[Code42#File_backup_and_sharing_services|CrashPlan]]—term removing backup data "pruning" instead of "grooming".[https://linux.die.net/man/1/rsync][https://support.code42.com/Administrator/5/Monitoring_and_managing/Archive_maintenance#Prune]</ref> can at most<ref name="PondiniFAQ12">{{cite web |last1=Pond |first1=James |title=12. Should I delete old backups? If so, How? |url=https://www.baligu.com/pondini/TM/12.html |website=Time Machine |publisher=baligu.com |accessdate=21 June 2019 |at=Green box, Gray box |date=2 June 2012}}</ref> be globally delayed or be disabled.<ref name="WirecutterBestOnlineCloudBackupService">{{cite web |last1=Kissell |first1=Joe |title=The Best Online Cloud Backup Service |url=https://thewirecutter.com/reviews/best-online-backup-service/ |website=wirecutter |publisher=The New York Times|accessdate=21 June 2019 |at=Next, there’s file retention. |date=12 March 2019}}</ref>
===Compression===
Various schemes can be employed to [[Data compression|shrink]] the size of the source data to be stored so that it uses less storage space. Compression is frequently a built-in feature of tape drive hardware.<ref name="CherrySecuring15">{{cite book |url=https://books.google.com/books?id=SD_LAwAAQBAJ&pg=PA306 |title=Securing SQL Server: Protecting Your Database from Attackers
|author=D. Cherry |publisher=Syngress |pages=306–308 |year=2015
|isbn=978-0-12-801375-5 |accessdate=8 May 2018}}</ref>
===Deduplication===
Redundancy due to backing up similarly configured workstations can be reduced, thus storing just one copy. This technique can be applied at the file or raw block level. This potentially large reduction<ref name="CherrySecuring15" /> is called [[Data deduplication|deduplication]]. It can occur on a server before any data moves to backup media, sometimes referred to as source/client side deduplication. This approach also reduces bandwidth required to send backup data to its target media. The process can also occur at the target storage device, sometimes referred to as inline or back-end deduplication.
===Duplication===
Sometimes backups are [[Replication (computer science)|duplicated]] to a second set of storage media. This can be done to rearrange the archive files to optimize restore speed, or to have a second copy at a different location or on a different storage medium—as in the disk-to-disk-to-tape capability of Enterprise client-server backup.
===Encryption===
High-capacity removable storage media such as backup tapes present a data security risk if they are lost or stolen.<ref>[http://www.securityfocus.com/news/11048 Backups tapes a backdoor for identity thieves] {{Webarchive|url=https://web.archive.org/web/20160405033517/http://www.securityfocus.com/news/11048 |date=5 April 2016 }} (28 April 2004). Retrieved 10 March 2007</ref> [[Encryption|Encrypting]] the data on these media can mitigate this problem, however encryption is a CPU intensive process that can slow down backup speeds, and the security of the encrypted backups is only as effective as the security of the key management policy.<ref name="CherrySecuring15" />
===Multiplexing===
When there are many more computers to be backed up than there are destination storage devices, the ability to use a single storage device with several simultaneous backups can be useful.<ref name="PrestonBackup07-02">{{cite book |url=https://books.google.com/books?id=6-w4fXbBInoC&pg=PA219 |title=Backup & Recovery: Inexpensive Backup Solutions for Open Systems |author=Preston, W.C. |publisher=O'Reilly Media, Inc |pages=219–220 |year=2007 |isbn=978-0-596-55504-7 |accessdate=8 May 2018}}</ref> However cramming the scheduled [[Glossary_of_backup_terms#Terms and definitions|backup window]] via "multiplexed backup" is only used for tape destinations.<ref name="PrestonBackup07-02" />
===Refactoring===
The process of rearranging the sets of backups in an archive file is known as refactoring. For example, if a backup system uses a single tape each day to store the incremental backups for all the protected computers, restoring one of the computers could require many tapes. Refactoring could be used to consolidate all the backups for a single computer onto a single tape, creating a "synthetic full backup". This is especially useful for backup systems that do incrementals forever style backups.
===Staging===
Sometimes backups are copied to a [[Disk staging|staging]] disk before being copied to tape.<ref name="PrestonBackup07-02" /> This process is sometimes referred to as D2D2T, an acronym for [[Disk-to-disk-to-tape]]. It can be useful if there is a problem matching the speed of the final destination device with the source device, as is frequently faced in network-based backup systems. It can also serve as a centralized location for applying other data manipulation techniques.
===Objectives===
*[[Disaster recovery#Recovery Point Objective|Recovery point objective]] (RPO) : The point in time that the restarted infrastructure will reflect, expressed as "the maximum targeted period in which data (transactions) might be lost from an IT service due to a major incident". Essentially, this is the roll-back that will be experienced as a result of the recovery. The most desirable RPO would be the point just prior to the data loss event. Making a more recent recovery point achievable requires increasing the frequency of [[file synchronization|synchronization]] between the source data and the backup repository.<ref name="RiskyThinkingDefRPO">{{cite web |title=Recovery Point Objective (Definition) |url=https://www.riskythinking.com/glossary/recovery_point_objective.php |website=ARL Risky Thinking |publisher=Albion Research Ltd. |accessdate=4 August 2019 |date=2007}}</ref>
*Recovery time objective (RTO) : The amount of time elapsed between disaster and restoration of business functions.<ref name="RiskyThinkingDefRTO">{{cite web |title=Recovery Time Objective (Definition) |url=https://www.riskythinking.com/glossary/recovery_time_objective.php |website=ARL Risky Thinking |publisher=Albion Research Ltd. |accessdate=4 August 2019 |date=2007}}</ref>
*[[Data security]] : In addition to preserving access to data for its owners, data must be restricted from unauthorized access. Backups must be performed in a manner that does not compromise the original owner's undertaking. This can be achieved with data encryption and proper media handling policies.<ref name="LittleImplement03">{{cite book |chapter-url=https://books.google.com/books?id=_DqO6kizEDUC&pg=PA17 |title=Implementing Backup and Recovery: The Readiness Guide for the Enterprise |chapter=Chapter 2: Business Requirements of Backup Systems |author=Little, D.B. |publisher=John Wiley and Sons |pages=17–30 |year=2003 |isbn=978-0-471-48081-5 |accessdate=8 May 2018}}</ref>
*[[Data retention]] period : Regulations and policy can lead to situations where backups are expected to be retained for a particular period, but not any further. Retaining backups after this period can lead to unwanted liability and sub-optimal use of storage media.<ref name="LittleImplement03" />
*[[Checksum]] or [[hash function]] validation : Applications that back up to tape archive files need this option to verify that the data was accurately copied.<ref name="BackupExecVerify&WriteChecksumsToMedia">{{cite web |title=How do the "verify" and "write checksums to media" processes work and why are they necessary? |url=https://www.veritas.com/support/en_US/article.100030833.html |website=Veritas Support |publisher=Veritas.com |accessdate=16 September 2019 |date=15 October 2015 |at=Write checksums to media}}</ref>
*[[Business_process_management#Monitoring|Backup process monitoring]] : Enterprise client-server backup applications need a user interface that allows administrators to monitor the backup process, and proves compliance to regulatory bodies outside the organization; for example, an insurance company in the USA might be required under [[Health Insurance Portability and Accountability Act|HIPAA]] to demonstrate that its client data meet records retention requirements.<ref>[http://www.hipaadvisory.com/regs/recordretention.htm HIPAA Advisory] {{Webarchive|url=https://web.archive.org/web/20070411135655/http://www.hipaadvisory.com/regs/recordretention.htm |date=11 April 2007 }}. Retrieved 10 March 2007</ref>
*[[Enterprise_client-server_backup#User-initiated_backups_and_restores|User-initiated backups and restores]] : To avoid or recover from ''minor'' disasters, such as inadvertently deleting or overwriting the "good" versions of one or more files, the computer user—rather than an administrator—may initiate backups and restores (from not necessarily the most-recent backup) of files or folders.
==See also==
;About backup
* Backup software & services
** [[List of backup software]]
** [[List of online backup services]]
* [[Glossary of backup terms]]
* [[Virtual backup appliance]]
;Related topics
* [[Data consistency]]
* [[Data degradation]]
* [[Data proliferation]]
* [[Database dump]]
* [[Digital preservation]]
* [[Disaster recovery and business continuity auditing]]
==Notes==
{{reflist|group=note}}
==References==
{{reflist|30em}}
{{Wiktionary|back up}}
{{Wiktionary|backup}}
{{Commons category|Backup}}
[[Category:Backup| ]]
[[Category:Computer data]]
[[Category:Content management systems|*]]
[[Category:Data management]]
[[Category:Data security]]
[[Category:Records management]]' |
New page wikitext, after the edit (new_wikitext ) | '{{Use dmy dates|date=October 2019}}
{{about|backup in computer systems|other uses}}
In [[information technology]], a '''backup''', or '''data backup''' is a copy of [[computer data]] taken and stored elsewhere so that it may be used to restore the original after a [[data loss]] event. The verb form, referring to the process of doing so, is "[[wikt:back up|back up]]", whereas the noun and adjective form is "[[wikt:backup|backup]]".<ref name="AHDictionaryBackup">{{cite web |title=back•up |url=https://www.ahdictionary.com/word/search.html?q=backup |website=The American Heritage Dictionary of the English Language |publisher=Houghton Mifflin Harcourt |accessdate=9 May 2018 |year=2018}}</ref> Backups can be used to recover data after its loss from [[File deletion|data deletion]] or [[Data corruption|corruption]], or to recover data from an earlier time.<ref name="NelsonPro11">{{cite book
|url=https://books.google.com/books?id=r4uEEsq3CJYC
|title=Pro Data Backup and Recovery
|chapter=Chapter 1: Introduction to Backup and Recovery
|author=S. Nelson |publisher=Apress |pages=1–16 |year=2011
|isbn=978-1-4302-2663-5 |accessdate=8 May 2018}}</ref> Backups provide a simple form of [[disaster recovery]]; however not all backup systems are able to reconstitute a computer system or other complex configuration such as a [[computer cluster]], [[active directory]] server, or [[database server]].<ref name="CougiasTheBackup03Chapter01">{{cite book |chapter-url=https://books.google.com/books?id=eLviiTag5A0C&pg=PA1 |title=The Backup Book: Disaster Recovery from Desktop to Data Center |chapter=Chapter 1: What's a Disaster Without a Recovery? |author=Cougias, D.J. |author2=Heiberger, E.L. |author3=Koop, K. |publisher=Network Frontiers |pages=1–14 |year=2003 |isbn=0-9729039-0-9}}</ref>
A backup system contains at least one copy of all data considered worth saving. The [[computer data storage|data storage]] requirements can be large. An [[information repository]] model may be used to provide structure to this storage. There are different types of [[data storage device]]s used for copying backups of data that is already in secondary storage onto [[archive file]]s.<ref group = note name=ArchiveFileMayNotContainOld/HistoricalMaterial>In contrast to everyday use of the term "archive", the data stored in an "archive file" is not necessarily old or of historical interest.</ref><ref name="KissellTakeControlMacOSX">{{cite book
|author=Joe Kissell
|title=Take Control of Mac OS X Backups |date=2007
|publisher=TidBITS Electronic Publishing |location=Ithaca, NY
|isbn=978-0-9759503-0-2 |edition=Version 2.0
|url=http://people.fas.harvard.edu/~techtool/pages/Take_Control_of_Mac_OS_X_Backups_(2.0).pdf
|accessdate=17 May 2019 |ref=Kissell
|pages=18–20 ("The Archive", meaning information repository, including versioning), 24 (client-server), 82–83 (archive file), 112–114 (Off-site storage backup rotation scheme), 126–141 (old Retrospect terminology and GUI—still used in Windows variant), 165 (client-server), 128 (subvolume—later renamed Favorite Folder in Macintosh variant)}}</ref> There are also different ways these devices can be arranged to provide geographic dispersion, [[data security]], and portability.
Data is selected, extracted, and manipulated for storage. The process can include methods for [[Data_consistency#Point-in-time_consistency|dealing with live data]], including open files, as well as compression, encryption, and [[Data deduplication|de-duplication]]. Additional techniques apply to [[enterprise client-server backup]]. Backup schemes may include [[Dry run (testing)|dry runs]] that validate the reliability of the data being backed up. There are limitations<ref name=OutaBiz>{{cite newspaper |newspaper=[[The New York Times]]
|url=https://www.nytimes.com/2018/01/11/smarter-living/backing-up-your-photos.html
|title=A Beginner's Guide to Backing Up Photos
|author=Terry Sullivan |date=11 January 2018
|quote=a hard drive ... an established company ... declared bankruptcy ... where many ... had ...}}</ref> and human factors involved in any backup scheme.
==Storage==
A backup strategy requires an information repository, "a secondary storage space for data"<ref name="wiseGEEKInformationRepository">{{cite web |last1=McMahon |first1=Mary |title=What Is an Information Repository? |url=https://www.wisegeek.com/what-is-an-information-repository.htm |website=wiseGEEK |publisher=Conjecture Corporation |accessdate=8 May 2019 |date=1 April 2019 |quote=In the sense of an approach to data management, an information repository is a secondary storage space for data.}}</ref> that aggregates backups of data "sources". The repository could be as simple as a list of all backup media (DVDs, etc.) and the dates produced, or could include a computerized index, catalog, or [[relational database]].
The backup data needs to be stored, requiring a [[backup rotation scheme]],<ref name="KissellTakeControlMacOSX" /> which is a system of backing up data to computer media that limits the number of backups of different dates retained separately, by appropriate re-use of the data storage media by overwriting of backups no longer needed. The scheme determines how and when each piece of removable storage is used for a backup operation and how long it is retained once it has backup data stored on it. The 3-2-1 rule can aid in the backup process. It states that there should be at least 3 copies of the data, stored on 2 different types of storage media, and one copy should be kept offsite, in a remote location (this can include [[cloud storage]]). 2 or more different media should be used to eliminate data loss due to similar reasons (for example, optical discs may tolerate being underwater while LTO tapes may not, and SSDs cannot fail due to [[head crash]]es or damaged spindle motors since they don't have any moving parts, unlike hard drives) An offsite copy protects against fire, theft of physical media (such as tapes or discs) and natural disasters like floods and earthquakes.<ref>https://www.nakivo.com/blog/3-2-1-backup-rule-efficient-data-protection-strategy/</ref> Disaster protected hard drives like those made by [[ioSafe]] are an alternative to an offsite copy, but they have limitations like only being able to resist fire for a limited period of time, so an offsite copy still remains as the ideal choice.
===Backup methods===
====Unstructured====
An unstructured repository may simply be a stack of tapes, DVD-Rs or external HDDs with minimal information about what was backed up and when. This method is the easiest to implement, but unlikely to achieve a high level of recoverability as it lacks automation.
====Full only/System imaging====
A repository using this backup method contains complete source data copies taken at one or more specific points in time.<ref name="NakivoTypesOfBackup">{{cite web |last1=Mayer |first1=Alex |title=Backup Types Explained: Full, Incremental, Differential, Synthetic, and Forever-Incremental |url=https://www.nakivo.com/blog/backup-types-explained-full-incremental-differential-synthetic-and-forever-incremental/ |website=Nakivo Blog |publisher=Nakivo |accessdate=17 May 2019 |at=Full Backup, Incremental Backup, Differential Backup, Mirror Backup, Reverse Incremental Backup, Continuous Data Protection (CDP), Synthetic Full Backup, Forever-Incremental Backup
|date=6 November 2017}}</ref> Copying [[system image]]s, this method is frequently used by computer technicians to record known good configurations. However, imaging<ref>{{Cite web |title=Five key questions to ask about your backup solution |url=http://sysgen.ca/five-key-backup-questions/ |website=sysgen.ca |accessdate=23 September 2015 |archive-url=https://web.archive.org/web/20160304042343/http://sysgen.ca/five-key-backup-questions/ |archive-date=4 March 2016 |url-status=live |df=dmy-all|at=Does your company have a low tolerance to longer "data access outages" and/or would you like to minimize the time your company may be without its data?|date=23 March 2014 }}</ref> is generally more useful as a way of deploying a standard configuration to many systems rather than as a tool for making ongoing backups of diverse systems.
====Incremental====
An [[incremental backup]]
Dhananjay
====Near-CDP{{anchor|Continuous_data_protection}}====
[[Continuous Data Protection]] (CDP) refers to a backup that instantly saves a copy of every change made to the data. This allows restoration of data to any point in time and is the most comprehensive and advanced data protection.<ref name=InformationWeekWhyCDPGettingMorePractical>{{cite web
|author=Behzad Behtash
|url=https://www.informationweek.com/why-continuous-data-protections-getting-more-practical/d/d-id/1088883
|title=Why Continuous Data Protection's Getting More Practical
|work=Disaster recovery/business continuity
|publisher=InformationWeek |date=6 May 2010 |accessdate=12 November 2011 |quote=A true CDP approach should capture all data writes, thus continuously backing up data and eliminating backup windows.... CDP is the gold standard—the most comprehensive and advanced data protection. But "near CDP" technologies can deliver enough protection for many companies with less complexity and cost. For example, snapshots can provide a reasonable near-CDP-level of protection for file shares, letting users directly access data on the file share at regular intervals--say, every half-hour or 15 minutes. That's certainly a higher level of protection than tape-based or disk-based nightly backups and may be all you need.}}</ref> Near-CDP backup applications—often [[List of backup software#Proprietary|marketed]] as "CDP"—automatically take incremental backups at a specific interval, for example every 15 minutes, one hour, or 24 hours. They can therefore only allow restores to an interval boundary.<ref name=InformationWeekWhyCDPGettingMorePractical /> Near-CDP backup applications use [[Journaling file system|journaling]] and are typically based on periodic "snapshots",<ref name="ComputerWeeklyCDPExplained">{{cite web |title=Continuous data protection (CDP) explained: True CDP vs near-CDP |url=https://www.computerweekly.com/Continuous-data-protection-CDP-explained-True-CDP-vs-near-CDP |website=ComputerWeekly.com |publisher=TechTarget |accessdate=22 June 2019 |date=July 2010 |quote=... copies data from a source to a target. True CDP does this every time a change is made, while so-called near-CDP does this at pre-set time intervals. Near-CDP is effectively the same as snapshotting....True CDP systems record every write and copy them to the target where all changes are stored in a log. [new paragraph] By contrast, near-CDP/snapshot systems copy files in a straightforward manner but require applications to be quiesced and made ready for backup, either via the application's backup mode or using, for example, Microsoft's Volume Shadow Copy Services (VSS).}}</ref> [[file system permissions|read-only]] copies of the data frozen at a particular [[point in time]].
Near-CDP (except for [[Apple Time Machine]])<ref name="PondiniHowTMWorksItsMagic">{{cite web |last1=Pond |first1=James |title=How Time Machine Works its Magic |url=https://www.baligu.com/pondini/TM/Works.html |website=Apple OSX and Time Machine Tips |publisher=Baligu.com (as mirrored after James Pond died in 2013) |accessdate=10 July 2019 |date=31 August 2013 |quote=The File System Event Store is a hidden log that OSX keeps on each HFS+ formatted disk/partition of changes made to the data on it. It doesn’t list every file that’s changed, but each directory (folder) that’s had anything changed inside it.}}</ref> [[Intent log|intent-logs]] every change on the host system,<ref name="deGuiseEnterprise09#A.3.3">{{cite book |url=https://books.google.com/books?id=2OtqvySBTu4C&pg=PA287|title=Enterprise Systems Backup and Recovery: A Corporate Insurance Policy |author=de Guise, P. |publisher=CRC Press |pages=285–287 |year=2009 |isbn=978-1-4200-7639-4}}</ref> often by saving byte or block-level differences rather than file-level differences.<ref name="NakivoTypesOfBackup" /> This backup method differs from simple [[disk mirroring]]<ref name="NakivoTypesOfBackup" /> in that it enables a roll-back of the log and thus a restoration of old images of data. Intent-logging allows precautions for the consistency of live data, protecting ''self-consistent'' files but requiring ''applications'' "be quiesced and made ready for backup."
Near-CDP is more practicable for ordinary personal backup applications, as opposed to ''true'' CDP, which must be run in conjunction with a virtual machine<ref name="VictorWuEMCRecoverPointVM">{{cite web |last1=Wu |first1=Victor |title=EMC RecoverPoint for Virtual Machine Overview |url=https://wuchikin.wordpress.com/2017/03/04/emc-recoverpoint-for-virtual-machine-overview/ |website=Victor Virtual |publisher=WuChiKin |accessdate=22 June 2019 |date=4 March 2017 |quote=The splitter splits out the Write IOs to the VMDK/RDM of a VM and sends a copy to the production VMDK and also to the RecoverPoint for VMs cluster.}}</ref><ref name="RES-QServicesZertoOrVeeam">{{cite web |title=Zerto or Veeam? |url=https://resqdr.com/zerto-or-veeam/ |website=RES-Q Services |accessdate=7 July 2019 |date=March 2017 |quote=Zerto doesn’t use snapshot technology like Veeam. Instead, Zerto deploys small virtual machines on its physical hosts. These Zerto VMs capture the data as it is written to the host and then send a copy of that data to the replication site.....However, Veeam has the advantage of being able to more efficiently capture and store data for long-term retention needs. There is also a significant pricing difference, with Veeam being cheaper than Zerto.}}</ref> or equivalent<ref name="CloudEndureAgentRelated">{{cite web |title=Agent Related |url=https://docs.cloudendure.com/Content/FAQ/FAQ/Agent_Related.htm |website=CloudEndure.com |accessdate=3 July 2019 |at=What does the CloudEndure Agent do? |date=2019 |quote=The CloudEndure Agent performs an initial block-level read of the content of any volume attached to the server and replicates it to the Replication Server. The Agent then acts as an OS-level read filter to capture writes and synchronizes any block level modifications to the CloudEndure Replication Server, ensuring near-zero RPO.}}</ref> and is therefore generally used in enterprise client-server backups.
====Reverse incremental====
A [[Incremental backup#Reverse incremental|Reverse incremental]] backup method stores a recent archive file "mirror" of the source data and a series of differences between the "mirror" in its current state and its previous states. A reverse incremental backup method starts with a non-image full backup. After the full backup is performed, the system periodically synchronizes the full backup with the live copy, while storing the data necessary to reconstruct older versions. This can either be done using [[hard links]]—as Apple Time Machine does, or using binary [[data comparison|diffs]].
====Differential====
A differential backup saves only the data that has changed since the last full backup. This means a maximum of two backups from the repository are used to restore the data. However, as time from the last full backup (and thus the accumulated changes in data) increases, so does the time to perform the differential backup. Restoring an entire system requires starting from the most recent full backup and then applying just the last differential backup.
A differential backup copies files that have been created or changed since the last full backup, regardless of whether any other differential backups have been made since, whereas an incremental backup copies files that have been created or changed since the most recent backup of any type (full or incremental). Other variations of incremental backup include multi-level incrementals and block-level incrementals that compare parts of files instead of just entire files.
===Storage media===
[[File:DVD, USB flash drive and external hard drive.jpg|thumb|right|From left to right, a [[DVD]] disc in plastic cover, a USB flash drive and an [[external hard drive]]]]
Regardless of the repository model that is used, the data has to be copied onto an archive file data storage medium. The medium used is also referred to as the type of backup destination.
====Magnetic tape====
[[Magnetic tape data storage|Magnetic tape]] was for a long time the most commonly used medium for bulk data storage, backup, archiving, and interchange. It was previously a less expensive option, but this is no longer the case for smaller amounts of data.<ref name=EngenioDiskToDiskVsTape>{{cite web |date=9 December 2004 |accessdate=26 May 2019
|url=http://www.storagesearch.com/engenio-art2.html |archive-url=https://web.archive.org/web/20050207082953/http://www.storagesearch.com/engenio-art2.html |url-status=dead |archive-date=7 February 2005 |title=Disk to Disk Backup versus Tape – War or Truce? |last1=Gardner |first1=Steve |at=Peaceful coexistence |publisher=Engenio
}}</ref> Tape is a [[sequential access]] medium, so the rate of continuously writing or reading data can be very fast.
Many tape formats have been proprietary or specific to certain markets like mainframes or a particular brand of personal computer. By 2014 [[Linear Tape-Open#Market performance|LTO]] had become the primary tape technology.<ref name="SpectraLogicDigitalDataStorageOutlook2017">{{cite web |title=Digital Data Storage Outlook 2017 |url=https://spectralogic.com/wp-content/uploads/white-paper-digital-data-storage-outlook-2017-v3.pdf |website=Spectra |publisher=Spectra Logic |accessdate=11 July 2018 |page=7(Solid-State), 10(Magnetic Disk), 14(Tape), 17(Optical) |year=2017}}</ref> The other remaining viable "super" format is the [[IBM 3592]] (also referred to as the TS11xx series). The [[StorageTek tape formats#T10000|Oracle StorageTek T10000]] was discontinued in 2016.<ref name="ForbesKeepingDataLongTime">{{cite web |author=Tom Coughlin |title=Keeping Data for a Long Time |url=https://www.forbes.com/sites/tomcoughlin/2014/06/29/keeping-data-for-a-long-time/ |website=Forbes |publisher=Forbes Media LLC |accessdate=19 April 2018 |date=29 June 2014 |at=para. Magnetic Tapes(popular formats, storage life), para. Hard Disk Drives(active archive), para. First consider flash memory in archiving(... may not have good media archive life)}}</ref>
====Hard disk====
The use of [[hard disk]] storage has increased over time as it has become progressively cheaper. Hard disks are usually easy to use, widely available, and can be accessed quickly.<ref name="SpectraLogicDigitalDataStorageOutlook2017" /> However, hard disk backups are [[Hard disk drive#Magnetic recording|close-tolerance mechanical devices]] and may be more easily damaged than tapes, especially while being transported.<ref name="PCWorldHardCoreDataPreservation">{{cite web |last1=Jacobi |first1=John L. |title=Hard-core data preservation: The best media and methods for archiving your data |url=https://www.pcworld.com/article/2984597/storage/hard-core-data-preservation-the-best-media-and-methods-for-archiving-your-data.html |website=PC World |accessdate=19 April 2018 |date=29 February 2016 |at=sec. External Hard Drives(on the shelf, magnetic properties, mechanical stresses, vulnerable to shocks), Tape, Online storage}}</ref> In the mid-2000s, several drive manufacturers began to produce portable drives employing [[Hard disk drive failure#Unloading|ramp loading and accelerometer]] technology (sometimes termed a "shock sensor"),<ref name="HGSTRampLoadUnload">{{cite web |title=Ramp Load/Unload Technology in Hard Disk Drives |url=https://www.hgst.com/sites/default/files/resources/LoadUnload_white_paper_FINAL.pdf |website=HGST |publisher=Western Digital |accessdate=29 June 2018 |page=3(sec. Enhanced Shock Tolerance) |date=November 2007}}</ref><ref name="ToshibaCanvio3.0PortableHDD">{{cite web |title=Toshiba Portable Hard Drive (Canvio® 3.0) |url=https://www.toshibadata.com.sg/Product-Canvio-Portable-Hard-Drive.aspx |website=Toshiba Data Dynamics Singapore |publisher=Toshiba Data Dynamics Pte Ltd |accessdate=16 June 2018 |year=2018 |at=sec. Overview(Internal shock sensor and ramp loading technology)}}</ref> and by 2010 the industry average in drop tests for drives with that technology showed drives remaining intact and working after a 36-inch non-operating drop onto industrial carpeting.<ref name="IomegaDropShock">{{cite web |title=Iomega Drop Guard ™ Technology |url=https://www.doc-developpement-durable.org/file/Projets-informatiques/Drop%20Guard-disque-dur-tres-solide.pdf |website=Hard Drive Storage Solutions |publisher=Iomega Corp. |accessdate=12 July 2018 |pages=2(What is Drop Shock Technology?, What is Drop Guard Technology? (... features special internal cushioning .... 40% above the industry average)), 3(*NOTE) |date=20 September 2010}}</ref> Some manufacturers also offer 'ruggedized' portable hard drives, which include a shock-absorbing case around the hard disk, and [[MIL-STD-810#Applicability to "ruggedized" consumer products|claim]] a range of higher drop specifications.<ref name="IomegaDropShock" /><ref name=PCMagBest>
{{cite web |author=John Burek
|title=The Best Rugged Hard Drives and SSDs
|url=https://www.pcmag.com/roundup/361072/the-best-rugged-hard-drives-and-ssds
|website=[[PC Magazine]] |publisher=Ziff Davis |accessdate=4 August 2018
|at=What Exactly Makes a Drive Rugged?(When a drive is encased ... you're mostly at the mercy of the drive vendor to tell you the rated maximum drop distance for the drive) |date=15 May 2018}}</ref><ref name="WirecutterBestPortableHardDrive2017Don'tBuy">{{cite web
|author=Justin Krajeski |author2=Kimber Streams
|title=The Best Portable Hard Drive
|url=http://thewirecutter.com/reviews/best-portable-hard-drive/#dont-buy-a-rugged-portable-hard-drive
|work=[[The New York Times]] |accessdate=4 August 2018
|archive-url=https://web.archive.org/web/20170331161821/http://thewirecutter.com/reviews/best-portable-hard-drive/#dont-buy-a-rugged-portable-hard-drive |url-status=dead |archivedate=31 March 2017 |date=20 March 2017}}</ref> Over a period of years the stability of hard disk backups is shorter than that of tape backups.<ref name="ForbesKeepingDataLongTime" /><ref name="IronMountainBestLong-TermDataArchiveSolutions">{{cite web |title=Best Long-Term Data Archive Solutions |url=http://www.ironmountain.com/resources/general-articles/b/best-long-term-data-archive-solutions |website=Iron Mountain |publisher=Iron Mountain Inc. |accessdate=19 April 2018 |year=2018 |at=sec. More Reliable(average mean time between failure ... rates, best practice for migrating data)}}</ref><ref name="PCWorldHardCoreDataPreservation" />
External hard disks can be connected via local interfaces like [[SCSI]], [[USB]], [[FireWire]], or [[eSATA]], or via longer-distance technologies like [[Ethernet]], [[iSCSI]], or [[Fibre Channel]]. Some disk-based backup systems, via [[Virtual tape library|Virtual Tape Libraries]] or otherwise, support data deduplication, which can reduce the amount of disk storage capacity consumed by daily and weekly backup data.<ref name="KissellTakeControlBackingUp">{{cite book |last1=Kissell |first1=Joe |title=Take Control of Backing Up Your Mac |date=2011 |publisher=TidBITS Publishing Inc. |location=Ithaca NY |isbn=978-1-61542-394-1 |page=41(Deduplication) |url=https://books.google.com/books?id=ANe3k_7bnAcC&q=retrospect+deduplication&pg=PT41 |accessdate=17 September 2019}}</ref><ref>{{Cite web |url=http://www.dcig.com/2009/07/symantec-shows-backup-exec-a-l.html |title=Symantec Shows Backup Exec a Little Dedupe Love; Lays out Source Side Deduplication Roadmap – DCIG |website=DCIG |access-date=26 February 2016 |archive-url=https://web.archive.org/web/20160304212819/http://www.dcig.com/2009/07/symantec-shows-backup-exec-a-l.html |archive-date=4 March 2016 |url-status=live |df=dmy-all}}</ref><ref name="NetBackupDeduplicationGuide">{{cite web |title=Veritas NetBackup™ Deduplication Guide |url=https://www.veritas.com/content/support/en_US/doc/ka6j00000000ADEAA2 |website=Veritas |publisher=Veritas Technologies LLC |accessdate=26 July 2018 |year=2016}}</ref>
====Optical storage====
[[Optical storage]] uses lasers to store and retrieve data. Recordable [[CD]]s, DVDs, and [[Blu-ray Disc]]s are commonly used with personal computers and are generally cheap. In the past, the capacities and speeds of these discs have been lower than hard disks or tapes, although advances in optical media are slowly shrinking that gap.<ref name="WanOptical14">{{cite journal
|title=Optical storage: An emerging option in long-term digital preservation
|journal=Frontiers of Optoelectronics
|author1=S. Wan |author2=Q. Cao |author3=C. Xie |volume=7 |issue=4 |pages=486–492 |year=2014
|doi=10.1007/s12200-014-0442-2|s2cid=60816607
}}</ref><ref>{{cite journal
|title=High-capacity optical long data memory based on enhanced Young's modulus in nanoplasmonic hybrid glass composites
|journal=Nature Communications
|author1=Q. Zhang |author2=Z. Xia |author3=Y.-B. Cheng |author4=M. Gu
|volume=9 |issue=1
|pages=1183 |year=2018 |doi=10.1038/s41467-018-03589-y|pmid=29568055
|bibcode=2018NatCo...9.1183Z
|pmc=5864957 }}</ref> Many optical disc formats are [[Write Once Read Many|WORM]] type, which makes them useful for archival purposes since the data cannot be changed. Some optical storage systems allow for cataloged data backups without human contact with the discs, allowing for longer data integrity. A French study in 2008 indicated that the lifespan of typically-sold [[CD-R#Lifespan|CD-Rs]] was 2–10 years,<ref name=INA_CD-R_Study>{{cite web
|url=http://www.ina.fr/video/3571726001/20-heures-emission-du-3-mars-2008.fr.html
|title= Journal de 20 Heures |accessdate=3 March 2008
|at=approximately minute 30 of the TV news broadcast
|work= Institut national de l'audiovisuel
|author1=Gérard Poirier |author2=Foued Berahou
|date=3 March 2008}}</ref> but one manufacturer later estimated the longevity of its CD-Rs with a gold-sputtered layer to be as high as 100 years.<ref>{{cite web
|url=http://delkin.com/i-5937134-archival-gold-cd-r-300-year-disc-binder-of-10-discs-with-scratch-armor-surface.html
|title=Archival Gold CD-R "300 Year Disc" Binder of 10 Discs with Scratch Armor Surface
|archivedate=27 September 2013 |archiveurl=https://web.archive.org/web/20130927170900/http://delkin.com/i-5937134-archival-gold-cd-r-300-year-disc-binder-of-10-discs-with-scratch-armor-surface.html
|website=Delkin Devices |publisher=Delkin Devices Inc.}}</ref> Sony's [[Optical Disc Archive]]<ref name="SpectraLogicDigitalDataStorageOutlook2017" /> can in 2016 reach a read rate of 250MB/s.<ref name="SonyOpticalDiscArchiveGen2">{{cite web |title=Optical Disc Archive Generation 2 |url=https://pro.sony/s3/cms-static-content/file/49/1237494482649.pdf |website=Optical Disc Archive |publisher=Sony |accessdate=15 August 2019 |page=12(World’s First 8-Channel Optical Drive Unit) |date=April 2016}}</ref>
====Solid-state drive====
[[Solid-state drives]] (SSDs) use [[integrated circuit]] assemblies to store data. [[Flash memory]], [[thumb drive]]s, [[USB flash drive]]s, [[CompactFlash]], [[SmartMedia]], [[Memory Stick]]s, and [[Secure Digital card]] devices are relatively expensive for their low capacity, but convenient for backing up relatively low data volumes. A solid-state drive does not contain any movable parts, making it less susceptible to physical damage, and can have huge throughput of around 500 Mbit/s up to 6 Gbit/s. Available SSDs have become more capacious and cheaper.<ref>{{cite journal
|title=Solid-State Drives (SSDs) |journal=Proceedings of the IEEE
|author1=R. Micheloni |author2=P. Olivo
|volume=105 |issue=9 |pages=1586–88 |year=2017
|doi=10.1109/JPROC.2017.2727228 }}</ref><ref name=PCMagBest/> Flash memory backups are stable for fewer years than hard disk backups.<ref name="ForbesKeepingDataLongTime" />
====Remote backup service====
[[Remote backup service]]s or cloud backups involve service providers storing data offsite. This has been used to protect against events such as fires, floods, or earthquakes which could destroy locally stored backups.<ref name="DellRemoteBackup">{{cite web
|url=https://www.emc.com/corporate/glossary/remote-backup.htm
|title=Remote Backup |work=EMC Glossary |publisher=Dell, Inc
|accessdate=8 May 2018
|quote=Effective remote backup requires that production data be regularly backed up to a location far enough away from the primary location so that both locations would not be affected by the same disruptive event.}}</ref> Cloud-based backup (through services like or similar to [[Google Drive]], and [[Microsoft OneDrive]]) provides a layer of data protection.<ref name="PCWorldHardCoreDataPreservation" /> However, the users must trust the provider to maintain the privacy and integrity of their data, with confidentiality enhanced by the use of [[encryption]]. Because speed and availability are limited by a user's online connection,<ref name="PCWorldHardCoreDataPreservation" /> users with large amounts of data may need to use cloud seeding and large-scale recovery.
===Management===
Various methods can be used to manage backup media, striking a balance between accessibility, security and cost. These media management methods are not mutually exclusive and are frequently combined to meet the user's needs. Using on-line disks for staging data before it is sent to a near-line [[tape library]] is a common example.<ref name="StackpoleSoftware07">{{cite book |url=https://books.google.com/books?id=gjAhVzuV7k0C&pg=PA164 |title=Software Deployment, Updating, and Patching |author=Stackpole, B. |author2=Hanrion, P. |publisher=CRC Press |pages=164–165 |year=2007 |isbn=978-1-4200-1329-0 |accessdate=8 May 2018}}</ref><ref name="GnanasundaramInfo12">{{cite book |url=https://books.google.com/books?id=PU7gkW9ArxIC&pg=PA255 |title=Information Storage and Management: Storing, Managing, and Protecting Digital Information in Classic, Virtualized, and Cloud Environments |editor=Gnanasundaram, S. |editor2=Shrivastava, A. |publisher=John Wiley and Sons |page=255 |year=2012 |isbn=978-1-118-23696-3 |accessdate=8 May 2018}}</ref>
====Online====
[[Online]] backup storage is typically the most accessible type of data storage, and can begin a restore in milliseconds. An internal hard disk or a [[disk array]] (maybe connected to [[Storage area network|SAN]]) is an example of an online backup. This type of storage is convenient and speedy, but is vulnerable to being deleted or overwritten, either by accident, by malevolent action, or in the wake of a data-deleting [[Computer virus|virus]] payload.
====Near-line====
[[Nearline storage]] is typically less accessible and less expensive than online storage, but still useful for backup data storage. A mechanical device is usually used to move media units from storage into a drive where the data can be read or written. Generally it has safety properties similar to on-line storage. An example is a [[tape library]] with restore times ranging from seconds to a few minutes.
====Off-line====
[[Off-line storage]] requires some direct action to provide access to the storage media: for example, inserting a tape into a tape drive or plugging in a cable. Because the data is not accessible via any computer except during limited periods in which they are written or read back, they are largely immune to on-line backup failure modes. Access time varies depending on whether the media are on-site or off-site.
====Off-site data protection====
Backup media may be sent to an [[off-site data protection|off-site]] vault to protect against a disaster or other site-specific problem. The vault can be as simple as a system administrator's home office or as sophisticated as a disaster-hardened, temperature-controlled, high-security bunker with facilities for backup media storage. A data replica can be off-site but also on-line (e.g., an off-site [[RAID]] mirror). Such a replica has fairly limited value as a backup.
====Backup site====
A [[backup site]] or disaster recovery center is used to store data that can enable computer systems and networks to be restored and properly configure in the event of a disaster. Some organisations have their own data recovery centres, while others contract this out to a third-party. Due to high costs, backing up is rarely considered the preferred method of moving data to a DR site. A more typical way would be remote [[disk mirroring]], which keeps the DR data as up to date as possible.
==Selection and extraction of data==
A backup operation starts with selecting and extracting coherent units of data. Most data on modern computer systems is stored in discrete units, known as [[Computer file|files]]. These files are organized into [[filesystem]]s. Deciding what to back up at any given time involves tradeoffs. By backing up too much redundant data, the information repository will fill up too quickly. Backing up an insufficient amount of data can eventually lead to the loss of critical information.<ref name="LeesWhatTo17">{{cite web |url=https://irontree.co.za/what-to-backup-a-critical-look-at-your-data-1935.html |title=What to backup – a critical look at your data |author=Lee
|work=Irontree Blog |publisher=Irontree Internet Services CC
|date=25 January 2017 |accessdate=8 May 2018}}</ref>
===Files===
*[[File copying|Copying files]] : Making copies of files is the simplest and most common way to perform a backup. A means to perform this basic function is included in all backup software and all operating systems.
*Partial file copying: A backup may include only the blocks or bytes within a file that have changed in a given period of time. This can substantially reduce needed storage space, but requires higher sophistication to reconstruct files in a restore situation. Some implementations require integration with the source file system.
*Deleted files : To prevent the unintentional restoration of files that have been intentionally deleted, a record of the deletion must be kept.
*Versioning of files : Most backup applications, other than those that do only full only/System imaging, also back up files that have been modified since the last backup. "That way, you can retrieve many different versions of a given file, and if you delete it on your hard disk, you can still find it in your [information repository] archive."<ref name="KissellTakeControlMacOSX" />
===Filesystems===
*Filesystem dump: A copy of the whole filesystem in block-level can be made. This is also known as a "raw partition backup" and is related to [[disk image|disk imaging]]. The process usually involves unmounting the filesystem and running a program like [[dd (Unix)]].<ref name="PrestonBackup07">{{cite book |url=https://books.google.com/books?id=6-w4fXbBInoC&pg=PA111 |title=Backup & Recovery: Inexpensive Backup Solutions for Open Systems |author=Preston, W.C. |publisher=O'Reilly Media, Inc |pages=111–114 |year=2007 |isbn=978-0-596-55504-7 |accessdate=8 May 2018}}</ref> Because the disk is read sequentially and with large buffers, this type of backup can be faster than reading every file normally, especially when the filesystem contains many small files, is highly fragmented, or is nearly full. But because this method also reads the free disk blocks that contain no useful data, this method can also be slower than conventional reading, especially when the filesystem is nearly empty. Some filesystems, such as [[XFS]], provide a "dump" utility that reads the disk sequentially for high performance while skipping unused sections. The corresponding restore utility can selectively restore individual files or the entire volume at the operator's choice.<ref name="PrestonUnix99">{{cite book |url=https://archive.org/details/unixbackuprecove00wcur |url-access=registration |title=Unix Backup & Recovery |author=Preston, W.C. |publisher=O'Reilly Media, Inc |pages=[https://archive.org/details/unixbackuprecove00wcur/page/73 73]–91 |year=1999 |isbn=978-1-56592-642-4 |accessdate=8 May 2018}}</ref>
*Identification of changes: Some filesystems have an [[archive bit]] for each file that says it was recently changed. Some backup software looks at the date of the file and compares it with the last backup to determine whether the file was changed.
*[[Versioning file system]] : A versioning filesystem tracks all changes to a file. The [[NILFS]] versioning filesystem for Linux is an example.<ref name="NILFSHome">{{cite web |title=NILFS Home |url=https://nilfs.sourceforge.io/en/ |website=NILFS Continuous Snapshotting System |publisher=NILFS Community |accessdate=22 August 2019 |date=2019}}</ref>
===Live data===
Files that are actively being updated present a challenge to back up. One way to back up live data is to temporarily [[quiesce]] them (e.g., close all files), take a "snapshot", and then resume live operations. At this point the snapshot can be backed up through normal methods.<ref name="CougiasTheBackup03Chapter11">{{cite book |chapter-url=https://books.google.com/books?id=eLviiTag5A0C&pg=PA360|title=The Backup Book: Disaster Recovery from Desktop to Data Center |chapter=Chapter 11: Open file backup for databases |author=Cougias, D.J. |author2=Heiberger, E.L. |author3=Koop, K. |publisher=Network Frontiers |pages=356–360 |year=2003 |isbn=0-9729039-0-9}}</ref> A [[Snapshot (computer storage)|snapshot]] is an instantaneous function of some [[file system|filesystems]] that presents a copy of the filesystem as if it were frozen at a specific point in time, often by a [[copy-on-write]] mechanism. Snapshotting a file while it is being changed results in a corrupted file that is unusable. This is also the case across interrelated files, as may be found in a conventional database or in applications such as [[Microsoft Exchange Server]].<ref name="ComputerWeeklyCDPExplained" /> The term [[fuzzy backup]] can be used to describe a backup of live data that looks like it ran correctly, but does not represent the state of the data at a single point in time.<ref name="LiotineMission03">{{cite book |url=https://books.google.com/books?id=LecC2BhPPxMC&pg=PA244 |title=Mission-critical Network Planning |author=Liotine, M. |publisher=Artech House |page=244 |year=2003 |isbn=978-1-58053-559-5 |accessdate=8 May 2018}}</ref>
Backup options for data files that cannot be or are not quiesced include:<ref name="deGuiseEnterprise09#3.4.7">{{cite book |url=https://books.google.com/books?id=2OtqvySBTu4C&pg=PA50 |title=Enterprise Systems Backup and Recovery: A Corporate Insurance Policy |author=de Guise, P. |publisher=CRC Press |pages=50–54 |year=2009 |isbn=978-1-4200-7639-4}}</ref>
*Open file backup: Many backup software applications undertake to back up open files in an internally consistent state.<ref name="HandyBackupOpenFileWindows">{{cite web |title=Open File Backup Software for Windows |url=https://www.handybackup.net/open-file-backup.shtml
|website=Handy Backup |publisher=Novosoft LLC
|accessdate=29 November 2018 |date=8 November 2018}}</ref> Some applications simply check whether open files are in use and try again later.<ref name="CougiasTheBackup03Chapter11" /> Other applications exclude open files that are updated very frequently.<ref name="ArqTroubleshootingBackingUpOpen/lockedFiles">{{cite web |last1=Reitshamer |first1=Stefan |title=Troubleshooting backing up open/locked files on Windows |url=https://www.arqbackup.com/blog/troubleshooting-backing-up-openlocked-files-on-windows/ |website=Arq Blog
|publisher=Haystack Software
|accessdate=29 November 2018 |date=5 July 2017 |at=Stefan Reitshamer is the principal developer of Arq}}</ref> Some [[High availability|low-availability]] interactive applications can be backed up via natural/induced pausing.
*Interrelated database files backup: Some interrelated database file systems offer a means to generate a "hot backup"<ref name="UWiscOracleBackups">{{cite web |last1=Boss |first1=Nina |title=Oracle Tips Session #3: Oracle Backups |url=http://www.wisc.edu/drmt/oratips/sess003.html#Hotbackup |archive-url=https://web.archive.org/web/20070302110933/http://www.wisc.edu/drmt/oratips/sess003.html#Hotbackup |url-status=dead |website=www.wisc.edu |publisher=University of Wisconsin |accessdate=1 December 2018 |archivedate=2 March 2007 |date=10 December 1997}}</ref> of the database while it is online and usable. This may include a snapshot of the data files plus a snapshotted log of changes made while the backup is running. Upon a restore, the changes in the log files are applied to bring the copy of the database up to the point in time at which the initial backup ended.<ref name="ArcserveOracleWhatIsARCHIVE-LOG">{{cite web |title=What is ARCHIVE-LOG and NO-ARCHIVE-LOG mode in Oracle and the advantages & disadvantages of these modes? |url=https://support.arcserve.com/s/article/202080249?language=en_US |website=Arcserve Backup |publisher=Arcserve |accessdate=29 November 2018 |date=27 September 2018}}</ref> Other low-availability interactive applications can be backed up via coordinated snapshots. However, [[High availability|genuinely-high-availability]] interactive applications can be only be backed up via Continuous Data Protection.
===Metadata===
Not all information stored on the computer is stored in files. Accurately recovering a complete system from scratch requires keeping track of this [[metadata|non-file data]] too.<ref name="Gresovnik1">{{cite web |url=http://www2.arnes.si/~ljc3m2/igor/blogs/technical/bootable_media_creation.html |title=Preparation of Bootable Media and Images |last=Grešovnik |first=Igor |date=April 2016 |archive-url=https://web.archive.org/web/20160425113119/http://www2.arnes.si/~ljc3m2/igor/blogs/technical/bootable_media_creation.html |archivedate=25 April 2016 |access-date=21 April 2016}}</ref>
*System description: System specifications are needed to procure an exact replacement after a disaster.
*[[Boot sector]] : The boot sector can sometimes be recreated more easily than saving it. It usually isn't a normal file and the system won't boot without it.
*[[Disk partitioning|Partition]] layout: The layout of the original disk, as well as partition tables and filesystem settings, is needed to properly recreate the original system.
*File [[metadata]] : Each file's permissions, owner, group, ACLs, and any other metadata need to be backed up for a restore to properly recreate the original environment.
*System metadata: Different operating systems have different ways of storing configuration information. [[Microsoft Windows]] keeps a [[Windows Registry|registry]] of system information that is more difficult to restore than a typical file.
==Manipulation of data and dataset optimization==
It is frequently useful or required to manipulate the data being backed up to optimize the backup process. These manipulations can improve backup speed, restore speed, data security, media usage and/or reduced bandwidth requirements.
===Automated data grooming===
Out-of-date data can be automatically deleted, but for personal backup applications—as opposed to enterprise client-server backup applications where automated data "grooming" can be customized—the deletion<ref group=note>Some backup applications—notably [[Rsync#History|rsync]] and [[Code42#File_backup_and_sharing_services|CrashPlan]]—term removing backup data "pruning" instead of "grooming".[https://linux.die.net/man/1/rsync][https://support.code42.com/Administrator/5/Monitoring_and_managing/Archive_maintenance#Prune]</ref> can at most<ref name="PondiniFAQ12">{{cite web |last1=Pond |first1=James |title=12. Should I delete old backups? If so, How? |url=https://www.baligu.com/pondini/TM/12.html |website=Time Machine |publisher=baligu.com |accessdate=21 June 2019 |at=Green box, Gray box |date=2 June 2012}}</ref> be globally delayed or be disabled.<ref name="WirecutterBestOnlineCloudBackupService">{{cite web |last1=Kissell |first1=Joe |title=The Best Online Cloud Backup Service |url=https://thewirecutter.com/reviews/best-online-backup-service/ |website=wirecutter |publisher=The New York Times|accessdate=21 June 2019 |at=Next, there’s file retention. |date=12 March 2019}}</ref>
===Compression===
Various schemes can be employed to [[Data compression|shrink]] the size of the source data to be stored so that it uses less storage space. Compression is frequently a built-in feature of tape drive hardware.<ref name="CherrySecuring15">{{cite book |url=https://books.google.com/books?id=SD_LAwAAQBAJ&pg=PA306 |title=Securing SQL Server: Protecting Your Database from Attackers
|author=D. Cherry |publisher=Syngress |pages=306–308 |year=2015
|isbn=978-0-12-801375-5 |accessdate=8 May 2018}}</ref>
===Deduplication===
Redundancy due to backing up similarly configured workstations can be reduced, thus storing just one copy. This technique can be applied at the file or raw block level. This potentially large reduction<ref name="CherrySecuring15" /> is called [[Data deduplication|deduplication]]. It can occur on a server before any data moves to backup media, sometimes referred to as source/client side deduplication. This approach also reduces bandwidth required to send backup data to its target media. The process can also occur at the target storage device, sometimes referred to as inline or back-end deduplication.
===Duplication===
Sometimes backups are [[Replication (computer science)|duplicated]] to a second set of storage media. This can be done to rearrange the archive files to optimize restore speed, or to have a second copy at a different location or on a different storage medium—as in the disk-to-disk-to-tape capability of Enterprise client-server backup.
===Encryption===
High-capacity removable storage media such as backup tapes present a data security risk if they are lost or stolen.<ref>[http://www.securityfocus.com/news/11048 Backups tapes a backdoor for identity thieves] {{Webarchive|url=https://web.archive.org/web/20160405033517/http://www.securityfocus.com/news/11048 |date=5 April 2016 }} (28 April 2004). Retrieved 10 March 2007</ref> [[Encryption|Encrypting]] the data on these media can mitigate this problem, however encryption is a CPU intensive process that can slow down backup speeds, and the security of the encrypted backups is only as effective as the security of the key management policy.<ref name="CherrySecuring15" />
===Multiplexing===
When there are many more computers to be backed up than there are destination storage devices, the ability to use a single storage device with several simultaneous backups can be useful.<ref name="PrestonBackup07-02">{{cite book |url=https://books.google.com/books?id=6-w4fXbBInoC&pg=PA219 |title=Backup & Recovery: Inexpensive Backup Solutions for Open Systems |author=Preston, W.C. |publisher=O'Reilly Media, Inc |pages=219–220 |year=2007 |isbn=978-0-596-55504-7 |accessdate=8 May 2018}}</ref> However cramming the scheduled [[Glossary_of_backup_terms#Terms and definitions|backup window]] via "multiplexed backup" is only used for tape destinations.<ref name="PrestonBackup07-02" />
===Refactoring===
The process of rearranging the sets of backups in an archive file is known as refactoring. For example, if a backup system uses a single tape each day to store the incremental backups for all the protected computers, restoring one of the computers could require many tapes. Refactoring could be used to consolidate all the backups for a single computer onto a single tape, creating a "synthetic full backup". This is especially useful for backup systems that do incrementals forever style backups.
===Staging===
Sometimes backups are copied to a [[Disk staging|staging]] disk before being copied to tape.<ref name="PrestonBackup07-02" /> This process is sometimes referred to as D2D2T, an acronym for [[Disk-to-disk-to-tape]]. It can be useful if there is a problem matching the speed of the final destination device with the source device, as is frequently faced in network-based backup systems. It can also serve as a centralized location for applying other data manipulation techniques.
===Objectives===
*[[Disaster recovery#Recovery Point Objective|Recovery point objective]] (RPO) : The point in time that the restarted infrastructure will reflect, expressed as "the maximum targeted period in which data (transactions) might be lost from an IT service due to a major incident". Essentially, this is the roll-back that will be experienced as a result of the recovery. The most desirable RPO would be the point just prior to the data loss event. Making a more recent recovery point achievable requires increasing the frequency of [[file synchronization|synchronization]] between the source data and the backup repository.<ref name="RiskyThinkingDefRPO">{{cite web |title=Recovery Point Objective (Definition) |url=https://www.riskythinking.com/glossary/recovery_point_objective.php |website=ARL Risky Thinking |publisher=Albion Research Ltd. |accessdate=4 August 2019 |date=2007}}</ref>
*Recovery time objective (RTO) : The amount of time elapsed between disaster and restoration of business functions.<ref name="RiskyThinkingDefRTO">{{cite web |title=Recovery Time Objective (Definition) |url=https://www.riskythinking.com/glossary/recovery_time_objective.php |website=ARL Risky Thinking |publisher=Albion Research Ltd. |accessdate=4 August 2019 |date=2007}}</ref>
*[[Data security]] : In addition to preserving access to data for its owners, data must be restricted from unauthorized access. Backups must be performed in a manner that does not compromise the original owner's undertaking. This can be achieved with data encryption and proper media handling policies.<ref name="LittleImplement03">{{cite book |chapter-url=https://books.google.com/books?id=_DqO6kizEDUC&pg=PA17 |title=Implementing Backup and Recovery: The Readiness Guide for the Enterprise |chapter=Chapter 2: Business Requirements of Backup Systems |author=Little, D.B. |publisher=John Wiley and Sons |pages=17–30 |year=2003 |isbn=978-0-471-48081-5 |accessdate=8 May 2018}}</ref>
*[[Data retention]] period : Regulations and policy can lead to situations where backups are expected to be retained for a particular period, but not any further. Retaining backups after this period can lead to unwanted liability and sub-optimal use of storage media.<ref name="LittleImplement03" />
*[[Checksum]] or [[hash function]] validation : Applications that back up to tape archive files need this option to verify that the data was accurately copied.<ref name="BackupExecVerify&WriteChecksumsToMedia">{{cite web |title=How do the "verify" and "write checksums to media" processes work and why are they necessary? |url=https://www.veritas.com/support/en_US/article.100030833.html |website=Veritas Support |publisher=Veritas.com |accessdate=16 September 2019 |date=15 October 2015 |at=Write checksums to media}}</ref>
*[[Business_process_management#Monitoring|Backup process monitoring]] : Enterprise client-server backup applications need a user interface that allows administrators to monitor the backup process, and proves compliance to regulatory bodies outside the organization; for example, an insurance company in the USA might be required under [[Health Insurance Portability and Accountability Act|HIPAA]] to demonstrate that its client data meet records retention requirements.<ref>[http://www.hipaadvisory.com/regs/recordretention.htm HIPAA Advisory] {{Webarchive|url=https://web.archive.org/web/20070411135655/http://www.hipaadvisory.com/regs/recordretention.htm |date=11 April 2007 }}. Retrieved 10 March 2007</ref>
*[[Enterprise_client-server_backup#User-initiated_backups_and_restores|User-initiated backups and restores]] : To avoid or recover from ''minor'' disasters, such as inadvertently deleting or overwriting the "good" versions of one or more files, the computer user—rather than an administrator—may initiate backups and restores (from not necessarily the most-recent backup) of files or folders.
==See also==
;About backup
* Backup software & services
** [[List of backup software]]
** [[List of online backup services]]
* [[Glossary of backup terms]]
* [[Virtual backup appliance]]
;Related topics
* [[Data consistency]]
* [[Data degradation]]
* [[Data proliferation]]
* [[Database dump]]
* [[Digital preservation]]
* [[Disaster recovery and business continuity auditing]]
==Notes==
{{reflist|group=note}}
==References==
{{reflist|30em}}
{{Wiktionary|back up}}
{{Wiktionary|backup}}
{{Commons category|Backup}}
[[Category:Backup| ]]
[[Category:Computer data]]
[[Category:Content management systems|*]]
[[Category:Data management]]
[[Category:Data security]]
[[Category:Records management]]' |
Unified diff of changes made by edit (edit_diff ) | '@@ -37,6 +37,8 @@
====Incremental====
-An [[incremental backup]] stores data changed since a reference point in time.<ref name="NakivoWhatIsIncrementalBackup">{{cite web |last1=Reed |first1=Jessie |title=What Is Incremental Backup? |url=https://www.nakivo.com/blog/what-is-incremental-backup/ |website=Nakivo Blog |publisher=Nakivo |accessdate=17 May 2019 |at=Reverse incremental, Multilevel incremental, Block-level |date=27 February 2018}}</ref> Duplicate copies of unchanged data aren't copied.<ref name="NakivoTypesOfBackup" /> Typically a full backup of all files is once or at infrequent intervals, serving as the reference point for an incremental repository. Subsequently, a number of incremental backups are made after successive time periods. Restores begin with the last full backup and then apply the incrementals.<ref name="Tech-FAQIncrementalBackup">{{cite web |title=Incremental Backup |url=http://www.tech-faq.com/incremental-backup.shtml |website=Tech-FAQ |publisher=Independent Media |accessdate=10 March 2006 |archiveurl=https://web.archive.org/web/20160621090117/http://www.tech-faq.com/incremental-backup.shtml |archivedate=21 June 2016 |date=13 June 2005}}</ref>
-Some backup systems<ref name="PondHowTimeMachineWorks">{{cite web | last1=Pond | first1=James| url=http://baligu.com/pondini/TM/Works.html | title=How Time Machine Works its Magic |website=Apple OSX and Time Machine Tips |publisher=baligu.com| accessdate=19 May 2019 | date=31 August 2013 | at=File System Event Store,Hard Links}}</ref> can create a {{visible anchor|synthetic full backup}} from a series of incrementals, thus providing the equivalent of frequently doing a full backup.<ref name="NakivoTypesOfBackup"/> When done to modify a single archive file, this speeds restores of recent versions of files.
+An [[incremental backup]]
+
+Dhananjay
+
====Near-CDP{{anchor|Continuous_data_protection}}====
' |
New page size (new_size ) | 53021 |
Old page size (old_size ) | 54741 |
Size change in edit (edit_delta ) | -1720 |
Lines added in edit (added_lines ) | [
0 => 'An [[incremental backup]] ',
1 => '',
2 => 'Dhananjay',
3 => ''
] |
Lines removed in edit (removed_lines ) | [
0 => 'An [[incremental backup]] stores data changed since a reference point in time.<ref name="NakivoWhatIsIncrementalBackup">{{cite web |last1=Reed |first1=Jessie |title=What Is Incremental Backup? |url=https://www.nakivo.com/blog/what-is-incremental-backup/ |website=Nakivo Blog |publisher=Nakivo |accessdate=17 May 2019 |at=Reverse incremental, Multilevel incremental, Block-level |date=27 February 2018}}</ref> Duplicate copies of unchanged data aren't copied.<ref name="NakivoTypesOfBackup" /> Typically a full backup of all files is once or at infrequent intervals, serving as the reference point for an incremental repository. Subsequently, a number of incremental backups are made after successive time periods. Restores begin with the last full backup and then apply the incrementals.<ref name="Tech-FAQIncrementalBackup">{{cite web |title=Incremental Backup |url=http://www.tech-faq.com/incremental-backup.shtml |website=Tech-FAQ |publisher=Independent Media |accessdate=10 March 2006 |archiveurl=https://web.archive.org/web/20160621090117/http://www.tech-faq.com/incremental-backup.shtml |archivedate=21 June 2016 |date=13 June 2005}}</ref>',
1 => 'Some backup systems<ref name="PondHowTimeMachineWorks">{{cite web | last1=Pond | first1=James| url=http://baligu.com/pondini/TM/Works.html | title=How Time Machine Works its Magic |website=Apple OSX and Time Machine Tips |publisher=baligu.com| accessdate=19 May 2019 | date=31 August 2013 | at=File System Event Store,Hard Links}}</ref> can create a {{visible anchor|synthetic full backup}} from a series of incrementals, thus providing the equivalent of frequently doing a full backup.<ref name="NakivoTypesOfBackup"/> When done to modify a single archive file, this speeds restores of recent versions of files.'
] |
All external links added in the edit (added_links ) | [] |
All external links removed in the edit (removed_links ) | [
0 => 'http://baligu.com/pondini/TM/Works.html',
1 => 'http://www.tech-faq.com/incremental-backup.shtml',
2 => 'https://web.archive.org/web/20160621090117/http://www.tech-faq.com/incremental-backup.shtml',
3 => 'https://www.nakivo.com/blog/what-is-incremental-backup/'
] |
All external links in the new text (all_links ) | [
0 => 'https://linux.die.net/man/1/rsync',
1 => 'https://support.code42.com/Administrator/5/Monitoring_and_managing/Archive_maintenance#Prune',
2 => 'https://www.ahdictionary.com/word/search.html?q=backup',
3 => 'https://books.google.com/books?id=r4uEEsq3CJYC',
4 => 'https://books.google.com/books?id=eLviiTag5A0C&pg=PA1',
5 => 'http://people.fas.harvard.edu/~techtool/pages/Take_Control_of_Mac_OS_X_Backups_(2.0).pdf',
6 => 'https://www.nytimes.com/2018/01/11/smarter-living/backing-up-your-photos.html',
7 => 'https://www.wisegeek.com/what-is-an-information-repository.htm',
8 => 'https://www.nakivo.com/blog/backup-types-explained-full-incremental-differential-synthetic-and-forever-incremental/',
9 => 'http://sysgen.ca/five-key-backup-questions/',
10 => 'https://web.archive.org/web/20160304042343/http://sysgen.ca/five-key-backup-questions/',
11 => 'https://www.informationweek.com/why-continuous-data-protections-getting-more-practical/d/d-id/1088883',
12 => 'https://www.computerweekly.com/Continuous-data-protection-CDP-explained-True-CDP-vs-near-CDP',
13 => 'https://www.baligu.com/pondini/TM/Works.html',
14 => 'https://books.google.com/books?id=2OtqvySBTu4C&pg=PA287',
15 => 'https://wuchikin.wordpress.com/2017/03/04/emc-recoverpoint-for-virtual-machine-overview/',
16 => 'https://resqdr.com/zerto-or-veeam/',
17 => 'https://docs.cloudendure.com/Content/FAQ/FAQ/Agent_Related.htm',
18 => 'https://web.archive.org/web/20050207082953/http://www.storagesearch.com/engenio-art2.html',
19 => 'http://www.storagesearch.com/engenio-art2.html',
20 => 'https://spectralogic.com/wp-content/uploads/white-paper-digital-data-storage-outlook-2017-v3.pdf',
21 => 'https://www.forbes.com/sites/tomcoughlin/2014/06/29/keeping-data-for-a-long-time/',
22 => 'https://www.pcworld.com/article/2984597/storage/hard-core-data-preservation-the-best-media-and-methods-for-archiving-your-data.html',
23 => 'https://www.hgst.com/sites/default/files/resources/LoadUnload_white_paper_FINAL.pdf',
24 => 'https://www.toshibadata.com.sg/Product-Canvio-Portable-Hard-Drive.aspx',
25 => 'https://www.doc-developpement-durable.org/file/Projets-informatiques/Drop%20Guard-disque-dur-tres-solide.pdf',
26 => 'https://www.pcmag.com/roundup/361072/the-best-rugged-hard-drives-and-ssds',
27 => 'https://web.archive.org/web/20170331161821/http://thewirecutter.com/reviews/best-portable-hard-drive/#dont-buy-a-rugged-portable-hard-drive',
28 => 'http://thewirecutter.com/reviews/best-portable-hard-drive/#dont-buy-a-rugged-portable-hard-drive',
29 => 'http://www.ironmountain.com/resources/general-articles/b/best-long-term-data-archive-solutions',
30 => 'https://books.google.com/books?id=ANe3k_7bnAcC&q=retrospect+deduplication&pg=PT41',
31 => 'http://www.dcig.com/2009/07/symantec-shows-backup-exec-a-l.html',
32 => 'https://web.archive.org/web/20160304212819/http://www.dcig.com/2009/07/symantec-shows-backup-exec-a-l.html',
33 => 'https://www.veritas.com/content/support/en_US/doc/ka6j00000000ADEAA2',
34 => '//doi.org/10.1007%2Fs12200-014-0442-2',
35 => 'https://api.semanticscholar.org/CorpusID:60816607',
36 => '//www.ncbi.nlm.nih.gov/pmc/articles/PMC5864957',
37 => 'https://ui.adsabs.harvard.edu/abs/2018NatCo...9.1183Z',
38 => '//doi.org/10.1038%2Fs41467-018-03589-y',
39 => '//pubmed.ncbi.nlm.nih.gov/29568055',
40 => 'http://www.ina.fr/video/3571726001/20-heures-emission-du-3-mars-2008.fr.html',
41 => 'https://web.archive.org/web/20130927170900/http://delkin.com/i-5937134-archival-gold-cd-r-300-year-disc-binder-of-10-discs-with-scratch-armor-surface.html',
42 => 'http://delkin.com/i-5937134-archival-gold-cd-r-300-year-disc-binder-of-10-discs-with-scratch-armor-surface.html',
43 => 'https://pro.sony/s3/cms-static-content/file/49/1237494482649.pdf',
44 => '//doi.org/10.1109%2FJPROC.2017.2727228',
45 => 'https://www.emc.com/corporate/glossary/remote-backup.htm',
46 => 'https://books.google.com/books?id=gjAhVzuV7k0C&pg=PA164',
47 => 'https://books.google.com/books?id=PU7gkW9ArxIC&pg=PA255',
48 => 'https://irontree.co.za/what-to-backup-a-critical-look-at-your-data-1935.html',
49 => 'https://books.google.com/books?id=6-w4fXbBInoC&pg=PA111',
50 => 'https://archive.org/details/unixbackuprecove00wcur',
51 => 'https://archive.org/details/unixbackuprecove00wcur/page/73',
52 => 'https://nilfs.sourceforge.io/en/',
53 => 'https://books.google.com/books?id=eLviiTag5A0C&pg=PA360',
54 => 'https://books.google.com/books?id=LecC2BhPPxMC&pg=PA244',
55 => 'https://books.google.com/books?id=2OtqvySBTu4C&pg=PA50',
56 => 'https://www.handybackup.net/open-file-backup.shtml',
57 => 'https://www.arqbackup.com/blog/troubleshooting-backing-up-openlocked-files-on-windows/',
58 => 'https://web.archive.org/web/20070302110933/http://www.wisc.edu/drmt/oratips/sess003.html#Hotbackup',
59 => 'http://www.wisc.edu/drmt/oratips/sess003.html#Hotbackup',
60 => 'https://support.arcserve.com/s/article/202080249?language=en_US',
61 => 'https://web.archive.org/web/20160425113119/http://www2.arnes.si/~ljc3m2/igor/blogs/technical/bootable_media_creation.html',
62 => 'http://www2.arnes.si/~ljc3m2/igor/blogs/technical/bootable_media_creation.html',
63 => 'https://www.baligu.com/pondini/TM/12.html',
64 => 'https://thewirecutter.com/reviews/best-online-backup-service/',
65 => 'https://books.google.com/books?id=SD_LAwAAQBAJ&pg=PA306',
66 => 'http://www.securityfocus.com/news/11048',
67 => 'https://web.archive.org/web/20160405033517/http://www.securityfocus.com/news/11048',
68 => 'https://books.google.com/books?id=6-w4fXbBInoC&pg=PA219',
69 => 'https://www.riskythinking.com/glossary/recovery_point_objective.php',
70 => 'https://www.riskythinking.com/glossary/recovery_time_objective.php',
71 => 'https://books.google.com/books?id=_DqO6kizEDUC&pg=PA17',
72 => 'https://www.veritas.com/support/en_US/article.100030833.html',
73 => 'http://www.hipaadvisory.com/regs/recordretention.htm',
74 => 'https://web.archive.org/web/20070411135655/http://www.hipaadvisory.com/regs/recordretention.htm',
75 => 'https://www.nakivo.com/blog/3-2-1-backup-rule-efficient-data-protection-strategy/'
] |
Links in the page, before the edit (old_links ) | [
0 => '//doi.org/10.1007%2Fs12200-014-0442-2',
1 => '//doi.org/10.1007%2Fs12200-014-0442-2',
2 => '//doi.org/10.1038%2Fs41467-018-03589-y',
3 => '//doi.org/10.1038%2Fs41467-018-03589-y',
4 => '//doi.org/10.1109%2FJPROC.2017.2727228',
5 => '//doi.org/10.1109%2FJPROC.2017.2727228',
6 => '//pubmed.ncbi.nlm.nih.gov/29568055',
7 => '//pubmed.ncbi.nlm.nih.gov/29568055',
8 => '//www.ncbi.nlm.nih.gov/pmc/articles/PMC5864957',
9 => '//www.ncbi.nlm.nih.gov/pmc/articles/PMC5864957',
10 => 'http://baligu.com/pondini/TM/Works.html',
11 => 'http://delkin.com/i-5937134-archival-gold-cd-r-300-year-disc-binder-of-10-discs-with-scratch-armor-surface.html',
12 => 'http://people.fas.harvard.edu/~techtool/pages/Take_Control_of_Mac_OS_X_Backups_(2.0).pdf',
13 => 'http://sysgen.ca/five-key-backup-questions/',
14 => 'http://thewirecutter.com/reviews/best-portable-hard-drive/#dont-buy-a-rugged-portable-hard-drive',
15 => 'http://www.dcig.com/2009/07/symantec-shows-backup-exec-a-l.html',
16 => 'http://www.hipaadvisory.com/regs/recordretention.htm',
17 => 'http://www.ina.fr/video/3571726001/20-heures-emission-du-3-mars-2008.fr.html',
18 => 'http://www.ironmountain.com/resources/general-articles/b/best-long-term-data-archive-solutions',
19 => 'http://www.securityfocus.com/news/11048',
20 => 'http://www.storagesearch.com/engenio-art2.html',
21 => 'http://www.tech-faq.com/incremental-backup.shtml',
22 => 'http://www.wisc.edu/drmt/oratips/sess003.html#Hotbackup',
23 => 'http://www2.arnes.si/~ljc3m2/igor/blogs/technical/bootable_media_creation.html',
24 => 'https://api.semanticscholar.org/CorpusID:60816607',
25 => 'https://archive.org/details/unixbackuprecove00wcur',
26 => 'https://archive.org/details/unixbackuprecove00wcur/page/73',
27 => 'https://books.google.com/books?id=2OtqvySBTu4C&pg=PA50',
28 => 'https://books.google.com/books?id=2OtqvySBTu4C&pg=PA287',
29 => 'https://books.google.com/books?id=6-w4fXbBInoC&pg=PA111',
30 => 'https://books.google.com/books?id=6-w4fXbBInoC&pg=PA219',
31 => 'https://books.google.com/books?id=ANe3k_7bnAcC&q=retrospect+deduplication&pg=PT41',
32 => 'https://books.google.com/books?id=LecC2BhPPxMC&pg=PA244',
33 => 'https://books.google.com/books?id=PU7gkW9ArxIC&pg=PA255',
34 => 'https://books.google.com/books?id=SD_LAwAAQBAJ&pg=PA306',
35 => 'https://books.google.com/books?id=_DqO6kizEDUC&pg=PA17',
36 => 'https://books.google.com/books?id=eLviiTag5A0C&pg=PA1',
37 => 'https://books.google.com/books?id=eLviiTag5A0C&pg=PA360',
38 => 'https://books.google.com/books?id=gjAhVzuV7k0C&pg=PA164',
39 => 'https://books.google.com/books?id=r4uEEsq3CJYC',
40 => 'https://docs.cloudendure.com/Content/FAQ/FAQ/Agent_Related.htm',
41 => 'https://irontree.co.za/what-to-backup-a-critical-look-at-your-data-1935.html',
42 => 'https://linux.die.net/man/1/rsync',
43 => 'https://nilfs.sourceforge.io/en/',
44 => 'https://pro.sony/s3/cms-static-content/file/49/1237494482649.pdf',
45 => 'https://resqdr.com/zerto-or-veeam/',
46 => 'https://spectralogic.com/wp-content/uploads/white-paper-digital-data-storage-outlook-2017-v3.pdf',
47 => 'https://support.arcserve.com/s/article/202080249?language=en_US',
48 => 'https://support.code42.com/Administrator/5/Monitoring_and_managing/Archive_maintenance#Prune',
49 => 'https://thewirecutter.com/reviews/best-online-backup-service/',
50 => 'https://ui.adsabs.harvard.edu/abs/2018NatCo...9.1183Z',
51 => 'https://web.archive.org/web/20050207082953/http://www.storagesearch.com/engenio-art2.html',
52 => 'https://web.archive.org/web/20070302110933/http://www.wisc.edu/drmt/oratips/sess003.html#Hotbackup',
53 => 'https://web.archive.org/web/20070411135655/http://www.hipaadvisory.com/regs/recordretention.htm',
54 => 'https://web.archive.org/web/20130927170900/http://delkin.com/i-5937134-archival-gold-cd-r-300-year-disc-binder-of-10-discs-with-scratch-armor-surface.html',
55 => 'https://web.archive.org/web/20160304042343/http://sysgen.ca/five-key-backup-questions/',
56 => 'https://web.archive.org/web/20160304212819/http://www.dcig.com/2009/07/symantec-shows-backup-exec-a-l.html',
57 => 'https://web.archive.org/web/20160405033517/http://www.securityfocus.com/news/11048',
58 => 'https://web.archive.org/web/20160425113119/http://www2.arnes.si/~ljc3m2/igor/blogs/technical/bootable_media_creation.html',
59 => 'https://web.archive.org/web/20160621090117/http://www.tech-faq.com/incremental-backup.shtml',
60 => 'https://web.archive.org/web/20170331161821/http://thewirecutter.com/reviews/best-portable-hard-drive/#dont-buy-a-rugged-portable-hard-drive',
61 => 'https://wuchikin.wordpress.com/2017/03/04/emc-recoverpoint-for-virtual-machine-overview/',
62 => 'https://www.ahdictionary.com/word/search.html?q=backup',
63 => 'https://www.arqbackup.com/blog/troubleshooting-backing-up-openlocked-files-on-windows/',
64 => 'https://www.baligu.com/pondini/TM/12.html',
65 => 'https://www.baligu.com/pondini/TM/Works.html',
66 => 'https://www.computerweekly.com/Continuous-data-protection-CDP-explained-True-CDP-vs-near-CDP',
67 => 'https://www.doc-developpement-durable.org/file/Projets-informatiques/Drop%20Guard-disque-dur-tres-solide.pdf',
68 => 'https://www.emc.com/corporate/glossary/remote-backup.htm',
69 => 'https://www.forbes.com/sites/tomcoughlin/2014/06/29/keeping-data-for-a-long-time/',
70 => 'https://www.handybackup.net/open-file-backup.shtml',
71 => 'https://www.hgst.com/sites/default/files/resources/LoadUnload_white_paper_FINAL.pdf',
72 => 'https://www.informationweek.com/why-continuous-data-protections-getting-more-practical/d/d-id/1088883',
73 => 'https://www.nakivo.com/blog/3-2-1-backup-rule-efficient-data-protection-strategy/',
74 => 'https://www.nakivo.com/blog/backup-types-explained-full-incremental-differential-synthetic-and-forever-incremental/',
75 => 'https://www.nakivo.com/blog/what-is-incremental-backup/',
76 => 'https://www.nytimes.com/2018/01/11/smarter-living/backing-up-your-photos.html',
77 => 'https://www.pcmag.com/roundup/361072/the-best-rugged-hard-drives-and-ssds',
78 => 'https://www.pcworld.com/article/2984597/storage/hard-core-data-preservation-the-best-media-and-methods-for-archiving-your-data.html',
79 => 'https://www.riskythinking.com/glossary/recovery_point_objective.php',
80 => 'https://www.riskythinking.com/glossary/recovery_time_objective.php',
81 => 'https://www.toshibadata.com.sg/Product-Canvio-Portable-Hard-Drive.aspx',
82 => 'https://www.veritas.com/content/support/en_US/doc/ka6j00000000ADEAA2',
83 => 'https://www.veritas.com/support/en_US/article.100030833.html',
84 => 'https://www.wisegeek.com/what-is-an-information-repository.htm'
] |
Whether or not the change was made through a Tor exit node (tor_exit_node ) | false |
Unix timestamp of change (timestamp ) | 1603366069 |