Jump to content

Disk formatting: Difference between revisions

From Wikipedia, the free encyclopedia
Content deleted Content added
Garyp01 (talk | contribs)
mNo edit summary
m Removing link(s) Wikipedia:Articles for deletion/2M (DOS) closed as soft delete (XFDcloser)
 
(269 intermediate revisions by more than 100 users not shown)
Line 1: Line 1:
{{short description|Process of preparing a data storage device for initial use}}
[[File:FORMAT.COM format c drive.png|thumb|Formatting a hard disk drive using [[MS-DOS]]]]
'''Disk formatting''' is the process of preparing a [[data storage device]] such as a [[hard disk drive]], [[solid-state drive]], [[floppy disk]], [[memory card]] or [[USB flash drive]] for initial use. In some cases, the formatting operation may also create one or more new [[file system]]s. The first part of the formatting process that performs basic medium preparation is often referred to as "low-level formatting".<ref name="Tanenbaum">{{cite book |title=Modern Operating Systems |edition=2nd |author-last=Tanenbaum |author-first=Andrew |author-link=Andrew S. Tanenbaum |date=2001 |publisher=Prentice Hall |at=section 3.4.2, Disk Formatting |isbn=0130313580 |url=https://archive.org/details/modernoperatings00tane |url-access=registration }}</ref> [[Disk partitioning|Partitioning]] is the common term for the second part of the process, dividing the device into several sub-devices and, in some cases, writing information to the device allowing an [[operating system]] to be booted from it.<ref name="Tanenbaum" /><ref>{{cite web|url=https://docs.microsoft.com/en-us/windows/win32/fileio/disk-devices-and-partitions|title=Disk Devices and Partitions|date=7 January 2021|website=[[Microsoft Docs]]}}</ref> The third part of the process, usually termed "high-level formatting" most often refers to the process of generating a new file system.<ref name="Tanenbaum"/> In some operating systems all or parts of these three processes can be combined or repeated at different levels{{refn|E.g., formatting a volume, formatting a [[Virtual Storage Access Method]] [[Virtual Storage Access Method#Linear VSAM organization|Linear Data Set (LDS)]] on the volume to contain a [[zFS (z/OS file system)|zFS]] and formatting the zFS in [[UNIX System Services]].|group="lower-alpha"}} and the term "format" is understood to mean an operation in which a new disk medium is fully prepared to store [[Computer file|files]]. Some formatting utilities allow distinguishing between a quick format, which does not erase all existing data and a long option that does erase all existing data.


As a general rule,{{refn|Not true for CMS file system<ref>{{cite manual
'''Disk formatting''' is the process of preparing a [[data storage device]] such as a [[hard disk drive]], [[solid-state drive]], [[floppy disk]] or [[USB flash drive]] for initial use. In some cases, the formatting operation may also create one or more new [[file system]]s. The first part of the formatting process that performs basic medium preparation is often referred to as "low-level formatting".<ref name="Tanenbaum">{{cite book|title=Modern Operating Systems, 2nd Edition|last=Tanenbaum|first=Andrew|authorlink=Andrew S. Tanenbaum|year=2001|at=section 3.4.2, Disk Formatting|isbn=0130313580}}</ref> Partitioning is the common term for the second part of the process, making the data storage device visible to an [[operating system]].<ref name="Tanenbaum" /> The third part of the process, usually termed "high-level formatting" most often refers to the process of generating a new file system.<ref name="Tanenbaum" /> In some operating systems all or parts of these three processes can be combined or repeated at different levels{{#tag:ref|E.g., formatting a volume, formatting a [[Virtual Storage Access Method]] [[Virtual Storage Access Method#Linear VSAM organization|Linear Data Set (LDS)]] on the volume to contain a [[zFS (IBM file system)|zFS]] and formatting the zFS in [[UNIX System Services]].|group="NB"|name=""}} and the term "format" is understood to mean an operation in which a new disk medium is fully prepared to store files. Illustrated to the right are the prompts and diagnostics printed by [[MS-DOS]]'s FORMAT.COM utility as a hard drive is being formatted.

As a general rule,{{#tag:ref|Not true for CMS file system<ref>{{cite web
| author = IBM
| title = z/VM CMS Commands and Utilities Reference
| title = z/VM CMS Commands and Utilities Reference
| chapter = FORMAT
| chapter = FORMAT
| chapterurl = http://publib.boulder.ibm.com/infocenter/zvm/v5r4/index.jsp?topic=/com.ibm.zvm.v54.dmsb4/for.htm
| chapter-url = http://publib.boulder.ibm.com/infocenter/zvm/v5r4/index.jsp?topic=/com.ibm.zvm.v54.dmsb4/for.htm
| version = z/VM Version 5 Release 4
| version = z/VM Version 5 Release 4
| publisher = IBM
| publisher = IBM
| date = 2008
| year = 2008
| url = http://publib.boulder.ibm.com/infocenter/zvm/v5r4/index.jsp?topic=/com.ibm.zvm.v54.dmsb4
| url = http://publib.boulder.ibm.com/infocenter/zvm/v5r4/index.jsp?topic=/com.ibm.zvm.v54.dmsb4
| format =
| id = SC24-6073-03
| id = SC24-6073-03
| accessdate =
| quote = When you do not specify either the RECOMP or LABEL option, the disk area is initialized by writing a device-dependent number of records (containing binary zeros) on each track. Any previous data on the disk is erased.
| quote = When you do not specify either the RECOMP or LABEL option, the disk area is initialized by writing a device-dependent number of records (containing binary zeros) on each track. Any previous data on the disk is erased.
|mode=cs2
| page =
}}</ref> on a CMS minidisk, TSS VAM-formatted volume,<ref>{{cite manual
| pages =
| ref =
| separator = ,
}}</ref> on a CMS minidisk, TSS VAM-formatted volume<ref>{{cite manual
| author = IBM
| author = IBM
| title = IBM System/360 Time Sharing System System Logic Summary Program Logic Manual
| title = IBM System/360 Time Sharing System System Logic Summary Program Logic Manual
| section = Virtual Access Methods
| section = Virtual Access Methods
| sectionurl =
| version =
| publisher = IBM
| publisher = IBM
| date =
| url = http://bitsavers.trailing-edge.com/pdf/ibm/360/tss/GY28-2009-2_Time_Sharing_System_System_Logic_Summary_PLM_Jun70.pdf
| url = http://bitsavers.trailing-edge.com/pdf/ibm/360/tss/GY28-2009-2_Time_Sharing_System_System_Logic_Summary_PLM_Jun70.pdf
| format = PDF
| id = GY28-2009-2
| id = GY28-2009-2
| accessdate =
| quote = The direct access volumes, on which TSS/360 virtual organization data sets are stored, have fixed-length, page size data blocks. No key field is required. The record overflow feature is utilized to allow data blocks to span tracks, as required. The entire volume, with the current exception of part of the first cylinder, which is used for identification, is formatted into page size blocks.
| quote = The direct access volumes, on which TSS/360 virtual organization data sets are stored, have fixed-length, page size data blocks. No key field is required. The record overflow feature is utilized to allow data blocks to span tracks, as required. The entire volume, with the current exception of part of the first cylinder, which is used for identification, is formatted into page size blocks.
| page = 56 (PDF 66)
| page = 56 (PDF 66)
|mode=cs2
| pages =
}}</ref> z/OS Unix file systems<ref>{{cite manual
| ref =
| title = z/OS 2.4 File System Administration
| separator = ,
| id = SC23-6887-40
}}</ref>, z/OS Unix file systems{{cn|date=May 2013}}<!-- Add z/OS Unix citation --> or VSAM<!-- Add AMS citation --> in IBM mainframes|group="NB"}} formatting a disk leaves most if not all existing data on the disk medium; some or most of which might be recoverable with special tools.<ref>{{cite web
| section = ioeagfmt
| url = https://www.linux.com/news/enterprise/storage/8257-how-to-recover-lost-files-after-you-accidentally-wipe-your-hard-drive
| section-url = https://www.ibm.com/servers/resourcelink/svc00100.nsf/pages/zOSV2R4sc236887/$file/ioea700_v2r4.pdf#page=142
| title = How to recover lost files after you accidentally wipe your hard drive
| last = Hermans
| pages = 116–119
| url = https://www.ibm.com/servers/resourcelink/svc00100.nsf/pages/zOSV2R4sc236887/$file/ioea700_v2r4.pdf
| first = Sherman
| date = 28 August 2006
| publisher = IBM
}}
| publisher = Linux.com
</ref> or VSAM<!-- Add AMS citation --> in IBM mainframes|group="lower-alpha"}} formatting a disk by default leaves most if not all existing data on the disk medium; some or most of which might be recoverable with [[Privilege (computing)|privileged]]<ref group="lower-alpha">E.g., AMASPZAP in MVS</ref> or [[Data recovery|special tools]].<ref>{{cite web
| accessdate = 22 November 2012
|url = https://www.linux.com/news/how-recover-lost-files-after-you-accidentally-wipe-your-hard-drive/
}}</ref> Special tools can remove user data by a single overwrite of all files and free space.<ref>{{cite web
|title = How to recover lost files after you accidentally wipe your hard drive
| url = http://www.infosecisland.com/blogview/16130-The-Urban-Legend-of-Multipass-Hard-Disk-Overwrite.html
|last = Hermans
| title = The Urban Legend of Multipass Hard Disk Overwrite and DoD 5220-22-M
| last = Smithson
|first = Sherman
| first = Brian
|date = 28 August 2006
| date = 29 August 2011
|publisher = Linux.com
|access-date = 28 November 2019
| publisher = Infosec Island
}}</ref> Special tools can remove user data by a single [[Data erasure|overwrite]] of all files and free space.<ref>{{cite web
| accessdate = 22 November 2012
|url = http://www.infosecisland.com/blogview/16130-The-Urban-Legend-of-Multipass-Hard-Disk-Overwrite.html
|title = The Urban Legend of Multipass Hard Disk Overwrite and DoD 5220-22-M
|last = Smithson
|first = Brian
|date = 29 August 2011
|publisher = Infosec Island
|access-date = 22 November 2012
|archive-date = 5 October 2018
|archive-url = https://web.archive.org/web/20181005045747/http://www.infosecisland.com/blogview/16130-The-Urban-Legend-of-Multipass-Hard-Disk-Overwrite.html
|url-status = dead
}}</ref>
}}</ref>


== History ==
== History ==


A "block", a contiguous number of bytes, is the minimum unit of memory that is read from and written to a disk by a disk driver. The earliest disk drives had fixed block sizes (e.g. the [[IBM 350]] disk storage unit (of the late 1950s) block size was 100 6 bit characters) but starting with the [[IBM 1301|1301]]<ref>{{cite web |url=http://www-03.ibm.com/ibm/history/exhibits/storage/storage_1301.html |title=IBM&nbsp;1301 disk storage unit |author= |publisher=[[IBM]] |date= |accessdate=2010-06-24}}</ref> IBM marketed subsystems that featured variable block sizes: a particular track could have blocks of different sizes. The [[Direct access storage device|disk subsystems]] on the [[IBM System/360]] expanded this concept in the form of [[Count Key Data]] (CKD) and later [[Count Key Data#ECKD|Extended Count Key Data]] (ECKD); however the use of variable block size in HDDs fell out of use in the 1990s; one of the last HDDs to support variable block size was the IBM 3390 Model 9, announced May 1993<ref>[http://www-03.ibm.com/ibm/history/exhibits/storage/storage_3390.html IBM 3390 direct access storage device]</ref>.
A [[Block (data storage)|block]], a contiguous number of [[byte]]s, is the minimum unit of storage that is read from and written to a disk by a disk driver. The earliest disk drives had fixed block sizes (e.g. the [[IBM 350]] disk storage unit (of the late 1950s) block size was 100 six-bit characters) but starting with the [[IBM 1301|1301]]<ref>{{cite web |url=http://www-03.ibm.com/ibm/history/exhibits/storage/storage_1301.html |archive-url=https://web.archive.org/web/20050426065056/http://www-03.ibm.com/ibm/history/exhibits/storage/storage_1301.html |url-status=dead |archive-date=April 26, 2005 |title=IBM&nbsp;1301 disk storage unit |date=23 January 2003 |publisher=[[IBM]] |access-date=2010-06-24}}</ref> IBM marketed subsystems that featured variable block sizes: a particular track could have blocks of different sizes. The disk subsystems and other [[direct access storage device]]s on the [[IBM System/360]] expanded this concept in the form of [[Count Key Data]] (CKD) and later [[Count Key Data#ECKD|Extended Count Key Data]] (ECKD); however the use of variable block size in HDDs fell out of use in the 1990s; one of the last HDDs to support variable block size was the IBM 3390 Model 9, announced May 1993.<ref>{{cite web |url=http://www-03.ibm.com/ibm/history/exhibits/storage/storage_3390.html |archive-url=https://web.archive.org/web/20050124005029/http://www-03.ibm.com/ibm/history/exhibits/storage/storage_3390.html |url-status=dead |archive-date=January 24, 2005 |title=IBM 3390 direct access storage device |date=23 January 2003 |publisher=[[IBM]]}}</ref>


Modern hard disk drives, such as [[Serial attached SCSI]] (SAS)<ref>"The LBAs on a logical unit shall begin with zero and shall be contiguous up to the last logical block on the logical unit"., Information technology&nbsp;— Serial Attached SCSI - 2 (SAS-2), INCITS 457 Draft 2, May 8, 2009, chapter 4.1 Direct-access block device type model overview.</ref> and [[Serial ATA]] (SATA)<ref>ISO/IEC 791D:1994, AT Attachment Interface for Disk Drives (ATA-1), section 7.1.2</ref> drives, appear at their [[interface (computing)|interface]]s as a contiguous set of fixed-size blocks; for many years 512 bytes long but beginning in 2009 and accelerating through 2011, all major hard disk drive manufacturers began releasing hard disk drive platforms using the [[Advanced Format]] of 4096 byte logical blocks.<ref>[http://www.anandtech.com/show/2888 Western Digital's Advanced Format: The 4K Sector Transition Begins]</ref><ref>[http://www.seagate.com/ww/v/index.jsp?locale=en-US&name=advanced-format-migration-to-4k-tpc&vgnextoid=746f43fce2489210VgnVCM1000001a48090aRCRD Advanced Format&nbsp;– The Migration to 4K Sectors], Seagate Corp.</ref>
Modern hard disk drives, such as [[Serial attached SCSI]] (SAS)<ref group="lower-alpha">"The LBAs on a logical unit shall begin with zero and shall be contiguous up to the last logical block on the logical unit"., Information technology&nbsp;— Serial Attached SCSI - 2 (SAS-2), INCITS 457 Draft 2, May 8, 2009, chapter 4.1 Direct-access block device type model overview.</ref> and [[Serial ATA]] (SATA)<ref>ISO/IEC 791D:1994, AT Attachment Interface for Disk Drives (ATA-1), section 7.1.2</ref> drives, appear at their [[interface (computing)|interface]]s as a contiguous set of fixed-size blocks; for many years 512 bytes long but beginning in 2009 and accelerating through 2011, all major hard disk drive manufacturers began releasing hard disk drive platforms using the [[Advanced Format]] of 4096 byte logical blocks.<ref>{{cite web |url=http://www.anandtech.com/show/2888 |title=Western Digital's Advanced Format: The 4K Sector Transition Begins |author-first=Ryan |author-last=Smith |date=2009-12-18 |publisher=[[Anandtech]]}}</ref><ref>{{cite web |url=http://www.seagate.com/tech-insights/advanced-format-4k-sector-hard-drives-master-ti/ |title=Transition to Advanced Format 4K Sector Hard Drives |publisher=[[Seagate Technology]]}}</ref>


[[Floppy disk]]s generally only used fixed block sizes but these sizes were a function of the host's [[Operating System|OS]] and its interaction with its [[Floppy disk controller|controller]] so that a particular type of media (e.g., 5¼-inch DSDD) would have different block sizes depending upon the host OS and controller.
[[Floppy disk]]s generally only used fixed block sizes but these sizes were a function of the host's [[Operating System|OS]] and its interaction with its [[Floppy disk controller|controller]] so that a particular type of media (e.g., 5¼-inch DSDD) would have different block sizes depending upon the host OS and controller.


[[Optical disk]]s generally only use fixed block sizes.
[[Optical disc]]s generally only use fixed block sizes.


== Disk formatting process ==
== Disk formatting process ==


Formatting a disk for use by an operating system and its applications typically involves three different processes.<ref group=NB>Each process may involve multiple steps, and steps of different processes may be interleaved.</ref>
Formatting a disk for use by an operating system and its applications typically involves three different processes.<ref group="lower-alpha">Each process may involve multiple steps, and steps of different processes may be interleaved.</ref>


# Low-level formatting (i.e., closest to the hardware) marks the surfaces of the disks with markers indicating the start of a recording block (typically today called sector markers) and other information like block [[CRC]] to be used later, in normal operations, by the [[disk controller]] to read or write data. This is intended to be the permanent foundation of the disk, and is often completed at the factory.
# Low-level formatting (i.e., closest to the hardware) marks the surfaces of the disks with markers indicating the start of a recording block (typically today called sector markers) and other information like block [[Cyclic redundancy check|CRC]] to be used later, in normal operations, by the [[disk controller]] to read or write data. This is intended to be the permanent foundation of the disk, and is often completed at the factory.
# [[Disk partitioning|Partitioning]] divides a disk into one or more regions, writing data structures to the disk to indicate the beginning and end of the regions. This level of formatting often includes checking for defective tracks or defective sectors.
# [[Disk partitioning|Partitioning]] divides a disk into one or more regions, writing data structures to the disk to indicate the beginning and end of the regions. This level of formatting often includes checking for defective tracks or defective sectors.
# High-level formatting creates the [[file system]] format within a disk partition or a [[logical volume]]. This formatting includes the data structures used by the OS to identify the logical drive or partition's contents). This may occur during operating system installation, or when adding a new disk. [[List of file systems|Disk and distributed file system]] may specify an optional boot block, and/or various volume and directory information for the operating system.
# High-level formatting creates the [[file system]] format within a disk partition or a [[logical volume]].<ref name="Tanenbaum" /> This formatting includes the data structures used by the OS to identify the logical drive or partition's contents. This may occur during operating system installation, or when adding a new disk. [[List of file systems|Disk and distributed file system]] may specify an optional boot block, and/or various volume and directory information for the operating system.


=== Low-level formatting of floppy disks ===
=== Low-level formatting of floppy disks ===
Line 77: Line 75:
The low-level format of floppy disks (and early hard disks) is performed by the disk drive's controller.
The low-level format of floppy disks (and early hard disks) is performed by the disk drive's controller.


Consider a standard [[Floppy disk#The 3½-inch microfloppy diskette|1.44 MB floppy disk]]. Low-level formatting of the floppy disk, normally writes 18 [[Disk sector|sector]]s of 512 [[byte]]s to each of 160 tracks (80 on each side) of the floppy disk, providing 1,474,560 bytes of storage on the disk.
For a standard [[Floppy disk#microfloppy|1.44 MB floppy disk]], low-level formatting normally writes 18 [[Disk sector|sector]]s of 512 [[byte]]s to each of 160 tracks (80 on each side) of the floppy disk, providing 1,474,560 bytes of storage on the disk.


Physical sectors are actually larger than 512 bytes, as in addition to the 512 byte data field they include a sector identifier field, [[Cyclic redundancy check|CRC]] bytes (in some cases [[Error detection and correction|error correction bytes]]) and gaps between the fields. These additional bytes are not normally included in the quoted figure for overall storage capacity of the disk.
Physical sectors are actually larger than 512 bytes, as in addition to the 512 byte data field they include a sector identifier field, [[Cyclic redundancy check|CRC]] bytes (in some cases [[Error detection and correction|error correction bytes]]) and gaps between the fields. These additional bytes are not normally included in the quoted figure for overall storage capacity of the disk.
Line 83: Line 81:
Different low-level formats can be used on the same [[Recording medium|media]]; for example, large records can be used to cut down on inter-record gap size.
Different low-level formats can be used on the same [[Recording medium|media]]; for example, large records can be used to cut down on inter-record gap size.


Several [[freeware]], [[shareware]] and [[free software]] programs (e.g. [[GParted]], [[Fdformat|FDFORMAT]], NFORMAT and [[2M (DOS)|2M]]) allowed considerably more control over formatting, allowing the formatting of high-density 3.5" disks with a capacity up to 2 MB.
Several [[freeware]], [[shareware]] and [[free software]] programs (e.g. [[GParted]], [[Fdformat|FDFORMAT]], NFORMAT, [[VGA-Copy]] and 2M) allowed considerably more control over formatting, allowing the formatting of high-density 3.5" disks with a capacity up to 2 MB.


Techniques used include:
Techniques used include:
Line 92: Line 90:
* increasing the number of tracks (most drives could tolerate extension to 82 tracks: though some could handle more, others could jam).
* increasing the number of tracks (most drives could tolerate extension to 82 tracks: though some could handle more, others could jam).


[[Linux]] supports a variety of sector sizes, and [[DOS]] and [[Microsoft Windows|Windows]] support a large-record-size [[Distribution Media Format|DMF]]-formatted floppy format. {{Citation needed|date=February 2007}}
[[Linux]] supports a variety of sector sizes,<ref>{{Cite web|url=https://tools.ietf.org/doc/fdutils/Fdutils.html#Media-description|title = Fdutils}}</ref> and [[DOS]] and [[Microsoft Windows|Windows]] support a large-record-size [[Distribution Media Format|DMF]]-formatted floppy format.<ref>{{cite web
|url=http://support.microsoft.com/kb/120348
|title=Definition of Distribution Media Format (DMF)
|publisher=[[Microsoft Knowledge Base]]
|date=2007-01-19
|access-date=2011-10-16
|archive-url=https://web.archive.org/web/20110914232540/http://support.microsoft.com/kb/120348
|archive-date=2011-09-14
|url-status=dead}}</ref>

After establishing the structure of tracks, a formatter also needs to fill the entire floppy and look for [[bad sector]]s. Traditionally, the physical sectors were initialized with a fill value of <code>0xF6</code> as per the INT&nbsp;1Eh's [[Disk Parameter Table]] (DPT)<!-- TBD: IIRC this resembles a bit pattern optimized for MFM controllers --> during format on IBM compatible machines. This value is also used on the [[Atari Portfolio]]. [[CP/M]] 8-inch floppies typically came pre-formatted with a value of <code>0xE5</code>,<ref name="Schulman_1994_Undocumented-DOS">{{cite book |author-first1=Andrew |author-last1=Schulman |author-first2=Ralf D. |author-last2=Brown |author-link2=Ralf D. Brown |author-first3=David |author-last3=Maxey |author-first4=Raymond J. |author-last4=Michels |author-first5=Jim |author-last5=Kyle |title=Undocumented DOS: A programmer's guide to reserved MS-DOS functions and data structures - expanded to include MS-DOS 6, Novell DOS and Windows 3.1 |publisher=[[Addison Wesley]] |edition=2 |date=1994 |orig-year=November 1993<!-- first printing --> |isbn=0-201-63287-X |url=https://archive.org/details/undocumenteddosp00andr_0 }} (xviii+856+vi pages, 3.5"-floppy) Errata: [https://web.archive.org/web/20190417215556/http://www.cs.cmu.edu/afs/cs/user/ralf/pub/books/UndocumentedDOS/errata.ud2][https://web.archive.org/web/20190417212906/https://www.pcjs.org/pubs/pc/programming/Undocumented_DOS/#errata-2nd-edition]</ref> and by way of [[Digital Research]] this value was also used on [[Atari ST]] and some [[Amstrad]] formatted floppies.<ref group="lower-alpha" name="NB_Magic_E5"/> Amstrad otherwise used <code>0xF4</code> as a fill value.


=== Low-level formatting (LLF) of hard disks ===
=== Low-level formatting (LLF) of hard disks ===
[[File:IBM PC XT 10 meg MFM low level format.jpg|thumb|Low-level format of a 10-megabyte [[IBM Personal Computer XT|IBM PC XT]] hard drive]]
[[File:IBM PC XT 10 meg MFM low level format.jpg|thumb|Low-level format of a 10-megabyte [[IBM Personal Computer XT|IBM PC XT]] hard drive]]


Hard disk drives prior to the 1990s typically had a separate [[disk controller]] that defined how data was encoded on the media. With the media, the drive and/or the controller possibly procured from separate vendors, low level formatting was a potential user activity. Separate procurement also had the potential of incompatibility between the separate components such that the subsystem would not reliably store data.<ref>This problem became common in PCs where users used RLL controllers with MFM drives; [http://webpages.charter.net/dperr/diskguid.txt "MFM drives should not be used on RLL controllers. ..".]</ref>
Hard disk drives prior to the 1990s typically had a separate [[disk controller]] that defined how data was encoded on the media. With the media, the drive and/or the controller possibly procured from separate vendors, users were often able to perform low-level formatting. Separate procurement also had the potential of incompatibility between the separate components such that the subsystem would not reliably store data.<ref group="lower-alpha">This problem became common in PCs where users used RLL controllers with MFM drives; [https://web.archive.org/web/20180609221448/http://webpages.charter.net/dperr/diskguid.txt "MFM drives should not be used on RLL controllers.".]</ref>


User instigated low-level formatting (LLF) of [[hard disk drives]] was common for [[minicomputer]] and [[personal computer]] systems until the 1990s. IBM and other mainframe system vendors typically supplied their hard disk drives (or media in the case of removable media HDDs) with a low-level format. Typically this involved subdividing each track on the disk into one or more blocks which would contain the user data and associated control information. Different computers used different block sizes and IBM notably used [[Count Key Data|variable block sizes]] but the popularity of the IBM PC caused the industry to adopt a standard of 512 user data bytes per block by the middle 1980s.
User-instigated low-level formatting (LLF) of [[hard disk drives]] was common for [[minicomputer]] and [[personal computer]] systems until the 1990s. [[IBM]] and other mainframe system vendors typically supplied their hard disk drives (or media in the case of removable media HDDs) with a low-level format. Typically this involved subdividing each track on the disk into one or more blocks which would contain the user data and associated control information. Different computers used different block sizes and IBM notably used [[Count Key Data|variable block sizes]] but the popularity of the IBM PC caused the industry to adopt a standard of 512 user data bytes per block by the middle 1980s.


Depending upon the system, low-level formatting was generally done by an operating system system utility. IBM compatible PCs used the BIOS, which is invoked using the MS-DOS [[DEBUG (DOS Command)|debug]] program, to transfer control to a routine hidden at different addresses in different BIOSes.<ref>[http://support.microsoft.com/kb/60089 Using DEBUG to Start a Low-Level Format], Microsoft</ref>
Depending upon the system, low-level formatting was generally done by an operating system utility. IBM compatible PCs used the BIOS, which is invoked using the MS-DOS [[DEBUG (DOS Command)|debug]] program, to transfer control to a routine hidden at different addresses in different BIOSes.<ref>[http://support.microsoft.com/kb/60089 Using DEBUG to Start a Low-Level Format], Microsoft</ref>


==== Transition away from LLF ====
==== Transition away from LLF ====


Starting in the late 1980s, driven by the volume of IBM compatible PCs, HDDs became routinely available pre-formatted with a compatible low-level format. At the same time, the industry moved from [[Hard_disk_drive#Disk_interface_families_used_in_personal_computers|''historical (dumb) bit serial interfaces'']] to [[Hard_disk_drive#Disk_interface_families_used_in_personal_computers|''modern (intelligent) bit serial interfaces'' and ''Word serial interfaces '']] wherein the low level format was performed at the factory.
Starting in the late 1980s, driven by the volume of IBM compatible PCs, HDDs became routinely available pre-formatted with a compatible low-level format. At the same time, the industry moved from [[Hard disk drive interface#BSDI|''historical (dumb) bit serial interfaces'']] to modern (intelligent) [[Hard disk drive interface#BSI|''bit serial interfaces'']] and [[Hard disk drive interface#WSI|''word serial interfaces'']] wherein the low-level format was performed at the factory.<ref>{{cite web|publisher=The NOSPIN Group, Inc.|url=http://freepctech.com/pc/001/007.shtml|title=Low level formatting an IDE hard drive|website=FreePCTech.com|archive-url=https://web.archive.org/web/20120716043736/http://freepctech.com/pc/001/007.shtml|archive-date=July 16, 2012|access-date=December 24, 2003}}</ref><ref>{{cite web|website=The PC Guide. Site Version: 2.2.0 - Version Date: April 17, 2001|url=http://www.pcguide.com/ref/hdd/geom/formatUtilities-c.html|title=Low-Level Format, Zero-Fill and Diagnostic Utilities|access-date=May 24, 2007|archive-url=https://web.archive.org/web/20190103014814/http://www.pcguide.com/ref/hdd/geom/formatUtilities-c.html|archive-date= January 3, 2019}}</ref> Accordingly, it is not possible for an end user to low-level format a modern hard disk drive.


=== Modern disks: reinitialization ===
Today, an [[end-user]], in most cases, should never perform a low-level formatting of an IDE or ATA hard drive, and in fact it is often not possible to do so on modern hard drives because the formatting is done on a [[servowriter]] before the disk is assembled into a drive in the factory.<ref>The NOSPIN Group, Inc. (n.d.). ''[http://freepctech.com/pc/001/007.shtml Low level formatting an IDE hard drive]''. Retrieved December 24, 2003.</ref><ref>The PC Guide. Site Version: 2.2.0 - Version Date: April 17, 2001 ''[http://www.pcguide.com/ref/hdd/geom/formatUtilities-c.html Low-Level Format, Zero-Fill and Diagnostic Utilities]''. Retrieved May 24, 2007.</ref>
Modern hard drives can no longer perform post-production LLF, i.e. to re-establish the basic layout of "tracks" and "blocks" on the recording surface. ''Reinitialization'' refers to processes that return a disk to a factory-like configuration: no data, no partitioning, all blocks available to use.


==== Disk reinitialization ====
==== Command-set support ====
SCSI provides a {{tt|Format Unit}} command. This command performs the needed certification step to weed out [[bad sector]]s and has the ability to change sector size. The command-line sg_format program may be used to issue the command.<ref>{{man|8|sg_format|Linux}}</ref> A variety of sector sizes may be chosen, but are not available on all devices: 512, 520, 524, 528, 4096, 4112, 4160, and 4224-byte sectors.<ref>[http://www.seagate.com/docs/pdf/whitepaper/tp595_building_faster_more_flexible_infrastructure.pdf Seagate SAS drives] {{webarchive |url=https://web.archive.org/web/20101129180307/http://www.seagate.com/docs/pdf/whitepaper/tp595_building_faster_more_flexible_infrastructure.pdf |date=2010-11-29}}</ref> Although the SCSI command provides many options, even resizing, it does not touch on the track layer where low-level format happens.<ref>{{cite web |title=INCITS 506-202x - Information technology - SCSI Block Commands - 4 (SBC-4) draft revision 22 |url=https://standards.incits.org/apps/group_public/download.php/124286/livelink |access-date=22 May 2023 |date=15 September 2020}}</ref>
{{Refimprove section|date=July 2009}}


ATA does not expose a low-level format functionality, but they allow the sector size to be changed via {{tt|SET SECTOR CONFIGURATION}} ({{tt|--set-sector-size}} in <code>[[hdparm]]</code>). (Consumer drives usually only support 512 and [[Advanced Format|4096-byte sector]]s.) Although sector-size change may scramble data, it is not a safe way of erasing data, nor is any certification done. ATA offers a separate {{tt|SECURITY ERASE}} ({{tt|--security-erase}} in <code>[[hdparm]]</code>) command for erasure.<ref>{{man|8|hdparm|Linux}}</ref>
While it is generally impossible to perform a complete LLF on most ''modern'' hard drives (since the mid-1990s) outside the factory,<ref>Many enterprise class HDDs can be low-level formatted to block sizes other than 512 bytes, e.g., [http://www.seagate.com/docs/pdf/whitepaper/tp595_building_faster_more_flexible_infrastructure.pdf Seagate SAS drives] support sector sizes of 512, 520, 524 or 528 bytes and can reformatted from one size to another</ref> the term "low-level format" is still used for what could be called the ''reinitialization'' of a hard drive to its ''factory configuration'' (and even these terms may be misunderstood).


[[NVMe]] drives have a standard method of formatting, available in, for example, the Linux command-line program {{tt|nvme format}}. Sector size change and secure erase options are available.<ref>{{man|1|nvme-format|Linux}}</ref> Note that NVMe drives are generally solid-state, making this "track" distinction useless.
The present ambiguity in the term ''low-level format'' seems to be due to both inconsistent documentation on web sites and the belief by many users that any process below a high-level (file system) format must be called a ''low-level'' format. Since much of the low level formatting process can today only be performed at the factory, various drive manufacturers describe reinitialization software as LLF utilities on their web sites. Since users generally have no way to determine the difference between a complete LLF and ''reinitialization'' (they simply observe running the software results in a hard disk that must be high-level formatted), both the misinformed user and ''mixed signals'' from various drive manufacturers have perpetuated this error. Note: Whatever possible misuse of such terms may exist (search hard drive manufacturers' web sites for all these terms), many sites do make such ''reinitialization'' utilities available (possibly as bootable floppy diskette or CD image files), to both overwrite every byte ''and'' check for damaged sectors on the hard disk.


[[Seagate Technology]] drives offer a [[TTL serial]] debugging console.<ref>{{cite web |title=Seagate Serial Talk {{!}} OS/2 Museum |url=https://www.os2museum.com/wp/seagate-serial-talk/}}</ref> Among other things, the console can format the "system" and "user" partitions while performing defect checks (re-initialization over pre-established logical blocks) and modify track parameters (managing the ''real'' low-level format).<ref>{{cite web |title=F3 Serial Port Diagnostics |url=https://dokumen.tips/documents/f3-serialport-diagnostics.html?page=1}} older version available from </ref>
Reinitialization should include identifying (and sparing out if possible) any sectors which cannot be written to and read back from the drive, correctly. The term has, however, been used by some to refer to only a portion of that process, in which every sector of the drive is written to; usually by writing a specific value to every addressable location on the disk.


==== Disk-filling ====
Traditionally, the physical sectors were initialized with a filler value of <code>0xF6</code> as per the INT&nbsp;1Eh's [[Disk Parameter Table]] (DPT)<!-- TBD: IIRC this resembles a bit pattern optimized for MFM controllers --> during format on IBM compatible machines. This value is also used on the [[Atari Portfolio]]. 8-inch [[CP/M]] floppies typically came pre-formatted with a value of <code>0xE5</code>,<ref name="Schulman_1994_Undocumented-DOS"/> and by way of [[Digital Research]] this value was also used on [[Atari ST]] and some [[Amstrad]] formatted floppies.<ref group="NB" name="NB_Magic_E5"/> Amstrad otherwise used <code>0xF4</code> as a format filler value. Some modern formatters wipe hard disks with a value of <code>0x00</code> instead, sometimes also called ''zero-filling'', whereas a value of <code>0xFF</code> is used on flash disks to reduce [[Program-erase cycle|wear]]. The latter value is typically also the default value used on ROM disks (which cannot be reformatted). (Some advanced formatting tools allow to configure the format filler byte.<ref group="NB" name="NB_Format_Wipe"/>)
When the hard drive's built-in reinitialization function (see above) is unavailable due to driver or system limitations, it is possible to fill the entire disk instead. On older hard drives without [[bad sector]] management,<ref>{{cite web |title=BadBlockHowto – smartmontools |url=https://www.smartmontools.org/wiki/BadBlockHowto |website=www.smartmontools.org}}</ref> a program will also need to check for any damaged sectors and try to spare them out. On newer drives with defect management, reallocated sectors may be left unerased, whereas the built-in re-initialization function will erase them.<ref name="Secure Deletion" />


One popular method for performing only the zero-fill operation on a hard disk is by writing zero-value bytes to the drive using the Unix [[dd (Unix)|dd]] utility with the [[/dev/zero]] stream as the input file and the drive itself or a specific partition as the output file.<ref>[http://www.myfixlog.com/fix.php?fid=58 How to Securely Erase (Wipe) a Hard Drive for Free with DD]</ref> This command may take many hours to complete, and can erase all files and file systems.
In modern times, it is most common to fill hard drives with value of <code>0x00</code>. One popular method for performing this zero-fill operation on a hard disk is by writing zero-value bytes to the drive using the Unix [[dd (Unix)|dd]] utility with the [[/dev/zero]] stream as the input file and the drive itself (or a specific partition) as the output file.<ref>{{cite web|url=http://www.myfixlog.com/fix.php?fid=58|title=How to Securely Erase (Wipe) a Hard Drive for Free with DD|website=myfixlog.com|archive-url=https://web.archive.org/web/20160418143615/http://www.myfixlog.com/fix.php?fid=58|archive-date=April 18, 2016}}</ref> This command may take many hours to complete, and will erase all files and file systems.


A value of <code>0xFF</code> is used on flash disks to reduce [[Program-erase cycle|wear]] . The latter value is typically also the default value used on ROM disks (which cannot be reformatted). Some advanced tools allow configuring the fill value.<ref group="lower-alpha" name="NB_Format_Wipe"/>
Another method for [[SCSI]] disks may use the sg_format<ref>[http://sg.danny.cz/sg/sg3_utils.html SG.danny.cz]</ref> command to issue a low level [[SCSI Format Unit Command]].


Overwriting the drive with a zero-fill-command is not necessarily a secure method of erasing sensitive data, or of preparing a drive for use with an encrypted filesystem.<ref>[http://www.globallinuxsecurity.pro/quickly-fill-a-disk-with-random-bits-without-dev-urandom/]</ref>
Zero-filling a drive is not a secure method of preparing a drive for use with an encrypted filesystem. Doing so voids the [[deniable encryption|plausible deniability of the process]], as the encrypted areas (indistinguishable from random without a key, unless the cipher is compromised) will stand out among zero blocks. The correct technique is to zero-fill inside a temporary encrypted layer then discard the key and layer setup. ([[/dev/urandom]] provides similar safety, but tends to be slow.)<ref>[http://www.globallinuxsecurity.pro/quickly-fill-a-disk-with-random-bits-without-dev-urandom/ Quickly fill a disk with random bits]</ref>

==== Confusion ====
{{More citations needed section|date=July 2009}}
The present ambiguity in the term ''low-level format'' seems to be due to both inconsistent documentation on web sites and the belief by many users that any process below a high-level (file system) format must be called a ''low-level'' format. Since much of the low-level formatting process can today only be performed at the factory, various drive manufacturers describe reinitialization software as LLF utilities on their web sites. Since users generally have no way to determine the difference between a complete LLF and ''reinitialization'' (they simply observe running the software results in a hard disk that must be high-level formatted), both the misinformed user and mixed signals from various drive manufacturers have perpetuated this error.

Note: whatever possible misuse of such terms may exist, many sites do make such ''reinitialization'' utilities available (possibly as bootable floppy diskette or CD image files), to both overwrite every byte ''and'' check for damaged sectors on the hard disk.


=== Partitioning ===
=== Partitioning ===
{{main|Disk partitioning}}
{{main|Disk partitioning}}


Partitioning is the process of writing information into blocks of a storage device or medium that allows access by an operating system. Some operating systems allow the device (or its medium) to appear as multiple devices; i.e. partitioned into multiple devices.
Partitioning is the process of writing information into blocks of a storage device or medium to divide the device into several sub-devices, each of which is treated by the operating system as a separate device and, in some cases, to allow an operating system to be booted from the device.


On [[MS-DOS]], [[Microsoft Windows]], and UNIX-based operating systems (such as [[BSD]], [[Linux]] and [[Mac OS X]]) this is normally done with a [[partition editor]], such as [[fdisk]], [[GNU Parted]], and [[Disk Utility]]. These operating systems support multiple partitions.
On [[MS-DOS]], [[Microsoft Windows]], and UNIX-based operating systems (such as [[BSD]], [[Linux]] and [[macOS]]) this is normally done with a [[partition editor]], such as [[fdisk]], [[GNU Parted]], or [[Disk Utility]]. These operating systems support multiple partitions.

In current IBM mainframe OSs derived from [[OS/360]] and [[DOS/360]], such as [[z/OS]] and [[z/VSE]], this is done by the INIT command of the ICKDSF utility.<ref>[http://publibz.boulder.ibm.com/epubs/pdf/ick4020f.pdf Device Support Facilities User's Guide and Reference]</ref> These OSs support only a single partition per device, called a volume. The ICKDSF functions include creating a volume label and writing a Record 0 on every track.


Floppy disks are not partitioned; however depending upon the OS they may require volume information in order to be accessed by the OS.
Floppy disks are not partitioned; however depending upon the OS they may require volume information in order to be accessed by the OS.


[[Partition editor]]s and ICKDSF today do not handle low level functions for HDDs and optical disk drives such as writing timing marks, and they cannot reinitialize a modern disk that has been degaussed or otherwise lost the factory formatting.
[[Partition editor]]s and ICKDSF today do not handle low-level functions for HDDs and optical disc drives such as writing timing marks, and they cannot reinitialize a modern disk that has been degaussed or otherwise lost the factory formatting.

IBM operating systems derived from [[CP-67]], e.g., [[z/VM]], maintain partitioning information for [[VM (operating system)#Minidisks|minidisks]] externally to the drive.


=== High-level formatting ===
=== High-level formatting ===


High-level formatting is the process of setting up an empty file system on a disk partition or [[logical volume]] and, for PCs, installing a [[boot sector]]. This is a fast operation, and is sometimes referred to as ''quick formatting''.
High-level formatting is the process of setting up an empty file system on a disk partition or a [[logical volume]] and for PCs, installing a [[boot sector]].<ref name="Tanenbaum" /> This is often a fast operation, and is sometimes referred to as ''quick formatting''.


The entire logical drive or partition may optionally be scanned for defects, which may take considerable time.
Formatting an entire logical drive or partition may optionally scan for defects, which may take considerable time.


In the case of floppy disks, both high- and low-level formatting are customarily performed in one pass by the disk formatting software. 8-inch floppies typically came low-level formatted and were filled with a format filler value of <code>0xE5</code>.<ref name="Schulman_1994_Undocumented-DOS"/><ref group="NB" name="NB_Magic_E5"/> Since the 1990s, most 5.25-inch and 3.5-inch floppies have been shipped pre-formatted from the factory as DOS [[FAT12]] floppies.
In the case of floppy disks, both high- and low-level formatting are customarily performed in one pass by the disk formatting software. Eight-inch floppies typically came low-level formatted and were filled with a format filler value of <code>0xE5</code>.<ref name="Schulman_1994_Undocumented-DOS"/><ref group="lower-alpha" name="NB_Magic_E5"/> Since the 1990s, most 5.25-inch and 3.5-inch floppies have been shipped pre-formatted from the factory as DOS [[FAT12]] floppies.


In current IBM mainframe operating systems derived from [[OS/360]] or [[DOS/360]], this may be done as part of allocating a file, by a utility specific to the file system or, in some older access methods, on the fly as new data are written.
In current IBM mainframe operating systems derived from [[OS/360]] and [[DOS/360]], such as [[z/OS]] and [[z/VSE]], formatting of drives is done by the INIT command of the [[ICKDSF]] utility.<ref>{{Cite web |url=http://publibz.boulder.ibm.com/epubs/pdf/ick4020f.pdf |title=Device Support Facilities User's Guide and Reference |access-date=2010-12-27 |archive-date=2021-12-09 |archive-url=https://web.archive.org/web/20211209100904/http://publibz.boulder.ibm.com/epubs/pdf/ick4020f.pdf |url-status=dead }}</ref> These OSs support only a single partition per device, called a volume. The ICKDSF functions include writing a Record 0 on every track, writing [[Initial Program Load|IPL]] text, creating a volume label, creating a [[Volume Table of Contents]] (VTOC) and, optionally, creating a VTOC index (VTOCIX); high level formatting may also be done as part of allocating a file, by a utility specific to a file system or, in some older access methods, on the fly as new data are written. In z/OS Unix System Services, there are three distinct levels of high-level formatting:
*Initializing a volume with ICKDSF
*Initializing a [[VSAM]] Linear Data Set (LDS) as part of allocating it on the volume with Access Method Services (IDCAMS) DEFINE
*Initializing a [[zFS (z/OS file system)|zFS]] aggregate in the LDS using ioeagfmt.

In IBM operating systems derived from [[CP-67]], formatting a volume initializes track 0 and a dummy VTOC. Guest operating systems are responsible for formatting [[minidisk (VM)|minidisks]]; the CMS FORMAT command formats a [[CMS file system]] on a CMS minidisk.


== Host protected area ==
== Host protected area ==
{{main|Host protected area}}
{{main|Host protected area}}


The host protected area, sometimes referred to as hidden protected area, is an area of a [[hard drive]] that is high level formatted so that the area is not normally visible to its [[operating system]] (OS).
The host protected area, sometimes referred to as hidden protected area, is an area of a [[hard drive]] that is high-level formatted such that the area is not normally visible to its [[operating system]] (OS).


== Reformatting {{anchor|REFORMAT}} ==<!-- incoming links -->
== Reformatting {{anchor|REFORMAT}} ==<!-- incoming links -->
Reformatting is a [[#HIGH|high-level formatting]] performed on a functioning disk drive to free the contents of its medium. Reformatting is unique to each operating system because what actually is done to existing data varies by OS. The most important aspect of the process is that it frees disk space for use by other data. To actually "erase" everything requires overwriting each block of data on the medium; something that is not done by many PC high-level formatting utilities.
Reformatting is a [[#HIGH|high-level formatting]] performed on a functioning disk drive to free the medium of its contents. Reformatting is unique to each operating system because what actually is done to existing data varies by OS. The most important aspect of the process is that it frees disk space for use by other data. To actually "erase" everything requires overwriting each block of data on the medium; something that is not done by many high-level formatting utilities.


Reformatting often carries the implication that the operating system and all other software will be reinstalled after the format is complete. Rather than fixing an installation suffering from malfunction or security compromise, it is sometimes judged easier to simply reformat everything and start from scratch. Various colloquialism exist for this process, such as "wipe and reload", "nuke and pave", "reimage", etc.
Reformatting often carries the implication that the operating system and all other software will be reinstalled after the format is complete. Rather than fixing an installation suffering from malfunction or security compromise, it may be necessary to simply reformat everything and start from scratch. Various colloquialisms exist for this process, such as "wipe and reload", "nuke and pave", "reimage", etc. However, reformatting a drive containing only user data does not require reinstallation of the OS.


== Formatting ==
== Formatting ==
Line 164: Line 185:
[[Image:Unformat.gif|thumb|MS-DOS 6.22a FORMAT /U switch failing to overwrite content of partition]]
[[Image:Unformat.gif|thumb|MS-DOS 6.22a FORMAT /U switch failing to overwrite content of partition]]


''format command'': Under [[MS-DOS]], [[PC-DOS]], [[OS/2]] and [[Microsoft Windows]], disk formatting can be performed by the <code>[[format (command)|format]]</code> [[command (computing)|command]]. The <code>format</code> program usually asks for confirmation beforehand to prevent accidental removal of data, but some versions of DOS have an undocumented <code>/AUTOTEST</code> option; if used, the usual confirmation is skipped and the format begins right away. The WM/FormatC [[Macro virus (computing)|macro virus]] uses this command to format drive C: as soon as a document is opened.
''format command'': Under [[MS-DOS]], [[PC DOS]], [[OS/2]] and [[Microsoft Windows]], disk formatting can be performed by the <code>[[format (command)|format]]</code> [[command (computing)|command]]. The <code>format</code> program usually asks for confirmation beforehand to prevent accidental removal of data, but some versions of DOS have an undocumented <code>/AUTOTEST</code> option; if used, the usual confirmation is skipped and the format begins right away. The WM/FormatC [[Macro virus (computing)|macro virus]] uses this command to format drive C: as soon as a document is opened.


''Unconditional format'': There is also the undocumented <code>/U</code> parameter that performs an ''unconditional'' format which under most circumstances overwrites the entire partition,<ref>{{cite web |url = http://www.mdgx.com/secrets.htm#FORMAT-U |title = AXCEL216 / MDGx MS-DOS Undocumented + Hidden Secrets |accessdate = 2008-06-07}}</ref> preventing the recovery of data through software. Note however that the <code>/U</code> switch only works reliably with floppy diskettes (see image to the right). Technically because unless <code>/Q</code> is used, floppies are always low level formatted in addition to high-level formatted. Under certain circumstances with hard drive partitions, however, the <code>/U</code> switch merely prevents the creation of <code>[[unformat (command)|unformat]]</code> information in the partition to be formatted while otherwise leaving the partition's contents entirely intact (still on disk but marked deleted). In such cases, the user's data remain ripe for recovery with specialist tools such as [[EnCase]] or [[disk editor]]s. Reliance upon <code>/U</code> for secure overwriting of hard drive partitions is therefore inadvisable, and purpose-built tools such as [[DBAN]] should be considered instead.
''Unconditional format'': There is also the <code>/U</code> parameter that performs an ''unconditional'' format which under most circumstances overwrites the entire partition,<ref>{{cite web |url = http://www.mdgx.com/secrets.htm#FORMAT-U |title = AXCEL216 / MDGx MS-DOS Undocumented + Hidden Secrets |access-date = 2008-06-07}}</ref> preventing the recovery of data through software. Note however that the <code>/U</code> switch only works reliably with floppy diskettes (see image to the right). Technically because unless <code>/Q</code> is used, floppies are always low level formatted in addition to high-level formatted. Under certain circumstances with hard drive partitions, however, the <code>/U</code> switch merely prevents the creation of <code>[[unformat (command)|unformat]]</code> information in the partition to be formatted while otherwise leaving the partition's contents entirely intact (still on disk but marked deleted). In such cases, the user's data remain ripe for recovery with specialist tools such as [[EnCase]] or [[disk editor]]s. Reliance upon <code>/U</code> for secure overwriting of hard drive partitions is therefore inadvisable, and purpose-built tools such as [[DBAN]] should be considered instead.


''Overwriting'': In Windows Vista and upwards the non-quick format will overwrite as it goes. Not the case in Windows XP and below.<ref>
''Overwriting'': In Windows Vista and upwards the non-quick format will overwrite as it goes. Not the case in Windows XP and below.<ref>
Line 175: Line 196:
| quote = The format command behavior has changed in Windows Vista. By default in Windows Vista, the format command writes zeros to the whole disk when a full format is performed. In Windows XP and in earlier versions of the Windows operating system, the format command does not write zeros to the whole disk when a full format is performed.
| quote = The format command behavior has changed in Windows Vista. By default in Windows Vista, the format command writes zeros to the whole disk when a full format is performed. In Windows XP and in earlier versions of the Windows operating system, the format command does not write zeros to the whole disk when a full format is performed.
| date = 2009-02-23
| date = 2009-02-23
| accessdate = 2012-10-24
| access-date = 2012-10-24
}}
}}
</ref>
</ref>


''OS/2'': Under OS/2, if you use the <code>/L</code> parameter, which specifies a ''long'' format, then format will overwrite the entire partition or logical drive. Doing so enhances the ability of [[CHKDSK]] to recover files.
''OS/2'': Under OS/2, format will overwrite the entire partition or logical drive if the <code>/L</code> parameter is used, which specifies a ''long'' format. Doing so enhances the ability of [[CHKDSK]] to recover files.


=== Unix-like operating systems ===
=== Unix-like operating systems ===


High-level formatting of disks on these systems is traditionally done using the <code>[[mkfs]]</code> command. On Linux (and potentially other systems as well) <code>mkfs</code> is typically a wrapper around filesystem-specific commands which have the name <code>mkfs''.fsname''</code>, where ''fsname'' is the name of the filesystem with which to format the disk.<ref>{{cite web |url = http://linux.die.net/man/8/mkfs |title = mkfs(8) - Linux man page |accessdate = 2010-04-25}}</ref> Some filesystems which are not supported by certain implementations of <code>mkfs</code> have their own manipulation tools; for example [[Ntfsprogs]] provides a format utility for the [[NTFS]] filesystem.
High-level formatting of disks on these systems is traditionally done using the <code>[[mkfs]]</code> command. On Linux (and potentially other systems as well) <code>mkfs</code> is typically a wrapper around filesystem-specific commands which have the name <code>mkfs''.fsname''</code>, where ''fsname'' is the name of the filesystem with which to format the disk.<ref>{{cite web |url = http://linux.die.net/man/8/mkfs |title = mkfs(8) - Linux man page |access-date = 2010-04-25}}</ref> Some filesystems which are not supported by certain implementations of <code>mkfs</code> have their own manipulation tools; for example [[Ntfsprogs]] provides a format utility for the [[NTFS]] filesystem.


Some Unix and Unix-like operating systems have higher-level formatting tools, usually for the purpose of making disk formatting easier and/or allowing the user to partition the disk with the same tool. Examples include [[GNU Parted]] (and its various GUI frontends such as [[GParted]] and the [[KDE Partition Manager]]) and the [[Disk Utility]] application on [[Mac OS X]].
Some Unix and Unix-like operating systems have higher-level formatting tools, usually for the purpose of making disk formatting easier and/or allowing the user to partition the disk with the same tool. Examples include [[GNU Parted]] (and its various GUI frontends such as [[GParted]] and the [[KDE Partition Manager]]) and the [[Disk Utility]] application on [[Mac OS X]].


== Recovery of data from a formatted disk ==
== Recovery of data from a formatted disk ==
{{original research|section|date=March 2011}}


As in file deletion by the operating system, data on a disk are not fully erased during every<ref>Data are destroyed in PC operating systems when the '''/L''' (long) option is used on format, for a [[Data set (IBM mainframe)#Partitioned datasets|Partitioned Data Set (PDS)]] in [[MVS]] and for newer file systems on IBM mainframes.</ref> high-level format. Instead, the area on the disk containing the data is merely marked as available, and retains the old data until it is overwritten. If the disk is formatted with a different file system than the one which previously existed on the partition, some data may be overwritten that wouldn't be if the same file system had been used. However, under some file systems (e.g., NTFS, but not FAT), the file indexes (such as $MFTs under NTFS, inodes under ext2/3, etc.) may not be written to the same exact locations. And if the partition size is increased, even FAT file systems will overwrite more data at the beginning of that new partition.
As in file deletion by the operating system, data on a disk are not fully erased during every high-level format. Instead, the area on the disk containing the data is merely marked as available, and retains the old data until it is overwritten. If the disk is formatted with a different file system than the one which previously existed on the partition, some data may be overwritten that wouldn't be if the same file system had been used. However, under some file systems (e.g., NTFS, but not FAT), the file indices (such as $MFTs under NTFS, inodes under ext2/3, etc.) may not be written to the same exact locations. And if the partition size is increased, even FAT file systems will overwrite more data at the beginning of that new partition.


From the perspective of preventing the recovery of sensitive data through recovery tools, the data must either be completely overwritten (every sector) with random data before the format, or the format program itself must perform this overwriting, as the [[DOS]] <code>FORMAT</code> command did with floppy diskettes, filling every data sector with the format filler byte value (typically <code>0xF6</code><!-- as per the INT 1Eh disk parameter table. TBD: IIRC this resembles a bit pattern optimized for FM/MFM controllers -->).
From the perspective of preventing the recovery of sensitive data through recovery tools, the data must be completely overwritten (every sector), either by a separate tool, or during formatting. Data are destroyed in DOS, OS/2, and Windows when the '''/L''' (long) option is used on format and always for a [[Data set (IBM mainframe)#Partitioned datasets|Partitioned Data Set (PDS)]] in [[MVS]] and for newer file systems on IBM mainframes.


{{main|data erasure}}
However there are applications and tools, especially used in forensic information technology, that can recover data that has been conventionally erased. In order to avoid the recovery of sensitive data, governmental organization or big companies use information destruction methods like the [[Gutmann method]].<ref>[http://www.youbioit.com/en/article/shared-information/5317/deleting-files-permanently Deleting files permanently]</ref> For average users there are also special applications that can perform complete data destruction by overwriting previous information. Although there are applications that perform multiple writes a [[data erasure#Number of overwrites needed|single write]] is generally all that is needed on modern hard disk drives.
It is disputed whether one pass of zero-fill is enough to destroy sensitive data on older (until 1990s) magnetic storage: Gutmann (known for his 35-pass [[Gutmann method]]) claims that [[magnetic force microscopy]] may be able to "see" old bits on a floppy,<ref name="Gutmann">Gutmann, Peter. (July 22–25, 1996) ''[https://www.cs.auckland.ac.nz/~pgut001/pubs/secure_del.html Secure Deletion of Data from Magnetic and Solid-State Memory.]'' University of Auckland Department of Computer Science. Epilogue section.</ref> but the sources he cited does not prove such. Random fill is believed to be stronger than a fixed pattern fill.<ref>{{cite web|date=2003|title=Can Intelligence Agencies Recover Overwritten Data?|author=Daniel Feenberg|url=http://www.nber.org/sys-admin/overwritten-data-gutmann.html|access-date=2007-12-10}}</ref> One pass of zero fill is sufficient to prevent [[data remanence]], according to NIST (2014) and Wright et al (2008).<ref>{{cite conference
| first = Craig | last = Wright
|author2=Kleiman, Dave
|author2-link=Dave Kleiman
|author3=Shyaam, Sundhar R.S.
| conference = Information Systems Security ICISS 2008 | title = Overwriting Hard Drive Data: The Great Wiping Controversy | series = Lecture Notes in Computer Science | publisher = Springer Berlin / Heidelberg | isbn = 978-3-540-89861-0 | doi = 10.1007/978-3-540-89862-7_21
| pages = 243–257 |date=December 2008
| volume = 5352 }}</ref><ref>{{cite tech report
| url = https://csrc.nist.gov/publications/detail/sp/800-88/rev-1/final
| title = Special Publication 800-88 Rev. 1: Guidelines for Media Sanitization
| publisher = [[National Institute of Standards and Technology|NIST]]
| date = December 2014 | doi = 10.6028/NIST.SP.800-88r1
| access-date = 2018-06-26
| last1 = Kissel
| first1 = Richard
| last2 = Regenscheid
| first2 = Andrew
| last3 = Scholl
| first3 = Matthew
| last4 = Stine
| first4 = Kevin
| doi-access = free
}}</ref> The ''[[Secure Erase]]'' option built into hard drives is considered trustworthy,<ref name="Secure Deletion">{{cite web
| url=http://www.upenn.edu/computing/security/privacy/data_clear.php
| title=Secure Data Deletion
| date=June 7, 2012
| access-date=9 December 2013
}}</ref><ref>{{cite web
|url=http://tinyapps.org/docs/wipe_drives_hdparm.html
|title=ATA Secure Erase (SE) and hdparm
}} Created: 2011.02.21, updated: 2013.04.02.</ref> with the caveat that early [[solid state drives]] are known to mis-implement the function.<ref name="Wei2011">{{cite q | Q115346857
| journal = FAST'11: Proceedings of the 9th USENIX conference on File and storage technologies
| access-date = 2018-01-08
| ref = {{sfnref|Wei|2011}}
}}</ref>

[[Degaussing]] is effective without controversy; however, this may render the drive [[Degaussing#Irreversible damage to some media types|unusable]].<ref name="Secure Deletion" />


== See also ==
== See also ==

* [[Data erasure]]
* [[Data erasure]]
* [[Data recovery]]
* [[Data recovery]]
* [[Data remanence]]
* [[Data remanence]]
* [[Drive mapping]]
* [[Drive mapping]]
* [[List of default file systems]]
* [[Comparison of file systems]]


== Notes ==
== Notes ==
{{Notelist|refs=

<ref group="lower-alpha" name="NB_Magic_E5">The fact that 8-inch CP/M floppies came pre-formatted with a filler value of <code>0xE5</code> is the reason why the value of <code>0xE5</code> has a special meaning in directory entries in [[FAT12]], [[FAT16]] and [[FAT32]] file systems. This allowed [[86-DOS]] to use 8-inch floppies out of the box or with only the FAT initialized.<!-- <ref name="Schulman_1994_Undocumented-DOS"/> --></ref>
<references group="NB">
<ref group="NB" name="NB_Magic_E5">The fact, that 8-inch CP/M floppies came pre-formatted with a filler value of <code>0xE5</code> is the reason, why the value of <code>0xE5</code> has a special meaning in directory entries in [[FAT12]], [[FAT16]] and [[FAT32]] file systems. This byte configuration allowed [[86-DOS]] to use 8-inch floppies out of the box or with only the FAT initialized.<!-- <ref name="Schulman_1994_Undocumented-DOS"/> --></ref>
<ref group="lower-alpha" name="NB_Format_Wipe">One utility providing an option to specify the desired fill value for hard disks is DR-DOS' FDISK&nbsp;R2.31 with its optional wipe parameter <code>/W:246</code> (for a fill value of <code>0xF6</code>). In contrast to other [[FDISK]] utilities, DR-DOS FDISK is not only a partitioning tool, but can also format freshly created partitions as [[#FAT12|FAT12]], [[#FAT16|FAT16]] or [[#FAT32|FAT32]]. This reduces the risk of accidentally formatting the wrong volume.</ref>
}}
<ref group="NB" name="NB_Format_Wipe">One utility providing an option to specify the desired format filler value for hard disks is DR-DOS' FDISK&nbsp;R2.31 with its optional wipe parameter <code>/W:246</code> (for a format filler byte of <code>0xF6</code>). In contrast to other [[FDISK]] utilities, DR-DOS FDISK is not only a partitioning tool, but can also format freshly created partitions as [[#FAT12|FAT12]], [[#FAT16|FAT16]] or [[#FAT32|FAT32]]. This reduces the risk to accidentally format wrong volumes.</ref>
</references><references group="nb"/>


== References ==
== References ==
{{reflist|refs=

{{reflist|30em|refs=
<ref name="Schulman_1994_Undocumented-DOS">Andrew Schulman, Ralf Brown, David Maxey, Raymond J. Michels, Jim Kyle (1994). ''Undocumented DOS''. Addison Wesley, second edition. ISBN 0-201-63287-X, ISBN 978-0-201-63287-3.
</ref>
}}
}}


== External links ==
== External links ==
* [http://technet.microsoft.com/en-us/library/cc750198.aspx Windows NT Workstation Resource Kit, Chapter 17 - Disk and File System Basics], section "Formatting Hard Disks and Floppy Disks"
* [https://technet.microsoft.com/en-us/library/cc750198.aspx Windows NT Workstation Resource Kit, Chapter 17 - Disk and File System Basics], section "Formatting Hard Disks and Floppy Disks"
* [http://www.cs.auckland.ac.nz/~pgut001/pubs/secure_del.html ''Secure Deletion of Data from Magnetic and Solid-State Memory''] by Peter Gutmann
* [http://www.cs.auckland.ac.nz/~pgut001/pubs/secure_del.html ''Secure Deletion of Data from Magnetic and Solid-State Memory''] by Peter Gutmann
* [http://support.microsoft.com/?kbid=302686 ''Differences between a Quick format and a regular format during a "clean" installation of Windows XP''] from Microsoft Help and Support. Useful for anyone setting up their own computer and needing advice on the subject!
* [http://support.microsoft.com/?kbid=302686 ''Differences between a Quick format and a regular format during a "clean" installation of Windows XP''] from Microsoft Help and Support
* [http://support.microsoft.com/?scid=kb%3Ben-us%3B255867&x=17&y=15 support.microsoft.com — How to Use the Fdisk Tool and the Format Tool to Partition or Repartition a Hard Disk]
* [http://support.microsoft.com/?scid=kb%3Ben-us%3B255867&x=17&y=15 support.microsoft.com — How to Use the Fdisk Tool and the Format Tool to Partition or Repartition a Hard Disk]
* [http://technet.microsoft.com/en-us/library/cc512587.aspx ''Help: I Got Hacked. Now What Do I Do?'']—Microsoft Tech Net: Why you should wipe a compromised drive to the bare metal. Article by Jesper M. Johansson, Ph.D., CISSP, MCSE, MCP+I
* [https://technet.microsoft.com/en-us/library/cc512587.aspx ''Help: I Got Hacked. Now What Do I Do?'']—Microsoft Tech Net: Why you should wipe a compromised drive to the bare metal. Article by Jesper M. Johansson, Ph.D., CISSP, MCSE, MCP+I
* [http://www.ultimatebootcd.com Ultimate Boot CD] - Free utility including many useful dos/linux based tools for system maintenance. It's bootable from a CD or USB and has its own operating systems so it's completely independent from external software.


[[Category:Rotating disc computer storage media|Formatting of disks]]
[[Category:Rotating disc computer storage media|Formatting of disks]]
[[Category:File system management]]
[[Category:File system management]]
[[Category:DOS on IBM PC compatibles]]
[[Category:OS/2]]
[[Category:Windows administration]]

Latest revision as of 06:40, 6 December 2024

Disk formatting is the process of preparing a data storage device such as a hard disk drive, solid-state drive, floppy disk, memory card or USB flash drive for initial use. In some cases, the formatting operation may also create one or more new file systems. The first part of the formatting process that performs basic medium preparation is often referred to as "low-level formatting".[1] Partitioning is the common term for the second part of the process, dividing the device into several sub-devices and, in some cases, writing information to the device allowing an operating system to be booted from it.[1][2] The third part of the process, usually termed "high-level formatting" most often refers to the process of generating a new file system.[1] In some operating systems all or parts of these three processes can be combined or repeated at different levels[a] and the term "format" is understood to mean an operation in which a new disk medium is fully prepared to store files. Some formatting utilities allow distinguishing between a quick format, which does not erase all existing data and a long option that does erase all existing data.

As a general rule,[b] formatting a disk by default leaves most if not all existing data on the disk medium; some or most of which might be recoverable with privileged[c] or special tools.[6] Special tools can remove user data by a single overwrite of all files and free space.[7]

History

[edit]

A block, a contiguous number of bytes, is the minimum unit of storage that is read from and written to a disk by a disk driver. The earliest disk drives had fixed block sizes (e.g. the IBM 350 disk storage unit (of the late 1950s) block size was 100 six-bit characters) but starting with the 1301[8] IBM marketed subsystems that featured variable block sizes: a particular track could have blocks of different sizes. The disk subsystems and other direct access storage devices on the IBM System/360 expanded this concept in the form of Count Key Data (CKD) and later Extended Count Key Data (ECKD); however the use of variable block size in HDDs fell out of use in the 1990s; one of the last HDDs to support variable block size was the IBM 3390 Model 9, announced May 1993.[9]

Modern hard disk drives, such as Serial attached SCSI (SAS)[d] and Serial ATA (SATA)[10] drives, appear at their interfaces as a contiguous set of fixed-size blocks; for many years 512 bytes long but beginning in 2009 and accelerating through 2011, all major hard disk drive manufacturers began releasing hard disk drive platforms using the Advanced Format of 4096 byte logical blocks.[11][12]

Floppy disks generally only used fixed block sizes but these sizes were a function of the host's OS and its interaction with its controller so that a particular type of media (e.g., 5¼-inch DSDD) would have different block sizes depending upon the host OS and controller.

Optical discs generally only use fixed block sizes.

Disk formatting process

[edit]

Formatting a disk for use by an operating system and its applications typically involves three different processes.[e]

  1. Low-level formatting (i.e., closest to the hardware) marks the surfaces of the disks with markers indicating the start of a recording block (typically today called sector markers) and other information like block CRC to be used later, in normal operations, by the disk controller to read or write data. This is intended to be the permanent foundation of the disk, and is often completed at the factory.
  2. Partitioning divides a disk into one or more regions, writing data structures to the disk to indicate the beginning and end of the regions. This level of formatting often includes checking for defective tracks or defective sectors.
  3. High-level formatting creates the file system format within a disk partition or a logical volume.[1] This formatting includes the data structures used by the OS to identify the logical drive or partition's contents. This may occur during operating system installation, or when adding a new disk. Disk and distributed file system may specify an optional boot block, and/or various volume and directory information for the operating system.

Low-level formatting of floppy disks

[edit]

The low-level format of floppy disks (and early hard disks) is performed by the disk drive's controller.

For a standard 1.44 MB floppy disk, low-level formatting normally writes 18 sectors of 512 bytes to each of 160 tracks (80 on each side) of the floppy disk, providing 1,474,560 bytes of storage on the disk.

Physical sectors are actually larger than 512 bytes, as in addition to the 512 byte data field they include a sector identifier field, CRC bytes (in some cases error correction bytes) and gaps between the fields. These additional bytes are not normally included in the quoted figure for overall storage capacity of the disk.

Different low-level formats can be used on the same media; for example, large records can be used to cut down on inter-record gap size.

Several freeware, shareware and free software programs (e.g. GParted, FDFORMAT, NFORMAT, VGA-Copy and 2M) allowed considerably more control over formatting, allowing the formatting of high-density 3.5" disks with a capacity up to 2 MB.

Techniques used include:

  • head/track sector skew (moving the sector numbering forward at side change and track stepping to reduce mechanical delay),
  • interleaving sectors (to boost throughput by organizing the sectors on the track),
  • increasing the number of sectors per track (while a normal 1.44 MB format uses 18 sectors per track, it is possible to increase this to a maximum of 21), and
  • increasing the number of tracks (most drives could tolerate extension to 82 tracks: though some could handle more, others could jam).

Linux supports a variety of sector sizes,[13] and DOS and Windows support a large-record-size DMF-formatted floppy format.[14]

After establishing the structure of tracks, a formatter also needs to fill the entire floppy and look for bad sectors. Traditionally, the physical sectors were initialized with a fill value of 0xF6 as per the INT 1Eh's Disk Parameter Table (DPT) during format on IBM compatible machines. This value is also used on the Atari Portfolio. CP/M 8-inch floppies typically came pre-formatted with a value of 0xE5,[15] and by way of Digital Research this value was also used on Atari ST and some Amstrad formatted floppies.[f] Amstrad otherwise used 0xF4 as a fill value.

Low-level formatting (LLF) of hard disks

[edit]
Low-level format of a 10-megabyte IBM PC XT hard drive

Hard disk drives prior to the 1990s typically had a separate disk controller that defined how data was encoded on the media. With the media, the drive and/or the controller possibly procured from separate vendors, users were often able to perform low-level formatting. Separate procurement also had the potential of incompatibility between the separate components such that the subsystem would not reliably store data.[g]

User-instigated low-level formatting (LLF) of hard disk drives was common for minicomputer and personal computer systems until the 1990s. IBM and other mainframe system vendors typically supplied their hard disk drives (or media in the case of removable media HDDs) with a low-level format. Typically this involved subdividing each track on the disk into one or more blocks which would contain the user data and associated control information. Different computers used different block sizes and IBM notably used variable block sizes but the popularity of the IBM PC caused the industry to adopt a standard of 512 user data bytes per block by the middle 1980s.

Depending upon the system, low-level formatting was generally done by an operating system utility. IBM compatible PCs used the BIOS, which is invoked using the MS-DOS debug program, to transfer control to a routine hidden at different addresses in different BIOSes.[16]

Transition away from LLF

[edit]

Starting in the late 1980s, driven by the volume of IBM compatible PCs, HDDs became routinely available pre-formatted with a compatible low-level format. At the same time, the industry moved from historical (dumb) bit serial interfaces to modern (intelligent) bit serial interfaces and word serial interfaces wherein the low-level format was performed at the factory.[17][18] Accordingly, it is not possible for an end user to low-level format a modern hard disk drive.

Modern disks: reinitialization

[edit]

Modern hard drives can no longer perform post-production LLF, i.e. to re-establish the basic layout of "tracks" and "blocks" on the recording surface. Reinitialization refers to processes that return a disk to a factory-like configuration: no data, no partitioning, all blocks available to use.

Command-set support

[edit]

SCSI provides a Format Unit command. This command performs the needed certification step to weed out bad sectors and has the ability to change sector size. The command-line sg_format program may be used to issue the command.[19] A variety of sector sizes may be chosen, but are not available on all devices: 512, 520, 524, 528, 4096, 4112, 4160, and 4224-byte sectors.[20] Although the SCSI command provides many options, even resizing, it does not touch on the track layer where low-level format happens.[21]

ATA does not expose a low-level format functionality, but they allow the sector size to be changed via SET SECTOR CONFIGURATION (--set-sector-size in hdparm). (Consumer drives usually only support 512 and 4096-byte sectors.) Although sector-size change may scramble data, it is not a safe way of erasing data, nor is any certification done. ATA offers a separate SECURITY ERASE (--security-erase in hdparm) command for erasure.[22]

NVMe drives have a standard method of formatting, available in, for example, the Linux command-line program nvme format. Sector size change and secure erase options are available.[23] Note that NVMe drives are generally solid-state, making this "track" distinction useless.

Seagate Technology drives offer a TTL serial debugging console.[24] Among other things, the console can format the "system" and "user" partitions while performing defect checks (re-initialization over pre-established logical blocks) and modify track parameters (managing the real low-level format).[25]

Disk-filling

[edit]

When the hard drive's built-in reinitialization function (see above) is unavailable due to driver or system limitations, it is possible to fill the entire disk instead. On older hard drives without bad sector management,[26] a program will also need to check for any damaged sectors and try to spare them out. On newer drives with defect management, reallocated sectors may be left unerased, whereas the built-in re-initialization function will erase them.[27]

In modern times, it is most common to fill hard drives with value of 0x00. One popular method for performing this zero-fill operation on a hard disk is by writing zero-value bytes to the drive using the Unix dd utility with the /dev/zero stream as the input file and the drive itself (or a specific partition) as the output file.[28] This command may take many hours to complete, and will erase all files and file systems.

A value of 0xFF is used on flash disks to reduce wear . The latter value is typically also the default value used on ROM disks (which cannot be reformatted). Some advanced tools allow configuring the fill value.[h]

Zero-filling a drive is not a secure method of preparing a drive for use with an encrypted filesystem. Doing so voids the plausible deniability of the process, as the encrypted areas (indistinguishable from random without a key, unless the cipher is compromised) will stand out among zero blocks. The correct technique is to zero-fill inside a temporary encrypted layer then discard the key and layer setup. (/dev/urandom provides similar safety, but tends to be slow.)[29]

Confusion

[edit]

The present ambiguity in the term low-level format seems to be due to both inconsistent documentation on web sites and the belief by many users that any process below a high-level (file system) format must be called a low-level format. Since much of the low-level formatting process can today only be performed at the factory, various drive manufacturers describe reinitialization software as LLF utilities on their web sites. Since users generally have no way to determine the difference between a complete LLF and reinitialization (they simply observe running the software results in a hard disk that must be high-level formatted), both the misinformed user and mixed signals from various drive manufacturers have perpetuated this error.

Note: whatever possible misuse of such terms may exist, many sites do make such reinitialization utilities available (possibly as bootable floppy diskette or CD image files), to both overwrite every byte and check for damaged sectors on the hard disk.

Partitioning

[edit]

Partitioning is the process of writing information into blocks of a storage device or medium to divide the device into several sub-devices, each of which is treated by the operating system as a separate device and, in some cases, to allow an operating system to be booted from the device.

On MS-DOS, Microsoft Windows, and UNIX-based operating systems (such as BSD, Linux and macOS) this is normally done with a partition editor, such as fdisk, GNU Parted, or Disk Utility. These operating systems support multiple partitions.

Floppy disks are not partitioned; however depending upon the OS they may require volume information in order to be accessed by the OS.

Partition editors and ICKDSF today do not handle low-level functions for HDDs and optical disc drives such as writing timing marks, and they cannot reinitialize a modern disk that has been degaussed or otherwise lost the factory formatting.

IBM operating systems derived from CP-67, e.g., z/VM, maintain partitioning information for minidisks externally to the drive.

High-level formatting

[edit]

High-level formatting is the process of setting up an empty file system on a disk partition or a logical volume and for PCs, installing a boot sector.[1] This is often a fast operation, and is sometimes referred to as quick formatting.

Formatting an entire logical drive or partition may optionally scan for defects, which may take considerable time.

In the case of floppy disks, both high- and low-level formatting are customarily performed in one pass by the disk formatting software. Eight-inch floppies typically came low-level formatted and were filled with a format filler value of 0xE5.[15][f] Since the 1990s, most 5.25-inch and 3.5-inch floppies have been shipped pre-formatted from the factory as DOS FAT12 floppies.

In current IBM mainframe operating systems derived from OS/360 and DOS/360, such as z/OS and z/VSE, formatting of drives is done by the INIT command of the ICKDSF utility.[30] These OSs support only a single partition per device, called a volume. The ICKDSF functions include writing a Record 0 on every track, writing IPL text, creating a volume label, creating a Volume Table of Contents (VTOC) and, optionally, creating a VTOC index (VTOCIX); high level formatting may also be done as part of allocating a file, by a utility specific to a file system or, in some older access methods, on the fly as new data are written. In z/OS Unix System Services, there are three distinct levels of high-level formatting:

  • Initializing a volume with ICKDSF
  • Initializing a VSAM Linear Data Set (LDS) as part of allocating it on the volume with Access Method Services (IDCAMS) DEFINE
  • Initializing a zFS aggregate in the LDS using ioeagfmt.

In IBM operating systems derived from CP-67, formatting a volume initializes track 0 and a dummy VTOC. Guest operating systems are responsible for formatting minidisks; the CMS FORMAT command formats a CMS file system on a CMS minidisk.

Host protected area

[edit]

The host protected area, sometimes referred to as hidden protected area, is an area of a hard drive that is high-level formatted such that the area is not normally visible to its operating system (OS).

Reformatting

[edit]

Reformatting is a high-level formatting performed on a functioning disk drive to free the medium of its contents. Reformatting is unique to each operating system because what actually is done to existing data varies by OS. The most important aspect of the process is that it frees disk space for use by other data. To actually "erase" everything requires overwriting each block of data on the medium; something that is not done by many high-level formatting utilities.

Reformatting often carries the implication that the operating system and all other software will be reinstalled after the format is complete. Rather than fixing an installation suffering from malfunction or security compromise, it may be necessary to simply reformat everything and start from scratch. Various colloquialisms exist for this process, such as "wipe and reload", "nuke and pave", "reimage", etc. However, reformatting a drive containing only user data does not require reinstallation of the OS.

Formatting

[edit]

DOS, OS/2 and Windows

[edit]
MS-DOS 6.22a FORMAT /U switch failing to overwrite content of partition

format command: Under MS-DOS, PC DOS, OS/2 and Microsoft Windows, disk formatting can be performed by the format command. The format program usually asks for confirmation beforehand to prevent accidental removal of data, but some versions of DOS have an undocumented /AUTOTEST option; if used, the usual confirmation is skipped and the format begins right away. The WM/FormatC macro virus uses this command to format drive C: as soon as a document is opened.

Unconditional format: There is also the /U parameter that performs an unconditional format which under most circumstances overwrites the entire partition,[31] preventing the recovery of data through software. Note however that the /U switch only works reliably with floppy diskettes (see image to the right). Technically because unless /Q is used, floppies are always low level formatted in addition to high-level formatted. Under certain circumstances with hard drive partitions, however, the /U switch merely prevents the creation of unformat information in the partition to be formatted while otherwise leaving the partition's contents entirely intact (still on disk but marked deleted). In such cases, the user's data remain ripe for recovery with specialist tools such as EnCase or disk editors. Reliance upon /U for secure overwriting of hard drive partitions is therefore inadvisable, and purpose-built tools such as DBAN should be considered instead.

Overwriting: In Windows Vista and upwards the non-quick format will overwrite as it goes. Not the case in Windows XP and below.[32]

OS/2: Under OS/2, format will overwrite the entire partition or logical drive if the /L parameter is used, which specifies a long format. Doing so enhances the ability of CHKDSK to recover files.

Unix-like operating systems

[edit]

High-level formatting of disks on these systems is traditionally done using the mkfs command. On Linux (and potentially other systems as well) mkfs is typically a wrapper around filesystem-specific commands which have the name mkfs.fsname, where fsname is the name of the filesystem with which to format the disk.[33] Some filesystems which are not supported by certain implementations of mkfs have their own manipulation tools; for example Ntfsprogs provides a format utility for the NTFS filesystem.

Some Unix and Unix-like operating systems have higher-level formatting tools, usually for the purpose of making disk formatting easier and/or allowing the user to partition the disk with the same tool. Examples include GNU Parted (and its various GUI frontends such as GParted and the KDE Partition Manager) and the Disk Utility application on Mac OS X.

Recovery of data from a formatted disk

[edit]

As in file deletion by the operating system, data on a disk are not fully erased during every high-level format. Instead, the area on the disk containing the data is merely marked as available, and retains the old data until it is overwritten. If the disk is formatted with a different file system than the one which previously existed on the partition, some data may be overwritten that wouldn't be if the same file system had been used. However, under some file systems (e.g., NTFS, but not FAT), the file indices (such as $MFTs under NTFS, inodes under ext2/3, etc.) may not be written to the same exact locations. And if the partition size is increased, even FAT file systems will overwrite more data at the beginning of that new partition.

From the perspective of preventing the recovery of sensitive data through recovery tools, the data must be completely overwritten (every sector), either by a separate tool, or during formatting. Data are destroyed in DOS, OS/2, and Windows when the /L (long) option is used on format and always for a Partitioned Data Set (PDS) in MVS and for newer file systems on IBM mainframes.

It is disputed whether one pass of zero-fill is enough to destroy sensitive data on older (until 1990s) magnetic storage: Gutmann (known for his 35-pass Gutmann method) claims that magnetic force microscopy may be able to "see" old bits on a floppy,[34] but the sources he cited does not prove such. Random fill is believed to be stronger than a fixed pattern fill.[35] One pass of zero fill is sufficient to prevent data remanence, according to NIST (2014) and Wright et al (2008).[36][37] The Secure Erase option built into hard drives is considered trustworthy,[27][38] with the caveat that early solid state drives are known to mis-implement the function.[39]

Degaussing is effective without controversy; however, this may render the drive unusable.[27]

See also

[edit]

Notes

[edit]
  1. ^ E.g., formatting a volume, formatting a Virtual Storage Access Method Linear Data Set (LDS) on the volume to contain a zFS and formatting the zFS in UNIX System Services.
  2. ^ Not true for CMS file system[3] on a CMS minidisk, TSS VAM-formatted volume,[4] z/OS Unix file systems[5] or VSAM in IBM mainframes
  3. ^ E.g., AMASPZAP in MVS
  4. ^ "The LBAs on a logical unit shall begin with zero and shall be contiguous up to the last logical block on the logical unit"., Information technology — Serial Attached SCSI - 2 (SAS-2), INCITS 457 Draft 2, May 8, 2009, chapter 4.1 Direct-access block device type model overview.
  5. ^ Each process may involve multiple steps, and steps of different processes may be interleaved.
  6. ^ a b The fact that 8-inch CP/M floppies came pre-formatted with a filler value of 0xE5 is the reason why the value of 0xE5 has a special meaning in directory entries in FAT12, FAT16 and FAT32 file systems. This allowed 86-DOS to use 8-inch floppies out of the box or with only the FAT initialized.
  7. ^ This problem became common in PCs where users used RLL controllers with MFM drives; "MFM drives should not be used on RLL controllers.".
  8. ^ One utility providing an option to specify the desired fill value for hard disks is DR-DOS' FDISK R2.31 with its optional wipe parameter /W:246 (for a fill value of 0xF6). In contrast to other FDISK utilities, DR-DOS FDISK is not only a partitioning tool, but can also format freshly created partitions as FAT12, FAT16 or FAT32. This reduces the risk of accidentally formatting the wrong volume.

References

[edit]
  1. ^ a b c d e Tanenbaum, Andrew (2001). Modern Operating Systems (2nd ed.). Prentice Hall. section 3.4.2, Disk Formatting. ISBN 0130313580.
  2. ^ "Disk Devices and Partitions". Microsoft Docs. 7 January 2021.
  3. ^ "FORMAT", z/VM CMS Commands and Utilities Reference, z/VM Version 5 Release 4, IBM, 2008, SC24-6073-03, When you do not specify either the RECOMP or LABEL option, the disk area is initialized by writing a device-dependent number of records (containing binary zeros) on each track. Any previous data on the disk is erased.
  4. ^ IBM, "Virtual Access Methods", IBM System/360 Time Sharing System System Logic Summary Program Logic Manual (PDF), IBM, p. 56 (PDF 66), GY28-2009-2, The direct access volumes, on which TSS/360 virtual organization data sets are stored, have fixed-length, page size data blocks. No key field is required. The record overflow feature is utilized to allow data blocks to span tracks, as required. The entire volume, with the current exception of part of the first cylinder, which is used for identification, is formatted into page size blocks.
  5. ^ "ioeagfmt" (PDF). z/OS 2.4 File System Administration (PDF). IBM. pp. 116–119. SC23-6887-40.
  6. ^ Hermans, Sherman (28 August 2006). "How to recover lost files after you accidentally wipe your hard drive". Linux.com. Retrieved 28 November 2019.
  7. ^ Smithson, Brian (29 August 2011). "The Urban Legend of Multipass Hard Disk Overwrite and DoD 5220-22-M". Infosec Island. Archived from the original on 5 October 2018. Retrieved 22 November 2012.
  8. ^ "IBM 1301 disk storage unit". IBM. 23 January 2003. Archived from the original on April 26, 2005. Retrieved 2010-06-24.
  9. ^ "IBM 3390 direct access storage device". IBM. 23 January 2003. Archived from the original on January 24, 2005.
  10. ^ ISO/IEC 791D:1994, AT Attachment Interface for Disk Drives (ATA-1), section 7.1.2
  11. ^ Smith, Ryan (2009-12-18). "Western Digital's Advanced Format: The 4K Sector Transition Begins". Anandtech.
  12. ^ "Transition to Advanced Format 4K Sector Hard Drives". Seagate Technology.
  13. ^ "Fdutils".
  14. ^ "Definition of Distribution Media Format (DMF)". Microsoft Knowledge Base. 2007-01-19. Archived from the original on 2011-09-14. Retrieved 2011-10-16.
  15. ^ a b Schulman, Andrew; Brown, Ralf D.; Maxey, David; Michels, Raymond J.; Kyle, Jim (1994) [November 1993]. Undocumented DOS: A programmer's guide to reserved MS-DOS functions and data structures - expanded to include MS-DOS 6, Novell DOS and Windows 3.1 (2 ed.). Addison Wesley. ISBN 0-201-63287-X. (xviii+856+vi pages, 3.5"-floppy) Errata: [1][2]
  16. ^ Using DEBUG to Start a Low-Level Format, Microsoft
  17. ^ "Low level formatting an IDE hard drive". FreePCTech.com. The NOSPIN Group, Inc. Archived from the original on July 16, 2012. Retrieved December 24, 2003.
  18. ^ "Low-Level Format, Zero-Fill and Diagnostic Utilities". The PC Guide. Site Version: 2.2.0 - Version Date: April 17, 2001. Archived from the original on January 3, 2019. Retrieved May 24, 2007.
  19. ^ sg_format(8) – Linux Programmer's Manual – Administration and Privileged Commands
  20. ^ Seagate SAS drives Archived 2010-11-29 at the Wayback Machine
  21. ^ "INCITS 506-202x - Information technology - SCSI Block Commands - 4 (SBC-4) draft revision 22". 15 September 2020. Retrieved 22 May 2023.
  22. ^ hdparm(8) – Linux Programmer's Manual – Administration and Privileged Commands
  23. ^ nvme-format(1) – Linux User Manual – User Commands
  24. ^ "Seagate Serial Talk | OS/2 Museum".
  25. ^ "F3 Serial Port Diagnostics". older version available from
  26. ^ "BadBlockHowto – smartmontools". www.smartmontools.org.
  27. ^ a b c "Secure Data Deletion". June 7, 2012. Retrieved 9 December 2013.
  28. ^ "How to Securely Erase (Wipe) a Hard Drive for Free with DD". myfixlog.com. Archived from the original on April 18, 2016.
  29. ^ Quickly fill a disk with random bits
  30. ^ "Device Support Facilities User's Guide and Reference" (PDF). Archived from the original (PDF) on 2021-12-09. Retrieved 2010-12-27.
  31. ^ "AXCEL216 / MDGx MS-DOS Undocumented + Hidden Secrets". Retrieved 2008-06-07.
  32. ^ "MSKB941961: Change in the behavior of the format command in Windows Vista". Microsoft Corporation. 2009-02-23. Retrieved 2012-10-24. The format command behavior has changed in Windows Vista. By default in Windows Vista, the format command writes zeros to the whole disk when a full format is performed. In Windows XP and in earlier versions of the Windows operating system, the format command does not write zeros to the whole disk when a full format is performed.
  33. ^ "mkfs(8) - Linux man page". Retrieved 2010-04-25.
  34. ^ Gutmann, Peter. (July 22–25, 1996) Secure Deletion of Data from Magnetic and Solid-State Memory. University of Auckland Department of Computer Science. Epilogue section.
  35. ^ Daniel Feenberg (2003). "Can Intelligence Agencies Recover Overwritten Data?". Retrieved 2007-12-10.
  36. ^ Wright, Craig; Kleiman, Dave; Shyaam, Sundhar R.S. (December 2008). Overwriting Hard Drive Data: The Great Wiping Controversy. Information Systems Security ICISS 2008. Lecture Notes in Computer Science. Vol. 5352. Springer Berlin / Heidelberg. pp. 243–257. doi:10.1007/978-3-540-89862-7_21. ISBN 978-3-540-89861-0.
  37. ^ Kissel, Richard; Regenscheid, Andrew; Scholl, Matthew; Stine, Kevin (December 2014). Special Publication 800-88 Rev. 1: Guidelines for Media Sanitization (Technical report). NIST. doi:10.6028/NIST.SP.800-88r1. Retrieved 2018-06-26.
  38. ^ "ATA Secure Erase (SE) and hdparm". Created: 2011.02.21, updated: 2013.04.02.
  39. ^ Michael Wei; Laura M. Grupp; Frederick E. Spada; Steven Swanson (2011). "Reliably Erasing Data From Flash-Based Solid State Drives". FAST'11: Proceedings of the 9th USENIX conference on File and storage technologies. Wikidata Q115346857. Retrieved 2018-01-08.
[edit]