Jump to content

initial ramdisk

From Wikipedia, the free encyclopedia

This is an old revision of this page, as edited by Scruffy brit (talk | contribs) at 09:50, 3 August 2008 (Re-wrote the rationale section, splitted out a methodoolgy section, and removed some material that conflicts with kernel behaviour and docs.). The present address (URL) is a permanent link to this revision, which may differ significantly from the current revision.

The initial ramdisk, or initrd is a temporary file system commonly used by the Linux kernel during boot[1]. The initrd is typically used for making preparations before the real root file system can be mounted.

Rationale

The complete linux driver set, covering every interface and file system that might be required, runs to many megabytes of object code. Clearly it is undesirable to waste memory on drivers for hardware that the system doesn't have, and so many drivers are provided as modules, that are loaded only once they're needed, instead of monolithically with the kernel itself. This saves memory, as it is not possible to unload drivers that are "monolithically" compiled into the kernel.

A system can only boot if it has drivers for the boot hardware. In principle this could be include any extant network or storage device, yet if every driver was pre-loaded just "in case" it was needed, the kernel would become infeasibly large.

This problem is solved by storing the modules in a ramdisk, which is loaded into memory along side the kernel during the operating system boot process. As all the modules are available in memory, they can then be copied into the kernel without resorting to other drivers. Once the kernel has loaded all the drivers it needs for the available hardware, the ramdisk can be released, and the "real" root filing system mounted. Hence initrd, the INITial RamDisk was created.

Methodology

An initrd comprises a small, yet valid, root filing system, that is, it contains an executable "init" or linuxrc program, and all libraries and interpreters required by it. The programs contained in initrd run in userspace, and direct the kernel to load appropriate modules, and once properly prepared, mount the main file system.

The construction of an initrd image is described in the kernel documentation, essentially the files that will be required at boot time are assembled into a file-system tree. This tree can then be archived into a single file using cpio or turned into a file-system image either by use of a loopback device or a small mountable device. The archive (or image file) can be compressed with gzip to further save space.

At boot time, the boot loader will load the kernel and the ram disk into memory, The boot loader will then pass the memory location of the ram disk, and control of the system, to the kernel.

In principal initrd can be part of any linux system which either requires boot time configuration, or which can function from a read-only file system, to support it's use on PCs, both the LILO and GRUB boot loaders provide syntax and functionality for loading initrd images.

The initrd system has attracted some criticism for employing "kludges"

The "loopback" method uses the kernel file system drivers to manipulate what a file, rather than a hardware device. This is unpopular with many administrators because it requires [Superuser|Root privileges]] to be used. As it is theoretically possibly to create a filesystem image purely from userspace, this breaks the doctrine of "no unnecessary root activity"
The file system used on the ramdisk has to be compiled directly into the kernel, if the same file-system won't be used by the rest of the system this can commit a considerable amount of code to the kernel.

Both of these points have been addressed by the recent introduction of cpio based ramdisks, which can be assembled from userspace, and have simpler drivers.

Initrd has been in use for a long time[1], although recently developed initramfs introduced in kernel 2.6 makes significant improvements. [2]

Initramfs in comparison with initrd

initramfs is an alternative, simpler method of having files available at boot time without having them in a persistent mountable filesystem.

With initrd, the kernel creates a memory-backed block device, loads it with data from a file, then mounts the device to create a filesystem image. With initramfs, the kernel creates a filesystem image directly from a file, without involving any block device. [2]

With initrd, the file contains a filesystem in on-disk format, while with initramfs, the file contains a compressed cpio archive. That means it is more convenient for a system builder to create and modify the contents of the filesystem with initramfs. With initramfs, there is no mounting, unmounting, and loop device involved. There is no privileged system operation at all.

With initramfs, the filesystem is typically a tmpfs, shmfs, or ramfs filesystem. These filesystem types did not exist when initrd became popular, and that history is usually the reason that initrd is used instead of initramfs.

The boot-time operation of initrd is slightly simpler than that of initramfs. With initrd, the boot loader copies the filesystem directly from the file to the memory that backs a ramdisk, then the kernel need only define a ramdisk over that same memory and do a standard filesystem mount. With initramfs, there are equivalent steps to get the cpio archive into memory and create a filesystem image, plus an additional procedure to unpack the cpio archive into the new filesystem image. The kernel must know the cpio archive format. Some of the computation that happens at build time with initrd happens at boot time with initramfs.

Uses

As well as providing for loading of necessary code preparatory to booting code from a fixed disk, a ramdisk (either initrd or initramfs) may be useful as a rescue disk, for use in applying security updates, backing up files, conducting forensics or debugging hardware problems, or to obviate a hard disk, perhaps in order to provide network-stored OS images, or in order to boot a slow medium such as a CD-ROM.

Many Linux distributions ship a single, generic kernel image that is intended to boot as wide a variety of hardware as possible. The drivers included with this generic kernel image must be modular, as it is not possible to statically compile everything into the one kernel without making it too large to boot from computers with limited memory or from lower-capacity media like floppy disks.

This then raises the problem of detecting and loading the modules necessary to mount the root file system at boot time (or, for that matter, deducing where or what the root file system is).

To further complicate matters, the root file system may be on a software RAID volume, LVM, a network file system of some sort (NFS is common on diskless computers) or on an encrypted partition. All of these require special preparations to mount.

End-user implementation

The kernel image and initrd image must both be stored somewhere accessible by the boot firmware of the computer or the Linux bootloader. On PCs, this is usually:

  • The root file system itself
  • A small ext2 or FAT-formatted partition on a local disk (a boot partition)
  • A filesystem on CD in case of Live CDs
  • A TFTP server (on systems that can boot from Ethernet)

The bootloader will load the kernel and initrd image into memory and then start the kernel, passing in the memory address of the initrd. At the end of its boot sequence, the kernel tries to determine the format of the image from its first few blocks of data:

  • If the image is a (optionally gzip-compressed) file system image, then it will be made available as a special block device (/dev/ram), which is then mounted as the initial root file system. The driver for that file system must be compiled statically into the kernel. Many distributions originally used compressed ext2 file systems as initrd images. Others (including Debian 3.1) used cramfs in order to boot on memory-limited systems, since the cramfs image can be mounted in-place without requiring extra space for decompression.
Once the initial root file system is up, the kernel executes "/linuxrc" (linux run command) as its first process. When it exits, the kernel assumes that "/linuxrc" has mounted the real root file system and executes "/sbin/init" to begin the normal user-space boot process.
  • If the image is a gzip-compressed cpio archive, it will be unpacked by the kernel in an intermediate step into a tmpfs, which then becomes the initial root file system. This scheme has been dubbed initramfs and is available on Linux 2.6.13 onwards. It has the advantage of not requiring an intermediate file system to be compiled into the kernel.
On an initramfs, the kernel executes "/init" as its first process. "/init" is not expected to exit.

Some Linux distributions will generate a customized initrd image which contains only whatever is necessary to boot that particular computer, such as ATA, SCSI and filesystem kernel modules. These typically embed the location and type of the root file system.

Other distributions (such as Fedora and Ubuntu) generate a more generic initrd image. These start only with the device name of the root file system (or its UUID) and must discover everything else at boot time. In this case, a complex cascade of tasks must be performed to get the root file system mounted:

  • Any hardware drivers that the boot process depends on must be loaded. A common arrangement is to pack kernel modules common storage devices onto the initrd and then invoke a hotplug agent to pull in modules matching the computer's detected hardware.
  • If the root file system is on NFS:
    • Bring up the primary network interface.
    • Invoke a DHCP client, with which it can obtain a DHCP lease.
    • Extract the address of the NFS server from the lease.
    • Mount the NFS share.
  • If the root file system appears to be on a software RAID device, there is no way of knowing which devices the RAID volume spans; the standard MD utilities must be invoked to scan all available block devices and bring the required one online.
  • If the root file system appears to be on a logical volume, the LVM utilities must be invoked to scan for and activate the volume group containing it.
  • If the root file system is on an encrypted block device:
    • Invoke a helper script to prompt the user to type in a passphrase and/or insert a hardware token (such as a smart card or a USB security dongle).
    • Create a decryption target with the device mapper.
  • Perform any maintenance tasks which cannot otherwise be safely done on a mounted root file system.
  • Mount the root file system read-only.

The final root file system cannot simply be mounted over "/", since that would make the scripts and tools on the initial root file system inaccessible for any final cleanup tasks. Instead, it is mounted at a temporary mount point and rotated into place with pivot_root(8) (which was introduced specifically for this purpose). This leaves the initial root file system at a mount point (such as "/initrd") where normal boot scripts can later unmount it to free up memory held by the initrd.

When using an initramfs image it's not possible to use pivot_root. Instead "switch_root" must be used.

Most initial root file systems implement "/linuxrc" or "/init" as a shell script and thus include a minimal shell (usually /bin/ash) along with some essential user-space utilities (usually the BusyBox toolkit). To further save space, the shell, utilities and their supporting libraries are typically compiled with space optimizations enabled (such as with gcc's "-Os" flag) and linked against a stripped-down version of the C library such as dietlibc or klibc.

Some distributions (notably, SUSE Linux and Ubuntu) further use the initrd to paint a bootsplash animation onto the display early on in the boot process.

Notes

References

  • Landley, Rob (15 March 2005), Introducing initramfs, a new model for initial RAM disks, linuxdevices.com