inode
In computing, an inode is a data structure on a traditional Unix-style file system such as UFS. An inode stores basic information about a regular file, directory, or other file system object.
Details
When a file system is created, data structures that contain information about files are created. Each file has an inode and is identified by an inode number (often referred to as an "i-number" or "inode") in the file system where it resides.
Inodes store information on files such as user and group ownership, access mode (read, write, execute permissions) and type of file. There is a fixed number of inodes, which indicates the maximum number of files each file system can hold. Typically when a file system is created about 1% of it is devoted to inodes.
The term inode usually refers to inodes on block devices that manage regular files, directories, and possibly symbolic links. The concept is particularly important to the recovery of damaged file systems.
- The inode number indexes a table of inodes in a known location on the device; from the inode number, the kernel can access the contents of the inode, including the data pointers, and so the contents of the file.
- A file's inode number can be found using the ls -i command, while the ls -l command will retrieve inode information.
- Non-traditional Unix-style filesystems such as ReiserFS may avoid having a table of inodes, but must store equivalent data in order to provide equivalent function. The data may be called stat data, in reference to the
stat
system call that provides the data to programs.
File names and directory implications
- Inodes do not contain filenames, only file contents.
- Unix directories are lists of "link" structures, each of which contains one filename and one inode number.
- The kernel must search a directory looking for a particular filename and then convert the filename to the correct corresponding inode number if the name is found.
The kernel's in-memory representation of this data is called struct inode
in Linux. Systems derived from BSD use the term vnode
, with the v of vnode referring to the kernel's virtual file system layer.
POSIX inode description
The POSIX standard mandates filesystem behavior that is strongly influenced by traditional UNIX filesystems. Regular files are required to have the following attributes:
- The length of the file in bytes.
- Device ID (this identifies the device containing the file).
- The User ID of the file's owner.
- The Group ID of the file.
- The file mode, which determines what users can read, write, and execute the file.
- Timestamps telling when the inode itself was last changed (ctime, change time), the file content last modified (mtime, modification time), and last accessed (atime, access time).
- A reference count telling how many hard links point to the inode.
- Pointers to the disk blocks that store the file's contents.
The stat
system call retrieves a file's inode number and some of the information in the inode.
Word origin
The exact reasoning for designating these as "i" nodes is unsure. When asked, Unix pioneer Dennis Ritchie replied:[citation needed]
In truth, I don't know either. It was just a term that we started to use. "Index" is my best guess, because of the slightly unusual file system structure that stored the access information of files as a flat array on the disk, with all the hierarchical directory information living aside from this. Thus the i-number is an index in this array, the i-node is the selected element of the array. (The "i-" notation was used in the 1st edition manual; its hyphen became gradually dropped).
Example of structure:
Variations in inode file systems
There are some important variations in inode file systems in current use: Defragmentation
- An inode file system would have to be offline to be fully defragmented on most systems -- but some online defragmentation tools exist.
- inode systems when defragmented can have data extraction rates higher than FAT32 or NTFS under optimal conditions.
Hashing related issues
- Basing the file hash on the inode number is a bit restrictive as it assumes that every file-system can uniquely identify a file in 32 bits. This is a problem at least of the NFS file-system, which would prefer to use the 256 bit file handle as the unique identifier in the hash.
- i_generation (inode data structure field) : The intent of i_generation is to be able to distinguish between an inode before and after a delete/reuse cycle. This is important for NFS. Currently, only ext2 and nfsd maintain this field.
- i_generation (inode data structure, VFS interoperability) : It is not clear if i_generation could be exported to the VFS layer at all. It has a specific use that cannot be readily transferred to other file systems.
Swap files
- Swap files in Linux typically cannot be located in files whose inodes are doubly indirect, or in some cases singly indirect. [citation needed]
- The swap file and sleep mode swap file don't need to be any bigger than 10 GiB, as a matter of practice. [citation needed]
Practical considerations
Many computer programs used by system administrators in UNIX operating systems often give inode numbers to designate a file. Popular disk integrity checking utility fsck
or pfiles
command may serve here as examples. Thus, the need naturally arises to translate inode numbers to file pathnames and vice versa. This can be accomplished using file-finding utility find
with option -inum
or ls
command with proper option which on many platforms is -i
.
It is possible to "run out" of inodes. When this happens, you cannot add data to the device, even though there may be free space available.
Y2038 problem
Some Inode file systems are Y2038 (aka Unix time) safe with respect to date overflow prevention -- but not all Inode file systems in use are protected from this problem. When setting up a server it will become more important over time to avoid the use of these non-POSIX compliant file systems. POSIX in its latest revision supports system time and date calls that are Y2038 safe.