OpenVZ

Developer(s)	Community project,; supported by Parallels
Repository	src.openvz.org/scm/ovz/openvz-docs.git ;
Operating system	Linux
Platform	x86, x86-64, IA-64, PowerPC, SPARC
Type	OS-level virtualization
License	GNU GPL v.2
Website	http://openvz.org

OpenVZ (Open Virtuozzo) is an operating system-level virtualization technology based on the Linux kernel and operating system. OpenVZ allows a physical server to run multiple isolated operating system instances, known as containers, Virtual Private Servers (VPSs), or Virtual Environments (VEs). It is similar to FreeBSD Jails and Solaris Zones.

OpenVZ requires both the host and guest OS to be Linux (although Linux distributions can be different in different containers). It also requires rebooting if VM processes get i/o hangs. However, OpenVZ claims a performance advantage; according to its website, there is only a 1–3% performance penalty for OpenVZ as compared to using a standalone server.^[1] One independent performance evaluation^[2] confirms this. Another shows more significant performance penalties^[3] depending on the metric used.

OpenVZ is the basis of Virtuozzo Containers, a proprietary software product provided by Parallels, Inc. OpenVZ is licensed under the GPL version 2 and is supported and sponsored by Parallels whereas the company does not offer commercial end-user support for OpenVZ.

The OpenVZ is divided into a custom kernel and user-level tools.

OpenVZ compared to other Virtualization Technologies

OpenVZ is not true virtualization but really containerization like FreeBSD Jails. Technologies like VMWare and Xen are more flexible in that they virtualize the entire machine and can run multiple operating systems, at the expense of greater overhead required to handle hardware virtualization. OpenVZ uses a single patched Linux kernel and therefore can run only Linux. However because it doesn't have the overhead of a true hypervisor, it is very fast and efficient. The disadvantage with this approach is the single kernel. All guests must function with the same kernel version that the host uses.

The advantages, however, are that memory allocation is soft in that memory not used in one virtual environment can be used by others or for disk caching. OpenVZ uses a common file system so each virtual environment is just a directory of files that is isolated using chroot, newer versions of OpenVZ also allow the container to have its own file system.^[4] Thus a virtual machine can be cloned by just copying the files in one directory to another and creating a config file for the virtual machine and starting it.

Kernel

The OpenVZ kernel is a Linux kernel, modified to add support for OpenVZ containers. The modified kernel provides virtualization, isolation, resource management, and checkpointing.

Virtualization and isolation

Each container is a separate entity, and behaves largely as a physical server would. Each has its own:

Files: System libraries, applications, virtualized /proc and /sys, virtualized locks, etc.

Users and groups: Each container has its own root user, as well as other users and groups.

Process tree: A container only sees its own processes (starting from init). PIDs are virtualized, so that the init PID is 1 as it should be.

Network: Virtual network device, which allows a container to have its own IP addresses, as well as a set of netfilter (iptables), and routing rules.

Devices: If needed, any container can be granted access to real devices like network interfaces, serial ports, disk partitions, etc.

IPC objects: Shared memory, semaphores, messages.

Resource management

OpenVZ resource management consists of three components: two-level disk quota, fair CPU scheduler, and user beancounters. These resources can be changed during container run time, eliminating the need to reboot.

Two-level disk quota

Each container can have its own disk quotas, measured in terms of disk blocks and inodes (roughly number of files). Within the container, it is possible to use standard tools to set UNIX per-user and per-group disk quotas.

CPU scheduler

The CPU scheduler in OpenVZ is a two-level implementation of fair-share scheduling strategy.

On the first level, the scheduler decides which container it is to give the CPU time slice to, based on per-container cpuunits values. On the second level the standard Linux scheduler decides which process to run in that container, using standard Linux process priorities.

It is possible to set different values for the CPUs in each container. Real CPU time will be distributed proportionally to these values.

Strict limits, such as 10% of total CPU time, are also possible.

I/O scheduler

Similar to the CPU scheduler described above, I/O scheduler in OpenVZ is also two-level, utilizing Jens Axboe's CFQ I/O scheduler on its second level.

Each container is assigned an I/O priority, and the scheduler distributes the available I/O bandwidth according to the priorities assigned. Thus no single container can saturate an I/O channel.

User Beancounters

User Beancounters is a set of per-container counters, limits, and guarantees. There is a set of about 20 parameters which is meant to control all the aspects of container operation. This is meant to prevent a single container from monopolizing system resources.

These resources primarily consist of memory and various in-kernel objects such as IPC shared memory segments, and network buffers. Each resource can be seen from /proc/user_beancounters and has five values associated with it: current usage, maximum usage (for the lifetime of a container), barrier, limit, and fail counter. The meaning of barrier and limit is parameter-dependent; in short, those can be thought of as a soft limit and a hard limit. If any resource hits the limit, the fail counter for it is increased. This allows the owner to detect problems by monitoring /proc/user_beancounters in the container.

Values in User Beancounter
Value	Meaning
lockedpages	The memory not allowed to be swapped out (locked with the mlock() system call), in pages.
shmpages	The total size of shared memory (including IPC, shared anonymous mappings and tmpfs objects) allocated by the processes of a particular VPS, in pages.
privvmpages	The size of private (or potentially private) memory allocated by an application. The memory that is always shared among different applications is not included in this resource parameter.
numfile	The number of files opened by all VPS processes.
numflock	The number of file locks created by all VPS processes.
numpty	The number of pseudo-terminals, such as an ssh session, the screen or xterm applications, etc.
numsiginfo	The number of siginfo structures (essentially, this parameter limits the size of the signal delivery queue).
dcachesize	The total size of dentry and inode structures locked in the memory.
physpages	The total size of RAM used by the VPS processes. This is an accounting-only parameter currently. It shows the usage of RAM by the VPS. For the memory pages used by several different VPSs (mappings of shared libraries, for example), only the corresponding fraction of a page is charged to each VPS. The sum of the physpages usage for all VPSs corresponds to the total number of pages used in the system by all the accounted users.
numiptent	The number of IP packet filtering entries

Checkpointing and live migration

A live migration and checkpointing feature was released for OpenVZ in the middle of April 2006. This makes it possible to move a container from one physical server to another without shutting down the container. The process is known as checkpointing: a container is frozen and its whole state is saved to a file on disk. This file can then be transferred to another machine and a container can be unfrozen (restored) there; the delay is roughly a few seconds. Because state is usually preserved completely, this pause may appear to be an ordinary computational delay.

OpenVZ distinct features

Scalability

As OpenVZ employs a single kernel model, it is as scalable as the Linux kernel; that is, it supports up to 4096 CPUs and up to 64 GiB of RAM on 32-bit with PAE. Please note that 64-bit kernels are strongly recommended for production. A single container can scale up to the whole physical system, i.e. use all the CPUs and all the RAM.

Performance

The virtualization overhead observed in OpenVZ is minimal; More computing power is available for each container.^[2]

Density

By decreasing the overhead required for each container, it is possible to serve more containers from a given physical server, so long as the computational demands do not exceed the physical availability.

Mass-management

An administrator (i.e. root) of an OpenVZ physical server (also known as a hardware node or host system) can see all the running processes and files of all the containers on the system, and this has convenience implications. Some fixes (such as a kernel update) will affect all containers automatically, while other changes can simply be "pushed" to all the containers by a simple shell script.

Compare this with managing a VMware- or Xen-based virtualized environment: in order to apply a security update to 10 virtual servers, one either needs a more elaborate pull system (on all the virtual servers) for such updates, or an administrator is required to log in to each virtual server and apply the update. This makes OpenVZ more convenient in those cases where a pull system has not been or can not be implemented.

Similar technologies

Other implementations of operating system-level virtualization technology

Limitations

OpenVZ restricts access to /dev devices to a small subset. The container may be impacted in not having access to devices that are used -- not in providing access to physical hardware -- but in adding or configuring kernel-level features.

/dev/loopN is often restricted in deployments, as it relies on a limit pool of kernel threads. It's absence restricts the ability to mount disk images. Some work-arounds exist using FUSE.

OpenVZ is limited to the providing only some VPN technologies based on PPP (such as PPTP/L2TP) and TUN/TAP. IPsec is not supported inside containers, including L2TP secured with IPsec.

Full virtualization solutions are free of these limitation.

References

^ Official OpenVZ web site, http://openvz.org/
^ ^a ^b HPL-2007-59 technical report, http://www.hpl.hp.com/techreports/2007/HPL-2007-59R1.html?jumpid=reg_R1002_USEN
^ (Ottawa) Linux Symposium Proceedings, Volume I, July 2008, http://www.linuxsymposium.org/2008/ols-2008-Proceedings-V1.pdf
^ http://wiki.openvz.org/Ploop

External links

OpenVZ official web site

[1] Official OpenVZ web site, http://openvz.org/

[hpl-2] HPL-2007-59 technical report, http://www.hpl.hp.com/techreports/2007/HPL-2007-59R1.html?jumpid=reg_R1002_USEN

[linuxsymposium-3] (Ottawa) Linux Symposium Proceedings, Volume I, July 2008, http://www.linuxsymposium.org/2008/ols-2008-Proceedings-V1.pdf

[4] ttp://wiki.openvz.org/Ploop

[1]

[2]

[3]

[4]