Jump to content

Io uring: Difference between revisions

From Wikipedia, the free encyclopedia
Content deleted Content added
Fmf2009 (talk | contribs)
“Facebook” is a now-defunct name for the Meta corporation
Tags: Mobile edit Mobile app edit iOS app edit
add full name with reference to primary source
 
(28 intermediate revisions by 12 users not shown)
Line 1: Line 1:
{{Short description|Linux kernel interface for storage devices}}
{{Short description|Linux kernel interface for storage devices}}
{{DISPLAYTITLE:io_uring}}
{{DISPLAYTITLE:io_uring}}
'''io_uring''' (previously known as '''aioring''') is a [[Linux kernel]] [[system call]] interface for storage device [[asynchronous I/O]] operations addressing performance issues with similar interfaces provided by functions like {{Code|read()}}/{{Code|write()}} or {{Code|aio_read()}}/{{Code|aio_write()}} etc. for [[file operation|operations]] on data accessed by [[file descriptor]]s.<ref name="Axboe2019" /><ref>{{Cite web|last=Axboe|first=Jens|date=October 15, 2019|title=Efficient IO with io_uring|url=https://kernel.dk/io_uring.pdf|url-status=live}}</ref>{{Rp|page=2}}
'''io_uring'''{{efn|Input/output user ring<ref>{{cite web|url=https://fosstodon.org/@axboe/113541516336804778|first1=Jens|last1=Axboe|author-link=Jens Axboe|title=@axboe@fosstodon.org}}</ref>}} (previously known as '''aioring''') is a [[Linux kernel]] [[system call]] interface for storage device [[asynchronous I/O]] operations addressing performance issues with similar interfaces provided by functions like {{Code|read()}}/{{Code|write()}} or {{Code|aio_read()}}/{{Code|aio_write()}} etc. for [[file operation|operations]] on data accessed by [[file descriptor]]s.{{R|name=Phoronix2019}}<ref>{{Cite web|last=Axboe|first=Jens|date=October 15, 2019|title=Efficient IO with io_uring|url=https://kernel.dk/io_uring.pdf}}</ref>{{Rp|page=2}}


It was primarily developed by [[Jens Axboe]] at Meta.<ref name="Axboe2019" />
Development is ongoing, worked on primarily by [[Jens Axboe]] at [[Meta Platforms|Meta]].{{R|name=Phoronix2019}}


==Interface==
==Interface==


Internally it works by creating two buffers dubbed as "queue rings" ([[circular buffer]]s) for storage of submission and completion of I/O requests (for storage devices, submission queue (SQ) and completion queue (CQ) respectively).<ref name=":1">{{Cite web|title=Getting Hands-on with io_uring using Go|url=https://developers.mattermost.com/blog/hands-on-iouring-go/|access-date=2021-11-20|website=developers.mattermost.com|language=en-us}}</ref> Keeping these buffers shared between the kernel and application helps to boost the [[IOPS|I/O performance]] by eliminating the need to issue extra and expensive system calls to copy these buffers between the two.<ref name="Axboe2019">{{Cite web|title=Linux Kernel Getting io_uring To Deliver Fast & Efficient I/O - Phoronix|url=https://www.phoronix.com/scan.php?page=news_item&px=Linux-io_uring-Fast-Efficient|url-status=live|access-date=2021-03-14|website=[[Phoronix]]}}</ref><ref name=":0">{{Cite web|title=The rapid growth of io_uring [LWN.net]|url=https://lwn.net/Articles/810414/|access-date=2021-11-20|website=lwn.net}}</ref><ref name=":1" /> According to the io_uring design paper, the SQ buffer is writable only by consumer application, and the CQ buffer only by the kernel.{{R|name=Axboe2019|page=3}}
It works by creating two [[circular buffer]]s, called "queue rings", for storage of submission and completion of I/O requests, respectively. For storage devices, these are called the submission queue (SQ) and completion queue (CQ).<ref name=":1">{{Cite web|title=Getting Hands-on with io_uring using Go|url=https://developers.mattermost.com/blog/hands-on-iouring-go/|access-date=2021-11-20|website=developers.mattermost.com|language=en-us}}</ref> Keeping these buffers shared between the kernel and application helps to boost the [[IOPS|I/O performance]] by eliminating the need to issue extra and expensive system calls to copy these buffers between the two.<ref name="Phoronix2019">{{Cite web |title=Linux Kernel Getting io_uring To Deliver Fast & Efficient I/O |date=2019-02-14 |url=https://www.phoronix.com/scan.php?page=news_item&px=Linux-io_uring-Fast-Efficient |access-date=2021-03-14 |website=[[Phoronix]]}}</ref><ref name=":1" /><ref name=":0">{{Cite web|title=The rapid growth of io_uring [LWN.net]|url=https://lwn.net/Articles/810414/|access-date=2021-11-20|website=lwn.net}}</ref> According to the io_uring design paper, the SQ buffer is writable only by consumer applications, and the CQ buffer is writable only by the kernel.{{R|name=Phoronix2019|page=3}}


[[eBPF]] can be combined with io_uring.<ref>{{Cite web |title=BPF meets io_uring [LWN.net] |url=https://lwn.net/Articles/847951/ |access-date=2023-04-17 |website=[[LWN.net]]}}</ref>
The API provided by {{Code|liburing}} library for userspace (applications) can be used to interact with the kernel interface more easily.<ref name="Axboe2019" />{{R|name=Axboe2019|page=12}}


==History==
Both kernel interface and library were adopted in Linux 5.1 kernel version.<ref name="Axboe2019" /><ref name=":0" /><ref>{{Cite web|title=Faster IO through io_uring {{!}} Kernel Recipes 2019|url=https://kernel-recipes.org/en/2019/talks/faster-io-through-io_uring/|access-date=2021-03-14|language=en-GB}}</ref>


The Linux kernel has had [[asynchronous I/O]] since version 2.5, but it was seen as difficult to use and inefficient.<ref>{{Cite web|last=Corbet|first=Jonathan|title=Ringing in a new asynchronous I/O API|url=https://lwn.net/Articles/776703/|url-status=live|access-date=2021-03-14|website=[[LWN.net]]}}</ref> The old API only supported certain niche [[use cases]].<ref>{{cite web | url = https://kernel.dk/axboe-kr2022.pdf | title = What’s new with io_uring | access-date = 2022-06-01}}</ref>
The Linux kernel has supported [[asynchronous I/O]] since version 2.5, but it was seen as difficult to use and inefficient.<ref>{{Cite web|last=Corbet|first=Jonathan|title=Ringing in a new asynchronous I/O API|url=https://lwn.net/Articles/776703/|access-date=2021-03-14|website=[[LWN.net]]}}</ref> This older API only supported certain niche [[use cases]],<ref>{{cite web | url = https://kernel.dk/axboe-kr2022.pdf | title = What's new with io_uring | access-date = 2022-06-01}}</ref> notably it only enables asynchronous operation when using the O_DIRECT flag and while accessing already allocated files. This prevents utilizing the [[page cache]], while also exposing the application to complex O_DIRECT semantics. Linux AIO also does not support sockets, so it cannot be used to multiplex network and disk I/O.<ref>{{Cite web |url=http://code.google.com/p/kernel/wiki/AIOUserGuide |title=Linux Asynchronous I/O |date=2014-04-21 |archive-url=https://web.archive.org/web/20150406015143/http://code.google.com/p/kernel/wiki/AIOUserGuide |access-date=2023-06-16 |archive-date=2015-04-06 |quote=Blocking during io_submit on ext4, on buffered operations, network access, pipes, etc. Some operations are not well-represented by the AIO interface. With completely unsupported operations like buffered reads, operations on a socket or pipes, the entire operation will be performed during the io_submit syscall, with the completion available immediately for access with io_getevents. AIO access to a file on a filesystem like ext4 is partially supported: if a metadata read is required to look up the data block (ie if the metadata is not already in memory), then the io_submit call will block on the metadata read. Certain types of file-enlarging writes are completely unsupported and block for the entire duration of the operation. }}</ref>

The io_uring kernel interface was adopted in Linux kernel version 5.1 to resolve the deficiencies of Linux AIO.{{R|name=Phoronix2019}}<ref name=":0" /><ref>{{Cite web|title=Faster IO through io_uring |website=Kernel Recipes 2019 |url=https://kernel-recipes.org/en/2019/talks/faster-io-through-io_uring/|access-date=2021-03-14|language=en-GB}}</ref> The liburing library provides an [[API]] to interact with the kernel interface easily from [[userspace]].{{R|name=Phoronix2019|page=12}}

==Security==

io_uring has been noted for exposing a significant attack surface and structural difficulties integrating it with the [[Linux Security Modules|Linux security subsystem]].<ref>{{Cite web |url=https://lwn.net/Articles/902466/ |title=Security requirements for new kernel features |date=2022-07-28 |access-date=2023-06-16 |website=[[LWN.net]] |last=Corbet |first=Jonathan}}</ref>

In June 2023, Google's security team reported that 60% of the [[Exploit (computer security)|exploits]] submitted to their [[bug bounty program]] in 2022 were exploits of the Linux kernel's io_uring vulnerabilities. As a result, <code>io_uring</code> was disabled for apps in [[Android (operating system)|Android]], and disabled entirely in [[ChromeOS]] as well as Google servers.<ref>{{cite web |last1=Koczka |first1=Tamás |title=Learnings from kCTF VRP's 42 Linux kernel exploits submissions |url=https://security.googleblog.com/2023/06/learnings-from-kctf-vrps-42-linux.html |website=Google Online Security Blog |publisher=Google |access-date=14 June 2023 |language=en |archive-url=https://web.archive.org/web/20240922183950/https://security.googleblog.com/2023/06/learnings-from-kctf-vrps-42-linux.html |archive-date=2024-09-22 |url-status=live |quote=60% of the submissions exploited the io_uring component of the Linux kernel}}</ref> [[Docker (software)|Docker]] also consequently disabled io_uring from their default [[seccomp]] profile.<ref>{{Cite web |title=Update RuntimeDefault seccomp profile to disallow io_uring related syscalls by vinayakankugoyal · Pull Request #9320 · containerd/containerd |url=https://github.com/containerd/containerd/pull/9320 |date=2023-11-02 |access-date=2024-10-20 |website=GitHub |language=en |archive-url=https://web.archive.org/web/20240106225425/https://github.com/containerd/containerd/pull/9320 |archive-date=2024-01-06 |url-status=live}}</ref>

== Notes ==
{{Notelist}}


== References ==
== References ==

Latest revision as of 20:09, 25 November 2024

io_uring[a] (previously known as aioring) is a Linux kernel system call interface for storage device asynchronous I/O operations addressing performance issues with similar interfaces provided by functions like read()/write() or aio_read()/aio_write() etc. for operations on data accessed by file descriptors.[2][3]: 2 

Development is ongoing, worked on primarily by Jens Axboe at Meta.[2]

Interface

[edit]

It works by creating two circular buffers, called "queue rings", for storage of submission and completion of I/O requests, respectively. For storage devices, these are called the submission queue (SQ) and completion queue (CQ).[4] Keeping these buffers shared between the kernel and application helps to boost the I/O performance by eliminating the need to issue extra and expensive system calls to copy these buffers between the two.[2][4][5] According to the io_uring design paper, the SQ buffer is writable only by consumer applications, and the CQ buffer is writable only by the kernel.[2]: 3 

eBPF can be combined with io_uring.[6]

History

[edit]

The Linux kernel has supported asynchronous I/O since version 2.5, but it was seen as difficult to use and inefficient.[7] This older API only supported certain niche use cases,[8] notably it only enables asynchronous operation when using the O_DIRECT flag and while accessing already allocated files. This prevents utilizing the page cache, while also exposing the application to complex O_DIRECT semantics. Linux AIO also does not support sockets, so it cannot be used to multiplex network and disk I/O.[9]

The io_uring kernel interface was adopted in Linux kernel version 5.1 to resolve the deficiencies of Linux AIO.[2][5][10] The liburing library provides an API to interact with the kernel interface easily from userspace.[2]: 12 

Security

[edit]

io_uring has been noted for exposing a significant attack surface and structural difficulties integrating it with the Linux security subsystem.[11]

In June 2023, Google's security team reported that 60% of the exploits submitted to their bug bounty program in 2022 were exploits of the Linux kernel's io_uring vulnerabilities. As a result, io_uring was disabled for apps in Android, and disabled entirely in ChromeOS as well as Google servers.[12] Docker also consequently disabled io_uring from their default seccomp profile.[13]

Notes

[edit]
  1. ^ Input/output user ring[1]

References

[edit]
  1. ^ Axboe, Jens. "@axboe@fosstodon.org".
  2. ^ a b c d e f "Linux Kernel Getting io_uring To Deliver Fast & Efficient I/O". Phoronix. 2019-02-14. Retrieved 2021-03-14.
  3. ^ Axboe, Jens (October 15, 2019). "Efficient IO with io_uring" (PDF).
  4. ^ a b "Getting Hands-on with io_uring using Go". developers.mattermost.com. Retrieved 2021-11-20.
  5. ^ a b "The rapid growth of io_uring [LWN.net]". lwn.net. Retrieved 2021-11-20.
  6. ^ "BPF meets io_uring [LWN.net]". LWN.net. Retrieved 2023-04-17.
  7. ^ Corbet, Jonathan. "Ringing in a new asynchronous I/O API". LWN.net. Retrieved 2021-03-14.
  8. ^ "What's new with io_uring" (PDF). Retrieved 2022-06-01.
  9. ^ "Linux Asynchronous I/O". 2014-04-21. Archived from the original on 2015-04-06. Retrieved 2023-06-16. Blocking during io_submit on ext4, on buffered operations, network access, pipes, etc. Some operations are not well-represented by the AIO interface. With completely unsupported operations like buffered reads, operations on a socket or pipes, the entire operation will be performed during the io_submit syscall, with the completion available immediately for access with io_getevents. AIO access to a file on a filesystem like ext4 is partially supported: if a metadata read is required to look up the data block (ie if the metadata is not already in memory), then the io_submit call will block on the metadata read. Certain types of file-enlarging writes are completely unsupported and block for the entire duration of the operation.
  10. ^ "Faster IO through io_uring". Kernel Recipes 2019. Retrieved 2021-03-14.
  11. ^ Corbet, Jonathan (2022-07-28). "Security requirements for new kernel features". LWN.net. Retrieved 2023-06-16.
  12. ^ Koczka, Tamás. "Learnings from kCTF VRP's 42 Linux kernel exploits submissions". Google Online Security Blog. Google. Archived from the original on 2024-09-22. Retrieved 14 June 2023. 60% of the submissions exploited the io_uring component of the Linux kernel
  13. ^ "Update RuntimeDefault seccomp profile to disallow io_uring related syscalls by vinayakankugoyal · Pull Request #9320 · containerd/containerd". GitHub. 2023-11-02. Archived from the original on 2024-01-06. Retrieved 2024-10-20.
[edit]