Tiled rendering: Difference between revisions
Remove Tags: Mobile edit Mobile web edit |
Adding local short description: "Process of rending a computer graphics image", overriding Wikidata description "process of subdividing a computer graphics image by a regular grid in optical space and rendering each section of the grid, or tile, separately" |
||
(36 intermediate revisions by 24 users not shown) | |||
Line 1: | Line 1: | ||
{{Short description|Process of rending a computer graphics image}} |
|||
'''Tiled rendering''' is the process of subdividing a [[computer graphics]] image by a regular [[Grid (spatial index)|grid]] in [[optical space]] and rendering each section of the grid, or ''tile'', separately. The advantage to this design is that the amount of memory and bandwidth is reduced compared to ''[[Immediate mode (computer graphics)|immediate mode]]'' rendering systems that draw the entire frame at once. This has made tile rendering systems particularly common for low-power [[handheld device]] use. Tiled rendering is sometimes known as a "sort middle" architecture, because it performs the sorting of the geometry in the middle of the [[graphics pipeline]] instead of near the end.<ref>{{cite web | url=https://www.cs.cmu.edu/afs/cs.cmu.edu/academic/class/15869-f11/www/readings/molnar94_sorting.pdf | title=A Sorting Classification of Parallel Rendering | publisher=[[IEEE]] | first=Steven | last=Molnar | date=1994-04-01 | access-date=2012-08-24 | archive-url=https://web.archive.org/web/20140912130015/https://www.cs.cmu.edu/afs/cs.cmu.edu/academic/class/15869-f11/www/readings/molnar94_sorting.pdf | archive-date=2014-09-12 | url-status=live }}</ref> |
|||
==Basic concept== |
==Basic concept== |
||
Creating a 3D image for display consists of a series of steps. First, the objects to be displayed are loaded into memory from individual ''models''. The system then applies mathematical functions to transform the models into a common coordinate system, the ''world view''. From this world view, a series of polygons (typically triangles) is created that approximates the original models as seen from a particular viewpoint, the ''camera''. Next, a compositing system produces an image by rendering the triangles and applying ''textures'' to the outside. Textures are small images that are painted onto the triangles to produce realism. The resulting image is then combined with various special effects, and moved into a [[frame buffer]], which video hardware then scans to produce the displayed image. This basic conceptual layout is known as the ''display pipeline''. |
Creating a 3D image for display consists of a series of steps. First, the objects to be displayed are loaded into memory from individual ''models''. The system then applies mathematical functions to transform the models into a common coordinate system, the ''world view''. From this world view, a series of polygons (typically triangles) is created that approximates the original models as seen from a particular viewpoint, the ''camera''. Next, a compositing system produces an image by rendering the triangles and applying ''textures'' to the outside. Textures are small images that are painted onto the triangles to produce realism. The resulting image is then combined with various special effects, and moved into a [[frame buffer]], which video hardware then scans to produce the displayed image. This basic conceptual layout is known as the ''display pipeline''. |
||
Line 6: | Line 9: | ||
Tiled renderers address this concern by breaking down the image into sections known as tiles, and rendering each one separately. This reduces the amount of memory needed during the intermediate steps, and the amount of data being moved about at any given time. To do this, the system sorts the triangles making up the geometry by location, allowing to quickly find which triangles overlap the tile boundaries. It then loads just those triangles into the rendering pipeline, performs the various rendering operations in the [[GPU]], and sends the result to the [[frame buffer]]. Very small tiles can be used, 16×16 and 32×32 pixels are popular tile sizes, which makes the amount of memory and bandwidth required in the internal stages small as well. And because each tile is independent, it naturally lends itself to simple parallelization. |
Tiled renderers address this concern by breaking down the image into sections known as tiles, and rendering each one separately. This reduces the amount of memory needed during the intermediate steps, and the amount of data being moved about at any given time. To do this, the system sorts the triangles making up the geometry by location, allowing to quickly find which triangles overlap the tile boundaries. It then loads just those triangles into the rendering pipeline, performs the various rendering operations in the [[GPU]], and sends the result to the [[frame buffer]]. Very small tiles can be used, 16×16 and 32×32 pixels are popular tile sizes, which makes the amount of memory and bandwidth required in the internal stages small as well. And because each tile is independent, it naturally lends itself to simple parallelization. |
||
In a typical tiled renderer, geometry must first be transformed into screen space and assigned to screen-space tiles. This requires some storage for the lists of geometry for each tile. In early tiled systems, this was performed by the [[CPU]], but all modern hardware contains hardware to accelerate this step. The list of geometry can also be sorted front to back, allowing the GPU to use [[hidden surface removal]] to avoid processing pixels that are hidden behind others, saving on memory bandwidth for unnecessary texture lookups.<ref>{{cite web | url=http://www.imgtec.com/powervr/insider/powervr_presentations/GDC%20HardwareAndOptimisation.pdf | title=PowerVR: A Master Class in Graphics Technology and Optimization | date=2012-01-14 | |
In a typical tiled renderer, geometry must first be transformed into screen space and assigned to screen-space tiles. This requires some storage for the lists of geometry for each tile. In early tiled systems, this was performed by the [[CPU]], but all modern hardware contains hardware to accelerate this step. The list of geometry can also be sorted front to back, allowing the GPU to use [[hidden surface removal]] to avoid processing pixels that are hidden behind others, saving on memory bandwidth for unnecessary texture lookups.<ref>{{cite web | url=http://www.imgtec.com/powervr/insider/powervr_presentations/GDC%20HardwareAndOptimisation.pdf | title=PowerVR: A Master Class in Graphics Technology and Optimization | date=2012-01-14 | access-date=2014-01-11 | publisher=[[Imagination Technologies]] | archive-url=https://web.archive.org/web/20131003002919/http://www.imgtec.com/powervr/insider/powervr_presentations/GDC%20HardwareAndOptimisation.pdf | archive-date=2013-10-03 | url-status=live }}</ref> |
||
There are two main disadvantages of the tiled approach. One is that some triangles may be drawn several times if they overlap several tiles. This means the total rendering time would be higher than an immediate-mode rendering system. There are also possible issues when the tiles have to be stitched together to make a complete image, but this problem was solved long ago. More difficult to solve is that some image techniques are applied to the frame as a whole, and these are difficult to implement in a tiled render where the idea is to not have to work with the entire frame. These tradeoffs are well known, and of minor consequence for systems where the advantages are useful; tiled rendering systems are widely found in handheld computing devices. |
There are two main disadvantages of the tiled approach. One is that some triangles may be drawn several times if they overlap several tiles. This means the total rendering time would be higher than an immediate-mode rendering system. There are also possible issues when the tiles have to be stitched together to make a complete image, but this problem was solved long ago{{Citation needed|date=June 2018}}. More difficult to solve is that some image techniques are applied to the frame as a whole, and these are difficult to implement in a tiled render where the idea is to not have to work with the entire frame. These tradeoffs are well known, and of minor consequence for systems where the advantages are useful; tiled rendering systems are widely found in handheld computing devices. |
||
Tiled rendering should not be confused with tiled/nonlinear [[framebuffer]] addressing schemes, which make adjacent pixels also adjacent in memory.<ref>{{cite web | url=http://www.x.org/wiki/Development/Documentation/HowVideoCardsWork | title=How Video Cards Work | publisher=[[X.Org Foundation]] | first=Alex | last=Deucher | date=2008-05-16 | |
Tiled rendering should not be confused with tiled/nonlinear [[framebuffer]] addressing schemes, which make adjacent pixels also adjacent in memory.<ref>{{cite web | url=http://www.x.org/wiki/Development/Documentation/HowVideoCardsWork | title=How Video Cards Work | publisher=[[X.Org Foundation]] | first=Alex | last=Deucher | date=2008-05-16 | access-date=2010-05-27 | archive-url=https://web.archive.org/web/20100521083733/http://www.x.org/wiki/Development/Documentation/HowVideoCardsWork | archive-date=2010-05-21 | url-status=live }}</ref> These addressing schemes are used by a wide variety of architectures, not just tiled renderers. |
||
==Early work== |
==Early work== |
||
Much of the early work on tiled rendering was done as part of the Pixel Planes 5 architecture (1989).<ref>{{cite web | url=http://www.cs.unc.edu/~pxfl/history.html | title=History | work=Pixel-Planes | publisher=[[University of North Carolina at Chapel Hill]] | first=Jim | last=Mahaney | date=1998-06-22| |
Much of the early work on tiled rendering was done as part of the Pixel Planes 5 architecture (1989).<ref>{{cite web | url=http://www.cs.unc.edu/~pxfl/history.html | title=History | work=Pixel-Planes | publisher=[[University of North Carolina at Chapel Hill]] | first=Jim | last=Mahaney | date=1998-06-22 | access-date=2008-08-04 | archive-url=https://web.archive.org/web/20080929160951/http://www.cs.unc.edu/~pxfl/history.html | archive-date=2008-09-29 | url-status=live }}</ref><ref>{{cite book | chapter-url=http://dl.acm.org/citation.cfm?id=74341 | chapter=Pixel-planes 5: a heterogeneous multiprocessor graphics system using processor-enhanced memories | publisher=[[Association for Computing Machinery|ACM]] | first=Henry | last=Fuchs | title=Proceedings of the 16th annual conference on Computer graphics and interactive techniques - SIGGRAPH '89 | date=1989-07-01| pages=79–88 | doi=10.1145/74333.74341 | isbn=0201504340 | s2cid=1778124 | access-date=2012-08-24}}</ref> |
||
The Pixel Planes 5 project validated the tiled approach and invented a lot of the techniques now viewed as standard for tiled renderers. It is the work most widely cited by other papers in the field. |
The Pixel Planes 5 project validated the tiled approach and invented a lot of the techniques now viewed as standard for tiled renderers. It is the work most widely cited by other papers in the field. |
||
Line 19: | Line 22: | ||
The tiled approach was also known early in the history of software rendering. Implementations of [[Reyes rendering]] often divide the image into "tile buckets". |
The tiled approach was also known early in the history of software rendering. Implementations of [[Reyes rendering]] often divide the image into "tile buckets". |
||
==Commercial products – Desktop and |
==Commercial products – Desktop and console== |
||
Early in the development of desktop GPUs, several companies developed tiled architectures. Over time, these were largely supplanted by immediate-mode GPUs with fast custom external memory systems. |
Early in the development of desktop GPUs, several companies developed tiled architectures. Over time, these were largely supplanted by immediate-mode GPUs with fast custom external memory systems. |
||
Major examples of this are: |
Major examples of this are: |
||
* [[PowerVR]] rendering architecture (1996): The [[rasterizer]] consisted of a 32×32 tile into which [[polygon]]s were [[rasterize]]d across the image across multiple [[pixel]]s in parallel. On early [[personal computer|PC]] versions, tiling was performed in the [[display driver]] running on the [[Central processing unit|CPU]]. In the application of the [[Dreamcast]] console, tiling was performed by a piece of hardware. This facilitated [[Deferred shading|deferred rendering]]—only the visible pixels were [[texture-mapped]], saving [[shading]] calculations and texture-[[bandwidth (computing)|bandwidth]]. |
* [[PowerVR]] rendering architecture (1996): The [[rasterizer]] consisted of a 32×32 tile into which [[polygon]]s were [[rasterize]]d across the image across multiple [[pixel]]s in parallel. On early [[personal computer|PC]] versions, tiling was performed in the [[display driver]] running on the [[Central processing unit|CPU]]. In the application of the [[Dreamcast]] console, tiling was performed by a piece of hardware. This facilitated [[Deferred shading|deferred rendering]]—only the visible pixels were [[texture-mapped]], saving [[shading]] calculations and texture-[[bandwidth (computing)|bandwidth]]. |
||
* [[Oak Technology]] (1997) Warp 5. The Oak chip is the first in the market to combine tiling with other high-performance rendering algorithms such as antialiasing and trilinear mip-mapped textures, per Jon Peddie, president of Jon Peddie Associates.<ref>{{cite news |last1=Maclellan |first1=Andrew |title=Oak intros 3-D chip Warp 5 accelerator uses Talisman like rendering scheme |issue=1063 |publisher=Electronic Buyers News |date=June 23, 1997}}</ref> |
|||
* [[Microsoft Talisman]] (1996) |
* [[Microsoft Talisman]] (1996) |
||
* [[Dreamcast]] (powered by PowerVR chipset) (1998) |
* [[Dreamcast]] (powered by PowerVR chipset) (1998) |
||
* Gigapixel GP-1 (1999)<ref>{{cite web | url=https://www.theregister.co.uk/1999/10/06/gigapixel_takes_on_3dfx_s3/ | title=GigaPixel takes on 3dfx, S3, Nvidia with... tiles | work=Gigapixel | publisher=[[The Register]] | first=Tony | last=Smith | date=1999-10-06 | |
* Gigapixel GP-1 (1999)<ref>{{cite web | url=https://www.theregister.co.uk/1999/10/06/gigapixel_takes_on_3dfx_s3/ | title=GigaPixel takes on 3dfx, S3, Nvidia with... tiles | work=Gigapixel | publisher=[[The Register]] | first=Tony | last=Smith | date=1999-10-06 | access-date=2012-08-24 | archive-url=https://web.archive.org/web/20121003004013/http://www.theregister.co.uk/1999/10/06/gigapixel_takes_on_3dfx_s3/ | archive-date=2012-10-03 | url-status=live }}</ref> |
||
* [[Larrabee (microarchitecture)|Intel Larrabee GPU]] (2009) (canceled) |
* [[Larrabee (microarchitecture)|Intel Larrabee GPU]] (2009) (canceled) |
||
* [[PS Vita]] (powered by PowerVR chipset) (2011)<ref>{{cite web | url=http://3dsforums.com/lounge-2/develop-2011-ps-vita-most-developer-friendly-hardware-sony-has-ever-made-19841/ | title=Develop 2011: PS Vita is the most developer friendly hardware Sony has ever made | work=PS Vita | publisher=[[3dsforums]] | |
* [[PS Vita]] (powered by PowerVR chipset) (2011)<ref>{{cite web | url=http://3dsforums.com/lounge-2/develop-2011-ps-vita-most-developer-friendly-hardware-sony-has-ever-made-19841/ | title=Develop 2011: PS Vita is the most developer friendly hardware Sony has ever made | work=PS Vita | publisher=[[3dsforums]] | date=2011-07-21 | access-date=2011-07-21 }}{{Dead link|date=June 2018 |bot=InternetArchiveBot |fix-attempted=no }}</ref> |
||
* [[Nvidia]] GPUs based on the [[Maxwell (microarchitecture)|Maxwell architecture]] and later architectures (2014)<ref>{{Cite news | url = http://www.realworldtech.com/tile-based-rasterization-nvidia-gpus/ | title = Tile-based Rasterization in Nvidia GPUs | first = David | last = Kanter | date = August 1, 2016 | newspaper = Real World Technologies | access-date = April 1, 2016}}</ref> |
* [[Nvidia]] GPUs based on the [[Maxwell (microarchitecture)|Maxwell architecture]] and later architectures (2014)<ref>{{Cite news | url = http://www.realworldtech.com/tile-based-rasterization-nvidia-gpus/ | title = Tile-based Rasterization in Nvidia GPUs | first = David | last = Kanter | date = August 1, 2016 | newspaper = Real World Technologies | access-date = April 1, 2016 | archive-url = https://web.archive.org/web/20160804205844/http://www.realworldtech.com/tile-based-rasterization-nvidia-gpus/ | archive-date = 2016-08-04 | url-status = live }}</ref> |
||
*[[AMD]] GPUs based on the [[Graphics Core Next#fifth|Vega (GCN5) architecture]] and later architectures (2017)<ref>{{Cite web|url=https://pcper.com/2017/01/amd-vega-gpu-architecture-preview-redesigned-memory-architecture/2/|title=AMD Vega GPU Architecture Preview: Redesigned Memory Architecture|website=PC Perspective|date=5 January 2017 |language=en-US|access-date=2020-01-04}}</ref><ref>{{Cite web|url=https://www.anandtech.com/show/11002/the-amd-vega-gpu-architecture-teaser|title=The AMD Vega GPU Architecture Teaser: Higher IPC, Tiling, & More, Coming in H1'2017|last=Smith|first=Ryan|website=www.anandtech.com|access-date=2020-01-04}}</ref> |
|||
*[[Intel]] Gen11 GPU and later architectures (2019)<ref>{{Cite web | url=https://software.intel.com/sites/default/files/managed/db/88/The-Architecture-of-Intel-Processor-Graphics-Gen11_R1new.pdf | title=Intel Processor Graphics Gen11 Architecture | access-date=2024-08-13 | website=software.intel.com}}</ref><ref>{{cite tweet|number=1126251762657124358|user=intelnews|title=Intel's @gregorymbryant at today's...|date=8 May 2019}}</ref><ref>{{Cite web | url=https://newsroom.intel.com/wp-content/uploads/sites/11/2019/05/10th-Gen-Intel-Core-Product-Brief.pdf | title=Big-time entertainment in remarkably thin and lightdesign | archive-url=https://web.archive.org/web/20190528103709/https://newsroom.intel.com/wp-content/uploads/sites/11/2019/05/10th-Gen-Intel-Core-Product-Brief.pdf | archive-date=2019-05-28}}</ref> |
|||
Examples of non-tiled architectures that use large on-chip buffers are: |
Examples of non-tiled architectures that use large on-chip buffers are: |
||
* [[Xbox 360]] (2005): the [[GPU]] contains an embedded 10 [[ |
* [[Xbox 360]] (2005): the [[GPU]] contains an embedded 10 [[megabyte|MB]] [[eDRAM]]; this is not sufficient to hold the raster for an entire 1280×720 image with 4× [[multisample anti-aliasing]], so a tiling solution is superimposed when running in HD resolutions and 4× MSAA is enabled.<ref>{{cite web|url=http://msdn.microsoft.com/en-us/library/bb464139.aspx|title=XNA Game Studio 4.0 Refresh|first=Tara | last=Meyer|website=msdn.microsoft.com|date=29 September 2011 |access-date=2014-05-15|archive-url=https://web.archive.org/web/20150107120850/http://msdn.microsoft.com/en-us/library/bb464139.aspx|archive-date=2015-01-07|url-status=live}}</ref> |
||
* [[Xbox One]] (2013): the [[GPU]] contains an embedded 32 [[ |
* [[Xbox One]] (2013): the [[GPU]] contains an embedded 32 [[megabyte|MB]] [[eSRAM]], which can be used to hold all or part of an image. It is not a tiled architecture, but is flexible enough that software developers can emulate tiled rendering.<ref>{{cite web|url=http://www.neowin.net/news/xbox-one-developer-upcoming-sdk-improvements-will-allow-for-more-1080p-games|title=Xbox One developer: upcoming SDK improvements will allow for more 1080p games|date=29 July 2023 }}</ref>{{Failed verification|date=June 2016}} |
||
==Commercial products – Embedded== |
==Commercial products – Embedded== |
||
Line 39: | Line 45: | ||
Tile-based immediate mode rendering (TBIM): |
Tile-based immediate mode rendering (TBIM): |
||
* [[ARM Holdings|ARM]] [[Mali (GPU)|Mali]] series.<ref>{{cite web | url=http://infocenter.arm.com/help/index.jsp?topic=/com.arm.doc.dui0363d/CJAEEJCF.html | title=Mali rendering strategy | publisher=ARM }}</ref> |
* [[ARM Holdings|ARM]] [[Mali (GPU)|Mali]]{{which|date=July 2020}} series.<ref>{{cite web | url=http://infocenter.arm.com/help/index.jsp?topic=/com.arm.doc.dui0363d/CJAEEJCF.html | title=Mali rendering strategy | publisher=ARM | access-date=2018-10-27 | archive-url=https://web.archive.org/web/20160304051042/http://infocenter.arm.com/help/index.jsp?topic=%2Fcom.arm.doc.dui0363d%2FCJAEEJCF.html | archive-date=2016-03-04 | url-status=live }}</ref> |
||
* [[Qualcomm]] [[Adreno]] (series 300 and newer can dynamically switch to immediate/direct mode rendering via FlexRender).<ref>{{cite web | url=https://lwn.net/Articles/638908/ | title=An update on the freedreno graphics driver | publisher=lwn.net }}</ref><ref>{{cite web |url=https://www.qualcomm.com/media/documents/files/the-rise-of-mobile-gaming-on-android-qualcomm-snapdragon-technology-leadership.pdf |title=The rise of mobile gaming on android |publisher=Qualcomm |
* [[Qualcomm]] [[Adreno]] (series 300 and newer can also dynamically switch to immediate/direct mode rendering via FlexRender).<ref>{{cite web | url=https://lwn.net/Articles/638908/ | title=An update on the freedreno graphics driver | publisher=lwn.net | access-date=2015-09-15 | archive-url=https://web.archive.org/web/20150905111118/https://lwn.net/Articles/638908/ | archive-date=2015-09-05 | url-status=live }}</ref><ref>{{cite web |url=https://www.qualcomm.com/media/documents/files/the-rise-of-mobile-gaming-on-android-qualcomm-snapdragon-technology-leadership.pdf |title=The rise of mobile gaming on android |publisher=Qualcomm |page=5 |access-date=17 September 2015 |archive-url=https://web.archive.org/web/20141109010801/https://www.qualcomm.com/media/documents/files/the-rise-of-mobile-gaming-on-android-qualcomm-snapdragon-technology-leadership.pdf |archive-date=2014-11-09 |url-status=live }}</ref><ref>{{Cite web|url=https://www.anandtech.com/show/4686/samsung-galaxy-s-2-international-review-the-best-redefined|title=Samsung Galaxy S 2 (International) Review - The Best, Redefined|first1=Brian | last1=Klug | first2=Anand | last2=Lal Shimpi|date=September 11, 2011|website=www.anandtech.com|access-date=2020-01-04}}</ref> |
||
Tile-based deferred rendering (TBDR): |
Tile-based deferred rendering (TBDR): |
||
* [[ |
* [[Arm Holdings|Arm]] [[Mali (GPU)|Mali]]{{which|date=July 2020}} series.<ref>{{cite web | url=https://developer.arm.com/documentation/dui0555/a/introduction/the-mali-gpu-hardware/tile-based-rendering | title=Tile based rendering | publisher=[[Arm Holdings|Arm]] | access-date=2020-07-13}}</ref> |
||
* [[Imagination Technologies]] [[PowerVR]] 5/6/7 series.<ref>{{cite web | url=http://blog.imgtec.com/powervr/a-look-at-the-powervr-graphics-architecture-tile-based-rendering | title=A look at the PowerVR graphics architecture: Tile-based rendering | publisher=Imagination Technologies | access-date=2015-09-15 | archive-url=https://web.archive.org/web/20150405010019/http://blog.imgtec.com/powervr/a-look-at-the-powervr-graphics-architecture-tile-based-rendering | archive-date=2015-04-05 | url-status=live }}</ref> |
|||
* [[Broadcom]] [[VideoCore|VideoCore IV]] series.<ref>{{cite web | url=http://www.broadcom.com/docs/support/videocore/VideoCoreIV-AG100-R.pdf | title=VideoCoreIV-AG100 | publisher=Broadcom | date=2013-09-18}}</ref> |
* [[Broadcom]] [[VideoCore|VideoCore IV]] series.<ref>{{cite web | url=http://www.broadcom.com/docs/support/videocore/VideoCoreIV-AG100-R.pdf | title=VideoCoreIV-AG100 | publisher=Broadcom | date=2013-09-18 | access-date=2015-01-10 | archive-url=https://web.archive.org/web/20150301043328/http://www.broadcom.com/docs/support/videocore/VideoCoreIV-AG100-R.pdf | archive-date=2015-03-01 | url-status=live }}</ref> |
||
* [[Apple silicon]] GPUs.<ref>{{cite web|url=https://developer.apple.com/videos/play/wwdc2020/10631|title=Bring your Metal app to Apple Silicon Macs|website=developer.apple.com|access-date=2020-07-13}}</ref> |
|||
[[Vivante]] produces mobile GPUs which have tightly coupled frame buffer memory (similar to the Xbox 360 GPU described above). Although this can be used to render parts of the screen, the large size of the rendered regions means that they are not usually described as using a tile-based architecture. |
[[Vivante]] produces mobile GPUs which have tightly coupled frame buffer memory (similar to the Xbox 360 GPU described above). Although this can be used to render parts of the screen, the large size of the rendered regions means that they are not usually described as using a tile-based architecture. |
||
Line 52: | Line 61: | ||
* [[Scanline rendering]] |
* [[Scanline rendering]] |
||
* [[Tile-based video game]] |
* [[Tile-based video game]] |
||
* [[Web Map Tile Service]] |
|||
==References== |
==References== |
||
{{reflist}} |
{{reflist}} |
||
{{Graphics Processing Unit}} |
|||
{{DEFAULTSORT:Tiled Rendering}} |
{{DEFAULTSORT:Tiled Rendering}} |
Latest revision as of 01:24, 15 November 2024
Tiled rendering is the process of subdividing a computer graphics image by a regular grid in optical space and rendering each section of the grid, or tile, separately. The advantage to this design is that the amount of memory and bandwidth is reduced compared to immediate mode rendering systems that draw the entire frame at once. This has made tile rendering systems particularly common for low-power handheld device use. Tiled rendering is sometimes known as a "sort middle" architecture, because it performs the sorting of the geometry in the middle of the graphics pipeline instead of near the end.[1]
Basic concept
[edit]Creating a 3D image for display consists of a series of steps. First, the objects to be displayed are loaded into memory from individual models. The system then applies mathematical functions to transform the models into a common coordinate system, the world view. From this world view, a series of polygons (typically triangles) is created that approximates the original models as seen from a particular viewpoint, the camera. Next, a compositing system produces an image by rendering the triangles and applying textures to the outside. Textures are small images that are painted onto the triangles to produce realism. The resulting image is then combined with various special effects, and moved into a frame buffer, which video hardware then scans to produce the displayed image. This basic conceptual layout is known as the display pipeline.
Each of these steps increases the amount of memory needed to hold the resulting image. By the time it reaches the end of the pipeline the images are so large that typical graphics card designs often use specialized high-speed memory and a very fast computer bus to provide the required bandwidth to move the image in and out of the various sub-components of the pipeline. This sort of support is possible on dedicated graphics cards, but as power and size budgets become more limited, providing enough bandwidth becomes expensive in design terms.
Tiled renderers address this concern by breaking down the image into sections known as tiles, and rendering each one separately. This reduces the amount of memory needed during the intermediate steps, and the amount of data being moved about at any given time. To do this, the system sorts the triangles making up the geometry by location, allowing to quickly find which triangles overlap the tile boundaries. It then loads just those triangles into the rendering pipeline, performs the various rendering operations in the GPU, and sends the result to the frame buffer. Very small tiles can be used, 16×16 and 32×32 pixels are popular tile sizes, which makes the amount of memory and bandwidth required in the internal stages small as well. And because each tile is independent, it naturally lends itself to simple parallelization.
In a typical tiled renderer, geometry must first be transformed into screen space and assigned to screen-space tiles. This requires some storage for the lists of geometry for each tile. In early tiled systems, this was performed by the CPU, but all modern hardware contains hardware to accelerate this step. The list of geometry can also be sorted front to back, allowing the GPU to use hidden surface removal to avoid processing pixels that are hidden behind others, saving on memory bandwidth for unnecessary texture lookups.[2]
There are two main disadvantages of the tiled approach. One is that some triangles may be drawn several times if they overlap several tiles. This means the total rendering time would be higher than an immediate-mode rendering system. There are also possible issues when the tiles have to be stitched together to make a complete image, but this problem was solved long ago[citation needed]. More difficult to solve is that some image techniques are applied to the frame as a whole, and these are difficult to implement in a tiled render where the idea is to not have to work with the entire frame. These tradeoffs are well known, and of minor consequence for systems where the advantages are useful; tiled rendering systems are widely found in handheld computing devices.
Tiled rendering should not be confused with tiled/nonlinear framebuffer addressing schemes, which make adjacent pixels also adjacent in memory.[3] These addressing schemes are used by a wide variety of architectures, not just tiled renderers.
Early work
[edit]Much of the early work on tiled rendering was done as part of the Pixel Planes 5 architecture (1989).[4][5]
The Pixel Planes 5 project validated the tiled approach and invented a lot of the techniques now viewed as standard for tiled renderers. It is the work most widely cited by other papers in the field.
The tiled approach was also known early in the history of software rendering. Implementations of Reyes rendering often divide the image into "tile buckets".
Commercial products – Desktop and console
[edit]Early in the development of desktop GPUs, several companies developed tiled architectures. Over time, these were largely supplanted by immediate-mode GPUs with fast custom external memory systems.
Major examples of this are:
- PowerVR rendering architecture (1996): The rasterizer consisted of a 32×32 tile into which polygons were rasterized across the image across multiple pixels in parallel. On early PC versions, tiling was performed in the display driver running on the CPU. In the application of the Dreamcast console, tiling was performed by a piece of hardware. This facilitated deferred rendering—only the visible pixels were texture-mapped, saving shading calculations and texture-bandwidth.
- Oak Technology (1997) Warp 5. The Oak chip is the first in the market to combine tiling with other high-performance rendering algorithms such as antialiasing and trilinear mip-mapped textures, per Jon Peddie, president of Jon Peddie Associates.[6]
- Microsoft Talisman (1996)
- Dreamcast (powered by PowerVR chipset) (1998)
- Gigapixel GP-1 (1999)[7]
- Intel Larrabee GPU (2009) (canceled)
- PS Vita (powered by PowerVR chipset) (2011)[8]
- Nvidia GPUs based on the Maxwell architecture and later architectures (2014)[9]
- AMD GPUs based on the Vega (GCN5) architecture and later architectures (2017)[10][11]
- Intel Gen11 GPU and later architectures (2019)[12][13][14]
Examples of non-tiled architectures that use large on-chip buffers are:
- Xbox 360 (2005): the GPU contains an embedded 10 MB eDRAM; this is not sufficient to hold the raster for an entire 1280×720 image with 4× multisample anti-aliasing, so a tiling solution is superimposed when running in HD resolutions and 4× MSAA is enabled.[15]
- Xbox One (2013): the GPU contains an embedded 32 MB eSRAM, which can be used to hold all or part of an image. It is not a tiled architecture, but is flexible enough that software developers can emulate tiled rendering.[16][failed verification]
Commercial products – Embedded
[edit]Due to the relatively low external memory bandwidth, and the modest amount of on-chip memory required, tiled rendering is a popular technology for embedded GPUs. Current examples include:
Tile-based immediate mode rendering (TBIM):
- ARM Mali[which?] series.[17]
- Qualcomm Adreno (series 300 and newer can also dynamically switch to immediate/direct mode rendering via FlexRender).[18][19][20]
Tile-based deferred rendering (TBDR):
- Arm Mali[which?] series.[21]
- Imagination Technologies PowerVR 5/6/7 series.[22]
- Broadcom VideoCore IV series.[23]
- Apple silicon GPUs.[24]
Vivante produces mobile GPUs which have tightly coupled frame buffer memory (similar to the Xbox 360 GPU described above). Although this can be used to render parts of the screen, the large size of the rendered regions means that they are not usually described as using a tile-based architecture.
See also
[edit]- Tessellation (computer graphics)
- Texture atlas
- Scanline rendering
- Tile-based video game
- Web Map Tile Service
References
[edit]- ^ Molnar, Steven (1994-04-01). "A Sorting Classification of Parallel Rendering" (PDF). IEEE. Archived (PDF) from the original on 2014-09-12. Retrieved 2012-08-24.
- ^ "PowerVR: A Master Class in Graphics Technology and Optimization" (PDF). Imagination Technologies. 2012-01-14. Archived (PDF) from the original on 2013-10-03. Retrieved 2014-01-11.
- ^ Deucher, Alex (2008-05-16). "How Video Cards Work". X.Org Foundation. Archived from the original on 2010-05-21. Retrieved 2010-05-27.
- ^ Mahaney, Jim (1998-06-22). "History". Pixel-Planes. University of North Carolina at Chapel Hill. Archived from the original on 2008-09-29. Retrieved 2008-08-04.
- ^ Fuchs, Henry (1989-07-01). "Pixel-planes 5: a heterogeneous multiprocessor graphics system using processor-enhanced memories". Proceedings of the 16th annual conference on Computer graphics and interactive techniques - SIGGRAPH '89. ACM. pp. 79–88. doi:10.1145/74333.74341. ISBN 0201504340. S2CID 1778124. Retrieved 2012-08-24.
- ^ Maclellan, Andrew (June 23, 1997). "Oak intros 3-D chip Warp 5 accelerator uses Talisman like rendering scheme". No. 1063. Electronic Buyers News.
- ^ Smith, Tony (1999-10-06). "GigaPixel takes on 3dfx, S3, Nvidia with... tiles". Gigapixel. The Register. Archived from the original on 2012-10-03. Retrieved 2012-08-24.
- ^ "Develop 2011: PS Vita is the most developer friendly hardware Sony has ever made". PS Vita. 3dsforums. 2011-07-21. Retrieved 2011-07-21.[permanent dead link ]
- ^ Kanter, David (August 1, 2016). "Tile-based Rasterization in Nvidia GPUs". Real World Technologies. Archived from the original on 2016-08-04. Retrieved April 1, 2016.
- ^ "AMD Vega GPU Architecture Preview: Redesigned Memory Architecture". PC Perspective. 5 January 2017. Retrieved 2020-01-04.
- ^ Smith, Ryan. "The AMD Vega GPU Architecture Teaser: Higher IPC, Tiling, & More, Coming in H1'2017". www.anandtech.com. Retrieved 2020-01-04.
- ^ "Intel Processor Graphics Gen11 Architecture" (PDF). software.intel.com. Retrieved 2024-08-13.
- ^ @intelnews (8 May 2019). "Intel's @gregorymbryant at today's..." (Tweet) – via Twitter.
- ^ "Big-time entertainment in remarkably thin and lightdesign" (PDF). Archived from the original (PDF) on 2019-05-28.
- ^ Meyer, Tara (29 September 2011). "XNA Game Studio 4.0 Refresh". msdn.microsoft.com. Archived from the original on 2015-01-07. Retrieved 2014-05-15.
- ^ "Xbox One developer: upcoming SDK improvements will allow for more 1080p games". 29 July 2023.
- ^ "Mali rendering strategy". ARM. Archived from the original on 2016-03-04. Retrieved 2018-10-27.
- ^ "An update on the freedreno graphics driver". lwn.net. Archived from the original on 2015-09-05. Retrieved 2015-09-15.
- ^ "The rise of mobile gaming on android" (PDF). Qualcomm. p. 5. Archived (PDF) from the original on 2014-11-09. Retrieved 17 September 2015.
- ^ Klug, Brian; Lal Shimpi, Anand (September 11, 2011). "Samsung Galaxy S 2 (International) Review - The Best, Redefined". www.anandtech.com. Retrieved 2020-01-04.
- ^ "Tile based rendering". Arm. Retrieved 2020-07-13.
- ^ "A look at the PowerVR graphics architecture: Tile-based rendering". Imagination Technologies. Archived from the original on 2015-04-05. Retrieved 2015-09-15.
- ^ "VideoCoreIV-AG100" (PDF). Broadcom. 2013-09-18. Archived (PDF) from the original on 2015-03-01. Retrieved 2015-01-10.
- ^ "Bring your Metal app to Apple Silicon Macs". developer.apple.com. Retrieved 2020-07-13.