U-Net: Difference between revisions
Citation bot (talk | contribs) Add: bibcode, pmc, doi-access, pages, pmid, issue, volume, date. | Use this bot. Report bugs. | Suggested by Jay8g | Category:CS1 maint: unflagged free DOI | #UCB_Category 10/35 |
No edit summary |
||
(8 intermediate revisions by 4 users not shown) | |||
Line 1: | Line 1: | ||
{{Short description|Type of convolutional neural network}} |
{{Short description|Type of convolutional neural network}} |
||
{{Machine learning|Artificial neural network}} |
{{Machine learning|Artificial neural network}} |
||
'''U-Net''' is a [[convolutional neural network]] that was developed for |
'''U-Net''' is a [[convolutional neural network]] that was developed for [[image segmentation]].<ref name="Ronneberger_2015">{{cite arXiv| vauthors = Ronneberger O, Fischer P, Brox T |title=U-Net: Convolutional Networks for Biomedical Image Segmentation|eprint=1505.04597|date=2015|class=cs.CV}}</ref> The network is based on a fully [[convolutional neural network]]<ref name="Shelhamer_2017">{{cite journal |vauthors=Shelhamer E, Long J, Darrell T |date=Nov 2014 |title=Fully Convolutional Networks for Semantic Segmentation |journal=IEEE Transactions on Pattern Analysis and Machine Intelligence |volume=39 |issue=4 |pages=640–651 |arxiv=1411.4038 |doi=10.1109/TPAMI.2016.2572683 |pmid=27244717 |s2cid=1629541}}</ref> whose architecture was modified and extended to work with fewer training images and to yield more precise [[Image segmentation|segmentation]]. Segmentation of a 512 × 512 image takes less than a second on a modern (2015) [[Graphics processing unit|GPU]] using the U-Net architecture.<ref name="Ronneberger_2015" /> <ref>{{cite journal |last1=Nazem |first1=Fatemeh |last2=Ghasemi |first2=Fahimeh |last3=Fassihi |first3=Afshin |last4=Mehri Dehnavi |first4=Alireza |title=3D U-Net: A Voxel-based method in binding site prediction of protein structure|journal=Journal of Bioinformatics and Computational Biology |date=2021 |volume=19 |issue=2 |doi= 10.1142/S0219720021500062 |pmid=33866960 }}</ref><ref>{{cite journal |last1=Nazem |first1=Fatemeh |last2=Ghasemi |first2=Fahimeh |last3=Fassihi |first3=Afshin |last4=Mehri Dehnavi |first4=Alireza |title=A GU-Net-Based Architecture Predicting Ligand–Protein-Binding Atoms|journal=Journal of Medical Signals & Sensors |date=2023 |volume=13 |issue=1 |pages=1–10 |doi= 10.4103/jmss.jmss_142_21 |doi-access=free |pmid=37292445 |pmc=10246592 }}</ref><ref>{{cite journal |last1=Nazem |first1=Fatemeh |last2=Ghasemi |first2=Fahimeh |last3=Fassihi |first3=Afshin |last4=Mehri Dehnavi |first4=Alireza |title=Deep attention network for identifying ligand-protein binding sites |journal= Journal of Computational Science|date=2024 |volume=81 |doi= 10.1016/j.jocs.2024.102368 }}</ref> |
||
The U-Net architecture has also been employed in [[diffusion models]] for iterative image denoising.<ref>{{cite arXiv |last=Ho |first=Jonathan |date=2020 |title=Denoising Diffusion Probabilistic Models |class=cs.LG |eprint=2006.11239}}</ref> This technology underlies many modern image generation models, such as [[DALL-E]], [[Midjourney]], and [[Stable Diffusion]]. |
The U-Net architecture has also been employed in [[diffusion models]] for iterative image denoising.<ref>{{cite arXiv |last=Ho |first=Jonathan |date=2020 |title=Denoising Diffusion Probabilistic Models |class=cs.LG |eprint=2006.11239}}</ref> This technology underlies many modern image generation models, such as [[DALL-E]], [[Midjourney]], and [[Stable Diffusion]]. |
||
==Description== |
==Description== |
||
The U-Net architecture stems from the so-called "fully convolutional network" |
The U-Net architecture stems from the so-called "fully convolutional network".<ref name="Shelhamer_2017" /> |
||
The main idea is to supplement a usual contracting network by successive layers, where pooling operations are replaced by [[upsampling]] operators. Hence these layers increase the resolution of the output. A successive convolutional layer can then learn to assemble a precise output based on this information.<ref name="Ronneberger_2015" /> |
The main idea is to supplement a usual contracting network by successive layers, where [[Pooling layer|pooling]] operations are replaced by [[upsampling]] operators. Hence these layers increase the resolution of the output. A successive convolutional layer can then learn to assemble a precise output based on this information.<ref name="Ronneberger_2015" /> |
||
One important modification in U-Net is that there are a large number of feature channels in the upsampling part, which allow the network to propagate context information to higher resolution layers. As a consequence, the expansive path is more or less symmetric to the contracting part, and yields a u-shaped architecture. The network only uses the valid part of each [[convolution]] without any fully connected layers.<ref name="Shelhamer_2017" /> To predict the pixels in the border region of the image, the missing context is extrapolated by mirroring the input image. This tiling strategy is important to apply the network to large images, since otherwise the resolution would be limited by the [[Graphics processing unit|GPU]] memory. |
One important modification in U-Net is that there are a large number of feature channels in the upsampling part, which allow the network to propagate context information to higher resolution layers. As a consequence, the expansive path is more or less symmetric to the contracting part, and yields a u-shaped architecture. The network only uses the valid part of each [[convolution]] without any fully connected layers.<ref name="Shelhamer_2017" /> To predict the pixels in the border region of the image, the missing context is extrapolated by mirroring the input image. This tiling strategy is important to apply the network to large images, since otherwise the resolution would be limited by the [[Graphics processing unit|GPU]] memory. |
||
⚫ | |||
⚫ | U-Net was created by Olaf Ronneberger, Philipp Fischer, Thomas Brox in 2015 and reported in the paper "U-Net: Convolutional Networks for Biomedical Image Segmentation".<ref name="Ronneberger_2015" /> It is an improvement and development of FCN: Evan Shelhamer, Jonathan Long, Trevor Darrell (2014). "Fully convolutional networks for semantic segmentation".<ref name="Shelhamer_2017" /> |
||
==Network architecture== |
==Network architecture== |
||
Line 31: | Line 25: | ||
# Image-to-image translation to estimate fluorescent stains <ref>{{cite journal | vauthors = Kandel ME, He YR, Lee YJ, Chen TH, Sullivan KM, Aydin O, Saif MT, Kong H, Sobh N, Popescu G | display-authors = 6 | title = Phase imaging with computational specificity (PICS) for measuring dry mass changes in sub-cellular compartments | journal = Nature Communications | volume = 11 | issue = 1 | pages = 6256 | date = December 2020 | pmid = 33288761 | pmc = 7721808 | doi = 10.1038/s41467-020-20062-x | arxiv = 2002.08361 | bibcode = 2020NatCo..11.6256K }}</ref> |
# Image-to-image translation to estimate fluorescent stains <ref>{{cite journal | vauthors = Kandel ME, He YR, Lee YJ, Chen TH, Sullivan KM, Aydin O, Saif MT, Kong H, Sobh N, Popescu G | display-authors = 6 | title = Phase imaging with computational specificity (PICS) for measuring dry mass changes in sub-cellular compartments | journal = Nature Communications | volume = 11 | issue = 1 | pages = 6256 | date = December 2020 | pmid = 33288761 | pmc = 7721808 | doi = 10.1038/s41467-020-20062-x | arxiv = 2002.08361 | bibcode = 2020NatCo..11.6256K }}</ref> |
||
#In binding site prediction of protein structure.<ref name = "Nazem_2021" /> |
#In binding site prediction of protein structure.<ref name = "Nazem_2021" /> |
||
⚫ | |||
⚫ | U-Net was created by Olaf Ronneberger, Philipp Fischer, Thomas Brox in 2015 and reported in the paper "U-Net: Convolutional Networks for Biomedical Image Segmentation".<ref name="Ronneberger_2015" /> It is an improvement and development of FCN: Evan Shelhamer, Jonathan Long, Trevor Darrell (2014). "Fully convolutional networks for semantic segmentation".<ref name="Shelhamer_2017" /> |
||
== References == |
== References == |
||
Line 40: | Line 37: | ||
* [https://lmb.informatik.uni-freiburg.de/people/ronneber/u-net/ U-Net source code] from Pattern Recognition and Image Processing at Computer Science Department of the University of Freiburg, Germany. |
* [https://lmb.informatik.uni-freiburg.de/people/ronneber/u-net/ U-Net source code] from Pattern Recognition and Image Processing at Computer Science Department of the University of Freiburg, Germany. |
||
[[Category:Neural network architectures]] |
|||
[[Category:Computer vision]] |
|||
[[Category:Deep learning software applications]] |
[[Category:Deep learning software applications]] |
||
[[Category:University of Freiburg]] |
[[Category:University of Freiburg]] |
Latest revision as of 22:11, 2 January 2025
Part of a series on |
Machine learning and data mining |
---|
U-Net is a convolutional neural network that was developed for image segmentation.[1] The network is based on a fully convolutional neural network[2] whose architecture was modified and extended to work with fewer training images and to yield more precise segmentation. Segmentation of a 512 × 512 image takes less than a second on a modern (2015) GPU using the U-Net architecture.[1] [3][4][5]
The U-Net architecture has also been employed in diffusion models for iterative image denoising.[6] This technology underlies many modern image generation models, such as DALL-E, Midjourney, and Stable Diffusion.
Description
[edit]The U-Net architecture stems from the so-called "fully convolutional network".[2]
The main idea is to supplement a usual contracting network by successive layers, where pooling operations are replaced by upsampling operators. Hence these layers increase the resolution of the output. A successive convolutional layer can then learn to assemble a precise output based on this information.[1]
One important modification in U-Net is that there are a large number of feature channels in the upsampling part, which allow the network to propagate context information to higher resolution layers. As a consequence, the expansive path is more or less symmetric to the contracting part, and yields a u-shaped architecture. The network only uses the valid part of each convolution without any fully connected layers.[2] To predict the pixels in the border region of the image, the missing context is extrapolated by mirroring the input image. This tiling strategy is important to apply the network to large images, since otherwise the resolution would be limited by the GPU memory.
Network architecture
[edit]The network consists of a contracting path and an expansive path, which gives it the u-shaped architecture. The contracting path is a typical convolutional network that consists of repeated application of convolutions, each followed by a rectified linear unit (ReLU) and a max pooling operation. During the contraction, the spatial information is reduced while feature information is increased. The expansive pathway combines the feature and spatial information through a sequence of up-convolutions and concatenations with high-resolution features from the contracting path.[7]
Applications
[edit]There are many applications of U-Net in biomedical image segmentation, such as brain image segmentation (''BRATS''[8]) and liver image segmentation ("siliver07"[9]) as well as protein binding site prediction.[10] U-Net implementations have also found use in the physical sciences, for example in the analysis of micrographs of materials.[11][12][13] Variations of the U-Net have also been applied for medical image reconstruction.[14] Here are some variants and applications of U-Net as follows:
- Pixel-wise regression using U-Net and its application on pansharpening;[15]
- 3D U-Net: Learning Dense Volumetric Segmentation from Sparse Annotation;[16]
- TernausNet: U-Net with VGG11 Encoder Pre-Trained on ImageNet for Image Segmentation.[17]
- Image-to-image translation to estimate fluorescent stains [18]
- In binding site prediction of protein structure.[10]
History
[edit]U-Net was created by Olaf Ronneberger, Philipp Fischer, Thomas Brox in 2015 and reported in the paper "U-Net: Convolutional Networks for Biomedical Image Segmentation".[1] It is an improvement and development of FCN: Evan Shelhamer, Jonathan Long, Trevor Darrell (2014). "Fully convolutional networks for semantic segmentation".[2]
References
[edit]- ^ a b c d Ronneberger O, Fischer P, Brox T (2015). "U-Net: Convolutional Networks for Biomedical Image Segmentation". arXiv:1505.04597 [cs.CV].
- ^ a b c d Shelhamer E, Long J, Darrell T (Nov 2014). "Fully Convolutional Networks for Semantic Segmentation". IEEE Transactions on Pattern Analysis and Machine Intelligence. 39 (4): 640–651. arXiv:1411.4038. doi:10.1109/TPAMI.2016.2572683. PMID 27244717. S2CID 1629541.
- ^ Nazem, Fatemeh; Ghasemi, Fahimeh; Fassihi, Afshin; Mehri Dehnavi, Alireza (2021). "3D U-Net: A Voxel-based method in binding site prediction of protein structure". Journal of Bioinformatics and Computational Biology. 19 (2). doi:10.1142/S0219720021500062. PMID 33866960.
- ^ Nazem, Fatemeh; Ghasemi, Fahimeh; Fassihi, Afshin; Mehri Dehnavi, Alireza (2023). "A GU-Net-Based Architecture Predicting Ligand–Protein-Binding Atoms". Journal of Medical Signals & Sensors. 13 (1): 1–10. doi:10.4103/jmss.jmss_142_21. PMC 10246592. PMID 37292445.
- ^ Nazem, Fatemeh; Ghasemi, Fahimeh; Fassihi, Afshin; Mehri Dehnavi, Alireza (2024). "Deep attention network for identifying ligand-protein binding sites". Journal of Computational Science. 81. doi:10.1016/j.jocs.2024.102368.
- ^ Ho, Jonathan (2020). "Denoising Diffusion Probabilistic Models". arXiv:2006.11239 [cs.LG].
- ^ "U-Net code".
- ^ "MICCAI BraTS 2017: Scope | Section for Biomedical Image Analysis (SBIA) | Perelman School of Medicine at the University of Pennsylvania". www.med.upenn.edu. Retrieved 2018-12-24.
- ^ "SLIVER07 : Home". www.sliver07.org. Retrieved 2018-12-24.
- ^ a b Nazem F, Ghasemi F, Fassihi A, Dehnavi AM (April 2021). "3D U-Net: A voxel-based method in binding site prediction of protein structure". Journal of Bioinformatics and Computational Biology. 19 (2): 2150006. doi:10.1142/S0219720021500062. PMID 33866960. S2CID 233300145.
- ^ Chen, Fu-Xiang Rikudo; Lin, Chia-Yu; Siao, Hui-Ying; Jian, Cheng-Yuan; Yang, Yong-Cheng; Lin, Chun-Liang (2023-02-14). "Deep learning based atomic defect detection framework for two-dimensional materials". Scientific Data. 10 (1): 91. Bibcode:2023NatSD..10...91C. doi:10.1038/s41597-023-02004-6. ISSN 2052-4463. PMC 9929095. PMID 36788235.
- ^ Shi, Peng; Duan, Mengmeng; Yang, Lifang; Feng, Wei; Ding, Lianhong; Jiang, Liwu (2022-06-22). "An Improved U-Net Image Segmentation Method and Its Application for Metallic Grain Size Statistics". Materials. 15 (13): 4417. Bibcode:2022Mate...15.4417S. doi:10.3390/ma15134417. ISSN 1996-1944. PMC 9267311. PMID 35806543.
- ^ Patrick, Matthew J; Eckstein, James K; Lopez, Javier R; Toderas, Silvia; Asher, Sarah A; Whang, Sylvia I; Levine, Stacey; Rickman, Jeffrey M; Barmak, Katayun (2023-11-15). "Automated Grain Boundary Detection for Bright-Field Transmission Electron Microscopy Images via U-Net". Microscopy and Microanalysis. 29 (6): 1968–1979. arXiv:2312.09392. doi:10.1093/micmic/ozad115. ISSN 1431-9276. PMID 37966960.
- ^ Andersson J, Ahlström H, Kullberg J (September 2019). "Separation of water and fat signal in whole-body gradient echo scans using convolutional neural networks". Magnetic Resonance in Medicine. 82 (3): 1177–1186. doi:10.1002/mrm.27786. PMC 6618066. PMID 31033022.
- ^ Yao W, Zeng Z, Lian C, Tang H (2018-10-27). "Pixel-wise regression using U-Net and its application on pansharpening". Neurocomputing. 312: 364–371. doi:10.1016/j.neucom.2018.05.103. ISSN 0925-2312. S2CID 207119255.
- ^ Çiçek Ö, Abdulkadir A, Lienkamp SS, Brox T, Ronneberger O (2016). "3D U-Net: Learning Dense Volumetric Segmentation from Sparse Annotation". arXiv:1606.06650 [cs.CV].
- ^ Iglovikov V, Shvets A (2018). "TernausNet: U-Net with VGG11 Encoder Pre-Trained on ImageNet for Image Segmentation". arXiv:1801.05746 [cs.CV].
- ^ Kandel ME, He YR, Lee YJ, Chen TH, Sullivan KM, Aydin O, et al. (December 2020). "Phase imaging with computational specificity (PICS) for measuring dry mass changes in sub-cellular compartments". Nature Communications. 11 (1): 6256. arXiv:2002.08361. Bibcode:2020NatCo..11.6256K. doi:10.1038/s41467-020-20062-x. PMC 7721808. PMID 33288761.
Implementations
[edit]- Tensorflow Unet by J Akeret (2017)
- U-Net source code from Pattern Recognition and Image Processing at Computer Science Department of the University of Freiburg, Germany.