Saliency map: Difference between revisions
m →Example implementation: fix math italics |
fixed copypasta error |
||
Line 9: | Line 9: | ||
[[File:Saliencymap example.jpg|thumb|476x476px|A view of the fort of [[Marburg]] (Germany) and the saliency Map of the image using color, intensity and orientation.]] |
[[File:Saliencymap example.jpg|thumb|476x476px|A view of the fort of [[Marburg]] (Germany) and the saliency Map of the image using color, intensity and orientation.]] |
||
In [[computer vision]], a '''saliency map''' is an [[image]] that shows each [[pixel]]'s unique quality.{{Citation needed|date=February 2017}} The goal of a saliency map is to simplify and/or change the representation of an image into something that is more meaningful and easier to analyze |
In [[computer vision]], a '''saliency map''' is an [[image]] that shows each [[pixel]]'s unique quality.{{Citation needed|date=February 2017}} The goal of a saliency map is to simplify and/or change the representation of an image into something that is more meaningful and easier to analyze. For example, if a pixel has a high [[grey level]] or other unique color quality in a color image, that pixel's quality will show in the saliency map and in an obvious way. Saliency is a kind of [[Image segmentation]]. |
||
The result of saliency map is set of [[contour line|contours]] extracted from the image (see [[edge detection]]). Each of the pixels in a region are similar with respect to some characteristic or computed property, such as color, [[Irradiance|intensity]], or texture. |
The result of saliency map is set of [[contour line|contours]] extracted from the image (see [[edge detection]]). Each of the pixels in a region are similar with respect to some characteristic or computed property, such as color, [[Irradiance|intensity]], or texture. |
Revision as of 11:04, 20 January 2018
This article has multiple issues. Please help improve it or discuss these issues on the talk page. (Learn how and when to remove these messages)
|
In computer vision, a saliency map is an image that shows each pixel's unique quality.[citation needed] The goal of a saliency map is to simplify and/or change the representation of an image into something that is more meaningful and easier to analyze. For example, if a pixel has a high grey level or other unique color quality in a color image, that pixel's quality will show in the saliency map and in an obvious way. Saliency is a kind of Image segmentation.
The result of saliency map is set of contours extracted from the image (see edge detection). Each of the pixels in a region are similar with respect to some characteristic or computed property, such as color, intensity, or texture.
Saliency Background
Saliency is a part of image segmentation. In computer vision, image segmentation is the process of partitioning a digital image into multiple segments (sets of pixels, also known as superpixels). The goal of segmentation is to simplify and/or change the representation of an image into something that is more meaningful and easier to analyze. Image segmentation is typically used to locate objects and boundaries (lines, curves, etc.) in images. More precisely, image segmentation is the process of assigning a label to every pixel in an image such that pixels with the same label share certain characteristics.
The result of image segmentation is a set of segments that collectively cover the entire image or a set of contours extracted from the image (see edge detection). Each of the pixels in a region are similar with respect to some characteristic or computed property, such as color, intensity, or texture. Adjacent regions are significantly different with respect to the same characteristic(s). When applied to a stack of images, typical in medical imaging, the resulting contours after image segmentation can be used to create 3D reconstructions with the help of interpolation algorithms like Marching cubes.
Example implementation
First, we should calculate the distance of each pixel to the rest of pixels in the same frame. And then add all of this minus value together. We use I as the representative of the pixel value. Then we get the following equation:
- SALS(Ik) = ∑|Ik - Ii|
Ii is the rest of pixel value in current frame, except Ik. Besides, the value of Ik is in the range of [0,255]. The following equation is the expanded form of this equation.
- SALS(Ik) = |Ik - I1| + |Ik - I2| + ... + |Ik - IN|
Where N is the total number of pixels in the current frame. Then we can further restructure our formula. We put the value that has same I together.
- SALS(Ik) = ∑ Fn × |Im - In|
Which Fn is the frequency of In. And the value of n belongs to [0,255]. The frequencies is expressed in the form of histogram, and the computational time of histogram is time complexity.
Time complexity
This saliency map algorithm has time complexity. Since the computational time of histogram is time complexity which N is the number of pixel's number of a frame. Besides, the minus part and multiply part of this equation need 256 times operation. Consequently, the time complexity of this algorithm is which equals to .
Pseudo code
All of the following code is pseudo matlab code. First, read Data from video sequences.
for k = 2 : 1: 13 // which means from frame 2 to 13, and in every loop K's value increase one. I = imread(currentfilename); //read current frame I1 = im2single(I); //convert double image into single(requirement of command vlslic) l = imread(lastfilename); //read last frame I2 = im2single(l); regionSize = 10; // set the parameter of SLIC this parameter setting are the experimental result. RegionSize means the superpixel size. regularizer = 1; //set the parameter of SLIC segments1 = vlslic(I1, regionSize, regularizer);//get the superpixel of current frame segments2 = vlslic(I2, regionSize, regularizer);//get superpixel of the previous frame numsuppix = max(segments1(:)); //get the number of superpixel all information about superpixel is in this link http://www.vlfeat.org/overview/slic.html regstats1 = regionprops(segments1,’all’); regstats2 = regionprops(segments2,’all’);//get the region characteristic based on segments1
After we read data, we do superpixel process to each frame. Spnum1 and Spnum2 represent the pixel number of current frame and previous pixel.
First, we calculate the value distance of each pixel.//This is our core code for i=1:1:spnum1 // From the first pixel to the last one. And in every loop i++ for j=1:1:spnum2 //From the first pixel to the last one.j++. previous frame centredist(i:j)=sum((center(i)-center(j)));//calculate the center distance end end
Then we calculate the color distance of each pixel, this process we call it contract function.
for i=1:1:spnum1//From first pixel of current frame to the last one pixel. I ++ for j=1:1:spnum2//From first pixel of previous frame to the last one pixel. J++ posdiff(i,j)=sum((regstats1(j).Centroid’-mupwtd(:,i)));//Calculate the color distance. end end
After this two process, we will get a saliency map, and then store all of these maps into a new FileFolder.
Difference in algorithms
The major difference between function one and two is the difference of contract function. If spnum1 and spnum2 both represent the current frame’s pixel number, then this contract function is for the first saliency function. If spnum1 is the current frame’s pixel number and spnum2 represent the previous frame’s pixel number, then this contract function is for second saliency function. If we use the second contract function which using the pixel of the same frame to get center distance to get a saliency map, then we apply this saliency function to each frame and use current frame’s saliency map minus previous frame’s saliency map to get a new image which is the new saliency result of the third saliency function.
References
- Zhai, Yun; Shah, Mubarak (2006-10-23). "Visual Attention Detection in Video Sequences Using Spatiotemporal Cues". Proceedings of the 14th ACM International Conference on Multimedia. MM '06. New York, NY, USA: ACM: 815–824. doi:10.1145/1180639.1180824. ISBN 1595934472.
- VLfeat: http://www.vlfeat.org/index.html