Submodular set function: Difference between revisions
Mathreader17 (talk | contribs) add reference on Lovász extension |
Mathreader17 (talk | contribs) Add citation to budget-additive function |
||
Line 15: | Line 15: | ||
satisfies the first condition above, but the second condition fails when <math>S</math> and <math>T</math> are infinite sets with finite intersection. |
satisfies the first condition above, but the second condition fails when <math>S</math> and <math>T</math> are infinite sets with finite intersection. |
||
== Types of submodular functions == |
== Types and examples of submodular functions == |
||
=== Monotone === |
=== Monotone === |
||
A submodular function <math>f</math> is ''monotone'' if for every <math>T\subseteq S</math> we have that <math>f(T)\leq f(S)</math>. Examples of monotone submodular functions include: |
A submodular function <math>f</math> is ''monotone'' if for every <math>T\subseteq S</math> we have that <math>f(T)\leq f(S)</math>. Examples of monotone submodular functions include: |
||
; Linear (Modular) functions : Any function of the form <math>f(S)=\sum_{i\in S}w_i</math> is called a linear function. Additionally if <math>\forall i,w_i\geq 0</math> then f is monotone. |
; Linear (Modular) functions : Any function of the form <math>f(S)=\sum_{i\in S}w_i</math> is called a linear function. Additionally if <math>\forall i,w_i\geq 0</math> then f is monotone. |
||
; Budget-additive functions : Any function of the form <math>f(S)=\min\left\{B,~\sum_{i\in S}w_i\right\}</math> for each <math>w_i\geq 0</math> and <math>B\geq 0</math> is called budget additive. |
; [[Budget-additive valuation|Budget-additive functions]] : Any function of the form <math>f(S)=\min\left\{B,~\sum_{i\in S}w_i\right\}</math> for each <math>w_i\geq 0</math> and <math>B\geq 0</math> is called budget additive.<ref name="BF" /> |
||
; Coverage functions : Let <math>\Omega=\{E_1,E_2,\ldots,E_n\}</math> be a collection of subsets of some [[matroid|ground set]] <math>\Omega'</math>. The function <math>f(S)=\left|\bigcup_{E_i\in S}E_i\right|</math> for <math>S\subseteq \Omega</math> is called a coverage function. This can be generalized by adding non-negative weights to the elements. |
; Coverage functions : Let <math>\Omega=\{E_1,E_2,\ldots,E_n\}</math> be a collection of subsets of some [[matroid|ground set]] <math>\Omega'</math>. The function <math>f(S)=\left|\bigcup_{E_i\in S}E_i\right|</math> for <math>S\subseteq \Omega</math> is called a coverage function. This can be generalized by adding non-negative weights to the elements. |
||
; [[Entropy (information theory)|Entropy]] : Let <math>\Omega=\{X_1,X_2,\ldots,X_n\}</math> be a set of [[random variables]]. Then for any <math>S\subseteq \Omega</math> we have that <math>H(S)</math> is a submodular function, where <math>H(S)</math> is the entropy of the set of random variables <math>S</math>, a fact known as [[Entropic vector#Shannon-type inequalities and Γn|Shannon's inequality]].<ref>{{Cite web|url = https://www.cs.cmu.edu/~aarti/Class/10704_Spring15/lecs/lec3.pdf|title = Information Processing and Learning|publisher = cmu}}</ref> Further inequalities for the entropy function are known to hold, see [[entropic vector]]. |
; [[Entropy (information theory)|Entropy]] : Let <math>\Omega=\{X_1,X_2,\ldots,X_n\}</math> be a set of [[random variables]]. Then for any <math>S\subseteq \Omega</math> we have that <math>H(S)</math> is a submodular function, where <math>H(S)</math> is the entropy of the set of random variables <math>S</math>, a fact known as [[Entropic vector#Shannon-type inequalities and Γn|Shannon's inequality]].<ref>{{Cite web|url = https://www.cs.cmu.edu/~aarti/Class/10704_Spring15/lecs/lec3.pdf|title = Information Processing and Learning|publisher = cmu}}</ref> Further inequalities for the entropy function are known to hold, see [[entropic vector]]. |
||
Line 40: | Line 40: | ||
== Continuous extensions == |
== Continuous extensions == |
||
=== Generic definition === |
|||
For a submodular function <math>f:2^{\Omega}\rightarrow \mathbb{R}</math> with <math>|\Omega|=n</math>, we can associate each <math>S\subseteq \Omega</math> with a binary vector <math>x^{S}\in \{0, 1\}^{n}</math> such that <math>x_{i}^{S}=1</math> when <math>i\in S</math>, and <math>x_{i}^{S}=0</math> otherwise. In this sense, the submodular function can be seen as a function defined on <math>\{0, 1\}^{n}</math>. We can then define the continuous [[Restriction_(mathematics)#Extension_of_a_function|extension]] of <math>f</math> to be any continuous function <math>F:\mathbb{R}^{n}\rightarrow \mathbb{R}</math> such that it matches the value of the submodular function on <math>x\in \{0, 1\}^{n}</math>, i.e. <math>F(x^{S})=f(S)</math>. Some commonly used continuous extensions are as follows. |
For a submodular function <math>f:2^{\Omega}\rightarrow \mathbb{R}</math> with <math>|\Omega|=n</math>, we can associate each <math>S\subseteq \Omega</math> with a binary vector <math>x^{S}\in \{0, 1\}^{n}</math> such that <math>x_{i}^{S}=1</math> when <math>i\in S</math>, and <math>x_{i}^{S}=0</math> otherwise. In this sense, the submodular function can be seen as a function defined on <math>\{0, 1\}^{n}</math>. We can then define the continuous [[Restriction_(mathematics)#Extension_of_a_function|extension]] of <math>f</math> to be any continuous function <math>F:\mathbb{R}^{n}\rightarrow \mathbb{R}</math> such that it matches the value of the submodular function on <math>x\in \{0, 1\}^{n}</math>, i.e. <math>F(x^{S})=f(S)</math>. Some commonly used continuous extensions are as follows. |
||
Line 106: | Line 107: | ||
<ref name="FNS">M. Feldman, J. Naor and R. Schwartz, A unified continuous greedy algorithm for submodular maximization, Proc. of 52nd FOCS (2011).</ref> |
<ref name="FNS">M. Feldman, J. Naor and R. Schwartz, A unified continuous greedy algorithm for submodular maximization, Proc. of 52nd FOCS (2011).</ref> |
||
<ref name="L">{{cite journal |author-link1=László Lovász |last1=Lovász |first1=L. |date=1983 |title=Submodular functions and convexity |url= |journal=Mathematical Programming The State of the Art |pages=235-257 |doi=10.1007/978-3-642-68874-4_10 }}</ref> |
<ref name="L">{{cite journal |author-link1=László Lovász |last1=Lovász |first1=L. |date=1983 |title=Submodular functions and convexity |url= |journal=Mathematical Programming The State of the Art |pages=235-257 |doi=10.1007/978-3-642-68874-4_10 }}</ref> |
||
<ref name="BF">{{cite encyclopedia |last1=Buchbinder |first1=Niv |last2=Feldman |first2=Moran |title=Submodular Functions Maximization Problems |encyclopedia= Handbook of Approximation Algorithms and Metaheuristics, Second Edition: Methodologies and Traditional Applications |year=2018 |editor1-last=Gonzalez |editor1-first=Teofilo F. |publisher=Chapman and Hall/CRC |doi=10.1201/9781351236423 |url=https://www.taylorfrancis.com/chapters/edit/10.1201/9781351236423-42/submodular-functions-maximization-problems-niv-buchbinder-moran-feldman}}</ref> |
|||
}} |
}} |
||
Revision as of 18:43, 21 October 2022
In mathematics, a submodular set function (also known as a submodular function) is a set function whose value, informally, has the property that the difference in the incremental value of the function that a single element makes when added to an input set decreases as the size of the input set increases. Submodular functions have a natural diminishing returns property which makes them suitable for many applications, including approximation algorithms, game theory (as functions modeling user preferences) and electrical networks. Recently, submodular functions have also found immense utility in several real world problems in machine learning and artificial intelligence, including automatic summarization, multi-document summarization, feature selection, active learning, sensor placement, image collection summarization and many other domains.[1][2][3][4]
Definition
If is a finite set, a submodular function is a set function , where denotes the power set of , which satisfies one of the following equivalent conditions.[5]
- For every with and every we have that .
- For every we have that .
- For every and such that we have that .
A nonnegative submodular function is also a subadditive function, but a subadditive function need not be submodular. If is not assumed finite, then the above conditions are not equivalent. In particular a function defined by if is finite and if is infinite satisfies the first condition above, but the second condition fails when and are infinite sets with finite intersection.
Types and examples of submodular functions
Monotone
A submodular function is monotone if for every we have that . Examples of monotone submodular functions include:
- Linear (Modular) functions
- Any function of the form is called a linear function. Additionally if then f is monotone.
- Budget-additive functions
- Any function of the form for each and is called budget additive.[6]
- Coverage functions
- Let be a collection of subsets of some ground set . The function for is called a coverage function. This can be generalized by adding non-negative weights to the elements.
- Entropy
- Let be a set of random variables. Then for any we have that is a submodular function, where is the entropy of the set of random variables , a fact known as Shannon's inequality.[7] Further inequalities for the entropy function are known to hold, see entropic vector.
- Matroid rank functions
- Let be the ground set on which a matroid is defined. Then the rank function of the matroid is a submodular function.[8]
Non-monotone
A submodular function that is not monotone is called non-monotone.
Symmetric
A non-monotone submodular function is called symmetric if for every we have that . Examples of symmetric non-monotone submodular functions include:
- Graph cuts
- Let be the vertices of a graph. For any set of vertices let denote the number of edges such that and . This can be generalized by adding non-negative weights to the edges.
- Mutual information
- Let be a set of random variables. Then for any we have that is a submodular function, where is the mutual information.
Asymmetric
A non-monotone submodular function which is not symmetric is called asymmetric.
- Directed cuts
- Let be the vertices of a directed graph. For any set of vertices let denote the number of edges such that and . This can be generalized by adding non-negative weights to the directed edges.
Continuous extensions
Generic definition
For a submodular function with , we can associate each with a binary vector such that when , and otherwise. In this sense, the submodular function can be seen as a function defined on . We can then define the continuous extension of to be any continuous function such that it matches the value of the submodular function on , i.e. . Some commonly used continuous extensions are as follows.
Lovász extension
This extension is named after mathematician László Lovász.[9] Consider any vector such that each . Then the Lovász extension is defined as where the expectation is over chosen from the uniform distribution on the interval . The Lovász extension is a convex function if and only if is a submodular function.
Multilinear extension
Consider any vector such that each . Then the multilinear extension is defined as .
Convex closure
Consider any vector such that each . Then the convex closure is defined as . The convex closure of any set function is convex over . It can be shown that for submodular functions.
Concave closure
Consider any vector such that each . Then the concave closure is defined as .
Properties
- The class of submodular functions is closed under non-negative linear combinations. Consider any submodular function and non-negative numbers . Then the function defined by is submodular.
- For any submodular function , the function defined by is submodular.
- The function , where is a real number, is submodular whenever is monotone submodular. More generally, is submodular, for any non decreasing concave function .
- Consider a random process where a set is chosen with each element in being included in independently with probability . Then the following inequality is true where is the empty set. More generally consider the following random process where a set is constructed as follows. For each of construct by including each element in independently into with probability . Furthermore let . Then the following inequality is true .[citation needed]
Optimization problems
Submodular functions have properties which are very similar to convex and concave functions. For this reason, an optimization problem which concerns optimizing a convex or concave function can also be described as the problem of maximizing or minimizing a submodular function subject to some constraints.
Submodular set function minimization
The simplest minimization problem is to find a set which minimizes a submodular function; this is the unconstrained problem. This problem is computable in (strongly)[10][11] polynomial time.[12][13] Computing the minimum cut in a graph is a special case of this general minimization problem. However, adding even a simple constraint such as a cardinality lower bound makes the minimization problem NP hard, with polynomial factor lower bounds on the approximation factor.[14][15]
Submodular set function maximization
Unlike the case of minimization, maximizing a submodular functions is NP-hard even in the unconstrained setting. Theory and enumeration algorithms for finding local and global maxima (minima) of submodular (supermodular) functions can be found in B. Goldengorin. European Journal of Operational Research 198(1):102-112, DOI: 10.1016/j.ejor.2008.08.022. For instance max cut is a special case even when the function is required only to be non-negative. The unconstrained problem can be shown to be inapproximable if it is allowed to be negative. There has been extensive work on constrained submodular function maximization when the functions are non-negative. Typically, the approximation algorithms for these problems are based on either greedy algorithms or local search algorithms. The problem of maximizing a non-negative symmetric submodular function admits a 1/2 approximation algorithm.[16] Computing the maximum cut of a graph is a special case of this problem. The more general problem of maximizing a non-negative submodular function also admits a 1/2 approximation algorithm.[17] The problem of maximizing a monotone submodular function subject to a cardinality constraint admits a approximation algorithm.[18][page needed][19] The maximum coverage problem is a special case of this problem. The more general problem of maximizing a monotone submodular function subject to a matroid constraint also admits a approximation algorithm.[20][21][22] Many of these algorithms can be unified within a semi-differential based framework of algorithms.[15]
Related optimization problems
Apart from submodular minimization and maximization, another natural problem is Difference of Submodular Optimization.[23][24] Unfortunately, this problem is not only NP hard, but also inapproximable.[24] A related optimization problem is minimize or maximize a submodular function, subject to a submodular level set constraint (also called submodular optimization subject to submodular cover or submodular knapsack constraint). This problem admits bounded approximation guarantees.[25] Another optimization problem involves partitioning data based on a submodular function, so as to maximize the average welfare. This problem is called the submodular welfare problem.[26]
Applications
Submodular functions naturally occur in several real world applications, in economics, game theory, machine learning and computer vision. Owing to the diminishing returns property, submodular functions naturally model costs of items, since there is often a larger discount, with an increase in the items one buys. Submodular functions model notions of complexity, similarity and cooperation when they appear in minimization problems. In maximization problems, on the other hand, they model notions of diversity, information and coverage. For more information on applications of submodularity, particularly in machine learning, see [4][27][28]
See also
Citations
- ^ H. Lin and J. Bilmes, A Class of Submodular Functions for Document Summarization, ACL-2011.
- ^ S. Tschiatschek, R. Iyer, H. Wei and J. Bilmes, Learning Mixtures of Submodular Functions for Image Collection Summarization, NIPS-2014.
- ^ A. Krause and C. Guestrin, Near-optimal nonmyopic value of information in graphical models, UAI-2005.
- ^ a b A. Krause and C. Guestrin, Beyond Convexity: Submodularity in Machine Learning, Tutorial at ICML-2008
- ^ (Schrijver 2003, §44, p. 766)
- ^ Buchbinder, Niv; Feldman, Moran (2018). "Submodular Functions Maximization Problems". In Gonzalez, Teofilo F. (ed.). Handbook of Approximation Algorithms and Metaheuristics, Second Edition: Methodologies and Traditional Applications. Chapman and Hall/CRC. doi:10.1201/9781351236423.
- ^ "Information Processing and Learning" (PDF). cmu.
- ^ Fujishige (2005) p.22
- ^ Lovász, L. (1983). "Submodular functions and convexity". Mathematical Programming The State of the Art: 235–257. doi:10.1007/978-3-642-68874-4_10.
- ^ Iwata, S.; Fleischer, L.; Fujishige, S. (2001). "A combinatorial strongly polynomial algorithm for minimizing submodular functions". J. ACM. 48 (4): 761–777. doi:10.1145/502090.502096. S2CID 888513.
- ^ Schrijver, A. (2000). "A combinatorial algorithm minimizing submodular functions in strongly polynomial time". J. Combin. Theory Ser. B. 80 (2): 346–355. doi:10.1006/jctb.2000.1989.
- ^ Grötschel, M.; Lovasz, L.; Schrijver, A. (1981). "The ellipsoid method and its consequences in combinatorial optimization". Combinatorica. 1 (2): 169–197. doi:10.1007/BF02579273. hdl:10068/182482. S2CID 43787103.
- ^ Cunningham, W. H. (1985). "On submodular function minimization". Combinatorica. 5 (3): 185–192. doi:10.1007/BF02579361. S2CID 33192360.
- ^ Z. Svitkina and L. Fleischer, Submodular approximation: Sampling-based algorithms and lower bounds, SIAM Journal on Computing (2011).
- ^ a b R. Iyer, S. Jegelka and J. Bilmes, Fast Semidifferential based submodular function optimization, Proc. ICML (2013).
- ^ U. Feige, V. Mirrokni and J. Vondrák, Maximizing non-monotone submodular functions, Proc. of 48th FOCS (2007), pp. 461–471.
- ^ N. Buchbinder, M. Feldman, J. Naor and R. Schwartz, A tight linear time (1/2)-approximation for unconstrained submodular maximization, Proc. of 53rd FOCS (2012), pp. 649-658.
- ^ G. L. Nemhauser, L. A. Wolsey and M. L. Fisher, An analysis of approximations for maximizing submodular set functions I, Mathematical Programming 14 (1978), 265–294.
- ^ Williamson, David P. "Bridging Continuous and Discrete Optimization: Lecture 23" (PDF).
- ^ G. Calinescu, C. Chekuri, M. Pál and J. Vondrák, Maximizing a submodular set function subject to a matroid constraint, SIAM J. Comp. 40:6 (2011), 1740-1766.
- ^ M. Feldman, J. Naor and R. Schwartz, A unified continuous greedy algorithm for submodular maximization, Proc. of 52nd FOCS (2011).
- ^ Y. Filmus, J. Ward, A tight combinatorial algorithm for submodular maximization subject to a matroid constraint, Proc. of 53rd FOCS (2012), pp. 659-668.
- ^ M. Narasimhan and J. Bilmes, A submodular-supermodular procedure with applications to discriminative structure learning, In Proc. UAI (2005).
- ^ a b R. Iyer and J. Bilmes, Algorithms for Approximate Minimization of the Difference between Submodular Functions, In Proc. UAI (2012).
- ^ R. Iyer and J. Bilmes, Submodular Optimization Subject to Submodular Cover and Submodular Knapsack Constraints, In Advances of NIPS (2013).
- ^ J. Vondrák, Optimal approximation for the submodular welfare problem in the value oracle model, Proc. of STOC (2008), pp. 461–471.
- ^ http://submodularity.org/.
- ^ J. Bilmes, Submodularity in Machine Learning Applications, Tutorial at AAAI-2015.
References
- Schrijver, Alexander (2003), Combinatorial Optimization, Springer, ISBN 3-540-44389-4
- Lee, Jon (2004), A First Course in Combinatorial Optimization, Cambridge University Press, ISBN 0-521-01012-8
- Fujishige, Satoru (2005), Submodular Functions and Optimization, Elsevier, ISBN 0-444-52086-4
- Narayanan, H. (1997), Submodular Functions and Electrical Networks, ISBN 0-444-82523-1
- Oxley, James G. (1992), Matroid theory, Oxford Science Publications, Oxford: Oxford University Press, ISBN 0-19-853563-5, Zbl 0784.05002
External links
- http://www.cs.berkeley.edu/~stefje/references.html has a longer bibliography