Exponential search: Difference between revisions
No edit summary |
No edit summary |
||
Line 39: | Line 39: | ||
// for less than 2 elements it makes little sense |
// for less than 2 elements it makes little sense |
||
// to even search, thus we simply check upfront. |
// to even search, thus we simply check upfront. |
||
int upper_bound = 4 ; |
int upper_bound = 4 ; |
Revision as of 12:51, 12 April 2015
This article has multiple issues. Please help improve it or discuss these issues on the talk page. (Learn how and when to remove these messages)
|
Class | Search algorithm |
---|---|
Data structure | Array |
Worst-case performance | O(log i) |
Best-case performance | O(1) |
Average performance | O(log i) |
Worst-case space complexity | O(1) |
Optimal | Yes |
In computer science, an exponential search (also called doubling search or galloping search)[1] is an algorithm, created by Jon Bentley and Andrew Chi-Chih Yao in 1976, for searching sorted, unbounded/infinite lists.[2] There are numerous ways to implement this with the most common being to determine a range that the search key resides in and performing a binary search within that range. This takes O(log i) where i is the position of the search key in the list, if the search key is in the list, or the position where the search key should be, if the search key is not in the list.
Exponential search can also be used to search in bounded lists. Exponential search can even out-perform more traditional searches for bounded lists, such as binary search, when the element being searched for is near the beginning of the array. This is because exponential search will run in O(log i) time, where i is the index of the element being searched for in the list, whereas binary search would run in O(log n) time, where n is the number of elements in the list.
Algorithm
Exponential search allows for searching through a sorted, unbounded list for a specified input value (the search "key"). The algorithm consists of two stages. The first stage determines a range in which the search key would reside if it were in the list. In the second stage, a binary search is performed on this range. In the first stage, assuming that the list is sorted in ascending order, the algorithm looks for the first exponent, j, where the value 2j is greater than the search key. This value, 2j becomes the upper bound for the binary search with the previous power of 2, 2j - 1, being the lower bound for the binary search.[3]
template <typename T>
int exponential_search (T * arr, int size, T key)
{
if ( size < 0 || ! arr )
return -1 ;
if ( arr [0] == key )
return 0 ;
if ( size && arr [1] == key ) // if (size) <=> if (size > 0) since for size < 0
return 1 ; // we would not have gotten this far.
// for less than 2 elements it makes little sense
// to even search, thus we simply check upfront.
int upper_bound = 4 ;
while ( upper_bound * 2 < size && arr [upper_bound] < key )
end *= 2 ;
return binary_search (arr, key, upper_bound/2, upper_bound);
}
In each step, the algorithm compares the search key value with the key value at the current search index. If the element at the current index is smaller than the search key, the algorithm repeats, skipping to the next search index by doubling it, calculating the next power of 2.[3] If the element at the current index is larger than the search key, the algorithm now knows that the search key, if it is contained in the list at all, is located in the interval formed by the previous search index, 2j - 1, and the current search index, 2j. The binary search is then performed with the result of either a failure, if the search key is not in the list, or the position of the search key in the list.
Performance
The first stage of the algorithm takes O(log i) time, where i is the index where the search key would be in the list. This is because, in determining the upper bound for the binary search, the while loop is executed exactly times. Since the list is sorted, after doubling the search index times, the algorithm will be at a search index that is greater than or equal to i as . As such, the first stage of the algorithm takes O(log i) time.
The second part of the algorithm also takes O(log i) time. As the second stage is simply a binary search, it takes O(log n) where n is the size of the interval being searched. The size of this interval would be 2j - 2j - 1 where, as seen above, j = log i. This means that the size of the interval being searched is 2log i - 2log i - 1 = 2log i - 1. This gives us a run time of log (2log i - 1) = log (i - 1) = O(log i).
This gives the algorithm a total runtime, calculated by summing the runtimes of the two stages, of O(log i) + O(log i) = 2 O(log i) = O(log i).
Alternatives
Bentley and Yao suggested several variations for exponential search.[2] These variations consist of performing a binary search, as opposed to a unary search, when determining the upper bound for the binary search in the second stage of the algorithm. This splits the first stage of the algorithm into two parts, making the algorithm a three-stage algorithm overall. The new first stage determines a value , much like before, such that is larger than the search key and is lower than the search key. Previously, was determined in a unary fashion by calculating the next power of 2 (i.e., adding 1 to j). In the variation, it is proposed that is doubled instead (e.g., jumping from 22 to 24 as opposed to 23). The first such that is greater than the search key forms a much rougher upper bound than before. Once this is found, the algorithm moves to its second stage and a binary search is performed on the interval formed by and , giving the more accurate upper bound exponent j. From here, the third stage of the algorithm performs the binary search on the interval 2j - 1 and 2j, as before. The performance of this variation is = O(log i).
Bentley and Yao generalize this variation into one where any number, k, of binary searches is performed during the first stage of the algorithm, giving the k-nested binary search variation. The asymptotic runtime does not change for the variations, running in O(log i) time, as with the original exponential search algorithm.
Also, a data structure with a tight version of the dynamic finger property can be given when the above result of the k-nested binary search is used on a sorted array.[4] Using this, the number of comparisons done during a search is log (d) + log log (d) + ... + O(log *d), where d is the difference in rank between the last element that was accessed and the current element being accessed.
See also
References
- ^ Baeza-Yates, Ricardo; Salinger, Alejandro (2010), "Fast intersection algorithms for sorted sequences", in Elomaa, Tapio; Mannila, Heikki; Orponen, Pekka (eds.), Algorithms and Applications: Essays Dedicated to Esko Ukkonen on the Occasion of His 60th Birthday, Lecture Notes in Computer Science, vol. 6060, Springer, pp. 45–61, doi:10.1007/978-3-642-12476-1_3, ISBN 9783642124754.
- ^ a b Bentley, Jon L.; Yao, Andrew C. (1976). "An Almost Optimal Algorithm For Unbounded Searching". Information Processing Letters. 5 (3). Elsevier: 82–87. doi:10.1016/0020-0190(76)90071-5. ISSN 0020-0190.
- ^ a b Jonsson, Håkan (2011-04-19). "Exponential Binary Search". Retrieved 2014-03-24.
- ^ Andersson, Arne; Thorup, Mikkel (2007). "Dynamic ordered sets with exponential search trees". Journal of the ACM (JACM). 54 (3). ACM: 13. doi:10.1145/1236457.1236460. ISSN 0004-5411.
This article needs additional or more specific categories. (May 2014) |