Khác biệt giữa bản sửa đổi của “Tìm kiếm nhị phân”
Nội dung được xóa Nội dung được thêm vào
nKhông có tóm lược sửa đổi |
Không có tóm lược sửa đổi |
||
Dòng 140:
Trong trường hợp tốt nhất, khi giá trị cần tìm nằm ngay giữa mảng, vị trí của nó được trả về ngay sau một phép lặp.{{Sfn|Chang|2003|p=169}}
==== Độ phức tạp không gian ====
Dòng 176:
</math>
Với <math>n</math> là một số nguyên,
==== Tìm kiếm không thành công ====
Dòng 202:
== Tìm kiếm nhị phân với các hệ thống khác ==
Áp dụng tìm kiếm nhị phân với các mảng đã được sắp xếp là một giải pháp rất không hiệu quả khi xen giữa các phép chèn và xóa còn là các phép truy hồi, mỗi thao tác đó phải mất <math display="inline">O(n)</math> thời gian để thực hiện. Ngoài ra, yêu cầu phải sắp xếp trước các mảng có thể làm phức tạp việc sử dụng bộ nhớ, đặc biệt là khi các phần tử thường được chèn vào trong mảng.{{Sfn|Knuth|1997|loc=§2.2.2 ("Sequential Allocation")}} Một số cấu trúc dữ liệu khác có thể hỗ trợ các thao tác chèn và xóa hiệu quả hơn nhiều. Tìm kiếm nhị phân có thể được dùng để thực hiện so khớp chính xác và thao tác [[Tập hợp (cấu trúc dữ liệu trừu tượng)|quan hệ tập hợp]] (
=== Tìm kiếm tuyến tính ===
[[Tìm kiếm tuyến tính]] là một thuật toán tìm kiếm đơn giản: nó kiểm tra mọi bản ghi cho tới khi tìm thấy giá trị cần tìm. Tìm kiếm tuyến tính có thể được thực hiện trên danh sách liên kết, cho phép các thao tác chèn và xóa nhanh hơn so với mảng.
=== Cây ===
Dòng 226:
=== Các cấu trúc dữ liệu khác ===
Còn có các cấu trúc dữ liệu khác có thể cải thiện tìm kiếm nhị phân trong một số trường hợp với cả mục đích tìm kiếm lẫn các thao tác khác có thể được thực hiện trên các mảng đã được sắp xếp. Ví dụ, các phép tìm kiếm, so khớp xấp xỉ và các thao tác khác trên mảng đã được sắp xếp có thể được thực hiện hiệu quả hơn tìm kiếm nhị phân nếu áp dụng trên các cấu trúc dữ liệu chuyên biệt như cây van Emde Boas, [[fusion tree]]s, [[trie]] và [[mảng bit]]. Các cấu trúc dữ liệu chuyên biệt này thường chỉ nhanh hơn vì chúng sử dụng các tính chất của những khóa có điều kiện nhất định (thông thường những khóa này là các số nguyên nhỏ), và do đó thời gian chạy hoặc không gian cần sử dụng có thể lớn nếu không đáp ứng được các điều kiện ấy.<ref name="pred" /> As long as the keys can be ordered, these operations can always be done at least efficiently on a sorted array regardless of the keys. Some structures, such as Judy arrays, use a combination of approaches to mitigate this while retaining efficiency and the ability to perform approximate matching.<ref name="judyarray" />
==Biến thể==
===Uniform binary search===
{{main|Uniform binary search}}
[[File:Uniform binary search.svg|thumb|upright=1.0|[[Uniform binary search]] stores the difference between the current and the two next possible middle elements instead of specific bounds.]]
Uniform binary search stores, instead of the lower and upper bounds, the difference in the index of the middle element from the current iteration to the next iteration. A [[lookup table]] containing the differences is computed beforehand. For example, if the array to be searched is {{math|[1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11]}}, the middle element (<math>m</math>) would be {{math|6}}. In this case, the middle element of the left subarray ({{math|[1, 2, 3, 4, 5]}}) is {{math|3}} and the middle element of the right subarray ({{math|[7, 8, 9, 10, 11]}}) is {{math|9}}. Uniform binary search would store the value of {{math|3}} as both indices differ from {{math|6}} by this same amount.{{Sfn|Knuth|1998|loc=§6.2.1 ("Searching an ordered table"), subsection "An important variation"}} To reduce the search space, the algorithm either adds or subtracts this change from the index of the middle element. Uniform binary search may be faster on systems where it is inefficient to calculate the midpoint, such as on [[decimal computer]]s.{{Sfn|Knuth|1998|loc=§6.2.1 ("Searching an ordered table"), subsection "Algorithm U"}}
===Tìm kiếm mũ{{anchor|One-sided search}}===
{{main|Exponential search}}
[[File:Exponential search.svg|thumb|upright=1.5|Visualization of [[exponential search]]ing finding the upper bound for the subsequent binary search]]
Exponential search extends binary search to unbounded lists. It starts by finding the first element with an index that is both a power of two and greater than the target value. Afterwards, it sets that index as the upper bound, and switches to binary search. A search takes <math display="inline">\lfloor \log_2 x + 1\rfloor</math> iterations before binary search is started and at most <math display="inline">\lfloor \log_2 x \rfloor</math> iterations of the binary search, where <math display="inline">x</math> is the position of the target value. Exponential search works on bounded lists, but becomes an improvement over binary search only if the target value lies near the beginning of the array.{{sfn|Moffat|Turpin|2002|p=33}}
===Tìm kiếm nội suy===
{{main|Tìm kiếm nội suy}}
[[File:Interpolation search.svg|thumb|upright=1.5|Visualization of [[interpolation search]] using linear interpolation. In this case, no searching is needed because the estimate of the target's location within the array is correct. Other implementations may specify another function for estimating the target's location.]]
Instead of calculating the midpoint, interpolation search estimates the position of the target value, taking into account the lowest and highest elements in the array as well as length of the array. It works on the basis that the midpoint is not the best guess in many cases. For example, if the target value is close to the highest element in the array, it is likely to be located near the end of the array.{{Sfn|Knuth|1998|loc=§6.2.1 ("Searching an ordered table"), subsection "Interpolation search"}}
A common interpolation function is [[linear interpolation]]. If <math>A</math> is the array, <math>L, R</math> are the lower and upper bounds respectively, and <math>T</math> is the target, then the target is estimated to be about <math>(T - A_L) / (A_R - A_L)</math> of the way between <math>L</math> and <math>R</math>. When linear interpolation is used, and the distribution of the array elements is uniform or near uniform, interpolation search makes <math display="inline">O(\log \log n)</math> comparisons.{{Sfn|Knuth|1998|loc=§6.2.1 ("Searching an ordered table"), subsection "Interpolation search"}}{{Sfn|Knuth|1998|loc=§6.2.1 ("Searching an ordered table"), subsection "Exercise 22"}}<ref>{{cite journal|last1=Perl|first1=Yehoshua|last2=Itai|first2=Alon|last3=Avni|first3=Haim|title=Interpolation search—a log log ''n'' search|journal=[[Communications of the ACM]]|date=1978|volume=21|issue=7|pages=550–553|doi=10.1145/359545.359557|bibcode=1985CACM...28...22S}}</ref>
In practice, interpolation search is slower than binary search for small arrays, as interpolation search requires extra computation. Its time complexity grows more slowly than binary search, but this only compensates for the extra computation for large arrays.{{Sfn|Knuth|1998|loc=§6.2.1 ("Searching an ordered table"), subsection "Interpolation search"}}
=== Đổ xuống một phần ===
{{Main|Fractional cascading}}
[[File:Fractional cascading.svg|thumb|upright=2.5|In [[fractional cascading]], each array has pointers to every second element of another array, so only one binary search has to be performed to search all the arrays.]]
Fractional cascading is a technique that speeds up binary searches for the same element in multiple sorted arrays. Searching each array separately requires <math display="inline">O(k \log n)</math> time, where <math display="inline">k</math> is the number of arrays. Fractional cascading reduces this to <math display="inline">O(k + \log n)</math> by storing specific information in each array about each element and its position in the other arrays.<ref name="ChazelleLiu2001">{{cite conference|last1=Chazelle|first1=Bernard|last2=Liu|first2=Ding|authorlink1=Bernard Chazelle|title=Lower bounds for intersection searching and fractional cascading in higher dimension|conference=33rd [[Symposium on Theory of Computing|ACM Symposium on Theory of Computing]]|pages=322–329|date=6 July 2001|doi=10.1145/380752.380818|url=https://dl.acm.org/citation.cfm?doid=380752.380818 |accessdate=30 June 2018 |publisher=ACM|isbn=978-1-58113-349-3}}</ref><ref name="ChazelleLiu2004">{{cite journal|last1=Chazelle|first1=Bernard|last2=Liu|first2=Ding|authorlink1=Bernard Chazelle|title=Lower bounds for intersection searching and fractional cascading in higher dimension|journal=Journal of Computer and System Sciences|date=1 March 2004 |volume=68|issue=2|pages=269–284 |language=en |issn=0022-0000|doi=10.1016/j.jcss.2003.07.003|citeseerx=10.1.1.298.7772|url=http://www.cs.princeton.edu/~chazelle/pubs/FClowerbounds.pdf|accessdate=30 June 2018}}</ref>
Fractional cascading was originally developed to efficiently solve various [[computational geometry]] problems. Fractional cascading has been applied elsewhere, such as in [[data mining]] and [[Internet Protocol]] routing.<ref name="ChazelleLiu2001" />
=== Generalization to graphs ===
Binary search has been generalized to work on certain types of graphs, where the target value is stored in a vertex instead of an array element. Binary search trees are one such generalization—when a vertex (node) in the tree is queried, the algorithm either learns that the vertex is the target, or otherwise which subtree the target would be located in. However, this can be further generalized as follows: given an undirected, positively weighted graph and a target vertex, the algorithm learns upon querying a vertex that it is equal to the target, or it is given an incident edge that is on the shortest path from the queried vertex to the target. The standard binary search algorithm is simply the case where the graph is a path. Similarly, binary search trees are the case where the edges to the left or right subtrees are given when the queried vertex is unequal to the target. For all undirected, positively weighted graphs, there is an algorithm that finds the target vertex in <math>O(\log n)</math> queries in the worst case.<ref>{{cite conference|last1=Emamjomeh-Zadeh|first1=Ehsan|last2=Kempe|first2=David|last3=Singhal|first3=Vikrant|title=Deterministic and probabilistic binary search in graphs|date=2016|pages=519–532|conference=48th [[Symposium on Theory of Computing|ACM Symposium on Theory of Computing]]|arxiv=1503.00805|doi=10.1145/2897518.2897656}}</ref>
=== Noisy binary search ===
[[File:Noisy binary search.svg|thumb|upright=1.5|In noisy binary search, there is a certain probability that a comparison is incorrect.]]
Noisy binary search algorithms solve the case where the algorithm cannot reliably compare elements of the array. For each pair of elements, there is a certain probability that the algorithm makes the wrong comparison. Noisy binary search can find the correct position of the target with a given probability that controls the reliability of the yielded position. Every noisy binary search procedure must make at least <math>(1 - \tau)\frac{\log_2 (n)}{H(p)} - \frac{10}{H(p)}</math> comparisons on average, where <math>H(p) = -p \log_2 (p) - (1 - p) \log_2 (1 - p)</math><!-- Attribution of LaTeX code: see history of https://en.wikipedia.org/wiki/Binary_entropy_function --> is the [[binary entropy function]] and <math>\tau</math> is the probability that the procedure yields the wrong position.<ref>{{cite conference |last1=Ben-Or |first1=Michael |last2=Hassidim |first2=Avinatan |title=The Bayesian learner is optimal for noisy binary search (and pretty good for quantum as well) |date=2008 |book-title=49th [[Annual IEEE Symposium on Foundations of Computer Science|Symposium on Foundations of Computer Science]] |pages=221–230 |doi=10.1109/FOCS.2008.58 |ref=harv |url=http://www2.lns.mit.edu/~avinatan/research/search-full.pdf |isbn=978-0-7695-3436-7}}</ref><ref name="pelc1989">{{cite journal|last1=Pelc|first1=Andrzej|title=Searching with known error probability|journal=Theoretical Computer Science|date=1989|volume=63|issue=2|pages=185–202|doi=10.1016/0304-3975(89)90077-7}}</ref><ref>{{cite conference|last1=Rivest|first1=Ronald L.|last2=Meyer|first2=Albert R.|last3=Kleitman|first3=Daniel J.|last4=Winklmann|first4=K.|authorlink1=Ronald Rivest|authorlink2=Albert R. Meyer|authorlink3=Daniel Kleitman|title=Coping with errors in binary search procedures|conference=10th [[Symposium on Theory of Computing|ACM Symposium on Theory of Computing]]|doi=10.1145/800133.804351}}</ref> The noisy binary search problem can be considered as a case of the [[Ulam's game|Rényi-Ulam game]],<ref>{{cite journal|last1=Pelc|first1=Andrzej|title=Searching games with errors—fifty years of coping with liars|journal=Theoretical Computer Science|date=2002|volume=270|issue=1–2|pages=71–109|doi=10.1016/S0304-3975(01)00303-6}}</ref> a variant of [[Twenty Questions]] where the answers may be wrong.<ref>{{Cite journal | last1=Rényi | first1=Alfréd | title=On a problem in information theory | language=Hungarian | mr=0143666 | year=1961 | journal=Magyar Tudományos Akadémia Matematikai Kutató Intézetének Közleményei| volume=6 | pages=505–516}}</ref>
=== Tìm kiếm nhị phân lượng tử ===
Classical computers are bounded to the worst case of exactly <math display="inline">\lfloor \log_2 n + 1 \rfloor</math> iterations when performing binary search. [[Quantum algorithm]]s for binary search are still bounded to a proportion of <math display="inline">\log_2 n</math> queries (representing iterations of the classical procedure), but the constant factor is less than one, providing for a lower time complexity on [[quantum computing|quantum computers]]. Any ''exact'' quantum binary search procedure—that is, a procedure that always yields the correct result—requires at least <math display="inline">\frac{1}{\pi}(\ln n - 1) \approx 0.22 \log_2 n</math> queries in the worst case, where <math display="inline">\ln</math> is the [[natural logarithm]].<ref>{{cite journal|last1=Høyer|first1=Peter|last2=Neerbek|first2=Jan|last3=Shi|first3=Yaoyun|title=Quantum complexities of ordered searching, sorting, and element distinctness|journal=[[Algorithmica]]|date=2002|volume=34|issue=4|pages=429–448|doi=10.1007/s00453-002-0976-3|ref=harv|arxiv=quant-ph/0102078}}</ref> There is an exact quantum binary search procedure that runs in <math display="inline">4 \log_{605} n \approx 0.433 \log_2 n</math> queries in the worst case.<ref name="quantumalgo">{{cite journal|last1=Childs|first1=Andrew M.|last2=Landahl|first2=Andrew J.|last3=Parrilo|first3=Pablo A.|title=Quantum algorithms for the ordered search problem via semidefinite programming|journal=Physical Review A|date=2007|volume=75|issue=3|at=032335|doi=10.1103/PhysRevA.75.032335|ref=harv|arxiv=quant-ph/0608161|bibcode=2007PhRvA..75c2335C}}</ref> In comparison, [[Grover's algorithm]] is the optimal quantum algorithm for searching an unordered list of elements, and it requires <math>O(\sqrt{n})</math> queries.<ref>{{cite conference |last1=Grover |first1=Lov K. | authorlink=Lov Grover | title=A fast quantum mechanical algorithm for database search | conference=28th [[Symposium on Theory of Computing|ACM Symposium on Theory of Computing]] |pages=212–219|date=1996| location=Philadelphia, PA | doi=10.1145/237814.237866| arxiv=quant-ph/9605043}}</ref>
== Lịch sử ==
|