Binary search¶

Binary search is also known as binary search algorithm. In a narrow sense, binary search is a search algorithm that finds a particular element in an ordered array. This is also a saying that most people know. In fact, the generalized binary search reduces the size of the problem to half of its original size. Similarly, the rule of thirds is to reduce the size of the problem to ⅓ of the original.

What this article brings to you is Dichotomous search in a narrow sense. If you want to know about other binary searches in a broad sense, you can check a blog post I wrote before Dichotomy from the question of rat poisoning

Although the basic idea of binary search is relatively simple, the details can be overwhelming... — Gartner

When Jon Bentley assigned the binary search problem to students in professional programming courses, 90% of the students still could not give the correct answer after spending a few hours, mainly because these wrong programs were facing the boundary value. Failed to run at the time, or returned an error result. A study conducted in 1988 showed that only 5 of 20 textbooks correctly implemented binary search. Not only that, the binary search algorithm in Bentley's 1986 book "Programming Pearls" has the problem of integer overflow, which has not been discovered for more than two decades. The same overflow problem in the binary search algorithm implemented by the Java language library has existed for more than nine years before being fixed.

It can be seen that binary search is not simple. This article will try to take you closer to ta, understand the underlying logic of ta, and provide templates to help you write bug free binary search codes.

Problem definition¶

Given an ordered array of numbers nums, and give you a number target. Ask whether there is a target in nums. If it exists, return its index in nums. If it does not exist, it returns -1.

This is the simplest form of binary search. Of course, binary search also has many distortions, which is also the reason why binary search is prone to errors and difficult to grasp.

Common variants are:

If there are multiple elements that meet the condition, the leftmost index that meets the condition is returned.
If there are multiple elements that meet the condition, the index of the rightmost one that meets the condition is returned.
The array is not ordered as a whole. For example, ascending and then descending, or descending and then ascending.
Turn a one-dimensional array into a two-dimensional array.
Next, we will check them one by one.

Premise¶

-The array is ordered (if unordered, we can also consider sorting, but pay attention to the complexity of sorting)

the term¶

Terms used in binary search:

target : the value to find
index : current position
l and r : left and right pointers
mid : The midpoint of the left and right pointers, the index used to determine whether we should search to the left or to the right

Common question type¶

Find a number¶

Algorithm Description:

Start with the middle element of the array first, if the middle element happens to be the element you want to find, the search process ends;
If the target element is greater than the middle element, search in the half of the array that is greater than the middle element, and start the comparison from the middle element as in the beginning.
If the target element is smaller than the middle element, search in the half of the array that is smaller than the middle element, and start the comparison from the middle element as at the beginning.
If the array is empty at a certain step, it means it cannot be found.

Complexity Analysis

Average time complexity: \(O(logN)\)
Worst time complexity: \(O(logN)\)
Optimal time complexity: \(O(1)\)
Space complexity
Iteration: \(O(1)\)
Recursion: \(O(logN)\) (no tail call elimination)

The following complexity is similar, so I won't repeat it.

This search algorithm reduces the search range by half every time it compares, which is a typical binary search.

This is the simplest type of binary search, let's fix it first. Let's take a concrete example, so that it is convenient for everyone to increase the sense of substitution. Suppose nums is [1,3,4,6,7,8,10,13,14] and target is 4.

The element in the middle of the array is 7
7 > 4, because the numbers to the right of 7 are all greater than 7, it cannot be the answer. We abbreviated the range to the left of 7.
Now the middle element is 3
3 < 4, since the numbers to the left of 3 are all less than 3, it cannot be the answer. We abbreviated the range to the right of 3.
At this time, the middle element is 4, which is exactly what we are looking for, just return its index 2.

How to convert the above algorithm into easy-to-understand executable code? Even with such a simple, unpretentious binary search, the differences written by different people are very big. If you don't have a mental framework to guide you, then you may write very different code at different times. In this case, your chances of making mistakes will be greatly increased.

Here is an introduction to a thinking framework and code template that I often use.

:::tip

why var index = left + ((right - left) / 2); (i.e. left + half)

To reduce the chances of overflow
Array could have negative elements

:::

Code template¶

C# Solution¶

using System;
using System.Collections.Generic;

namespace Algorithms.Simple
{
  public class BinarySearch
  {
    public static int Do(int[] nums, int target)
    {
      int left = 0;
      int right = nums.Length - 1;

      while (left <= right)
      {
        var midIndex = left + ((right - left) / 2);
        if (nums[midIndex] == target) return midIndex;

        if (target < nums[midIndex])
        {
          right = midIndex - 1;
        }
        else
        {
          left = midIndex + 1;
        }
      }

      return -1;
    }
  }
}

C# Tests¶

using Algorithms.Simple;
using Xunit;

namespace AlgorithmTests.Simple
{
  public class BinarySearchTests
  {
    [Fact]
    public void ValidSearch()
    {
      Assert.Equal(4, BinarySearch.Do(new int[] { -1, 0, 3, 5, 9, 12 }, 9));
    }

    [Fact]
    public void InvalidSearch()
    {
      Assert.Equal(-1, BinarySearch.Do(new int[] { -1, 0, 3, 5, 9, 12 }, 2));
    }
  }
}

Variations:

Find the leftmost value that satisfies the condition (shrink to left)
Find the rightmost value that satisfies the condition (shrink to right)

Find the leftmost insertion position¶

Find the rightmost insertion position¶

Variant: 33.search-in-rotated-sorted-array 81.search-in-rotated-sorted-array-ii

Extension¶

What if the title does not let you return true and false, but returns the index with the leftmost/rightmost equal to target? Doesn't this establish a connection with the previous knowledge? For example, I asked you to find the leftmost index equal to target in a rotating array, which is actually Interview Question 10.03. Search Rotation Array .

The idea is similar to the previous leftmost satisfaction, which is still achieved by compressing the interval, updating the spare tire, and finally returning to the spare tire. Look at the code specifically.

Python Code:

```py{137-141} class Solution: def search(self, nums: List[int], target: int) -> int: l, r = 0, len(nums)-1 while l <= r: mid = l + (r-l) // 2 # # the first half is ordered if nums[l] <nums[mid]: # target is in the first half if nums[l] <= target <= nums[mid]: r = mid-1 else: l = mid + 1 # # the second half is ordered elif nums[l]> nums[mid]: # target is in the second half if nums[l] <= target or target <= nums[mid]: r = mid-1 else: l = mid + 1 elif nums[l] == nums[mid]: if nums[l] != target: l += 1 else: # l is a spare tire r = l-1 return l if l <len(nums) and nums[l] == target else -1

### Two-dimensional array

There is no essential difference between binary search of a two-dimensional array and one-dimensional array. We will explain it through two questions.

#### 74. Search two-dimensional matrix

##### Subject address

https://leetcode-cn.com/problems/search-a-2d-matrix/

##### Title description

Write an efficient algorithm to determine whether there is a target value in the mxn matrix. The matrix has the following characteristics:

The integers in each row are arranged in ascending order from left to right. The first integer in each line is greater than the last integer in the previous line. Example 1:

enter: matrix = [ [1, 3, 5, 7], [10, 11, 16, 20], [23, 30, 34, 50] ] target = 3 Output: true Example 2:

enter: matrix = [ [1, 3, 5, 7], [10, 11, 16, 20], [23, 30, 34, 50] ] target = 13 Output: false

##### Ideas

Simply put, it is to cut a one-dimensional ordered array into several segments of the same length, and then concatenate these segments into a two-dimensional array. Your task is to find the target in this concatenated two-dimensional array.

It should be noted that there are no duplicate elements in the array.

> What should we do if there are duplicate elements?

algorithm:

-Select the lower left corner of the matrix as the starting element Q
-If Q> target, there is no need to look at the elements on the right and below (relative to the elements on the right of the one-dimensional array)
-If Q <target, there is no need to look at the left and above elements (relative to the left element of the one-dimensional array)
-If Q == target, return True directly
-If you return it, you can't find it, return False

##### Code (Python)

```py


class Solution:
    def searchMatrix(self, matrix: List[List[int]], target: int) -> bool:
        m = len(matrix)
        if m == 0:
            return False
        n = len(matrix[0])

        x = m-1
        y = 0
        while x >= 0 and y <n:
            if matrix[x][y]> target:
                x -= 1
            elif matrix[x][y] <target:
                y += 1
            else:
                return True
        return False

Complexity Analysis

-Time complexity: The worst case is that there is only one row or only one column, and the time complexity is \(O(M * N)\). In more cases, the time complexity is \(O(M + N)\) -Space complexity: \(O(1)\)

Leetcode 240. Search for two-dimensional matrix II has undergone a little change, and is no longer the first one in each row The integer is greater than the last integer in the previous row, but the elements in each column are arranged in ascending order from top to bottom. We can still choose to make a dichotomy from the bottom left.

Find the best value (improved dichotomy)¶

All of the above are to find a given value, this time we try to find the maximum value (minimum or maximum). Let's take the smallest as an example to explain how to cut into this kind of problem.

153. Find the smallest value in a rotated sorted array¶

Subject address¶

https://leetcode-cn.com/problems/find-minimum-in-rotated-sorted-array/

Title description¶

Assume that the array sorted in ascending order is rotated at a point unknown in advance.

(For example, the array [0,1,2,4,5,6,7] may become [4,5,6,7,0,1,2] ).

Please find the smallest element.

You can assume that there are no duplicate elements in the array.

Example 1:

Input: [3,4,5,1,2]
Output: 1
Example 2:

Input: [4,5,6,7,0,1,2]
Output: 0

Dichotomy¶

Ideas¶

The idea is the same as finding the specified value. We still:

-Initialize the head and tail pointers l and r -If nums[mid] is greater than nums[r], it means that mid is in the ordered part on the left. Since the smallest must be on the right, the left interval can be contracted, that is, l = mid + 1 -Otherwise, shrink the right side, that is, r = mid (no r = mid-1)

There is no point in judging the equal sign here, because the title does not let us find the specified value

-Exit the loop when l >= r or nums[l] <nums[r]

nums[l] <nums[r], indicating that the interval [l, r] is already in order, so nums[l] is what we want to find

Code (Python)¶

class Solution:
    def findMin(self, nums: List[int]) -> int:
        l, r = 0, len(nums)-1

        while l <r:
            # important
            if nums[l] <nums[r]:
                return nums[l]
            mid = (l + r) // 2
            # left part
            if nums[mid]> nums[r]:
                l = mid + 1
            else:
                # right part
                r = mid
        # l or r is not important
        return nums[l]

Complexity Analysis

-Time complexity: \(O(log N)\) -Space complexity: \(O(1)\)

Another dichotomy¶

Ideas¶

Of course, we can also compare with nums[l] instead of nums[r] above, we find:

-The elements on the left of the rotation point ** are greater than ** the first element of the array

-The elements on the right of the rotation point ** are less than ** the first element of the array

This establishes the relationship between nums[mid] and nums[0].

Specific algorithm:

Find the middle element mid of the array.
If the middle element> the first element of the array, we need to search to the right of mid.

-If the middle element <= the first element of the array, we need to search to the left of mid.

In the above example, the middle element 6 is larger than the first element 4, so continue searching on the right side of the middle point.

When we find the rotation point, we stop searching, when any of the following conditions are met:

-nums[mid]> nums[mid + 1], so mid+1 is the minimum value.

-nums[mid-1]> nums[mid], so mid is the minimum value.

Code (Python)¶

class Solution:
    def findMin(self, nums):
        # If the list has just one element then return that element.
        if len(nums) == 1:
            return nums[0]

        # left pointer
        left = 0
        # right pointer
        right = len(nums)-1

        # if the last element is greater than the first element then there is no rotation.
        # eg 1 <2 <3 <4 <5 <7. Already sorted array.
        # Hence the smallest element is first element. A[0]
        if nums[right]> nums[0]:
            return nums[0]

        # Binary search way
        while right >= left:
            # Find the mid element
            mid = left + (right-left) / 2
            # if the mid element is greater than its next element then mid+1 element is the smallest
            # This point would be the point of change. From higher to lower value.
            if nums[mid]> nums[mid + 1]:
                return nums[mid + 1]
            # if the mid element is lesser than its previous element then mid element is the smallest
            if nums[mid-1]> nums[mid]:
                return nums[mid]

            # if the mid elements value is greater than the 0th element this means
            # the least value is still somewhere to the right as we are still dealing with elements greater than nums[0]
            if nums[mid]> nums[0]:
                left = mid + 1
            # if nums[0] is greater than the mid value then this means the smallest value is somewhere to the left
            else:
                right = mid-1

Complexity Analysis

-Time complexity: \(O(log N)\) -Space complexity: \(O(1)\)

Binary Tree¶

For a given binary tree, any node has at most two child nodes. From this definition, we seem to be able to smell a bit of dichotomy, but this is not dichotomy. However, there are many connections between binary trees and binary, let's take a look.

The simplest, if the binary tree is a binary search tree (BST). So in fact, the process of searching in a binary search tree species is dichotomy.

As shown above, we need to search for 7 in such a binary search tree. Then our search path will be 8 -> 3 -> 6 -> 7, which is also a dichotomy. It's just that the lower bound of the time complexity is worse than the ordinary ordered sequence to find a given value dichotomy, because the binary search tree is not necessarily a binary balanced tree.

We talked about binary search trees above, let's look at a similarly special tree-a complete binary tree. If we number all the nodes of a complete binary tree (binary), they will be 01, 10, 11, ... in order.

In fact, the number in the last line is the path from the root node to the node. Where 0 means to the left and 1 means to the right. (The first digit is not used). Let's take 101 in the last line as an example. We need to execute left once and then right once.

In fact, the principle is not difficult, if you have used an array to represent a complete binary tree, then it is easy to understand. We can find that the numbers of the parent nodes are all twice the number of the left node, and they are all twice the number of the right node + 1. From a binary point of view: The number of the parent node shifted by one bit to the left is the number of the left node, and shifted by one bit to the left + 1 is the number of the right node. So conversely, knowing the last bit of the child node, we can know whether it is the left node or the right node of the parent node.

Topic recommendation¶

-875. Koko who loves bananas -300. Longest Ascending Subsequence -354. Russian doll envelope problem -Interview Question 17.08. Circus Tower

The next three questions are recommended to be done together

to sum up¶

Binary search is a very important and difficult core algorithm, and everyone must understand it. Some questions can be divided into two directly, and some questions are only one part of the link. Either way, we need to be very familiar with the idea of dichotomy and code templates.

The basic question types of binary search are:

-Find the elements that meet the conditions and return the corresponding index -If there are multiple elements that meet the condition, the leftmost index that meets the condition is returned. -If there are multiple elements that meet the condition, the index of the rightmost one that meets the condition is returned. -The array is not ordered as a whole. For example, ascending and then descending, or descending and then ascending. -Turn a one-dimensional array into a two-dimensional array. -Locally orderly find the largest (smallest) element -. . .

Regardless of the type, our thinking framework is similar, both:

-First define search interval (very important) -Define loop end conditions based on the search interval -Take the middle element and the target element for comparison (the target element may be the element to be found or the first and last element of the array, etc.) (very important) -Shrink the interval based on the result of the comparison, and discard the illegal solution (that is, dichotomies)

If it is in overall order, usually only nums[mid] and target are compared. If it is locally ordered, it may need to be compared with specific elements around it.

You can use this thinking framework in combination with the several question types introduced in this article for practice. If necessary, you can use the problem-solving template provided by me to provide problem-solving speed while effectively reducing the probability of error.

Special attention should be paid to the fact that the presence or absence of repeated elements has a great influence on the binary algorithm. We need to be careful.