Imagine we have written down the heights of all the buildings in the world in increasing order. So, the first element is the world's shortest building, and the last one is the tallest one. We play a game in which given a query number that represents the height of a building, you should reply with the position (the index) of that building in the list. In case there is no building with the requested height, you reply with -1.
A naive approach would be to run a for loop to process each and every element in the array and check against the query. In the worst-case scenario, this will do n operations for each query (in case the number of elements in the array is n).
h = [...] # Heights of all the buildings in increasing order
q = int(input()) # The query height
for i, height in enumerate(h):
if height == q: # In case we find the element
print(i) # Print the index
break # And stop
else: # In case we didn't find a building with height q (else = nobreak)
print(-1) # Print -1 to indicate that we didn't find it
Yet, realizing that the list is sorted in increasing order, you start to notice that we perform a ton of redundant operations. Instead of looping over the list from the beginning to the very end, we couldβve probably jumped over some redundant operations to arrive at the correct index quicker.
As we have the whole array sorted in increasing order, a more optimal way of approaching this problem would be by looking at the middle element in the array. If that element is smaller than q, we can throw away the whole left half of the array and concentrate on the right one only. If that element is larger than q, we can throw away the whole right part, and concentrate on the left one. The process of looking at the middle element and throwing away the redundant half until finding the right element is essentially the Binary Search algorithm.
h = [...] # Heights of all the buildings in increasing order
q = int(input()) # The query height
l, r = 0, len(h) # The answer always lies between [l; r). Note that r is exclusive
while r - l > 1: # There is at least one element in between to consider
mid = (l + r) // 2 # Take the middle index
if h[mid] > q: # h[mid] > q => throw away the right half
r = mid
else: # h[mid] <= q => throw away the left half
l = mid
# We should arrive at h[l] being the query element
print(l if h[l] == q else -1)
Imagine we have an array of building heights h = [20, 22, 23, 23, 34, 49, 52, 55, 58] and we would like to know the index of a building with height 49.
With this algorithm, we would perform several iterations:
mid = (0 + 9) // 2 = 4 β h[4] = 34 β€ 49 β we throw away the left part.
mid = (4 + 9) // 2 = 6 β h[6] = 52 > 49 β we throw away the right part.
mid = (4 + 6) // 2 = 5 β h[5] = 49 β€ 49 β we throw away the left part.
l = 5, r = 6 β r - l == 1 β the while loop ends and we print 5 as h[5] = 49.
Β
This small example might not seem like a lot of optimization but imagine if we had 10 billion buildings. In case of looping over each and every element, we would perform 10 billion iterations in the worst-case scenario. Yet, in the case of splitting the array in half every time and throwing away the redundant parts, this is a huge performance gain. Letβs see how many iterations it would take us to converge to an answer if we split 10 billion elements on every iteration:
Spoiler: itβs only 32 iterations instead of 10 billion
10,000,000,000 / 2 = 5,000,000,000
5,000,000,000 / 2 = 2,500,000,000
1250000000 / 2 = 625000000
625000000 / 2 = 312500000
312500000 / 2 = 156250000
156250000 / 2 = 78125000
78125000 / 2 = 39062500
39062500 / 2 = 19531250
19531250 / 2 = 9765625
9765625 / 2 = 4882812
4882812 / 2 = 2441406
2441406 / 2 = 1220703
1220703 / 2 = 610351
610351 / 2 = 305175
305175 / 2 = 152587
152587 / 2 = 76293
76293 / 2 = 38146
38146 / 2 = 19073
19073 / 2 = 9536
9536 / 2 = 4768
4768 / 2 = 2384
2384 / 2 = 1192
1192 / 2 = 596
596 / 2 = 298
298 / 2 = 149
149 / 2 = 74
74 / 2 = 37
37 / 2 = 18
18 / 2 = 9
9 / 2 = 4
4 / 2 = 2
2 / 2 = 1
So, imagine there is an algorithm that does a linear search by going through each and every item in a 10-billion-item collection, and checking each element takes 1 second. That algorithm will take 317 years to complete. While doing a binary search would take only 32 seconds!
This is the difference between doing operations and operations.
Challenge
Given n GPAs from 1 to 4 (1 is the lowest grade and 4 is the perfect one), and several thresholds, you are asked to tell how many people would pass to the next grade if one should be at the threshold and above to pass.
Input
The first line of the input contains a single integer n (1 β€ n β€ ) the number of students.
The next line contains n floating point numbers representing the GPAs of students in increasing order (1 β€ β€ 4).
The third line contains a single integer q (1 β€ q β€ ) the number of thresholds to consider.
The last line of the input contains q floating point numbers representing the minimum thresholds to pass to the next grade (1 β€ β€ 4).
Output
The program should print q lines. Line i should represent the number of students that would pass to the next grade given the threshold .