When dealing with algorithms and data structures, people sometimes have a skewed perception of those being very complicated. Yet, in the majority of cases, they are simple and intuitive. The only thing required to master those is some practice and hands-on experience to get the intuition of how the algorithms actually work.
When dealing with specific problems and exercises, it soon becomes clear that certain groups and categories of problems can be viewed together. They share certain characteristics that make it possible to apply some techniques to the whole group of those problems.
One of those groups is the implementation problems. Those problems usually include the steps to get to the solution right in the problem statement. So, the actions the program needs to perform are described in the problem itself. The task of the software engineer becomes to implement those steps optimally and obtain the desired result. It’s important to note that even though the steps are described in the statement itself, it might not be straightforward to implement those right away.
Challenge - Find the missing number
Given all the numbers from 1 to n except one, you are asked to find the missing number.
The first line of the input contains a single integer n (2 ≤ n ≤ 100 000).
The second line of the input contains n-1 space-separated integers (1 ≤ ≤ n).
The program should print the missing number.
3 4 1
Tutorial on Complexity and Big Notation
Even though the problem described above might seem easy, the naive solution won’t be fast enough to pass all the test cases.
A naive approach might involve reading all the elements from the input into a list and then looping through all the numbers from 1...n and checking if the number is in the list. In case it’s not - we found the answer:
n = int(input())
a = [int(ai) for ai in input().split()]
for i in range(1, n + 1): # n operations
if i not in a: # n operations (goes through all the elements)
print(i) # We found the solution!
This algorithm is actually very slow. It goes through all the numbers 1...n and checks if a number is in a list. Checking if a number is in a list is a linear operation, meaning that the program goes through each and every element in the list a and checks if the number i is in it or not. Therefore, to check if i is in a, the program needs to perform n operations, as the list length is ~n (n-1 to be precise, but that doesn’t make a big difference).
So, this algorithm performs ~ operations in total. The outer loop performs n operations while checking if an item is in the list performs n operations. So, in total, that amounts to around operations (in the worst case).
Our computers can perform ~10-100 million operations in one second. So, if the time limit for a problem is 1 second (which is a typical time limit for algorithmic problems), we should aim to perform at most 100 million operations in total.
In the problem described above, the number n can be up to 100 000, which means that the algorithm would perform operations, which is 10 billion operations. On a normal computer that would take about 100 seconds to complete, while the time limit for the problem is 1 second. So, we should come up with a more optimal solution.
Big O notation is a mathematical notation used to describe the upper bound of an algorithm's growth rate of time complexity as the size of the input grows. It provides a rough estimate of the number of operations an algorithm will perform relative to the size of its input.
For this problem, we estimated that the algorithm would perform ~ operations in total. So, the algorithm complexity can be written as .
We will discuss this in more detail, providing some more intuition and formality later in the course.