Adrien M.
Adrien M. Adrien est de la promotion 2024 d'Epitech (Lyon). Il est passionné par la technologie Python et la programmation de jeux video
Basile S.
Basile S. Basile est de la promotion 2024 d'Epitech (Lyon). Il est passionné par la programmation Web et l'intelligence artificielle

A Technical Overview of the Most Popular Sorting Algorithms

A Technical Overview of the Most Popular Sorting Algorithms

In computer science, arranging elements in an ordered sequence is called “sorting”. It is a very a common operation in many applications and efficient algorithms have been developed to perform it.

The most common uses of sorted sequences are :

  • making lookup or search efficient.
  • making merging of sequences efficient.
  • enabling processing of data in a defined order.

For example, you might want to give priority to people who have been customers the longest, but also to people who have spent a lot of money or even to people who bought something yesterday. So you have to sort them by different criteria. Sorting is very critical for information systems performance, since it is quicker to find an element in an ordered list than in an unordered one.

This article discusses sorting strategies and shows off the performances of the corresponding algorithms.

What does a sorting algorithm depend on?

A sorting algorithm depends on two factors : time complexity and space complexity. We can also talk about the stability of the algorithm.

Time complexity is the computational complexity that describes the amount of computer time it takes to run an algorithm. For example, O(log N) means time goes up linearly while the N goes up exponentially. So if it takes 1 second to compute 10 elements, it will take 2 seconds to compute 100 elements, 3 seconds to compute 1000 elements, and so on. Space complexity measures the total amount of memory that an algorithm or operation needs to run according to its input size.

The stability of a sorting algorithm depends on how the algorithm treats equal (or repeated) elements. Stable sorting algorithms preserve the relative order of equal elements, while unstable sorting algorithms don’t.

Sorting algorithms evaluation

We tested different sorting algorithms in order to discuss about the fastest ones and to explain their main principles. We have implemented 6 different algorithms.

Selection Sort

This algorithm selects the smallest element of a list in each iteration and places that element at the beginning of the list.

1
2
3
4
5
6
7
8
def selectionSort(tab):
    for i in range(len(tab)):
        min = i
        for j in range(i + 1, len(tab), 1):
            if tab[j] < tab[min]:
                min = j
        tab[i], tab[min] = tab[min], tab[i]
    return tab
Selection Sort Complexity  
Time Complexity  
Best O(n²)
Worst O(n²)
Average O(n²)
Space Complexity O(1)
Stability No


Insertion Sort

For this algorithm, we assume that the first element is already sorted, then we take another element and we compare it to the first one. If it’s greater than the sorted element, we place it to the right, otherwise, to the left.

1
2
3
4
5
6
7
8
9
10
def insertionSort(tab):
    for i in range(1, len(tab), 1):
        j = i
        while j > 0:
            if tab[j] >= tab[j - 1]:
                tab[j], tab[j - 1] = tab[j - 1], tab[j]
                j -= 1
            else:
                break
    return tab
Insertion Sort Complexity  
Time Complexity  
Best O(n)
Worst O(n²)
Average O(n²)
Space Complexity O(1)
Stability Yes


Bubble Sort

This algorithm compares two adjacent elements and swaps them until the list is sorted.

1
2
3
4
5
6
7
#!/usr/bin/python
def bubbleSort(tab):
    for i in range(len(tab) - 1):
        for j in range(0, len(tab) - i - 1):
            if tab[j] > tab[j + 1] :
                tab[j], tab[j + 1] = tab[j + 1], tab[j]
    return tab
Bubble Sort Complexity  
Time Complexity  
Best O(n)
Worst O(n²)
Average O(n²)
Space Complexity O(1)
Stability Yes


Quick Sort

For this one, we will divide the list by using a pivot element. This pivot element should always be positioned in a way that elements to the left are smaller than the pivot and elements to the right are bigger than the pivot. Thus we obtain sublists, and those sublists are also divided until we have sublists of one element. After that we can recombine sublists into one.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
def partition(tab):
    l = []
    r = []
    for i in range(1, len(tab), 1):
        l.append(tab[i]) if tab[i] >= tab[0] else r.append(tab[i])
    return tab, l, r

def quickSort(tab):
    if len(tab) <= 1:
        return tab
    tab, left, right = partition(tab)
    sortLeft = quickSort(left)
    sortRight = quickSort(right)
    tab = sortLeft[0] + [tab[0]] + sortRight[0]
    return tab
Quicksort Sort Complexity  
Time Complexity  
Best O(n*log n)
Worst O(n²)
Average O(n*log n)
Space Complexity O(log n)
Stability No


Merge Sort

This algorithm looks like the quick sort. We divide the list into sublists by cutting it in halves. When we have lists of one element, we recombine the lists into greater sorted lists until we finished to recombine all lists.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
def mergeSort(tab):
    if len(tab) > 1:
        middle = len(tab) // 2
        left = tab[:middle]
        right = tab[middle:]
        tab = mergeSort(left)
        tab = mergeSort(right)
        i = 0
        j = 0
        k = 0
        while i < len(left) and j < len(right):
            if left[i] < right[j]:
                tab[k] = left[i]
                i += 1
            else:
                tab[k] = right[j]
                j += 1
            k += 1
        while i < len(left):
            tab[k] = left[i]
            i += 1
            k += 1
        while j < len(right):
            tab[k] = right[j]
            j += 1
            k += 1
    return tab
Merge Sort Complexity  
Time Complexity  
Best O(n*log n)
Worst O(n*log n)
Average O(n*log n)
Space Complexity O(n)
Stability Yes


Shell Sort

It’s a generalized version of the insertion sort we saw earlier. It sorts groups of elements that are far away from each other. The idea is to arrange the list of elements so that, starting anywhere, taking every h-th element produces a sorted list — such a list is said to be h-sorted.

1
2
3
4
5
6
7
8
9
10
11
12
def shellSort(n, array):
    interval = n // 2
    while interval > 0:
        for i in range(interval, n):
            temp = array[i]
            j = i
            while j >= interval and array[j - interval] > temp:
                array[j] = array[j - interval]
                j -= interval
            array[j] = temp
        interval //= 2
    return array
Shell Sort Complexity  
Time Complexity  
Best O(nlog n)
Worst O(n²)
Average O(nlog n)
Space Complexity O(1)
Stability No


Evaluation

For our evaluation experiment, we used python to implement the algorithms. We took a list of 5 000 numbers, sorted them 10 times and averaged the duration for stability purposes (the list is always the same and the order of elements isn’t changed between each test).

The python implementation of the sorting algorithms is available here !

The PC used has an Intel(R) Core(TM) i7-8750H CPU @ 2.20GHz 2.21 GHz processor and 16Go of RAM. Since the time can be influenced by the processor’s efficiency, time can change between other computers. Moreover, the algorithms used can probably be improved and the number of tests (only 10 by algorithms) could be increased to have a better idea of the time taken.

Results

We have collected the average time for each algorithm in seconds:

  • Selection Sort upon 10 sort of 5 000 elements: 1,313 seconds
  • Insertion Sort upon 10 sort of 5 000 elements: 1,727 seconds
  • Bubble Sort upon 10 sort of 5 000 elements: 2,42 seconds
  • Quick Sort upon 10 sort of 5 000 elements: 0,078 seconds
  • Merge Sort upon 10 sort of 5 000 elements: 0,078 seconds
  • Shell Sort upon 10 sort of 5 000 elements: 0,099 seconds

According to our experiment, the Merge Sort and the Quick Sort seem to the fastest algorithms. What do you think ?

Références