Data Structures and Object-Oriented Programming

Chapter 12: Sorting and Searching Algorithms

  • Computers spend a lot of time putting data in order (sorting) or finding something inside ordered data (searching).
  • Good algorithms make these jobs much faster, especially as the list grows.

2. Selection Sort - "Pick the smallest each time"

What happensPicture it
Look through the whole list, find the smallest item, put it in front. Then repeat with the rest of the list.Like choosing the shortest player to stand first in a line, then the next shortest, and so on.
  • Steps

    1. Start at position 0.
    2. Scan the rest of the list to find the minimum.
    3. Swap it into position 0.
    4. Move to position 1 and repeat until the end.
  • Cost: always O(n2)O(n^2) comparisons (slow for big lists).

  • When it's fine: tiny lists, or when swapping is expensive but comparisons are cheap.


3. Insertion Sort - "Build the list like arranging playing cards"

  • You keep a sorted left side and insert the next item into its proper spot.
[sorted part] | key | [unsorted part]
  • In the worst case (list is backwards) it still does about n(n-1)/2 moves → O(n2)O(n^2).

  • Best case (nearly sorted): almost no moves → close to O(n)O(n)!

  • Stable (equal items keep their original order).

  • Great for:

    • Very small lists (it's what Python's built-in sort uses for tiny chunks).
    • Touch-ups on nearly-sorted data.

4. Merge Sort - "Divide, sort, and stitch together"

  1. Divide the list in half until you have single-element lists.
  2. Merge pairs of lists back together, always pulling out the smaller front item.
  • Time: Every level of the "split/merge" tree touches all n items, and the tree height is logz nO(nlogn)O(n \log n) guaranteed.
  • Space: Needs extra arrays during merging (not in-place).
  • Stable and predictable, good for huge datasets where worst-case time must be bounded.

5. Quick Sort - "Partition around a pivot"

  1. Pick a pivot value.
  2. Put smaller items on the left, larger on the right.
  3. Recursively sort the two halves.
  • Average/Best: O(nlogn)O(n \log n) - often faster than merge sort in practice because of cache friendliness and no extra arrays.
  • Worst: O(n2)O(n^2) if pivots are terrible (e.g., always choose the first element and the list is already sorted).
  • Tips to avoid worst-case: choose a random pivot or the median of three elements.
  • Not stable, but in-place (uses little extra memory).

6. Binary Search - "Halve the search space"

  • Works only on a sorted list.

  • Compare target with the middle item:

    • If equal → found.
    • If smaller → search left half.
    • If larger → search right half.
  • Repeats until the range is empty.

  • Time: each step halves the range → O(logn)O(\log n).

  • Return: often the index, or -1 if not present.


7. bisect Module - "Binary search built-in"

Python's bisect gives you the same power without writing the loop:

FunctionWhat it does
bisect_left(a, x)Index to insert x before equal items.
bisect_right(a, x)Index to insert x after equal items.
insort_left(a, x)Insert x in place, keeping the list sorted (shifts items).
insort_right(a, x)Same but after equal items.
  • Lookup is O(logn)O(\log n), insertion is O(n)O(n) because items move to make room.

8. Cheat-Sheet Comparison

AlgorithmTypical TimeWorst-CaseExtra SpaceStable?Good For
Selection SortO(n2)O(n^2)O(n2)O(n^2)O(1)O(1)Tiny lists, minimal swaps
Insertion SortO(n2)O(n^2) (see note 1)O(n2)O(n^2)O(1)O(1)Small or nearly-sorted data
Merge SortO(nlogn)O(n \log n)O(nlogn)O(n \log n)O(n)O(n)Large datasets, need stability
Quick SortO(nlogn)O(n \log n)O(n2)O(n^2)O(logn)O(\log n) (see note 2)General-purpose fastest
Binary SearchO(logn)O(\log n)-O(1)O(1)-Finding items in sorted lists

Notes

  1. Best case is close to O(n)O(n)
  2. Average extra space (call stack)

9. Key Take-Aways

  • Know your data size & ordering - simple O(n2)O(n^2) sorts are fine for tiny inputs.
  • Merge vs. Quick - merge is safer; quick is usually quicker.
  • Binary search is a huge speed-up once the data is already sorted.
  • Python tip: prefer the built-in sorted() or list.sort() (Timsort). They pick the best strategy automatically.
Back to Data Structures and Object-Oriented Programming