This is a study guide I created to help prepare for the scary CS225 final exam at U of I! This is by no means complete, I only included content that I found unfamiliar.
- Lecture Slides (
/slides
) - Lecture Hangouts (
/handouts
) - Lecture Code (
/code
) - Lecture Notes by TAs (
/notes
) - Spring 2019 Course Website (Cloned Copy)
- C++ Basics
- List
- Iterator
- Stack
- Queue
- Binary Tree
- Traversal
- Search
- BST Tree (Binary Search Tree)
- AVL Tree
- k-d Tree
- B Tree
- Hashing
- Heaps (Priority Queues)
- Disjoint Sets
- Graphs
- Kruskal's MST (Minimum Spanning Tree)
- Prim's MST
- Dijkstra's SSSP (Single Source Shortest Path)
- Floyd-Warshall's APSP (All Pairs Shortest Path)
Separate the interface from implementation but still keep them as a cohesive unit.
Telling compiler that this particular file will be included only once within this single compilation.
An alias to an existing variable. Never creates new memory.
Stores a memory address of the instance instead of storing data.
Default type memory, starts near the top of memory.
- Starts from low memory values and grows up.
- Use
new
to create,delete
to free. - Memory leak is when we don't
delete
when done.
Cube joinCubes(const Cube & c1)
to prevent the parameters passed in to be changed.double getVolume() const;
means thatgetVolume()
guarantees it will not modify the value.
- Allow us to override the function in derived classes.
Let's say we have 2 classes...
class Animal
{
public:
_____ void eat() { std::cout << "I'm eating generic food."; }
};
class Cat : public Animal
{
public:
void eat() { std::cout << "I'm eating a rat."; }
};
and I create these in main.cpp
...
Animal *animal = new Animal;
Cat *cat = new Cat;
this should be easy...
animal->eat(); // Outputs: "I'm eating generic food."
cat->eat(); // Outputs: "I'm eating a rat."
but now with this new function I created in main.cpp
...
void func(Animal *xyz) { xyz->eat(); }
Case 1, void eat()...
in class Animal
(without virtual
):
func(animal); // Outputs: "I'm eating generic food."
func(cat); // Outputs: "I'm eating generic food."
Case 2, virtual void eat()...
in class Animal
(with virtual
):
func(animal); // Outputs: "I'm eating generic food."
func(cat); // Outputs: "I'm eating a rat."
- A dynamic data type
- Using
Template <typename T>
so that we do not need to write same function for various types
template <typename T>
T maximum(T a, T b) {
T result;
result = (a > b) ? a : b;
return result;
}
ADT describes functionality but not implementation details.
// Implementation
class List {
public:
/* ... */
private:
class ListNode {
T & data;
ListNode * next;
ListNode(T & data) : data(data), next(NULL) { }
};
};
// Find: Recursive
ListNode *& _index(unsigned index) {
return _index_helper(index, _head);
}
ListNode *& _index_helper(unsigned index, ListNode *& node) {
if (index == 0) {
return node;
} else {
return _index_helper(index - 1, node -> next);
}
// Find: Iterative
// Moving the pointer one by one until it's at the right location.
ListNode *& _index(unsigned index) {
if (index == 0) {
return head;
} else {
ListNode *thru = head;
for (unsigned i = 0; i < index - 1; i++) {
thru = thru->next;
}
return thru->next;
}
}
// Operator []
T & operator[](unsigned index) {
ListNode *& d = _index(index);
return d -> data;
}
// Insert
void nsert(const T & t, unsigned index) {
ListNode *& node = _index(index);
ListNode * newNode = new ListNode(t);
newNode -> next = node;
node = newNode;
}
To implement an iterator, must have
begin()
end()
operator++
operator*
operator!=
for (std::vector<Animal>::iterator it = zoo.begin(); it != zoo.end(); it++) {
std::cout << (*it).name << " " << (*it).food << std::endl;
}
Singly Linked List | ArrayList | |
---|---|---|
Insert/Remove at Front | O(1) | O(1) (amortized) |
Insert at a given element | O(1) | O(n) (needs to copy everything after) |
Remove a given element | O(1) | O(n) |
Insert an arbitrary element | O(n) | O(n) |
Remove at arbitrary location | O(n) | O(n) |
Like dining hall plates.
- push
- pop
Like a line is Disneyland.
- enqueue
- dequeue
Each node has at most 2 children.
Each node has either 2 or 0 children.
Full tree + all leaves same distance from root.
Full tree + perfect tree except last level, where all leaves are "pushed to the left."
To traverse a tree means we process every element exactly once in a tree.
// pre-order
void preOrder(TreeNode * cur) {
if (cur != NULL) {
print(cur->data); // print is some imaginary function
preOrder(cur->left);
preOrder(cur->right);
}
}
// in-order
void inOrder(TreeNode * cur) {
if (cur != NULL) {
inOrder(cur->left);
print(cur->data);
inOrder(cur->right);
}
}
// post-order
void postOrder(TreeNode * cur) {
if (cur != NULL) {
postOrder(cur->left);
postOrder(cur->right);
print(cur->data);
}
}
- Visit nearby nodes quickly by each level.
- In other words, visit the direct descendants quicker (level-order traversal).
- Find the endpoint of a path quickly, and to move deeper into the tree as quickly as possible.
- In other words, we visit the leaves as soon as possible.
Binary tree but
- Everything to the left of root < root
- Everything to the right of root > root
// Find
TreeNode *& _find(TreeNode *& root, const K & key) const {
If (root == NULL || key == root->key) {
return root; //root is null when we cannot find
}
If (key < root->key) {
return _find(root->left, key);
}
If (key > root->key) {
return _find(root->right, key);
}
}
- No child: Just remove
- 1 child: Swap with child then remove
- 2 children: Swap with IOP then remove, like this:
- Find inorder predecessor (IOP), the largest node in the left subtree (the rightmost node).
- Swap IOP and the node we want to delete.
- The node is now a leaf, so we can remove it.
Operation | BST Average Case | BST Worst Case | Sorted Array | Sorted List |
---|---|---|---|---|
find | O(lg n) | O(h) <= O(n) | O(lg n) with binary search | O(n) no binary search |
insert | O(lg n) | O(h) <= O(n) | O(n) with find O(lg n), move O(n) | O(n) find data with O(n) |
delete | O(lg n) | O(h) <= O(n) | O(n) | O(n) |
traverse | O(n) | O(n) | O(n) | O(n) |
b = height(TR) - height(TL)
- left heavy: b negative
- right heavy: b positive
- balanced: if |b| <= 1
- lowest point of imbalance: the deepest node in the tree that is out of balance
- 4 kinds of rotation (L, R, LR, RL)
- Simple rotations: stick
- Complex rotations: elbow
- All rotations local
- All rotations run in O(1)
Self-Balancing BST where the difference between heights of left and right subtrees cannot be more than 1.
To maintain height of tree:
- Check for imbalance.
- Correct it (do rotations).
- Update height.
- O(lg(n)) for all operations.
- Find: 0
- Insert: up to 1 (L, R, LR, or RL)
- Delete: up to h (O(lg(n)))
- Almost the same as AVL Trees.
- Maximal height is 2 * lg(n) = O(lg(n)).
Worst runtime | Unsorted Array | Sorted Array | Unsorted List | Sorted List | Binary Tree (unsorted) | BST | AVL |
---|---|---|---|---|---|---|---|
find | O(n) | O(lg n) Binary search | O(n) | O(n) | O(n) | O(h)<=n | O(lg n) |
insert | O(1)* InsertEnd and resize properly | O(n) Shifting up to ½ data | O(1) InsertFront | O(n) | O(1) Insert at root | O(h)<=n | O(lg n) |
remove | O(n) | O(n) | O(n) | O(n) | O(n) | O(h)<=n | O(lg n) |
traverse | O(n) | O(n) | O(n) | O(n) | O(n) | O(n) | O(n) |
*: amortized runtime
- aka k-dimensional tree.
- A BST that organizes points in k dimensional space.
- Every node in a k-d Tree contains one point.
- Every parent node splits the space into two subspaces based on a certain dimension.
- Every node in its left subtree is in the left subspace, and every node in its right subtree is in the right subspace.
- The dimension that a node is splitting on depends on which level of the tree this node is in.
- Useful for range and nearest neighbor searches.
- Each node splits the space in a certain dimension.
- Nodes on the i th level split the space in the i th dimension.
- If i is greater than k, the dimension wraps back around to i mod k.
Example: a 2-d k-d tree, each node on a odd level splits the x dimension, and each node on a even level splits the y dimension:
- Each node split the space such that number of nodes in the left subspace = right subspace.
- We pick the median among the nodes for the current dimension and make it the subroot.
Given this array to construct a k-d tree:
Find the median in terms of current dimension (dimension 1):
Partition so that smaller on left, greater on right:
Find the median in terms of current dimension (dimension 2):
Repeat this process until the array only consists of one node:
- Each node holes 2 arrays
- first holds the keys of the node
- second holds pointers to the children
- Keys within a node are ordered
- Leaves contain no more than
m-1
keys - Internal nodes have exactly one more child than keys
- Root nodes can have
2
tom
children - Internal nodes can have
ceil(m/2)
tom
children - All leaves are at the same depth, which makes the tree balanced
template <class K, class V>
V BTree<K, V>::find(const BTreeNode* subroot, const K& key) const
{
/* If first_larger_idx is a valid index and the key there is the key we
* are looking for, we are done. */
/* Otherwise, we need to figure out which child to explore. For this we
* can actually just use first_larger_idx directly. E.g.
* | 1 | 5 | 7 | 8 |
* Suppose we are looking for 6. first_larger_idx is 2. This means we want to
* explore the child between 5 and 7. The children vector has a pointer for
* each of the horizontal bars. The index of the horizontal bar we want is
* 2, which is conveniently the same as first_larger_idx. If the subroot is
* a leaf and we didn't find the key in it, then we have failed to find it
* anywhere in the tree and return the default V.
*/
size_t first_larger_idx = insertion_idx(subroot->elements, key);
if (first_larger_idx < subroot->elements.size() && subroot->elements[first_larger_idx].key == key) {
return subroot->elements[first_larger_idx].value;
}
if (subroot->is_leaf) {
return V();
} else {
return find(subroot->children[first_larger_idx], key);
}
}
- Search:
O(log n)
- Combining/Splitting of Keys:
O(m)
- Insert/Delete:
O((m/log m) * log n)
- The hash function: f(h) -> Integers
- The compression: Integer -> array
- Handling collisions
void insert(K key, V value) {
elems++;
if (shouldResize()) { resizeTable(); }
size_t idx = hash(key, size);
pair<K, V> insert(key, value);
// table is list<pair<K, V>>[size];
table[idx].push_front(insert);
}
void remove(K key) {
typename list<pair<K, V>>::iterator it;
size_t idx = hash(key, size);
for (it = table[idx].begin(); it != table[idx].end(); it++) {
if (it->first == key) {
table[idx].remove(it);
elems--;
return;
}
}
}
V find(K key) {
typename list<pair<K, V>>::iterator it;
size_t idx = hash(key, size);
for (it = table[idx].begin(); it != table[idx].end(); it++) {
if (it->first == key) {
return it->second;
}
}
}
void resizeTable() {
typename list<pair<K, V>>::iterator it;
size_t newSize = findPrime(s * size);
list<pair<K, V>> * newTable = new list<pair<K, V>>[newSize];
for (size_t i = 0; i < size; i++) {
for (it = table[i].begin(); it != table[i].end(); i++) {
size_t = newIdx = hash(key, size);
pair<K, V> newPair(it_.first, it->second);
newTable[newIdx].push_front(newPair);
}
}
size = newSize;
table = newTable;
}
Flags for whether or not to probe forward when looking at a particular cell in the table.
bool * should_probe;
void insert(K key, V value) {
elems++;
if (shouldResize()) { resizeTable(); }
size_t insertIndex = hash(key, size);
while (table[insertIndex] != NULL) {
insertIndex = ((insertIndex + 1) % size);
}
table[insertIndex] = new pair<K, V>(key, value);
should_probe[insertIndex] = true;
}
void remove(K key) {
int removeIndex = findIndex(key);
if (removeIndex != -1) {
delete table[removeIndex];
table[removeIndex] = NULL;
elems--;
}
}
V find(K & key) {
// see below
int idx = findIndex(key);
if (idx != -1) {
return table[idx]->second;
}
return V();
}
int findIndex(K & key) {
size_t findIndex = hash(key, size);
size_t orgIndex = findIndex;
while (should_probe[findIndex] == true) {
if (table[findIndex]->first == key) {
return findIndex;
}
findIndex = (findIndex + 1) % size;
if (findIndex == orgIndex) {
break;
}
}
return -1;
}
void resizeTable() {
size_t newSize = findPrime(size * 2);
pair<K, V>** resizeTemp = new pair<K, V>*[newSize];
delete[] should_probe;
should_probe = new bool[newSize];
for (size_t i = 0; i < newSize; i++) {
resizeTemp[i] = NULL;
should_probe[i] = false;
}
for (size_t i = 0; i < size; i++) {
if (table[i]) {
size_t index = hash(table[i]->first, newSize);
while (resizeTemp[index] != NULL) {
index = ((index + 1) % newSize);
}
resizeTemp[index] = table[i];
should_probe[index] = true;
}
}
delete[] table;
table = resizeTemp;
size = newSize;
}
void insert(K & key, V & value) {
elems++;
if (shouldResize()) {
resizeTable();
}
size_t insertIndex = hashes::hash(key, size);
size_t secondHash = hashes::secondary_hash(key, size);
while (table[insertIndex] != NULL) {
insertIndex = ((insertIndex + secondHash) % size);
}
table[insertIndex] = new pair<K, V>(key, value);
should_probe[insertIndex] = true;
}
void remove(K & key) {
int removeIndex = findIndex(key);
if (removeIndex != -1) {
delete table[removeIndex];
table[removeIndex] = NULL;
elems--;
}
}
V find(K & key) const {
int idx = findIndex(key);
if (idx != -1) return table[idx]->second;
return V();
}
int findIndex(K & key) {
size_t findIndex = hashes::hash(key, size);
size_t secondHash = hashes::secondary_hash(key, size);
size_t orgIndex = findIndex;
while (should_probe[findIndex]) {
if (table[findIndex]->first == key) {
return findIndex;
}
findIndex = (findIndex + secondHash) % size;
if (findIndex == orgIndex) {
break;
}
}
return -1;
}
void resizeTable() {
size_t newSize = findPrime(size * 2);
pair<K, V>** temp = new pair<K, V>*[newSize];
// create a new should_probe that fits the new size
delete[] should_probe;
should_probe = new bool[newSize];
// init the new table we created
for (size_t i = 0; i < newSize; i++) {
temp[i] = NULL;
should_probe[i] = false;
}
for (size_t slot = 0; slot < size; slot++) {
if (table[slot] != NULL) {
size_t h = hashes::hash(table[slot]->first, newSize);
size_t jump = hashes::secondary_hash(table[slot]->first, newSize);
size_t i = 0;
size_t idx = h;
while (temp[idx] != NULL) {
++i;
idx = (h + jump * i) % newSize;
}
temp[idx] = table[slot];
should_probe[idx] = true;
}
}
delete[] table;
table = temp;
size = newSize;
}
Whenever remove is called, the data structure pops out an element with a predetermined property (for example, the smallest element in minHeap).
- Map level order tree traversal to an array or vector.
- Use trees just for representation.
- Root starts at
1
for convenience.
void heapifyUp(idx) {
if (idx = root()) { return; }
size_t parent = parent(idx);
// if index is more "min" than parent
if (higherPiority(_elem[idx], _elem[parent])) {
swap(elem[idx], _elem[parent]);
heapiftUp(parent);
}
}
void heapifyDown(idx) {
if (hasAChild(idx) == false) { return; }
size_t minChild = maxPriorityChild(idx);
// if minChild is more "min" than index
if (higherPiority(_elem[minChild], _elem[idx])) {
swap(_elem[minChild], _elem[idx];
heapifyDown(minChild);
}
}
void DisjointSets::addelements(int num) {
for (int i = 0; i < num; i++) {
elems.push_back(-1);
}
}
int DisjointSets::find(int elem) {
if (elems[elem] < 0) {
return elem;
} else {
return find(elems[elem]);
}
}
void DisjointSets::setunion(int a, int b) {
int a_root = find(a);
int b_root = find(b);
int size = elems[a_root] + elems[b_root];
if (elems[a_root] <= elems[b_root]) {
elems[b_root] = a_root;
elems[a_root] = size;
} else {
elems[a_root] = b_root;
elems[b_root] = size;
}
}
int DisjointSets::size(int elem) {
if (elem >= (int)elems.size() || elem < 0) {
return 0;
} else {
return elems[find(elem)] * -1;
}
}
-
Edge List
- Vertex collection: Use a hash table (find/remove/insert will be O(1)).
- Edge collections: Use a linked list (hash table is not good because we have many collisions (no random distribution, violates SUHA) )
-
Adjacency Matrix
- Maintain a hash table of vertices and a list of edges.
- Add an nn matrix → store a pointer to the edge in edge list for every index in the matrix where the two vertices are adjacent.
-
ADJ List
- We will maintain a hash table of vertices, and every vertex in the table has a linked list of pointers which point to edges in the edge list.
- Elements from the edge list will point back to the hash table.
KruskalMST(G):
DisjointSets forest
foreach (Vertex v : G):
forest.makeSet(v)
PriorityQueue Q // min edge weight
foreach (Edge e : G):
Q.insert(e)
Graph T = (V, {})
while |T.edges()| < n-1:
Vertex (u, v) = Q.removeMin()
if forest.find(u) != forest.find(v):
T.addEdge(u, v)
forest.union( forest.find(u),
forest.find(v) )
return T
PrimMST(G, s):
Input: G, Graph;
s, vertex in G, starting vertex
Output: T, a minimum spanning tree (MST) of G
foreach (Vertex v : G):
d[v] = +inf
p[v] = NULL
d[s] = 0
PriorityQueue Q // min distance, defined by d[v]
Q.buildHeap(G.vertices())
Graph T // "labeled set"
repeat n times:
Vertex m = Q.removeMin()
T.add(m)
foreach (Vertex v : neighbors of m not in T):
if cost(v, m) < d[v]:
d[v] = cost(v, m)
p[v] = m
return T
Dijkstra(Graph, source, destination):
initialize distances // initialize tentative distance value
initialize previous // initialize a map that maps current node -> its previous node
initialize priority_queue // initialize the priority queue
initialize visited
while the top of priority_queue is not destination:
get the current_node from priority_queue
for neighbor in current_node's neighbors and not in visited:
if update its neighbor's distances:
previous[neighbor] = current_node
save current_node into visited
extract path from previous
return path and distance
Floyd-Warshall’s Algorithm solves the problem Dijkstra's algorithm has with negative edges.