Welcome to Skyzhou's Notes
ようこそいらっしゃいました!
目前已经有的笔记:
有些笔记可能 $\LaTeX$ 一下子加载不出来,需要手动刷新一下
非常抱歉!
CSC3001 Discrete Mathematics
L0 Introduction
Main Course Topics
- Propositional Logic
- Sets, First-Order Logic
- Methods of Proofs
- Mathematical Induction
- Recursion
- Greatest Common Divisors
- Modular Arthmetic
- Chinese Remainder Theorem
- Graphs
- Graph Matching
- Graph Coloring
- Combinatorial Proofs and its Principles
- Counting by Mapping
- Cardinality
L1 Propositional Logic
Statement (Proposition)
A Statement is a sentence that is either True or False.
e.g. $2+2=4 \to \text{True}$
Non e.g. $x+y>0$, cause they are true for some values of $x$ and $y$, but false for others.
Logic Operators
NOT $\lnot$
Alternative notation: $\lnot P = \overline{P}$
| $P$ | $\lnot P$ |
|---|---|
| T | F |
| F | T |
AND $\land$
| $P$ | $Q$ | $P\land Q$ |
|---|---|---|
| T | T | T |
| T | F | F |
| F | T | F |
| F | F | F |
OR $\lor$
| $P$ | $Q$ | $P\lor Q$ |
|---|---|---|
| T | T | T |
| T | F | T |
| F | T | T |
| F | F | F |
XOR $\oplus$
Exclusive OR
| $P$ | $Q$ | $P\oplus Q$ |
|---|---|---|
| T | T | F |
| T | F | T |
| F | T | T |
| F | F | F |
De Morgan's Laws
Logical Equivalence: Two statements have the same truth table.
De Morgan's Laws:
$$ \lnot (P \land Q) \equiv \lnot P \lor \lnot Q $$ $$ \lnot (P \lor Q) \equiv \lnot P \land \lnot Q $$
Tautology and Contradiction
A tautology is a statement that is always true.
- e.g. $P \lor \lnot P$
A contradiction is a statement that is always false. (Negation of a tautology)
- e.g. $P \land \lnot P$
Conditinal Statement
If $\to$
If P then Q: $P \to Q \equiv \lnot P \lor Q$
| $P$ | $Q$ | $P \to Q$ |
|---|---|---|
| T | T | T |
| T | F | F |
| F | T | T |
| F | F | T |
Convention: If we don't say anything wrong, then it is not false, and thus true.
Negation of If-Then: $\lnot(P \to Q) \equiv P \land \lnot Q$
Contrapositive: The contrapositive of $P \to Q$ is $\lnot Q \to \lnot P$
$$ P \to Q \equiv \lnot P \lor Q \equiv Q \lor \lnot P \equiv \lnot Q \to \lnot P $$
If and Only If $\iff$
P if Q means "if Q then P", or "Q implies P". We also say Q is a sufficient condition for P.
P only if Q means "if P then Q", or "P implies Q". We also say Q is a necessary condition for P.
P if and only if (iff) Q means P and Q are logically equivalent.
| $P$ | $Q$ | $P \iff Q$ |
|---|---|---|
| T | T | T |
| T | F | F |
| F | T | F |
| F | F | T |
$$ P \iff Q \equiv (P \to Q) \land (Q \to P) \equiv (P \to Q) \land (\lnot P \to \lnot Q) $$
Argument
An argument is a sequence of statements.
All statements but the final one are called assumptions or hypothesis.
The final statement is called the conclusion.
An argument is valid if: whenever all the assumptions are true, then the conclusion is true. Or we say the conclusion follows from the assumptions.
L2 Sets
Basic Definitions
Defining Sets
Set: A set is an unordered collection of distinct objects.
Elements / Members: The objects in a set are called the elements or members of the set S, and we say S contains its elements.
e.g. $S=\{2,3,5,7\} = \{3,7,2,5\}$ (Unordered)
Alternatively, we use the notation $\{x \in A | P(x)\}$ to define the set as the set of elemets, $x$, in $A$ such that $x$ satisfies property $P$.
Classical Sets
- $\mathbb{Z}$: The set of all integers
- $\mathbb{Z^+}$: The set of all positive integers
- $\mathbb{Z^-}$: The set of all negative integers
- $\mathbb{N}$: The set of all nonnegative integers
- $\mathbb{R}$: The set of all real numbers
- $\mathbb{Q}$: The set of all rational numbers
- $\mathbb{C}$: The set of all complex numberse
Membership
The most basic question in set theory is whether an element is in a set.
- $A \subseteq B$: $A$ is an subset of $B$
- $A = B$: $A$ is equal to $B$
- $A \subset B$: $A$ is an proper subset of $B$
Operations on Sets
Basic Operations on Sets
Intersection: $A \cap B = \{x \in U | x \in A \text{ and } x \in B\}$
Two sets are said to be disjoint if their intersection is an empty set.
Union: $A \cup B = \{x \in U | x \in A \text{ or } x \in B\}$
And we have $|A \cup B| = |A|+|B| - |A \cap B|$
Complement: $\overline{A} = A^c = \{x \in U | x \notin A\}$
And if $A \subseteq B$, then $\overline{B} \subseteq \overline{A}$
Difference: $A - B = \{ x \in U | x \in A \text{ and }x \notin B \}$
And we have $|A-B| = |A| - |A \cap B|$
Big set operators: $$ \begin{align*} \bigcup_{i=0}^n A_i &= \{ x \in A_i \text{ for at least one } i = 0, 1, 2, \cdots, n \} \\ \bigcap_{i=0}^n A_i &= \{ x \in A_i \text{ for all } i = 0, 1, 2, \cdots, n \} \end{align*} $$
Partition of Sets
Partition: A collection of nonempty sets $\{ A_1, A_2, \cdots, A_n \}$ is a partition of a set A if and only if $$ \begin{align*} & A = A_1 \cup A_2 \cup \cdots \cup A_n \\ & A_1, A_2, \cdots, A_n \text{ are mutually disjoint (or pairwise disjoint)} \end{align*} $$
Cartesian Products
Cartesian Products: Given two sets $A$ and $B$, the Cartesian product $A \times B$ is the set of all ordered pairs $(a,b)$, where $a$ is in $A$ and b is in $B$. That is: $$ A \times B = \{(a,b) | a \in A, b \in B\} $$
e.g. Let $A = \{ a, b, c, \cdots, z \}$. Let $B = \{ 0, 1, 2, \cdots, 9 \}$. Then $A \times B = \{ (a,1), (a,2), \cdots, (a, 9), (b, 1), (b, 2), \cdots, (z, 9) \}$
Well...It may seems a bit like the outer product of two vectors is a matrix, but in a set.
Set Identities
Set Laws
Let $A$, $B$ and $C$ be subsets of a universal set $U$, and we have:
- Commutative Law
- $A \cup B = B \cup A$
- $A \cap B = B \cap A$
- Associative Law
- $(A \cup B) \cup C = A \cup (B \cup C)$
- $(A \cap B) \cap C = A \cap (B \cap C)$
- Distributive Law
- $A \cup (B \cap C) = (A \cup B) \cap (A \cup C)$
- $A \cap (B \cup C) = (A \cap B) \cup (A \cap C)$
- Identity Law
- $A \cup \emptyset = A$
- $A \cap U = A$
- Complement Law
- $A \cup A^c = U$
- $A \cap A^c = \emptyset$
- De Morgan's Law
- $(A \cup B)^c = A^c \cap B^c$
- $(A \cap B)^c = A^c \cup B^c$
- Set Difference Law
- $A - B = A \cap B^c$
L3 First-order Logic
Quantifiers
Predicates
Predicate: Propositions (i.e. statments) with variables.
- e.g. $P(x, y): x + 2 = y$
The domain of a variable is the set of all values that may be substituted in place of the variable.
The truth of a quantified statement depends on the domain.
The Universal Quantifier $\forall$
$\forall x$: For ALL $x$
- e.g. $\forall x, x^2 \geq x$
The Existential Quantifier $\exists$
$\exists x$: There EXISTS some $x$ (perhaps one)
- e.g. $\exists x \in \mathbb{Z}^+, x=x^2$
e.g. How to write prime(p) ?
A prime number is a natural number greater than $1$ that has no positive divisors other than $1$ and itself.
- Conditions: $p \not= ab$
- Variables: $p, a, b$
- Domains: $p, a, b > 1, p, a, b \in \mathbb{Z}$
- Add quantifiers and reconcile the order:
$$ (p > 1) \land (p \in \mathbb{Z}) \land (\forall a, b, \in \mathbb{Z}^+, (p \not= ab) \lor (a=1) \lor (a=p)) $$
Negation
One simple equivalence: $$ \lnot \forall x, P(x) \equiv \exists x, \lnot P(x) $$ e.g. Not every like you = There exists someone who doesn't like you.
Multiple Quantifiers
The order of quantifiers is very important.
Arguments of Quantified Statements
Predicate Calculus Validity
A quantified argument is valid if the conclusion is true whenever the assumptions are true.
Arguments with Quantified Statements
Universal Instantiation: $$ \begin{align*} & \forall x , P(x) \\ \therefore \ & P(a) \end{align*} $$
Universal Modus Ponens $$ \begin{align*} & \forall x , (P(x) \to Q(x)) \\ & P(a) \\ \therefore \ & Q(a) \end{align*} $$
Universal Modus Tollens: $$ \begin{align*} & \forall x, P(x) \to Q(x) \\ & \lnot Q(A) \\ \therefore \ & \lnot P(A) \end{align*} $$
Universal Generalization: $$ \frac{A \to R(c)}{A \to \forall x , R(x)} \qquad \\ \text{(valid rule, provided $c$ is independent of $A$)} $$
CSC3200 Data Structures and Advanced Programming
Introduction
Teaching C++ first.
Then data sturctures.
Assignment 1
Trigonometric Functions
Implement some of the trigonometric functions from scratch without using any mathematical functions from STL.
student_math.h
#ifndef STUDENT_MATH_H
#define STUDENT_MATH_H
namespace student_std {
double sin(double x); //x is an angle in radian
double sin_deg(double x); //x is an angle in degree
double cos(double x); //x is an angle in radian
double cos_deg(double x); //x is an angle in degree
double tan(double x); //x is an angle in radian
double tan_deg(double x); //x is an angle in degree
double cot(double x); //x is an angle in radian
double cot_deg(double x); //x is an angle in degree
}
#endif
student_math.cpp
#include "student_math.h"
const double pi = 3.1415926535897932384626;
const double esp = 1e-8;
double toRadian(double x) {
return pi*x/180.0;
}
double dcmp(double x) {
return (x < -esp ? -1 : (x > esp ? 1 : 0));
}
double fabs(double x) {
return (dcmp(x) == -1 ? -x : x);
}
double TrigonometricLagrange(double x, bool issin) {
while(x > 2.0*pi) x -= 2.0*pi;
while(x < -2.0*pi) x += 2.0*pi;
double numerator = 1, denominator = 1;
double timescnt = 1;
if(issin) {
numerator = x;
timescnt = 2;
}
double current = numerator/denominator, last = numerator/denominator;
double re = current;
while(fabs(current) > esp) {
current = current*x*x/(timescnt)/(timescnt+1.0)*(-1.0);
timescnt += 2;
re += current;
}
return (fabs(re) > esp ? re : 0);
}
namespace student_std {
double sin(double x) {
return TrigonometricLagrange(x, 1);
}
double sin_deg(double x) {
return sin(toRadian(x));
}
double cos(double x) {
return TrigonometricLagrange(x, 0);
}
double cos_deg(double x) {
return cos(toRadian(x));
}
double tan(double x) {
return sin(x)/cos(x);
}
double tan_deg(double x) {
return sin(toRadian(x))/cos(toRadian(x));
}
double cot(double x) {
return 1.0/tan(x);
}
double cot_deg(double x) {
return 1.0/tan(toRadian(x));
}
}
String
Implement a C++ string class from scratch without using STL.
student_string.h
#ifndef STUDENT_STRING_H
#define STUDENT_STRING_H
#define MAXLEN 256
namespace student_std {
class string {
public:
string();
string(const char* str);
string(string const&);
~string();
int get_strlen() const;
const char* get_c_str() const;
void strcat(string const&);
string& operator=(string const&);
string& operator+=(string const&);
char& operator[](int);
const char& operator[](int) const;
void to_upper();
void to_lower();
void strcopy(string const&);
bool equals(string const&) const;
bool equals_ignore_case(string const&) const;
void trim(); // Removes spaces ' ' from beginning and end
bool is_alphabetic() const;
private: // hint
char c_str[MAXLEN];
int strlen;
};
}
#endif
student_string.cpp
#include "student_string.h"
namespace student_std {
string::string() {
this->c_str[0] = '\0';
this->strlen = 0;
}
string::string(const char* str) {
int len = 0;
while(str[len] != '\0') {
len++;
}
this->strlen = len;
for(int i = 0; i <= len; i++) {
this->c_str[i] = str[i];
}
}
string::string(string const& str) {
this->strlen = str.strlen;
for(int i = 0; i <= this->strlen; i++) {
this->c_str[i] = str.c_str[i];
}
}
string::~string() {}
int string::get_strlen() const {
return this->strlen;
}
const char* string::get_c_str() const {
return this->c_str;
}
void string::strcat(string const& str) {
int len = this->strlen;
for(int i = 0; i <= str.strlen; i++) {
this->c_str[i+len] = str.c_str[i];
}
this->strlen = len+str.strlen;
}
string& string::operator=(string const& str) {
if(this == &str) {
return *this;
}
this->strlen = str.strlen;
for(int i = 0; i <= this->strlen; i++) {
this->c_str[i] = str.c_str[i];
}
return *this;
}
string& string::operator+=(string const& str) {
this->strcat(str);
return *this;
}
char& string::operator[](int i) {
return this->c_str[i];
}
const char& string::operator[](int i) const {
return this->c_str[i];
}
void string::to_upper() {
for(int i = 0; i < this->strlen; i++) {
if(this->c_str[i] >= 'a' && this->c_str[i] <= 'z') {
this->c_str[i] = char(c_str[i]+('A'-'a'));
}
}
}
void string::to_lower() {
for(int i = 0; i < this->strlen; i++) {
if(this->c_str[i] >= 'A' && this->c_str[i] <= 'Z') {
this->c_str[i] = char(c_str[i]+('a'-'A'));
}
}
}
void string::strcopy(string const& str) {
*this = str;
}
bool string::equals(string const& str) const {
if(this->strlen != str.strlen) return false;
for(int i = 0; i < this->strlen; i++) {
if(this->c_str[i] != str.c_str[i]) return false;
}
return true;
}
bool string::equals_ignore_case(string const& str) const {
string a = *this, b = str;
a.to_lower();
b.to_lower();
return a.equals(b);
}
void string::trim() {
int l = 0, r = this->strlen-1;
while(this->c_str[l] == ' ') {
l++;
}
while(this->c_str[r] == ' ') {
r--;
}
for(int i = 0; i < r-l+1; i++) {
this->c_str[i] = this->c_str[i+l];
}
for(int i = r-l+1; i <= this->strlen; i++) {
this->c_str[i] = '\0';
}
this->strlen = r-l+1;
}
bool string::is_alphabetic() const {
for(int i = 0; i < this->strlen; i++) {
if(!((this->c_str[i] >= 'a' && this->c_str[i] <= 'z') || (this->c_str[i] >= 'A' && this->c_str[i] <= 'Z'))) {
return false;
}
}
return true;
}
}
Assignment 2
Vector
Implement a C++ vector (dynamic array) from scratch without using STL.
student_vector.h
#ifndef STUDENT_VECTOR_H
#define STUDENT_VECTOR_H
#include <cstddef>
#include <cassert>
#include <stdexcept>
#include <iterator>
namespace student_std {
template <typename T>
class vector {
public:
using size_type = std::size_t;
using difference_type = std::ptrdiff_t;
using value_type = T;
private:
T* s;
size_type sz;
size_type cap;
void reallocate(size_type newcap) {
if(newcap < sz) newcap = sz;
T* news = new T[newcap];
for(size_type i = 0; i < sz; i++) {
news[i] = s[i];
}
delete[] s;
s = news;
cap = newcap;
}
public:
vector() {
s = nullptr;
sz = 0;
cap = 0;
}
vector(const vector& other) {
s = nullptr;
sz = other.sz;
cap = other.cap;
s = new T[cap];
for(size_type i = 0; i < sz; i++) {
s[i] = other.s[i];
}
}
vector& operator= (const vector& other) {
if(this == &other) return *this;
delete[] s;
sz = other.sz;
cap = other.cap;
s = new T[cap];
for(size_type i = 0; i < sz; i++) {
s[i] = other.s[i];
}
return *this;
}
~vector() {
delete[] s;
}
size_type size() const {
return sz;
}
size_type capacity() const {
return cap;
}
bool empty() const {
return sz == 0;
}
const T* data() const {
return s;
}
const T& at(size_type p) const {
if(p >= sz) throw std::out_of_range("vector::at out of range");
else return s[p];
}
const T& operator[] (size_type p) const {
assert(p >= 0 && p < sz);
return s[p];
}
const T& front() const {
assert(sz > 0);
return s[0];
}
const T& back() const {
assert(sz > 0);
return s[sz-1];
}
void reserve(size_type newcap) {
if(newcap > cap) {
reallocate(newcap);
}
}
void push_back(const T& val) {
if(sz == cap) {
size_type newcap = (cap == 0 ? 1 : cap * 2);
reserve(newcap);
}
s[sz] = val;
sz++;
}
void pop_back() {
assert(sz > 0);
sz--;
}
void resize(size_type newsz, const T& val) {
if(newsz > cap) {
reserve(newsz);
}
if(newsz > sz) {
for(size_type i = sz; i < newsz; i++) {
s[i] = val;
}
}
sz = newsz;
}
void resize(size_type newsz) {
resize(newsz, T());
}
void clear() {
sz = 0;
}
void swap(vector& other) {
std::swap(s, other.s);
std::swap(sz, other.sz);
std::swap(cap, other.cap);
}
T* data() {
return s;
}
T& at(size_type p) {
if(p >= sz) throw std::out_of_range("vector::at out of range");
else return s[p];
}
T& operator[] (size_type p) {
assert(p >= 0 && p < sz);
return s[p];
}
T& front() {
assert(sz > 0);
return s[0];
}
T& back() {
assert(sz > 0);
return s[sz-1];
}
public:
class iterator {
public:
using difference_type = std::ptrdiff_t;
using value_type = T;
using pointer = T*;
using reference = T&;
using iterator_category = std::random_access_iterator_tag;
private:
pointer ptr;
public:
iterator() {
ptr = nullptr;
}
iterator(pointer p) {
ptr = p;
}
reference operator*() const {
return *ptr;
}
pointer operator->() const {
return ptr;
}
iterator& operator++() {
ptr++;
return *this;
}
iterator operator++(int) {
iterator tmp = *this;
ptr++;
return tmp;
}
iterator& operator--() {
ptr--;
return *this;
}
iterator operator--(int) {
iterator tmp = *this;
ptr--;
return tmp;
}
iterator& operator+=(difference_type num) {
ptr += num;
return *this;
}
iterator& operator-=(difference_type num) {
ptr -= num;
return *this;
}
iterator operator+(difference_type num) const {
return iterator(ptr+num);
}
iterator operator-(difference_type num) const {
return iterator(ptr-num);
}
difference_type operator-(const iterator& other) const {
return ptr-other.ptr;
}
bool operator==(const iterator& other) const{
return ptr == other.ptr;
}
bool operator!=(const iterator& other) const{
return ptr != other.ptr;
}
bool operator<=(const iterator& other) const{
return ptr <= other.ptr;
}
bool operator>=(const iterator& other) const{
return ptr >= other.ptr;
}
bool operator<(const iterator& other) const{
return ptr < other.ptr;
}
bool operator>(const iterator& other) const{
return ptr > other.ptr;
}
};
public:
iterator begin() {
return iterator(s);
}
iterator end() {
return iterator(s+sz);
}
iterator erase(iterator it) {
assert(it >= begin() && it < end());
T* p = &(*it);
for(T* i = p; i+1 < s+sz; i++) {
*i = *(i+1);
}
sz--;
return iterator(p);
}
iterator erase(iterator it1, iterator it2) {
assert(it1 >= begin() && it1 < it2 && it2 <= end());
T* p1 = &(*it1);
T* p2 = &(*it2);
size_type erasecnt = p2 - p1;
for(T* i = p1; i+erasecnt < s + sz; i++) {
*i = *(i+erasecnt);
}
sz -= erasecnt;
return iterator(p1);
}
iterator insert(iterator it, const T& val) {
assert(it >= begin() && it <= end());
size_type p = it - begin();
if(sz == cap) reserve(cap == 0 ? 1 : cap * 2);
it = iterator(s + p);
for(size_type i = sz; i > p; i--) {
s[i] = s[i-1];
}
s[p] = val;
sz++;
return it;
}
};
}
#endif
Assignment 3
Maze
Too simple.
Priority Queue
Priority queue with list and $O(n)$ operation ??? What the fk are you doing CUHKSZ
student_priority_queue.h
#ifndef STUDENT_PRIORITY_QUEUE_H
#define STUDENT_PRIORITY_QUEUE_H
#include <list>
namespace student_std {
template <typename T, typename Container = std::list<T>>
class priority_queue {
public:
using container_type = Container;
using value_type = typename Container::value_type;
using size_type = typename Container::size_type;
// ...
private:
Container c;
static bool cmp(T a, T b) {
return b < a;
}
public:
priority_queue() = default;
priority_queue(const priority_queue &other) {
c = other.c;
}
priority_queue &operator = (const priority_queue &other) {
if(this != &other) c = other.c;
return this;
}
value_type const& top() const {
return *c.begin();
}
void pop() {
c.erase(c.begin());
}
size_type size() const {
return c.size();
}
bool empty() const {
return c.empty();
}
void push(const value_type &val) {
auto it = c.begin();
while(it != c.end() && val < *it) it++;
c.insert(it, val);
}
void swap(priority_queue &other) {
c.swap(other.c);
}
};
}
#endif
Assignment 4
Unordered Map
Implement an unordered map using bucket hashing.
Bucket type: vector and list
student_unordered_map.h
#ifndef STUDENT_UNORDERED_MAP_H
#define STUDENT_UNORDERED_MAP_H
#include <vector>
#include <list>
#include <functional>
#include <utility>
#include <cassert>
namespace student_std {
template <typename Key, typename T, typename Hash = std::hash<Key>>
class unordered_map {
public:
using key_type = Key;
using mapped_type = T;
using size_type = std::size_t;
using difference_type = std::ptrdiff_t;
using value_type = std::pair<Key, T>;
using hasher = Hash;
using reference = value_type&;
using const_reference = const value_type&;
private:
std::vector <std::list <value_type>> buckets;
size_type sz = 0;
hasher hsh;
size_type index_for(Key const& k) const {
return hsh(k) % buckets.size();
}
double load_factor() const {
return double(sz) / double(buckets.size());
}
void rehash(size_type newsz) {
auto tmp = std::move(buckets);
buckets = std::vector <std::list <value_type>> (newsz);
for(auto& lst : tmp) {
for(auto& x : lst) {
size_type idx = index_for(x.first);
buckets[idx].push_back(std::move(x));
}
}
}
public:
unordered_map(size_type init_sz = 8) : buckets(init_sz) {}
size_type size() const {
return sz;
}
bool empty() const {
return sz == 0;
}
size_type bucket_count() const {
return buckets.size();
}
bool contains(key_type const& k) const {
size_type idx = index_for(k);
for(auto const& x : buckets[idx]) {
if(x.first == k) {
return true;
}
}
return false;
}
void clear() {
for(auto& x : buckets) {
x.clear();
}
sz = 0;
}
size_type erase(key_type const& k) {
size_type idx = index_for(k);
auto& lst = buckets[idx];
for(auto it = lst.begin(); it != lst.end(); it++) {
if(it->first == k) {
lst.erase(it);
sz--;
return 1;
}
}
return 0;
}
T& operator [](key_type const& k) {
size_type idx = index_for(k);
auto& lst = buckets[idx];
for(auto& x : lst) {
if(x.first == k) {
return x.second;
}
}
lst.emplace_back(k, T());
sz++;
if(load_factor() >= 2.0) {
rehash(buckets.size()*2);
}
idx = index_for(k);
auto& nlst = buckets[idx];
for(auto& x : nlst) {
if(x.first == k) {
return x.second;
}
}
assert(false);
static T re{};
return re;
}
};
}
#endif
BST's Inorder Iterator
Implement the binary tree traversal inorder forward iterator from scratch.
student_inorder_iterator.h
#ifndef STUDENT_INORDER_ITERATOR_H
#define STUDENT_INORDER_ITERATOR_H
namespace student_std {
template <typename BinaryTree>
class inorder_iterator {
public:
using value_type = typename BinaryTree::value_type;
using difference_type = std::ptrdiff_t;
using iterator_category = std::forward_iterator_tag;
using reference = value_type const&;
using pointer = value_type const*;
private:
BinaryTree const* cur;
public:
inorder_iterator() : cur(nullptr) {}
inorder_iterator(BinaryTree const* node) {
cur = node;
if(cur) {
while(cur->left()) cur = cur->left();
}
}
reference operator*() const {
return cur->value();
}
pointer operator->() const {
return &(cur->value());
}
inorder_iterator& operator++() {
if(!cur) return *this;
if(cur->right()) {
cur = cur->right();
while(cur->left()) cur = cur->left();
return *this;
}
auto p = cur->parent();
while(p && p->right() == cur) {
cur = p;
p = p->parent();
}
cur = p;
return *this;
}
inorder_iterator operator++(int) {
inorder_iterator tmp = *this;
++(*this);
return tmp;
}
bool operator==(inorder_iterator const& other) const {
return cur == other.cur;
}
bool operator!=(inorder_iterator const& other) const {
return !(*this == other);
}
};
}
#endif
Assignment 5
AVL Tree
Implement an AVL Tree from scratch.
student_avl_tree.h
#ifndef STUDENT_AVL_TREE_H
#define STUDENT_AVL_TREE_H
#include <algorithm>
#include <functional>
#include <utility>
#include <cstddef>
#include <iterator>
namespace student_std {
template <typename Key, typename Comp = std::less<Key>>
class avl_tree {
class avl_node {
public:
using size_type = std::size_t;
using difference_type = std::ptrdiff_t;
avl_node(const Key& k, avl_node* p = nullptr) :
m_key(k), m_parent(p), m_left(nullptr), m_right(nullptr), m_size(1), m_height(0) {}
Key const& value() const { return m_key; };
avl_node const* parent() const { return m_parent; };
avl_node const* left() const { return m_left; };
avl_node const* right() const{ return m_right; };
size_type size() const { return m_size; }
std::ptrdiff_t height() const { return m_height; }
private:
size_type m_size;
std::ptrdiff_t m_height;
Key m_key;
avl_node* m_parent;
avl_node* m_left;
avl_node* m_right;
friend class avl_tree;
};
class iterator {
public:
using value_type = avl_node;
using reference = value_type const&;
using pointer = value_type const*;
using difference_type = std::ptrdiff_t;
using iterator_category = std::bidirectional_iterator_tag;
iterator(avl_node* node = nullptr) : m_node{node} {}
iterator(pointer node) : m_node{const_cast<avl_node*>(node)} {}
iterator& operator++() { // O(log n)
if (m_node == nullptr) return *this;
if (m_node->m_right) {
m_node = m_node->m_right;
while (m_node->m_left) {
m_node = m_node->m_left;
}
} else {
avl_node* p = m_node->m_parent;
while (p && m_node == p->m_right) {
m_node = p;
p = p->m_parent;
}
m_node = p;
}
return *this;
}
iterator operator++(int) { // O(log n)
iterator tmp = *this;
++(*this);
return tmp;
}
iterator& operator--() { // O(log n)
if (m_node == nullptr) return *this;
if (m_node->m_left) {
m_node = m_node->m_left;
while (m_node->m_right) {
m_node = m_node->m_right;
}
} else {
avl_node* p = m_node->m_parent;
while (p && m_node == p->m_left) {
m_node = p;
p = p->m_parent;
}
m_node = p;
}
return *this;
}
iterator operator--(int) { // O(log n)
iterator tmp = *this;
--(*this);
return tmp;
}
reference operator*() const { // O(1)
return *m_node;
}
pointer operator->() const { // O(1)
return m_node;
}
bool operator==(iterator const& other) const {
return m_node == other.m_node;
}
bool operator!=(iterator const& other) const {
return m_node != other.m_node;
}
private:
avl_node* m_node;
friend class avl_tree;
};
public:
using key_type = Key;
using node_type = avl_node;
using size_type = std::size_t;
using comparison = Comp;
using const_iterator = iterator;
avl_tree() : m_root(nullptr), m_comp(comparison()) {}
~avl_tree() { clear(m_root); }
iterator insert(Key const& key) { // O(log n)
if (!m_root) {
m_root = new avl_node(key);
return iterator(m_root);
}
avl_node* current = m_root;
avl_node* parent = nullptr;
while (current) {
parent = current;
if (m_comp(key, current->m_key)) {
current = current->m_left;
} else if (m_comp(current->m_key, key)) {
current = current->m_right;
} else {
return iterator(current);
}
}
avl_node* new_node = new avl_node(key, parent);
if (m_comp(key, parent->m_key)) {
parent->m_left = new_node;
} else {
parent->m_right = new_node;
}
rebalance_path(new_node);
return iterator(new_node);
}
iterator erase(Key const& key) { // O(log n)
avl_node* node = find_node(key);
if (!node) return end();
avl_node* node_to_delete = node;
avl_node* successor_to_return = nullptr;
if (node->m_right) {
successor_to_return = node->m_right;
while (successor_to_return->m_left) {
successor_to_return = successor_to_return->m_left;
}
} else {
avl_node* p = node->m_parent;
while (p && node == p->m_right) {
node = p;
p = p->m_parent;
}
successor_to_return = p;
}
if (node_to_delete->m_left && node_to_delete->m_right) {
avl_node* successor_swap = node_to_delete->m_right;
while (successor_swap->m_left) {
successor_swap = successor_swap->m_left;
}
node_to_delete->m_key = successor_swap->m_key;
node_to_delete = successor_swap;
}
avl_node* child = (node_to_delete->m_left) ? node_to_delete->m_left : node_to_delete->m_right;
avl_node* parent_of_deleted = node_to_delete->m_parent;
avl_node* rebalance_start_node = parent_of_deleted;
if (child) {
child->m_parent = parent_of_deleted;
}
if (!parent_of_deleted) {
m_root = child;
} else {
if (parent_of_deleted->m_left == node_to_delete) {
parent_of_deleted->m_left = child;
} else {
parent_of_deleted->m_right = child;
}
}
delete node_to_delete;
if (rebalance_start_node) {
rebalance_path(rebalance_start_node);
}
return iterator(successor_to_return);
}
iterator find(Key const& key) const { // O(log n)
avl_node* res = find_node(key);
return iterator(res);
}
bool contains(Key const& key) const { // O(log n)
return find_node(key) != nullptr;
}
size_type size() const { // O(1)
return m_root ? m_root->m_size : 0;
}
std::ptrdiff_t height() const { // O(1)
return m_root ? m_root->m_height : -1;
}
std::ptrdiff_t getBL() const {
return get_balance_factor(m_root);
}
iterator begin() const { // O(log n)
if (!m_root) return iterator(static_cast<avl_node*>(nullptr));
avl_node* curr = m_root;
while (curr->m_left) {
curr = curr->m_left;
}
return iterator(curr);
}
iterator end() const {
return iterator(static_cast<avl_node*>(nullptr));
}
iterator root() const { // O(1)
return iterator(m_root);
}
private:
avl_node* m_root;
comparison m_comp;
avl_node* find_node(Key const& key) const { // O(log n)
avl_node* curr = m_root;
while (curr) {
if (m_comp(key, curr->m_key)) {
curr = curr->m_left;
} else if (m_comp(curr->m_key, key)) {
curr = curr->m_right;
} else {
return curr;
}
}
return nullptr;
}
void clear(avl_node* node) {
if (node) {
clear(node->m_left);
clear(node->m_right);
delete node;
}
}
std::ptrdiff_t get_height(avl_node* n) const {
return n ? n->m_height : -1;
}
size_type get_size(avl_node* n) const {
return n ? n->m_size : 0;
}
std::ptrdiff_t get_balance_factor(avl_node* n) const {
if (!n) return 0;
return get_height(n->m_left) - get_height(n->m_right);
}
void update_stats(avl_node* n) {
if (n) {
n->m_height = 1 + std::max(get_height(n->m_left), get_height(n->m_right));
n->m_size = 1 + get_size(n->m_left) + get_size(n->m_right);
}
}
void rotate_right(avl_node* y) {
avl_node* x = y->m_left;
avl_node* T2 = x->m_right;
x->m_right = y;
y->m_left = T2;
x->m_parent = y->m_parent;
y->m_parent = x;
if (T2) T2->m_parent = y;
if (x->m_parent) {
if (x->m_parent->m_left == y) x->m_parent->m_left = x;
else x->m_parent->m_right = x;
} else {
m_root = x;
}
update_stats(y);
update_stats(x);
}
void rotate_left(avl_node* x) {
avl_node* y = x->m_right;
avl_node* T2 = y->m_left;
y->m_left = x;
x->m_right = T2;
y->m_parent = x->m_parent;
x->m_parent = y;
if (T2) T2->m_parent = x;
if (y->m_parent) {
if (y->m_parent->m_left == x) y->m_parent->m_left = y;
else y->m_parent->m_right = y;
} else {
m_root = y;
}
update_stats(x);
update_stats(y);
}
void rebalance_path(avl_node* node) {
while (node) {
update_stats(node);
std::ptrdiff_t balance = get_balance_factor(node);
// Left Heavy
if (balance > 1) {
if (get_balance_factor(node->m_left) < 0) {
rotate_left(node->m_left); // LR
}
rotate_right(node); // LL or LR
}
// Right Heavy
else if (balance < -1) {
if (get_balance_factor(node->m_right) > 0) {
rotate_right(node->m_right); // RL
}
rotate_left(node); // RR or RL
}
node = node->m_parent;
}
}
};
}
#endif
CSC4120 Design and Analysis of Algorithms
L0 Introduction
Main Course Topics
- Key algorithm concepts (asymptotic complexity, divide-and-conquer)
- Basic data structure algorithms
- Heaps (priority queues)
- Binary seach trees (scheduling)
- Hashing (cryptography, dictionaries)
- Graph searches (reachability analysis)
- Shortest paths (Google maps, navigation)
- Greedy algorithms (minimum spanning trees, Huffman codes)
- Dynamic programming (any optimization problem)
- Network flows
- Compleity (NP-complete, poly reductions, approximation algorithms)
Textbook: Algorithms by S. Dasgupta, C.H. Papadimitriou, and U.V. Vazirani.
L1 Basic Concepts of Algorithmic Performance
Asymptotic Complexity
What is $T(n)$?
$T(n)$ is the largest (worst) execution time over all inputs of size $n$.
Asymptotic Complexity
Goal: Capture the growth of $T(n)$ of an algorithm as the problem instance size $n \rightarrow \infty$
Key Idea: Keep only the rate of growth, i.e., the dominant term when $n \rightarrow \infty$, forgetting multiplicative constants and lower order terms.
Three Os
-
$\Theta$: Grows asymptotically as fast as ($=$)
- Tight Bound
- $f(n) = \Theta(g(n)) \Leftrightarrow \exists c_1, c_2 > 0, n_0 \text{ s.t. } c_1 g(n) \leq f(n) \leq c_2 g(n) \text{ for } n \geq n_0$
- e.g. $100n^2 + 5n - 5 = 0.1n^2 + 10n^{1.5} - 5 = \Theta(n^2)$
-
$O$: Grows asymptotically at most as fast as ($\leq$)
- Upper Bound
- $f(n) = O(g(n)) \Leftrightarrow \exists c > 0, n_0 \text{ s.t. } f(n) \leq c g(n) \text{ for } n \geq n_0$
- e.g. $2n^{1.5}-5 \leq n^{1.5} \leq n^2 \Rightarrow 2n^{1.5}-5 = O(n^{1.5}), = O(n^2)$
- e.g. $n^{1.5} \leq n^2 \leq 2^n \Rightarrow n^{1.5} = O(n^2), n^2 = O(2^n), n^{1.5} = O(2^n)$
-
$\Omega$: Grows asymptotically at least as fast as ($\geq$)
- Lower Bound
- $f(n) = \Omega(g(n)) \Leftrightarrow \exists c > 0, n_0 \text{ s.t. } f(n) \geq c g(n) \text{ for } n \geq n_0$
- e.g. $5 \leq 2n \leq n^{1.5}+1 \leq 2^n \Rightarrow 2n = \Omega(1), 2^n = \Omega(n^{1.5})$
In Practice
Conclusions
- To measure the performance of algorithms we measure their asymptotic complexity.
- Big $O$ is used to describe an upper bound for the running time of an algorithm.
- Big $\Omega$ is used to describe a lower bound for the running time of an algorithm.
- Big $\Theta$ is used to denote asymptotic equivalence of the running times of two algorithms. And when $\Omega(g(n)) = O(g(n)) = f(n)$, we can have $\Theta(g(n)) = f(n)$
L2 Divide and Conquer
Divide and Conquer (Sorting and Master Theorem)
Basic Devide and Conquer
There are 3 steps:
- Divide input into multiple smaller (usually disjoint) points.
- Conquer each of the parts separatelly (using recursive calls).
- Combine results from different calls.
$$ T(n) = \text{divide time} \\ +T(n_1) + T(n_2) + \cdots + T(n_k) \\ +\text{combine time} $$
Sorting
A non-D&C solution: Insertion Sort
void insertionSort(int arr[], int n) {
for (int i = 1; i < n; i++) {
int key = arr[i];
int j = i - 1;
while (j >= 0 && arr[j] > key) {
arr[j + 1] = arr[j];
j--;
}
arr[j + 1] = key;
}
}
We have the complexity is $O(n^2)$ since it runs two loops with $n$ steps.
But $O(n^2)$ is too slow for most of the problem, can it be much faster? Of course. We can use divide and conquer here!
A D&C solution: Merge Sort
void merge(vector<int>& arr, int l, int m, int r) {
int n1 = m - l + 1;
int n2 = r - m;
vector<int> L(n1), R(n2);
for (int i = 0; i < n1; i++) L[i] = arr[l + i];
for (int j = 0; j < n2; j++) R[j] = arr[m + 1 + j];
int i = 0, j = 0, k = l;
while (i < n1 && j < n2) {
if (L[i] <= R[j]) arr[k++] = L[i++];
else arr[k++] = R[j++];
}
while (i < n1) arr[k++] = L[i++];
while (j < n2) arr[k++] = R[j++];
}
void mergeSort(vector<int>& arr, int l, int r) {
if (l < r) {
int m = l + (r - l) / 2;
mergeSort(arr, l, m);
mergeSort(arr, m + 1, r);
merge(arr, l, m, r);
}
}
From the function mergeSort we can see that we divide the array into $2$ parts and conquer them seperately and recursively. And the merge function has the time complexity of $\Theta(n)$.
From the graph we can see that there is only $\log n$ levels of recursion. Thus, we have
$$ \begin{align*} T(n) &= 2T(n/2) + cn \\ &= (cn) \log n + nT(1) \\ &= \Theta (n \log n) \end{align*} $$
Where $cn$ denotes for the time of divide and merge, and $T(1)$ represents for the node of the recursion tree.
Geometric Sums
Consider a geometric series $$ S = 1 + x + x^2 + \cdots + x^h = \frac{x^{x+1}-1}{x-1} \ \ (\text{for } x \not= 1) $$ If $x < 1$, then $$ 1 + x + x^2 + \cdots + x^h \in \left[ 1, \frac{1}{1-x} \right] \Rightarrow S = \Theta(1) $$ If $x > 1$ , then $$ 1 + x + x^2 + \cdots + x^h < \frac{1}{1-x} x^h \Rightarrow S = \Theta(x^h) $$
Master Theorem
Consider the recursion:
$$ T(n) = aT(n/b) + f(n) $$
- Number of levels: $\log_b n = \Theta(\log n)$
- Number of leaves: $L = a^{\log_b n} = n^{log_b a}$
We use the recursion tree to calculate the cost. The cost of the $i$-th (root $i=0$) is: $$ \begin{align*} C_i &= (\text{Nodes Number of } i \text{-th Level}) \times (f \text{ of Every Subproblem}) \\ &= a^i f(n/b^i) \end{align*} $$ So that, we have: $$ T(n) = \sum_{i=0}^h C_i = \sum_{i=0}^{\log_b n} a^i f(n/b^i) $$
We need to compare the time within root and leaves to get the main factor. The usual method is to compare $f(n)$ and $L(n) = n^{\log_b a}$
There are 3 situations here:
-
Geometrically Increasing (Leaves): $$ f(n) = O(L^{1-\epsilon}) \Rightarrow T(n) = \Theta(L) $$ e.g. $T(n) = 2T(n/2) + \sqrt{n}$, we have $a=2, b=2, L=n^{\log_b a} = n$, and $\sqrt{n} = n^{1/2} = O(L^{1-\epsilon})$, thus $T(n) = \Theta(n)$
-
Roughly Equal Levels: $$ f(n) = \Theta(L) \Rightarrow T(n) = \Theta (L \log n) \\ f(n) = \Theta(L \log n) \Rightarrow T(n) = \Theta (L \log^2 n) $$ e.g. $T(n) = 2T(n/2) + n$, we have $L=n^{\log_2 2} = n$, and $n = O(L)$, thus $T(n) = \Theta(n \log n)$
-
Geometrically Decreasing (Root) + Regularly Condition $$ f(n) = \Omega(L^{1+\epsilon}) \text{ and } af(n/b) \leq sf(n), s < 1 \Rightarrow T(n) = \Theta(f(n)) $$ e.g. $T(n) = 2T(n/2) + n^2$, we have $L=n^{\log_2 2} = n$, and $n^2 > O(L), 2f(n/2) = 2(n/2)^2 \leq (1/2) f(n)$, thus $T(n) = \Theta(n^2)$
Applications of D&C
Convex Hull
Given $n$ points $S = { (x_i, y_i) | i = 1, 2, \cdots, n }$ s.t. all have different $x$ and $y$ coordinates, and no three of them in a line.
Convex Hull (CH): The smallest polygon containing all points in $S$. Represented by the sequence of points in the boundary in clockwise order.
How to use D&C here?
- Sort points by $x$-coordinate.
- For our input set $S$ of points:
- Divide (by $x$-coordinates) into left-half A and right-half B.
- Compute CH(A) and CH(B).
- Merge CH(A) and CH(B) to get CH(A+B).
It has the complexity time $T(n) = 2T(n/2) + \Theta(n) \rightarrow \Theta(n \log n)$.
BUT, how to merge?
Algorithm: Start with $(a_1, b_1)$. While $y(i, j)$ increases, keep moving by one step either a_i counterclockwise or b_j clockwise. When cannot increase anymore, we found the upper tangent, say $a_u, b_u$. Similarly, the lower tangent $a_l, b_l$. Then the CH = points of A clockwise from $a_l$ to $a_u$ + points of B clockwise from $b_u$ + $b_l$. Hence total time $\Theta(n)$
Median Finding / Peak Finding
Not interested.
Fast Fourier Transform
Multiplication of polynomials as list of coefficients.
Consider: $$ \begin{align*} A(x) &= a_0 + a_1 x + \cdots + a_n x^n = \sum_{i=0}^{n}a_i x^i \\ B(x) &= b_0 + b_1 x + \cdots + b_n x^n = \sum_{i=0}^{n}b_i x^i \end{align*} $$ To calculate: $$ \begin{align*} C(x) &= A(x) B(x) = c_0 + c_1 x + \cdots + c_{2n}x^{2n} =\sum_{i=0}^{2n}c_i x^i \\ \text{where }c_k &= a_0 b_k + a_1 b_{k-1} + \cdots + a_k b_0 = \sum_{i=0}^k a_i b_{k-i} \end{align*} $$
Complexity: For $c_k \rightarrow O(k)$, for $C(x) \rightarrow O(n^2)$
Can we do it faster? Like $O(n \log n)$?
Multiplication of polynomials as list of values.
Consider: A degree-$n$ polynomial is uniquely characterized by its values at any $n+1$ distinct points $x_0, x_1, \cdots, x_d$.
That is, if we fix any distinct $n+1$ points, we can specify any degree-$n$ polynomial $A(x) = a_0 + a_1 x + \cdots + a_d x^n$ by the $n+1$ values: $$ A(x_0), A(x_1), \cdots, A(x_n) $$ Then we can have: $$ \begin{align*} C(x) &= A(x) B(x) \\ C(x_i) &= A(x_i) B(x_i) \\ \text{where } i&=0,1,\cdots,2n \end{align*} $$ Polynomial multiplication is $O(n)$ using this representation!
But how can we get the coefficients of $C(x)$ ?
Evaluation and Interpolation
Evaluation: $a_0, a_1, \cdots, a_n \rightarrow A(x_0), A(x_1), \cdots, A(x_n)$
$$ A(x) = M_n(x) \cdot a \\ $$ $$ \begin{bmatrix} A(x_0) \\ A(x_1) \\ \vdots \\ A(x_{n}) \end{bmatrix}= \begin{bmatrix} 1 & x_0 & x_0^2 & \cdots & x_0^n \\ 1 & x_1 & x_1^2 & \cdots & x_1^n \\ \vdots & \vdots & \vdots & \ddots & \vdots \\ 1 & x_n & x_n^2 & \cdots & x_n^n \end{bmatrix} \cdot \begin{bmatrix} a_0 \\ a_1 \\ \vdots \\ a_{n} \end{bmatrix} $$
Interpolation: $A(x_0), A(x_1), \cdots, A(x_n) \rightarrow a_0, a_1, \cdots, a_n$ $$ a = M_n^{-1}(x) \cdot A(x) \\ $$ $$ \begin{bmatrix} a_0 \\ a_1 \\ \vdots \\ a_n \end{bmatrix}= \begin{bmatrix} 1 & x_0 & x_0^2 & \cdots & x_0^n \\ 1 & x_1 & x_1^2 & \cdots & x_1^n \\ \vdots & \vdots & \vdots & \ddots & \vdots \\ 1 & x_n & x_n^2 & \cdots & x_n^n \end{bmatrix} ^{-1} \cdot \begin{bmatrix} A(x_0) \\ A(x_1) \\ \vdots \\ A(x_n) \end{bmatrix} $$
Here $M_n$ is Vandermonde matrix.
Consider $C(x_i) = A(x_i) B(x_i)$, from the two matrix multiplication we can know that: If we have every $A(x_i)$ and $B(x_i)$ we can use $O(n)$ to multiply and get $C(x_i)$, then transfer it into every coefficient $c_i$.
But, using matrix multiplication directly will need time complexity of $O(n^2)$. Can we optimize it to $O(n \log n)$? Yes, and here comes Fast Fourier Tranform (FFT).
FFT
Let's calculate evaluation using FFT first.
Consider a polynomial $P(x)$ with length of even number $n$ s.t. $$ \begin{align*} P(x) &= a_0 + a_1 x + a_2 x^2 + \cdots + a_{n-1} x^{n-1} \\ &= P_{\text{even}}(x^2) + x \cdot P_{\text{odd}}(x^2) \\ \text{where } P_{\text{even}}(x) &= a_0 + a_2 x + a_4 x^2 + \cdots + a_{n-2} x^{n/2} \\ P_{\text{odd}}(x) &= a_1 + a_3 x + a_5 x^2 + \cdots + a_{n-1} x^{n/2} \end{align*} $$
By doing this, we divide the the length $n$ to two length $n/2$. Then, we have to conquer them each and then combine them together.
Let $A_k$ denote $A(x_k)$.
The question is: How can we divide $A_k$?
And here's the most essencial part: We use primitive root $\omega$.
Now, think about an complex equation: $$ z^n = (re^{i\theta})^n = 1 $$ We can get the solutions: $$ \begin{align*} r &= 1 \\ \theta &= \frac{2k\pi}{n} \\ \text{where } k &= 0, 1, \cdots, n-1 \end{align*} $$ Thus, $$ \begin{align*} \omega_n &= e^{\frac{2\pi}{n}i} \\ \omega_n^k &= e^{\frac{2\pi}{n}ki} = e^{i\theta} \\ (\omega_n^k)^n &= 1 \end{align*} $$ To each $k$, we want to calculate: $$ \begin{align*} A_k &= P(\omega_n^k) \\ &= P_{\text{even}}(\omega_n^{2k}) + \omega_n^k \cdot P_{\text{odd}}(\omega_n^{2k}) \\ &= P_{\text{even}}(\omega_{n/2}^{k}) + \omega_n^k \cdot P_{\text{odd}}(\omega_{n/2}^{k}) \end{align*} $$ Since $$ \omega_n^{2k} = e^{\frac{2\pi}{n} 2ki} = e^{\frac{2\pi}{n/2} ki} = \omega_{n/2}^k $$ This is called the Discrete Fourier Tranform (DFT) of $A_k$.
And we found that: $$ \begin{align*} A_{k+n/2} &= P(\omega_n^{k+n/2}) \\ &= P_{\text{even}}(\omega_{n/2}^{k+n/2}) + \omega_n^{k+n/2} \cdot P_{\text{odd}}(\omega_{n/2}^{k+n/2}) \\ &= P_{\text{even}}(\omega_{n/2}^{k} \cdot \omega_{n/2}^{n/2}) + \omega_n^k \cdot \omega_n^{n/2} \cdot P_{\text{odd}}(\omega_{n/2}^{k} \cdot \omega_{n/2}^{n/2}) \\ &= P_{\text{even}}(\omega_{n/2}^{k}) - \omega_n^k \cdot P_{\text{odd}}(\omega_{n/2}^{k}) \end{align*} $$ Since $$ \omega_n^n = e^{\frac{2\pi}{n}ni} = e^{2\pi i} = 1 \\ \omega_n^{n/2} = e^{\frac{2\pi}{n} (n/2)i} = e^{\pi i} = -1 $$
Combining $A_k$ and $A_{k+n/2}$ together for $0 \leq k < n/2$, we can do D&C here! Since $P_{\text{even}}(\omega_{n/2}^{k})$ and $P_{\text{odd}}(\omega_{n/2}^{k})$ can calculate from the half DFT.
The time complexity is $T(n) = 2T(n/2) + O(n) = O(n \log n)$. We get the whole evaluation.
IFFT
To do interpolation, we need Inverse Fast Fourier Tranfrom (IFFT).
Inverse FFT is easy to explain using Linear Algebra.
It's easy since: $$ a = M_n^{-1}(\omega) \cdot A(\omega) = \frac{1}{n} M_n(\omega^{-1}) \cdot A(\omega) $$ And it's over.
Let $A(\omega) = \text{FFT}(a, \omega)$, then $a = \frac{1}{n} \text{FFT}(A, \omega^{-1})$.
The Whole Progress
- Filled $A(x)$ and $B(x)$ to length of $n=2^k$ by $0$ due to easier FFT D&C.
- Calculate $A(\omega^k)$ and $B(\omega^k)$ for each $k=0,1,\cdots,n-1$, where $\omega = e^{\frac{2\pi}{n}i}$ using FFT.
- Calculate $C(\omega^k) = A(\omega^k) \cdot B(\omega^k)$.
- Calculate $c_i$ using IFFT.
L3 Randomized Algorithms
Basic Concepts of Randomized Algoirthms
Randomized Algorithms: On the same input, different executions of the same algorithm may result in different running times and different outputs.
Type of polynomial randomized algorithms:
- Monte Carlo: Bounded number of steps, but might not find the answer.
- Runs in polynomial time, $P_r$(answer is correct)$=p_o > \frac{1}{2}$.
- Always able to check the answer for correctness.
- How to use: Repeat if neccessary, use majority of answers.
- Las Vegas: No guarantee on max number of steps, always finds answer.
- Runs in expected polynomial time.
- Not stop until finds the correct answer.
$\overline{T}(n)$: Expected time to terminate for a given input of size $n$.
Quicksort
Consider combining D&C and randomize
- Pick an pivot $x$ of the array.
- Divide: Partition the array into subarrays using $x$.
- Conquer: Recursively sort $A$, $B$.
- Combine: $A$, $x$, $B$.
$\overline{T}(n) = \Theta(n \log n)$ for all inputs. $\Rightarrow$ Random selection of $x$ makes no input to be worse.
Or we can say $\Theta(n \log n)$ is the average complexity of quicksort.
Frievald's Algorithm
Given $n \times n$ matrices $A$, $B$ and $C$, check that $AB = C$ (yes or no).
It's very slow to use brute force algorithm since matrices multiplication is $O(n^3)$.
Consider randomization (Frievald's Algorithm):
- Choose a random binary vector $r$.
- If $(A(Br) = Cr) \bmod 2$ then output "yes" else "no".
Property: If $AB \not= C$, then $\text{Pr}[A(Br) \not= Cr] \geq \frac{1}{2}$.
That is, Do 10 tests, if all say "yes", the probability of error $< \left( \frac{1}{2} \right)^{10}$.
L4 Lower Bounds and Heaps
Comparison Model
Comparsion Sort
All the sorting algorithms we've seen so far are comparison sort (merge sort, insertion sort, bubble sort, etc.).
And the best running time that we've seen for comparison sort is $O(n \log n)$. Is it the best we can do?
Decision-Tree
A given recipe for sorting $n$ numbers $a_1, a_2, \cdots, a_n$.
Nodes are suggested comparisons: $i, j$ means compare $a_i$ to $a_j$ for $i, j \in \{1, 2, \cdots, n\}$
The figure below is an example of decision-tree of sorting $a_1 = 9, a_2 = 4, a_3 = 6$.
From the tree we can see that:
- A path from the root to the leaves of the tree represents a trace of comparisons that the algorithm may perform.
- The running time of the algorithm = the length of the path taken.
- Worst-case running time = height(depth) of tree
The tree must contain $\geq n!$ leaves since there are $n!$ possible permutations. And a height-$h$ binary tree has $L_h \leq 2^h$ leaves.
Hence: $$ n! \leq L_h \leq 2^h \iff h > \log(n!) \iff h = \Omega(n \log n) $$
Since the Stirling's approximation: $$ n! \approx \sqrt{2 \pi n} \left(\frac{n}{e}\right)^n \\ \log(n!) = n \log n - n + \frac{1}{2} \log (2 \pi n) $$
Theorem: Any decision tree that can sort $n$ elements must have height $\Omega(n \log n)$.
Linear Time Sorting
As metioned above, the comparison sorting have a lower-bound complexity of $\Omega(n \log n)$, so do linear time sorting exist?
Counting Sort
Given: An array $a_0, a_1, \cdots, a_{n-1}$ of $n$ keys to be sorted.
Count the times of number appeared between $a_{min}$ and $a_{max}$, then use the prefix sum array to get the final place of each element. With time complexity of $O(n+k), k = a_{max} - a_{min} + 1$
void counting_sort(vector<int>& a) {
int n = a.size();
int mn = *min_element(a.begin(), a.end());
int mx = *max_element(a.begin(), a.end());
int k = mx - mn + 1;
vector<int> count(k, 0);
for (int x : a) count[x - mn]++;
for (int i = 1; i < k; i++) count[i] += count[i - 1];
vector<int> res(n);
for (int i = n - 1; i >= 0; i--) {
int x = a[i];
res[--count[x - mn]] = x;
}
a = res;
}
Radix Sort
Use a radix(base) $k$ for an integer $x$: $$ x = a_{d-1} k^{d-1} + \cdots + a_0 k^0 \to a_{d-1}a_{d-2}\cdots a_0 $$
Idea: Sort on least-significant digit first with auxiliary stable sort.
We are given $n$ integers, each integer $< M$, in base $k$. Then we have time complexity of $O((n+k) \log_k M)d$
Heaps
Storing Objects in Data Structures
We operate on objects of the form: object $x = (x.\text{key}, x.\text{data})$.
The attribute $x.\text{key}$ characterizes uniquely the object $x$. And we store $x.\text{key}$ combined with a pointer to the location of $x.\text{data}$ in memory.
Inserting and searching for object $x$ is done by inserting and searching for $x.\text{key}$. Or we can say we mostly focus on $x.\text{key}$ instead of $x.\text{data}$.
Priority Queue
$S$: set of elements $\{ x_1, x_2, \cdots, x_n \}$.
$x_i = (\text{key}_i, \text{data}_i)$.
Operations: $\text{Insert}(S,x), \max(S), \text{Extract-max}(S), \text{Incr-key}(S, x, k)$.
Heap
Heap: An array visualized as a nearly complete binary tree.
Navigate on the tree but move on array = full info of the tree, without pointers.
Max(Min) Heap Property: the key of a node is $\geq (\leq)$ than the keys of its children.
Binary Tree Structure in array, as shown above.
int s[MAXN], sz;
int parent(int i) {
return i >> 1; // i/2
}
int left(int i) {
return i << 1; // i*2
}
int right(int i) {
return i << 1 | 1; // i*2+1
}
And the height of the tree is $h = \log n$.
Max Heapify: Correct a single violation of the heap property occurring at the root $i$ of a subtree in $O(\log n)$.
void max_heapify(int i) {
int l = left(i), r = right(i), maxi = i;
if(l <= sz && s[l] > s[maxi]) maxi = l;
if(r <= sz && s[r] > s[maxi]) maxi = r;
if(maxi != i) {
swap(s[i], s[maxi]);
max_heapify(maxi);
}
}
Build Max Heap: Convert an array $A$ into a max heap. And we find that elements from $A_{\lfloor n/2 \rfloor+1}$ to $A_n$ are leaves of the tree, and they won't violate the rule. So we only need to heapify $A_1$ to $A_{\lfloor n/2 \rfloor}$. The time complexity is $O(n)$.
void build_max_heap(int n) {
sz = n;
for(int i = n/2; i >= 1; i--) max_heapify(i);
}
Insert: We first put it at the back of the array, then go up in the tree one level by one level until no violation.
void insert(int val) {
sz++;
s[sz] = val;
int i = sz;
while(i > 1 && s[parent(i)] < s[i]) {
swap(s[i], s[parent(i)]);
i = parent(i);
}
}
Extract Max: The max element must be the first one of the array, we can just swap it with the last element, then heapify.
int extract_max() {
if(!sz) return -1;
int maxn = s[1];
s[1] = s[sz];
sz--;
max_heapify(1);
return maxn;
}
Heap Sort: Like extracting max element for $n$ times. Time complexity $O(n \log n)$.
void heap_sort() {
int n = sz;
for(int i = n; i >= 2; i--) {
swap(s[1], s[i]);
sz--;
max_heapify(1);
}
}
L5 Trees
Binary Search Tree
A Binary Search Tree(BST) is a binary tree structure that every node $x$ of it must satisfies:
- All $\text{key}$ from its left subtree is less than $x.\text{key}$
- All $\text{key}$ from its right subtree is greater than $x.\text{key}$
And we can easily know that the left subtree and right subtree of $x$ are both BST.
Define Node: Using C++ struct and pointers.
struct node {
int key;
node* left;
node* right;
node(int k): key(k), left(NULL), right(NULL) {}
};
Insert: Start from the root, if the $\text{key}$ is less, put it to the left, else to the right. Doing the whole progress in recursion, until find an empty place and put it in. With average time complexity $O(\log n)$, but worst $O(n)$.
node* insert(node* root, int key) {
if(!root) return new node(key);
if(key < root->key) {
root->left = insert(root->left, key);
} else if(key > root->key) {
root->right = insert(root->right, key);
}
return root;
}
Build Tree: Actually there's no such a thing, you just need to insert the new node into root.
node* root = NULL;
root = insert(root, a);
Inorder Traversal: The inorder traversal of the BST is a sorted array, due to the property of it.
void printTree(node* root) {
if(!root) return;
printTree(root->left);
cout << root->key << " ";
printTree(root->right);
}
Search: Tipical binary search on tree.
node* search(node* root, int key) {
if(!root || root->key == key) {
return root;
}
if(key < root->key) {
return search(root->left, key);
} else {
return search(root->right, key);
}
}
Delete: A bit complicated.
- Leaf (no children): Delete directly.
- On child: Use the child to replace it.
- Two children: Find the minimum of right subtree, replace it with the element we want to delete, and delete it.
node* deleteNode(node* root, int key) {
if(!root) return NULL;
if(key < root->key)
root->left = deleteNode(root->left, key);
else if(key > root->key)
root->right = deleteNode(root->right, key);
else {
if(!root->left && !root->right) {
delete root;
return NULL;
} else if(!root->left) {
node* temp = root->right;
delete root;
return temp;
} else if(!root->right) {
node* temp = root->left;
delete root;
return temp;
} else {
node* temp = findMin(root->right);
root->key = temp->key;
root->right = deleteNode(root->right, temp->key);
}
}
return root;
}
Warning: We usually use the different key to insert, but if there exists two same keys, we can do a count, or change the right subtree to greater or equal subtree.
AVL Tree
An AVL (Adelson-Velskii and Landis) Tree is a self-balancing binary search tree.
Property: For every node, store its height:
- Leaves have height $0$
- NIL has height $-1$
Invariant: For every node $x$, the heights of its left child and right child differ by at most $1$.
And we can know that AVL trees always have height $\Theta(\log n)$.
Define Node: Like BST, but with height.
struct node {
int key, height;
node* left;
node* right;
node(int k): key(k), height(1), left(NULL), right(NULL) {}
};
Get Height: For convenience, I take leaves height as $1$, NIL height as $0$.
int getHeight(node* u) {
return u ? u->height : 0;
}
Get Balance: Use the definition of balance, $\text{BL} = \text{height}(\text{left subtree}) - \text{height}(\text{right subtree})$, notice that if $\text{BL} > 1$, left subtree is heavy, else if $\text{BL} < 1$, right subtree is heavy, else it's balanced at this node.
int getBalance(node* u) {
return u ? getHeight(u->left) - getHeight(u->right) : 0;
}
Right Rotate:
A B
/ \ / \
B T3 -> T1 A
/ \ / \
T1 T2 T2 T3
node* rightRotate(node* u) {
node* v = u->left;
node* w = v->right;
v->right = u;
u->left = w;
u->height = max(getHeight(u->left), getHeight(u->right)) + 1;
v->height = max(getHeight(v->left), getHeight(v->right)) + 1;
return v;
}
Left Rotate:
A B
/ \ / \
T1 B -> A T3
/ \ / \
T2 T3 T1 T2
node* leftRotate(node* u) {
node* v = u->right;
node* w = v->left;
v->left = u;
u->right = w;
u->height = max(getHeight(u->left), getHeight(u->right)) + 1;
v->height = max(getHeight(v->left), getHeight(v->right)) + 1;
return v;
}
Insert: Consider insertion with 4 situations (L and R means insert into left and right subtree):
- LL: Do right rotation once.
- RR: Do left rotation once.
- LR: Do left rotation, then right rotation.
- RL: Do right rotation, then left rotation.
node *insert(node* u, int key) {
if(!u) return new node(key);
if(key < u->key) u->left = insert(u->left, key);
else if(key > u->key) u->right = insert(u->right, key);
else return u;
u->height = max(getHeight(u->left), getHeight(u->right)) + 1;
int BL = getBalance(u);
if(BL > 1 && key < u->left->key) return rightRotate(u); // LL
if(BL < -1 && key > u->right->key) return leftRotate(u); // RR
if(BL > 1 && key > u->left->key) { // LR
u->left = leftRotate(u->left);
return rightRotate(u);
}
if(BL < -1 && key < u->right->key) { // RL
u->right = rightRotate(u->right);
return leftRotate(u);
}
return u;
}
Delete: As same as deletion in BST, then check whether it's balanced.
node* findMin(node* u) {
while(u->left) u = u->left;
return u;
}
node* deleteNode(node* u, int key) {
// BST Deletion first
if(!u) return NULL;
if(key < u->key) u->left = deleteNode(u->left, key);
else if(key > u->key) u->right = deleteNode(u->right, key);
else {
if(!u->left || !u->right) {
node* temp = u->left ? u->left : u->right;
if(!temp) {
temp = u;
u = NULL;
} else {
*u = *temp;
}
delete temp;
} else {
node* temp = findMin(u->right);
u->key = temp->key;
u->right = deleteNode(u->right, temp->key);
}
}
if(!u) return NULL;
// Keep balance
u->height = max(getHeight(u->left), getHeight(u->right)) + 1;
int BL = getBalance(u);
if(BL > 1 && getBalance(u->left) >= 0)
return rightRotate(u); // LL
if(BL > 1 && getBalance(u->left) < 0) {
u->left = leftRotate(u->left);
return rightRotate(u); // LR
}
if(BL < -1 && getBalance(u->right) <= 0)
return leftRotate(u); // RR
if(BL < -1 && getBalance(u->right) > 0) {
u->right = rightRotate(u->right);
return leftRotate(u); // RL
}
return u;
}
Time Complexity: In balanced BST, all operations take $O(\log n)$.
Red-Black Tree
Shit.
Interval Tree
Interval Tree is a balanced BST but nodes are intervals.
Define Node: Interval has $l$ and $r$, while $\text{maxn}$ is the maximum $r$ of all interval of the whole subtree, and we use $l$ as the $\text{key}$ of BST.
struct interval {
int l, r;
};
struct node {
interval i;
int maxn;
node* left;
node* right;
node(interval in): i(in), maxn(in.r), left(NULL), right(NULL) {}
};
Insert: Like regular BST, using $l$ as the $\text{key}$, and record $\text{maxn}$.
node* insert(node* u, interval i) {
if(!u) return new node(i);
if(i.l < u->i.l) u->left = insert(u->left, i);
else u->right = insert(u->right, i);
u->maxn = max(u->maxn, i.r);
return u;
}
Search: To get all the intervals who are overlapping with $i$, there are 3 situations
- If the node has overlap with $i$, record the answer.
- If its left subtree's $\text{maxn}$ is greater than $i.l$, search in the left subtree.
- If the node's $l$ is less than $i.r$, search in the right subtree.
bool isOverlap(interval a, interval b) {
return a.l <= b.r && a.r >= b.l;
}
void searchAll(node* u, interval i, vector<interval>& re) {
if(!u) return;
if(isOverlap(u->i, i)) re.push_back(u->i);
if(u->left && u->left->maxn >= i.l) searchAll(u->left, i, re);
if(u->right && u->i.l <= i.r) searchAll(u->right, i, re);
}
The regular Interval Tree is like BST with worst time complexity $O(n)$ for each operation, but it can be optimized by AVL tree or other balanced tree, reducing the time complexity to $O(\log n)$.
vEB Tree
An vEB (van Emde Boas) Tree is a data structure that can support below functions in a integer set $S \subseteq \{ 0, 1, \cdots, U-1 \}$, while $U$ is the range of integers (e.g. $U=2^{16}$)
L6 Hashing
Basic Concepts of Hashing
Hash Function: A function that maps data of arbitrary size to fixed-sized values. And the goal is to propose a data structure such that for key $k$:
- $\text{add}(k), \text{delete}(k) = O(1)$
- $\text{search}(k) = O(1)$ on the average
From what I've learnt, the most important thing is not the precise hashing function itself, but the critical thought of hashing.
Hash Table: A table that stores indice and values of hashing.
Making Choices
Choosing the Keys:
- By Nature: Randomly from $U$. (Simple uniform hashing)
- By an Opponent: That knows the hash function we use when this is fixed, in a way to create the longest possbile chain. (Universal hashing)
Choosing the Hash Funcions:
- We use a fixed hash function everyone knows.
- We change before each experiments the hash function, chosen randomly from a "universal set" of functions known to everyone.
Special Case (perfect hashing): We know the keys, then choose the hash function.
Performance: Average length of chain in a typical slot (same hashing value).
Models for Randomness
Simple Uniform Hashing Assumption (SUHA)
$h$ is fixed: In every experiment each key $k$'s value is chosen randomly (uniformly) from $U$.
- Then, $h$ is "good" if $h(k)$ scatters uniformly over the hash table: $$ P_k[h(k) = i] = \frac{1}{m}, \forall i = 1, 2, \cdots, m $$
- We call this the Simple Uniform Hashing Assumption.
Universal Hashing
The set of keys to be stored is assumed fixed: In every experiment, $h$ is randomly (uniformly) chosen from a set $H$.
- Then, $H$ is "good" if $$ P_h[h(k_1) = h(k_2)] \leq \frac{1}{m}, \forall k_1 \not= k_2 $$
- $H$ is a set of Universal Hash Funcions
Actually, Universal Hashing is not a hash function, but designing a set of hash functions. It will randomly choose a hash function when using. Or we can san Universal Hashing is using SUHA by randomization, in order to decrese the probability of collisions.
Universal Hashing Function Examples
The Matrix Method
Settings:
- Keys are $u$-bits long, $|U| = 2^u$
- Hash values are $b$-bits long, hash table size $m = 2^b$
And we pick $h = $ random $b \times u$ $0/1$ matrix, define: $$ h(x) = h \cdot x \pmod 2 $$
Which means matrix multiplication under module $2$.
Claim: For $x \not= y, P_h[h(x) = h(y)] \leq \frac{1}{m} = \frac{1}{2^b}$
e.g. $$ x = 2 = \begin{bmatrix} 1 \\ 0 \\ 1 \\ 0 \end{bmatrix}, h = \begin{bmatrix} 1 & 0 & 0 & 0 \\ 0 & 1 & 1 & 1 \\ 1 & 1 & 1 & 0 \end{bmatrix} $$ And we get $$ h(x) = h \cdot x = \begin{bmatrix} 1 \\ 1 \\ 0 \end{bmatrix} = 6 $$
Linear Hashing for Vector Keys
Settings:
- Suppose keys are vectors $x = (x_1, x_2, x_3, x_4)$, where $x_i$ is an $8$-bit integer. (i.e. $0 \leq x \leq 255$)
- We use a hash table of size $m = p$, where $p$ is a prime number slightly greater than $255$ (e.g. $257$).
Choose the vector of coefficients $a = (a_1, a_2, a_3, a_4)$ where $a_i$ is (uniformly) randomly selected in the range $\{0, 1, \cdots, m-1\}$ And we define: $$ h_a(x) = \left( \sum_{i=1}^4 a_i x_i \right) \bmod m $$
So the hash family is $$ H = \{ h_a : a \in [0, m-1]^4 \} $$
Claim: For $x \not= y, P_a[h_a(x) = h_a(y)] = \frac{1}{m}$
Affine Hashing for Integer Keys
Also called the Carter-Wegman construction
Settings:
- Keys are integers $k \in U$
- Pick a prime number $p$ s.t. $|U| \leq p \leq 2|U|$
- Table size $m \leq p$
We radomly choose $a \in \{ 1, 2, \cdots, p-1 \}, b \in \{ 0, 1, \cdots, p-1 \}$, and define: $$ h_{a,b}(k) = ((a \cdot k + b) \bmod p) \bmod m $$
Claim: For $x \not= y, P_{a,b}[h_{a,b}(x) = h_{a,b}(y)] \leq \frac{1}{m}$
e.g. $|U| = 2^4 = 16, p = 17, m = 6, k = 8, a = 3, b = 4$, then: $$ h_{3,4}(8) = ((3 \cdot 8 + 4) \bmod 17) \bmod 6 = 5 $$
Rolling Hash
Sliding window, but in hashing.
Rabin-Karp Algorithm
Consider a string $S = s_0 s_1 s_2 \cdots s_{n-1}$. We define a window of length $k$ s.t. $T = s_i s_{i+1} \cdots s_{i+k-1}$. And we use the polynomial hash function: $$ H_i = H(T) = (s_i \cdot p^{k-1} + s_{i+1} p^{k-2} + \cdots + s_{i+k-1} \cdot p^0) \bmod M $$ Where $p$ is the base, $M$ is a large prime number and $s_i$ is the ASCII number.
From the function above we can get: $$ H_{i+1} = ((H_i - s_i \cdot p^{k-1}) \cdot p + s_{i+k}) \bmod M $$
And then we can $O(1)$ check whether a string of length $k$ is appeared in $S$.
Table Doubling
We define Load Factor: $$ \alpha = \frac{n}{m} $$ Where $n$ is the element number in hash table, $m$ is the bucket number of hash table.
We have known that we need a hash table to store the keys and values, but if the element number in table is too large, it will cause more and more collisions, which slowing down the hashing progress.
So if $\alpha$ is too large (usually $1$), we will double the bucket number $m$, then rehash all the elements into new table.
Average Cost of Insertions $=\frac{2^{i+2}c + 2^i c_0}{2^i} = O(1)$
Or observe that up to any time $t$: by paying $O(1)$ per insert, covers at all times the total actual cost. We say that insert has an amortized cost of $O(1)$.
Perfect Hashing
Bloom Filters
L7 Amortized Analysis
Why Amortized Complexity?
Think about doubling hash tables, some of the functions are cheap, like $O(1)$ calculate hashing, but some are expensive, like $O(n)$ to doubling hash tables.
Amortized analysis is a way to make sure each function is in an average time, no need to assume randomized input.
Actual Cost and Amortized Cost
$c(t) = $ actual cost of the operation at time $t$. It can be a function of the state of the data structure, e.g. depends on $n$ = number of elements stored at time $t$.
$a(t) = $ amortized cost of the operation at time $t$. Assume $a(t) = O(1) \ \forall t$, i.e., $a(t) \leq c_a \ \forall t$
Suppose that we have shown that on any trajectory $0, 1, \cdots, T$, $$ \sum_{0 \leq t \leq T} c(t) \leq \sum_{0 \leq t \leq T} a(t) $$
Which means at any operation, the total actual cost $\leq$ the total budget cost we assigned. That is, our budget is enough to cover all the cost.
For any length $T$, $$ \sum_{0 \leq t \leq T} c(t) \leq \sum_{0 \leq t \leq T} a(t) \leq c_a T $$
Amortization Methods
Aggregate Method
The easiest way of amortization.
$$ \text{amortized cost per op} = \frac{\text{total cost of } k \text{ ops}}{k} $$
Amortized Bound
Define an amortized cost $a_op(t) = O(1)$ to each operation type s.t. for any possible sequence of operations, the total cost is covered. That is, $$ \sum(\text{amortized costs}) \geq \sum(\text{actual costs}) $$
Methods: Accounting method, Charging method, Potential method.
L8 Graphs
Graph Definition
$G = (V, E)$
$V$: A set of vertices. $|V|$ usually denoted by $n$.
$E \subseteq V \times V$: A set of edges (pair of vertices). $|E|$ usually denoted by $m$.
$m \leq n^2 = O(n^2)$
Two flavors:
- Directed graph: Edges have order.
- Undirected graph: Ignore order. Then only $\frac{n(n-1)}{2}$ possible edges.
Problems Relates to Graphs
Reachability: Find all states reachable from a given state.
Shortest Paths: Find the shortest path between state $s_0$ and $s_d$.
Cycle: Find if a directed graph has a cycle.
Topological Sort: Find a causal ordering of the nodes if it exists.
Minimum Spanning Tree: Find the minimum subgraph that contains all nodes.
Strongly Connected Components: Maximal sets of nodes that from every node can reach any other node in the set (mixing).
Graph Representation
Four representations with pros/cons:
- Adjacency Lists (of neighbors for each vertex)
- Incidence Lists (of edges from each vertex)
- Adjacency Matrix (0-1, which pairs are adjacent)
- Implicit representation (as a find neighbor function)
DFS
Depth-First Search (DFS) is a versatile linear-time procedure that reveals a wealth of information about a graph.
int n, m;
vector <int> g[MAXN];
int vis[MAXN]; // 0: WHITE, 1: GREY, 2: BLACK
int d[MAXN], f[MAXN], cnt;
void dfs(int u) {
cnt++;
d[u] = cnt;
vis[u] = 1;
for(int x = 0; x < g[u].size(); x++) {
int v = g[u][x];
if(!vis[v]) dfs(v);
if(vis[v] == 1) cout << "Cycle!" << endl;
}
cnt++;
f[u] = cnt;
vis[u] = 2;
}
Properties of DFS
When DFS a graph, we will record two time stamp of each vertex:
- $d[u]$: Discover Time, when $u$ is first visited.
- $f[u]$: Finish Time, when we finish exploring all neighbors of $u$.
Of course, it's easy to see that $u.d < u.f$
Then we can get $4$ different type of edges, for a $u \to v$:
- Tree Edge: When we first discover a new vertex $v$ from $u$.
- $u.d < v.d < v.f < u.f$
- Back Edge: An edge that points back to an ancestor in the DFS tree.
- $v.d < u.d < u.f < v.f$
- Forward Edge: An edge that goes from a node to its descendant in the DFS tree, but not a direct tree edge.
- $u.d < v.d < v.f < u.f$
- Cross Edge: An edge that goes between two different DFS branches (neither ancestor nor descendant).
- $v.d < v.f < u.d < u.f$
White Path Theorem
We colored the vertex in DFS:
- White: Not yet visited
- Gray: Currently being explored
- Black: Finished exploring
If, during a DFS, there is a path from a vertex $u$ to a vertex $v$ that only goes through white vertices, then $v$ will become a descendant of $u$ in the DFS tree.
We can use it to detect cycle in graph: When we see a back edge, or meet grey node, means we found a path from a node $u$ back to one of its ancestors $v$, and there is a cycle.
Topological Sort
Topological Sort of a DAG $G=(V,E)$.
Theorem: $G$ is a DAG iff there is a linear ordering of all its $v \in V$ s.t., if $(u, v) \in E$, then $u < v$ in the above ordering. Or we say $(u,v)$ goes always from left to right.
Algorithm: Given DAG $G$.
- Run any DFS($G$), compute finishing time of nodes.
- Linear order: The nodes in decreasing order of finishing times.
Corollary: In a DAG, $(u, v) \in E \iff u.f > v.f$ for any DFS.
BFS
Breath-First Search (BFS) is searching neighbors around each node. And we usually use queue to implement.
Used very often in Shortest Path Algorithms.
int n, m;
vector <int> g[MAXN];
bool vis[MAXN];
void bfs(int st) {
queue <int> q;
q.push(st);
vis[st] = true;
while(!q.empty()) {
int u = q.front(); q.pop();
for(int x = 0; x < g[u].size(); x++) {
int v = g[u][x];
if(!vis[v]) {
vis[v] = true;
q.push(v);
}
}
}
}
SCC
Strongly Connected Component (SCC): Maximal set of vertices $C \subseteq V$ such that for every pair of vertices $u, v \in C$, we have both $u \leadsto v$ and $v \leadsto u$.
$u \leadsto v$ means there is a path from $u$ to $v$.
Kosaraju's Algorithm
Goal: Find all SCCs.
Key Observation: If you collapse each SCC into a single node, the resulting graph is a DAG. This DAG has a topological order.
If you start a DFS on a node located in a sink SCC (an SCC with no outgoing edges in the SCC DAG), the DFS will visit all nodes within that SCC and then stop, without leaking into other SCCs. You could then remove this SCC and repeat the process.
Idea: Reverse the direction of all edges in $G$ to create the transpose graph $G^T$.
int n, m;
vector<int> g[MAXN]; // original graph
vector<int> gr[MAXN]; // reversed graph
int vis[MAXN];
int d[MAXN], f[MAXN], cnt;
vector<int> order; // store vertices by finish time
vector<int> component; // store one SCC
void dfs1(int u) {
vis[u] = 1;
cnt++;
d[u] = cnt;
for (int v : g[u]) {
if (!vis[v])
dfs1(v);
}
cnt++;
f[u] = cnt;
order.push_back(u); // store vertex by finish time
vis[u] = 2;
}
void dfs2(int u) {
vis[u] = 1;
component.push_back(u);
for (int v : gr[u]) {
if (!vis[v])
dfs2(v);
}
}
// Kosaraju main
void kosaraju() {
cnt = 0;
memset(vis, 0, sizeof(vis));
for (int i = 1; i <= n; i++) {
if (!vis[i])
dfs1(i);
}
memset(vis, 0, sizeof(vis));
reverse(order.begin(), order.end());
cout << "Strongly Connected Components:\n";
for (int u : order) {
if (!vis[u]) {
component.clear();
dfs2(u);
for (int x : component) cout << x << " ";
cout << "\n";
}
}
}
And we can find all the SCCs using 2 DFS, with time complexity $O(n + m)$.
After Midterm (1 DAY BEFORE FINAL)
妈的,没时间了,得赶紧速通了
最短路(单源)
对于一个图 $G(V,E)$,其中两点 $u, v \in V$ 的最短路长度定义为 $\delta(u, v)$,且容易得到: $$ \delta(s, t) \leq \delta(s, u) + \delta(u, t) $$
Bellman-Ford
核心思想:对所有边执行 $n-1$ 轮松弛,复杂度 $O(nm)$
struct Edge {
int u, v;
LL w;
};
vector<Edge> edges;
vector<LL> dist;
int n, m, s;
bool bellman_ford() {
dist.assign(n + 1, LLONG_MAX);
dist[s] = 0;
// n-1 次松弛
for (int i = 1; i <= n - 1; i++) {
bool updated = false;
for (auto &e : edges) {
if (dist[e.u] != LLONG_MAX &&
dist[e.u] + e.w < dist[e.v]) {
dist[e.v] = dist[e.u] + e.w;
updated = true;
}
}
if (!updated) break; // 提前结束
}
// 第 n 次检查负环
for (auto &e : edges) {
if (dist[e.u] != LLONG_MAX &&
dist[e.u] + e.w < dist[e.v]) {
return false; // 有负环
}
}
return true; // 无负环
}
Dijkstra
核心思想:每一步都“贪心”地确定一个当前距离最小的点,并且一旦确定,就不会被更改。
算法维护两个集合:一个 $S$ 表示最短路已经被确定的点,还有一个 $V-S$ 表示还没被确定的点。每一步做三件事:
- 在 $V-S$ 中找到 $d$ 最小的点 $u$
- 把 $u$ 加入 $S$
- 用 $u$ 去松弛所有出边
那么,这个 $u$ 总是当前离 $s$ 最近的未确定点,不可能被绕路更短。
注意:没有负权的情况下
struct node {
int v, dis;
};
struct pri {
int ans, id;
bool operator < (const pri &x) const{return x.ans < ans;}
};
priority_queue <pri> q;
vector <node> g[MAXN];
int n, m, d[MAXN], pre[MAXN];
bool vis[MAXN];
void Dijkstra(int st) {
for(int x = 1; x <= n; x++) {
d[x] = INF;
pre[x] = -1;
vis[x] = false;
}
d[st] = 0;
q.push((pri){0, st});
while(!q.empty()) {
pri tmp = q.top();
q.pop();
int u = tmp.id;
if(!vis[u]) {
vis[u] = true;
for(int y = 0; y < g[u].size(); y++) {
int v = g[u][y].v;
if(!vis[v] && d[u] + g[u][y].dis < d[v]) {
d[v] = d[u] + g[u][y].dis;
pre[v] = u;
q.push((pri){d[v], v});
}
}
}
}
}
// 从ed回溯
void printPath(int ed) {
vector <int> ans;
for(int u = ed; u != -1; u = pre[u]) ans.push_back(u);
reverse(ans.begin(), ans.end());
for(int u : ans) cout << u << " ";
cout << endl;
}
A*
可以立即为 Dijkstra 的升级版,对比:
- Dijkstra: priority = $g(v)$: past cost = cost to reach $v$
- A*: priority = $f(v)$ = $g(v) + h(v)$
其中,$h(v)$ 是一个启发式函数,在二维平面 A* 算法中,一般是用 $v$ 到终点的欧几里得距离。
struct node {
int x, y;
double val;
bool operator < (const node &a) const {
return val > a.val;
}
} pre[MAXN][MAXN];
int dx[8] = {0, 1, 0, -1, 1, 1, -1, -1}, dy[8] = {1, 0, -1, 0, 1, -1, 1, -1};
string s[MAXN];
int g[MAXN][MAXN];
double f[MAXN][MAXN];
bool vis[MAXN][MAXN];
int n, m, ex, ey;
vector <node> ansPath;
double hfun(int ux, int uy) {
return sqrt((double)((ux-ex)*(ux-ex)+(uy-ey)*(uy-ey)));
}
void buildPath() {
int ux = ex, uy = ey;
while(!(ux == -1 && uy == -1)) {
ansPath.push_back((node){ux, uy, 0});
int vx = pre[ux][uy].x, vy = pre[ux][uy].y;
ux = vx, uy = vy;
}
}
void AStar(int sx, int sy) {
cout << "A*: " << sx << " " << sy << "-> " << ex << " " << ey << endl;
for(int x = 0; x < n; x++) {
for(int y = 0; y < m; y++) g[x][y] = INF;
}
g[sx][sy] = 0;
f[sx][sy] = hfun(sx, sy);
pre[sx][sy] = (node){-1, -1};
priority_queue <node> q;
q.push((node){sx, sy, f[sx][sy]});
while(!q.empty()) {
node tmp = q.top(); q.pop();
int ux = tmp.x, uy = tmp.y, val = tmp.val;
if(vis[ux][uy]) continue;
vis[ux][uy] = true;
if(ux == ex && uy == ey) {
buildPath();
return;
}
for(int o = 0; o < 8; o++) {
int vx = ux + dx[o], vy = uy + dy[o];
if(vx < 0 || vx >= n || vy < 0 || vy >= m) continue;
if(s[vx][vy] == '*') continue;
int tmpg = g[ux][uy] + 1;
if(tmpg < g[vx][vy]) {
pre[vx][vy] = (node){ux, uy, 0};
g[vx][vy] = tmpg;
f[vx][vy] = g[vx][vy] + hfun(vx, vy);
q.push((node){vx, vy, f[vx][vy]});
}
}
}
}
贪心
最小生成树
Kruskal算法:不停地添加下一个最短且不会生成环的边,用并查集维护。
struct edge {
int u, v, w;
} e[MAXN];
int n, m;
int fa[MAXN];
vector <edge> MST;
bool cmp(edge x, edge y) {
return x.w < y.w;
}
int find(int x) {
return fa[x] == x ? x : fa[x] = find(fa[x]);
}
int Kruskal() {
for(int x = 1; x <= n; x++) fa[x] = x;
int ans = 0;
sort(e+1, e+m+1, cmp);
for(int x = 1; x <= m; x++) {
int u = find(e[x].u), v = find(e[x].v);
if(u == v) continue;
fa[v] = u;
ans += e[x].w;
MST.push_back(e[x]);
}
return ans;
}
Prim 最小生成树:从任意一个点出发,每一步选择一条“连接已选点集和未选点集的最小边权”。
需要理解割定理(Cut Property):
对于一个图 $G(V,E)$来说,其割就是将图中所有顶点分成两个不重叠的集合 $S$ 和 $V-S$,割边就是连接这两个集合的边。
那么,在一个连通加权无向图中,对于任意一个割 $(S,V-S)$,如果某条割边 $e$ 的权重是所有割边里面最小的,那么这条边 $e$ 一定属于该图的某棵最小生成树。
证明可以用反证法,很简单。
Prim最小生成树的实现细节和 Dijkstra 非常像,都是维护一个小根堆,每次贪心地加边。
Huffman 编码
用于无损压缩储存字符串,让一个字符串里面出现频率高的字符编码短,频率低的字符编码长一些,传统方法是所有字符编码长度一样。
Huffman树:一棵 $01$ 前缀二叉树,左 $0$ 右 $1$。
给定一组符号 $c_1, c_2, \cdots, c_n$ 和每个符号出现的频率(权值)$w_i$,目标是构建一棵二叉树,每个符号对应一个叶子节点,使得带权路径长度($W$)最小
$$ \text{minimize } W = \sum_{i=1}^n w_i \cdot \text{depth}(c_i) $$
核心思想(贪心):每次合并当前权值最小的两个节点。
还是用小根堆来维护。
而且 Huffman 编码通过构建前缀二叉树,保证了没有任何一个字符的编码是另一个字符编码的前缀,这样在解码的时候也不会有歧义。
解码就很简单了,直接对应 Huffman 树从根往下走,走到叶子节点就将其输出,然后再回到根。
动态规划
之前写过两篇 blog 关于 DP 的,可以简单参考一下
动态规划就是把一个复杂的大问题,拆解成一个个重复的小问题,并把小问题的答案记录下来,最后得到大问题的答案。
最重要的是无后效性,即“未来与过去无关,只与现在有关”。一旦某个状态被确定,它是怎么到达这个状态的,无关紧要!后续的决策只依赖于当前状态。
如何设计一个 DP 方程:
- 定义状态(State):为了描述当前的局面,我需要知道哪些变量?状态必须包含所有“会限制未来决策”的信息。
- 定义选择(Choice):在当前状态下,我能进行什么操作?
- 明确 DP 数组的含义(Definition):用语言将 $dp[\dots]$ 的定义写下来,一定要具体。
- 写出状态转移方程(Transition):把选择变成数学公式,通常是
min,max或者sum。
$$ 现在的状态=择优 (之前的状态+本次选择的代价/收益) $$
Floyd 算法(多源最短路)
定义 $$ dp^{(k)}[i][j]=只允许使用 {1,2,\cdots,k} 作为中间点时,i \to j 的最短路 $$
状态转移方程 $$ dp^{(k)}[i][j] = \min\{dp^{(k-1)}[i][j], dp^{(k-1)}[i][k] + dp^{(k-1)}[k][j] \} $$
int n, m;
int d[MAXN][MAXN];
bool findN;
void Floyd() {
findN = false;
for(int k = 1; k <= n; k++)
for(int i = 1; i <= n; i++)
for(int j = 1; j <= n; j++)
if(d[i][k] < INF && d[k][j] < INF)
d[i][y] = min(d[i][j], d[i][k] + d[k][j]);
for(int x = 1; x <= n; x++)
if(d[x][x] < 0) findN = true; // 找到负环
}
区间 DP
核心思想:大区间的解=小区间的解+合并代价
标准套路:定义 $dp[i][j]$ 表示区间 $[i, j]$ 经过一系列操作之后得到的最优解。
状态转移方程:为了求 $dp[i][j]$,我们把区间 $[i, j]$ 从中间切一刀,分成 $[i, k]$ 和 $[k+1, j]$,枚举这个分割点 $k$,找到最优解。
$$ dp[i][j] = \min_{i \leq k < j}{ dp[i][k] + dp[k+1][j] } + \text{Cost}(i, j) $$
其中,$\text{Cost}(i, j)$ 是将两部分合并产生的代价。
遍历顺序:按照区间长度从小到大枚举。先算长度 $1$ 的区间,再算长度 $2$ 的区间,以此类推,最后能得到整个大区间的最优解。
最长公共子序列 LCS
给定两个字符串,找到它们的最长公共子序列(不要求连续)。
状态转移方程: $$ dp[i][j] = \begin{cases} dp[i-1][j-1] + 1 & \text{if } s1[i] = s2[j] \\ \max \{dp[i-1][j], dp[i][j-1] \} & \text{otherwise}\end{cases} $$ 其中 $dp[i][j]$ 表示 $s1$ 的前 $i$ 个字符和 $s2$ 的前 $j$ 个字符的 LCS 长度。
旅行商问题 TSP
定义状态 $dp[mask][i]$: $mask$ 是一个二进制转化成十进制的数,其 $n$ 位二进制表示已经访问过的城市集合;$i$ 表示当前停留在哪个城市(一定要在 $mask$ 里)。其值表示从起点出发,访问了 $mask$ 里所有城市,最后停在 $i$ 的最小总距离。
状态转移:假设我们在状态 $dp[mask][i]$,那么上一步我们在一个没访问 $i$ 之前的状态,停在某个 $j$。
之前的集合 $prev\_ mask$ 可以表示为 $mask \text{ xor } (1 << i)$,也就是把 $mask$ 里第 $i$ 位的 $1$ 去掉。
那么状态转移方程: $$ dp[mask][i] = \min_{j \in prev\_ mask} \{dp[prev\_ mask][j] + d[j][i] \} $$
void TSP() {
cin >> n;
for (int i = 0; i < n; i++)
for (int j = 0; j < n; j++)
cin >> w[i][j];
int full = (1 << n);
for (int mask = 0; mask < full; mask++)
for (int i = 0; i < n; i++)
dp[mask][i] = INF;
dp[1 << 0][0] = 0;
for (int mask = 0; mask < full; mask++) {
if (!(mask & 1)) continue;
for (int i = 0; i < n; i++) {
if (!(mask & (1 << i))) continue;
int prev = mask ^ (1 << i);
if (prev == 0 && i != 0) continue;
for (int j = 0; j < n; j++) {
if (!(prev & (1 << j))) continue;
dp[mask][i] = min( dp[mask][i], dp[prev][j] + w[j][i]);
}
}
}
int ans = INF, all = full - 1;
for (int i = 1; i < n; i++) {
ans = min(ans, dp[all][i] + w[i][0]);
}
cout << ans << "\n";
}
背包问题
01 背包:
有 $n$ 个物品,物品重量 $w_i$,价值 $v_i$,每个物品最多只能拿一个,背包大小 $S$。 $$ dp[i][j] = \max\{dp[i-1][j], dp[i-1][j-w[i]]+v[i] \} $$ 其中 $dp[i][j]$ 表示前 $i$ 个物品在容量 $j$ 下的最大价值。
void Knapsack01() {
int S, n;
cin >> S >> n;
for(int x = 1; x <= n; x++) cin >> w[x] >> v[x];
for(int j = 0; j <= S; j++) {
for(int i = 1; i <= n; i++) {
if(j-w[i] < 0) dp[i][j] = dp[i-1][j];
else dp[i][j] = max(dp[i-1][j], dp[i-1][j-w[i]] + v[i]);
}
}
cout << dp[n][S] << endl;
}
时间复杂度 $O(nS)$
完全背包:
同上,但可以重复拿 $$ dp[j] = \max \{dp[j-1], \max_i{dp[j-w[i]]+v[i] } \} $$ 其中 $dp[j]$ 表示容量 $j$ 下的最大价值。
void KnapsackComplete() {
int S, n;
cin >> S >> n;
for(int x = 1; x <= n; x++) cin >> w[x] >> v[x];
for(int j = 0; j <= S; j++) {
for(int i = 1; i <= n; i++) {
if(j-w[i] >= 0)
dp[j] = max(dp[j], dp[j-w[i]] + v[i]);
}
}
cout << dp[S] << endl;
}
时间复杂度 $O(nS)$
平衡划分问题
给定一个包含 $n$ 个正整数的集合,如何将其划分为两个子集,使得这两个子集的元素和相差最小。
$$ \min_{S_1, S_2} d = \left| \sum_{i \in S_1} k_i - \sum_{j \in S_2} k_j \right| $$
核心解法:将其转化为 01 背包问题。先算出总和 $\text{Sum}$,找到一个子集使其和尽可能接近 $\frac{\text{Sum}}{2}$,然后就是一个容量为 $\frac{\text{Sum}}{2}$ 的 01 背包问题了,每个物品的价值和重量都是它本身,EZ。
网络流
想象一个由管道组成的网络:
- 源点(Source, $S$):自来水厂,有无限的水流出
- 汇点(Sink, $T$):大海,可以接收无限的水
- 容量(Capacity, $c$):每根管道 $(u,v)$ 单位时间最多能流过的水量
- 流量(Flow, $f$):实际流过的水量
还有以下规则:
- 流量限制:$f(u,v) \leq c(u,v)$(水管不能撑爆)
- 流量守恒:对于除了 $S$ 和 $T$ 以外的任意节点,流入量=流出量
那么问题:在不撑爆任何管道的前提下,从源点 $S$ 最多能推多少水到汇点 $T$?
Ford-Fulkerson 方法
核心思想:只要还能找到一条路去推流,就往死里推;推不动了,现在的流量就是最大流。
这条“能推流的路”叫做增广路(Augmenting Path)
反向边:如果随便找了一条路推流,可能会堵死另一条更优的路线,那么可以加一个反向边,每当我们在边 $(u,v)$ 上推了 $k$ 的流量:
- 正向边的 $c(u,v)$ 减少 $k$
- 反向边的 $c(v,u)$ 增加 $k$
在残量网络(Residual Graph)中走反向边,相当于“撤销”之前的操作!
Edmonds-Karp 算法
用 BFS 在参量网络图里寻找增广路,确保每次找的都是边数最少的路。
这里的增广路是包括反向边的,但要满足硬性条件:剩余容量(Capacity)$>0$
BFS 总能找到经过边数最少的那条增广路(最短路性质),可以保证效率是 $O(nm)$,然后再乘上更新一次推流的复杂度 $O(m)$,使得算法总复杂度 $O(nm^2)$
LL BFS()
{
memset(last, -1, sizeof(last));
queue <LL> q;
q.push(s);
flow[s] = INF;
while(!q.empty())
{
int u = q.front(); q.pop();
if(u == t) break;
for(LL x = head[u]; x > 0; x = edge[x].next)
{
LL v = edge[x].to, w = edge[x].w;
if(w > 0 && last[v] == -1)
{
last[v] = x;
flow[v] = min(flow[u], w);
q.push(v);
}
}
}
return last[t] != -1;
}
LL EK()
{
LL ans = 0;
while(BFS())
{
ans += flow[t];
for(LL x = t; x != s; x = edge[last[x] ^ 1].to)
{
edge[last[x]].w -= flow[t];
edge[last[x]^1].w += flow[t];
}
}
return ans;
}
代码是古早时期用链式前向星随便写的,凑合着看吧(
最大流最小割定理
Max-Flow Min-Cut Theorem
割的定义是顶点的集合,在网络流中特指 $s-t$ 割,即我们将顶点集合 $V$ 划分成 $S$ 和 $TY$,使得:
- 源点 $s \in S$
- 汇点 $t \in T$
- $S \cup T = V$ 且 $S \cap T \empty$
割的容量,记为 $C(S,T)$,是指所有从 $S$ 指向 $T$ 的边的容量之和。 $$ C(S,T) = \sum_{u \in S, v \in T} c(u,v) $$
复杂度
以下内容由 Gemini 生成
这四个概念(P, NP, NP-Hard, NP-Complete)是理论计算机科学的基石,但名字起得非常容易让人误解。
最常见的误区是:“NP 就是 Not Polynomial(非多项式),就是难解的问题。” —— 这是大错特错的!
我们要用验算和归约这两个视角来彻底搞懂它们。
1. P (Polynomial Time) —— “能做出来的”
定义:能在多项式时间 $O(n^k)$ 内解决的问题。 直观:计算机能快速搞定的问题。
- 例子:
- 给你一个数组,把它排个序。(归并排序 $O(n \log n)$)
- 给你个地图,找两点间最短路。(Dijkstra $O(E \log V)$)
- 判断一个数是不是偶数。($O(1)$)
2. NP (Nondeterministic Polynomial Time) —— “能检查答案的”
名字误区:NP 不代表“非多项式(Non-Polynomial)”,它代表“非确定性多项式(Non-deterministic Polynomial)”。这个名字来源于非确定性图灵机,比较抽象,你可以直接忽略它。
真正的定义(验证视角): 如果在多项式时间内,给定一个“解(Certificate/Witness)”,你能验证这个解是不是对的,那这个问题就在 NP 里。
直观:虽然我不会做,但如果你蒙了一个答案给我,我能很快告诉你蒙得对不对。
- 例子(数独):
- 让你解一个 $100 \times 100$ 的空数独,这极难(可能是指数级时间)。
- 但是,如果我直接塞给你一张填满了数字的纸,问你“这是不是解?”,你只需要扫描一遍行、列、宫,几秒钟就能告诉我 Yes 或 No。
- 所以,数独属于 NP。
P 和 NP 的关系:
- $P \subseteq NP$。
- 为什么?因为如果你能快速解出问题,那你当然也能快速验证答案(直接解一遍对比一下就行了)。
- 千古之谜:$P = NP$ 吗?(即:所有能被快速验证的问题,都能被快速解决吗?目前绝大多数科学家认为 $P \neq NP$)。
3. NP-Hard —— “大Boss”
定义:如果所有的 NP 问题都能在多项式时间内归约到问题 X,那么 X 就是 NP-Hard。
通俗翻译: “我是个狠角色。如果谁能把我也在多项式时间内解决了,那么世界上所有的 NP 问题(比如数独、TSP、图着色)就全都解决了。”
特点:
- 它不一定在 NP 里:NP-Hard 问题甚至可能连“验证答案”都很难。
- 它代表了“难度下限”:它至少和 NP 中最难的问题一样难。
- 例子:
- 停机问题:给定一段代码,判断它会不会死循环。这是不可解的,它比所有 NP 问题都难,所以它是 NP-Hard,但它不是 NP(因为连验算都做不到)。
4. NP-Complete (NPC) —— “NP 里的扛把子”
定义:一个问题如果满足两个条件,它就是 NPC:
- 它属于 NP(能被快速验证)。
- 它属于 NP-Hard(所有 NP 问题都能归约给它)。
直观: 它是 NP 集合里最难、最核心的那一小撮问题。它们处于 P/NP 图解的交界处。 NPC 问题的神奇之处在于它们的等价性:只要你解决了其中任何一个 NPC 问题(找到多项式算法),你就解决了所有的 NP 问题(即证明了 P=NP)。
- 经典 NPC 例子:
- SAT (布尔可满足性问题):第一个被证明是 NPC 的问题(祖师爷)。
- TSP (旅行商问题-判定版):给一个长度 $L$,问有没有一条路径长度小于 $L$?
- 最大团问题:图里有没有 $K$ 个点两两相连?
- 背包问题。
5. 总结与辨析(一张表看懂)
| 类别 | 核心能力 | 通俗理解 | 典型例子 |
|---|---|---|---|
| P | 能解决 | 简单题,电脑跑得飞快 | 排序、最短路、最小生成树 |
| NP | 能验证 | 只有上帝或欧皇能猜中答案,但我能检查对不对 | 数独、因数分解、以及所有 P 问题 |
| NPC | 能验证 + 最难 | NP 里的万恶之源。解出一个就能解出所有。 | SAT、TSP(判定)、图着色 |
| NP-Hard | 未必能验证 + 极难 | 难度无上限,包含了 NPC 和比 NPC 更难的问题 | 停机问题、TSP(优化版)、围棋必胜策略 |
6. 一个常考的细节:优化 vs 判定
通常我们在算法书上讨论 P/NP 时,讨论的都是判定问题(Decision Problem),即答案只有 Yes/No 的问题。
- TSP 优化版:“求最短路径是多少?” $\to$ 这是 NP-Hard。
- TSP 判定版:“是否存在一条路径长度 $< 100$?” $\to$ 这是 NP-Complete。
为什么? 因为“优化版”不好验证。如果我给你一条路,说“这是最短的”,你怎么验证它是最短的?你很难在多项式时间内验证没有更短的路。 但“判定版”很好验证。我给你一条小于 100 的路,你算一下长度,确实小于 100,验证通过。
总结
- P:做起来容易。
- NP:检查起来容易。
- NPC:NP 里最难的,如果攻克它,世界将发生巨变(密码学崩溃)。
- NP-Hard:只要能把所有 NP 问题转化成它,它就是 NP-Hard。
下次有人问你,你可以装作很深沉地说:“P 是解的集合,NP 是验证的集合,NPC 是这帮验证问题里的终极代理人。”
CSC3060 Introduction to Computer Systems
L0 前言
课本:深入理解计算机系统 Computer Systems: A Programmer's Perspective, 3/E (CS:APP3e)
网站:https://hansimov.gitbook.io/csapp
L1 计算机系统漫游
计算机系统是由硬件和系统软件组成的,它们共同工作来运行应用程序。
本书先用一个小例子来引起读者兴趣,并让读者构建一个简单的框架。
#include <stdio.h>
int main() {
printf("FUCK, world!\n");
return 0;
}
这个程序是从源文件开始的,比如文件名叫做 hello.c,用 ASCII 标准来表示文本字符存在系统里。
C 编程语言:是由贝尔实验室的人创建的,有三个特点让它变得成功:1. 与 Unix 操作系统关系密切,或者说其开始就是作为 Unix 的程序语言开发出来的,也写了很多 Unix 的工具和库;2. 小而简单,易于学习,也易于移植到不同的计算机上;3. C 是为实践目的设计的。这使得其成为系统级编程的首选,也适用于应用级程序。
接着,这个程序会被编译成不同的格式,由高级编程语言转化为低级机器语言指令,打包。下图为编译系统。
了解编译系统之后,我们可以:
- 优化程序性能
- 理解链接时出现的错误
- 避免安全漏洞
此刻,我们已经将源程序编译成了可执行目标文件 hello,并被存放在磁盘上。然后我们将其运行!关于这个时候程序发生的事情,我们需要了解系统的硬件组成(下图)。
总线:贯穿整个系统的电子管道,携带信息字节并负责在各个部件间传递。
I/O 设备:Input / Output 设备是系统与外界的联系通道,例如键盘、鼠标、显示器和磁盘。每个 I/O 设备都是通过一个控制器或适配器与 I/O 总线连接起来的,控制器是主板上的芯片组,适配器则是一块插在主板插槽上的卡。
主存:临时储存设备,在处理器执行程序时用来存放程序和程序处理的数据,是由一组 DRAM(动态随机存取存储器)芯片组成的,本质上是栈性的字节数组。
处理器:中央处理单元(CPU)的简称,是解释或执行存储在主存中指令的引擎,其核心是一个被称为程序计数器(PC)的字长大小的存储设备(或寄存器),还有算术逻辑单元(ALU)。CPU 在指令的要求下可能会执行这些操作:
- 加载:从主存复制一个字节(字)到寄存器,覆盖寄存器原来的内容
- 存储:从寄存器复制一个字节(字)到主存的某个位置,覆盖原来内容
- 更新:复制两个寄存器的内容到 ALU,ALU 将两个字相加,并将结果存放到一个寄存器中,覆盖原来内容
- I/O 读写:将一个 I/O 设备中复制一个字节(字)到寄存器,或者反过来
- 转移:从指令本身中抽取一个字,并将其复制到 PC 中,覆盖原来的值
CSC3170 数据库系统 Database System
Part 1 简介 Introduction
结构化和非结构化的文本信息 Structured and Unstructured Textual Information
一个数据库管理系统(Database Management System, DBMS)是一个复杂的软件系统,其任务是帮助管理大量复杂的结构化信息。
结构化信息通常由离散的单元(实体 Entities)组成。
相同类型的实体会有预设的组织方案:
- 相同数量的性质 Attributes
- 每个性质都有预设的格式
非结构化的信息通常指没有固定格式、没有区分信息的数据,一般为普通的文本。
数据库系统 Database System
包含信息的数据集合叫做数据库。
其应用非常广泛,包括但不限于:
- 销售:顾客、产品、购买
- 会计:支付、发票
- 人力资源:应聘者信息、销售人员、税收
- 管理层
- 银行和金融
- 大学
数据库的目的是什么?主要是为了高效地存储、维护和查找大量数据,而且要保证不出现安全问题。
数据视图 View of Data
数据库系统是一些相互关联的数据以及一组使得用户可以访问和修改这些数据的程序的集合,其主要目的是给用户提供数据的抽象视图。
数据模型 Data Model
数据模型是描述数据、数据联系、数据语义以及一致性约束的概念工具的集合。其可以被划分为四类:
- 关系模型(Relational Model):利用表的集合来表示数据和数据间的联系,是最广泛的数据模型。
- 实体-联系模型(Entity-relationship Model):主要用作数据库设计。
- 半结构化数据模型(Semi-structured Data Model):允许数据定义中某些相同类型的数据项含有不同的属性集,例如 JSON, XML。
- 基于对象的数据模型(Objecte-based Data Model):面向对象的程序设计,例如 Java, C++, C#。
下面这个图是最常见的关系模型例子:
数据抽象 Data Abstraction
一个可用的系统必须要能高效地检索数据,系统开发人员通过如下几个层次的数据抽象来对用户屏蔽复杂性,以简化交互:
- 物理层(Physical Level):最低层次的抽象,描述数据实际上是怎样储存的。
- 逻辑层(Logical Level):描述数据库中存什么数据以及这些数据间存在什么联系。
- 视图层(View Level):只描述整个数据库的某个部分。
实例和模式 Instances and Schemas
随着时间的推移,信息会被插入或删除,数据库也就发生了改变。特定时刻存储在数据库中的信息的集合称作数据库的一个实例 Instance,而数据库的总体设计称作数据库模式 Schema。
数据库有几种模式:
- 物理模式(Physical Schema),在物理层描述数据库的设计。
- 逻辑模式(Logical Schema),在逻辑层描述数据库的设计。
- 子模式(Subschema),描述数据库的不同视图。
物理数据隔离 Physical Data Independence:在不改变逻辑模式的情况下改变物理模式的能力。
数据库语言 Database Language
数据定义语言 Data Definition Language (DDL)
定义数据库模式的专用标记,例如:
DDL 编译器会生成一系列表格模板,存在数字典 Data Dictionary 里。
数据字典包含元数据 Metadata:
- 数据库模式
- 完整性约束 Integrity Constraints
- Primary Key
- 授权
- 验证谁可以访问
数据操纵语言 Data Manipulation Language (DML)
可以使得用户访问或操纵,那些按照某种适当的数据模型组织起来的数据,通常也叫做查询语言。基本上有两种类型:
- 过程化的 DML(Procedural DML),要求用户指定需要什么数据以及如何获得这些数据
- 声明式的 DML(Declarative DML),只要求用户指定需要什么数据,不必指明如何获得
声明式的 DML 通常也被称为非过程化的 DML。
结构化的查询语言 Structured Query Language (SQL)
SQL 是非过程化的,一个查询包含输入一个或多个表格,返回一个表格。例子:
数据库设计 Database Design
设计数据库通常结构的流程:
- 逻辑设计 Logical Design:决定数据库模式。数据库设计要求我们能找到一个“好的”关系模式。
- 物理设计 Physical Design:决定数据库的物理层。
数据库组成 Database Components
一个数据库系统可以被划分成以下几个功能性模块:
- 储存管理器 Storage Manager
- 查询处理器 Query Processor
储存管理器 Storage Manager
储存管理器是数据库系统中负责在数据库中储存的低层数据与应用程序以及向系统提交的查询之间提供接口的部件。
储存管理器负责以下任务:
- 与操作系统文件管理器交互
- 高效地储存、检索和更新数据
储存管理器实现了以下几种数据结构:
- 数据文件(Data File):储存数据库它本身
- 数据字典(Data Dictionary):储存数据库结构的元数据
- 索引(Index):能提供对数据项的快速访问
查询处理器 Query Processor
查询处理器组件包括:
- DDL 解释器(DDL Interpreter):解释 DDL 语句并将这些定义记录在数据字典中
- DML 编译器(DML Compiler):将查询语言中的 DML 语句翻译为包括一系列查询执行引擎能理解的低级指令的执行方案
- 一个查询通常可以被翻译成给出相同结果的多个候选执行计划中的任何一个。DML 编译器还进行查询优化(Query Optimization),就是从几个候选执行计划中选择代价最小的那个执行计划。
- 查询执行引擎(Query Evaluation Engine):它执行由 DML 编译器产生的低级指令。
事务管理 Transaction Management
事务 Transaction是数据库应用中完成单一逻辑功能的操作集合。
数据库架构 Database Architecture
- 中心化的数据库
- 一些核心,共享内存
- 客户端-服务器
- 一个服务器机器执行基于多个客户端的工作
- 平行数据库
- 多核,共享内存
- 共享硬盘
- 什么都不共享
- 分布式数据库
- 地理上的分布
- 模式 / 数据异质性
数据库用户和管理员 Database Users and Administrator
数据库用户:
- 普通(初学者=)用户
- 应用程序员
- 复杂(老练)用户
数据库管理员:
- 模式定义
- 存储结构及存取方法定义
- 模式及物理组织的修改
- 数据访问授权
- 日常维护
Part 2 关系模型介绍 Introduction to Relational Model
关系数据库由表 Table的集合构成,每张表被赋予一个唯一的名称。
一个关系表的例子:
关系的数学定义 Mathematical Definition of Relation
笛卡尔积 Cartesian Product
考虑两个任意集合 $A$ 和 $B$,包括所有有序对 $(a,b), a \in A, b \in B$ 的集合叫做 $A$ 和 $B$ 的笛卡尔积。 $$ A \times B = {(a,b):a \in A, b \in B} $$ 如果 $A=B$,我们通常将 $A \times A$ 写成 $A^2$。
- e.g. 如果$A = \mathbb{R}$,那么 $\mathbb{R}^2$ 就是二维笛卡尔平面。
记集合 $A$ 的基数 Cartesian 为 $|A|$,则有 $$ |A \times B| = |A| \times |B| $$
从 $A$ 到 $B$ 的一个二元关系 Binary Relation $R$,就是 $A \times B$ 的一个子集,即 $R \subseteq A \times B$
一般来说,定义在集合 $D_1,D_2,\cdots,D_n$ 上的 $n$ 元关系 n-ary Relaion $R$ 是笛卡尔积 $$ D_1 \times D_2 \times \cdots D_n $$ 的一个子集,即 $$ R \subseteq \prod_{k=1}^n D_k $$ 其中,$D_1,D_2,\cdots,D_n$ 通常被称为域 Domain。
并且该笛卡尔积的基数为 $$ \left| \prod_{k=1}^n D_k \right| = \prod_{k=1}^n \left| D_k \right| $$ 也就是说,笛卡尔积的元素个数等于各个集合基数的乘积。
如果某个特定元组 tuple 出现在关系 $R$ 中,就表示该元组所表达的信息在当前是真实且有效的信息。
举个例子,假设
- $D_1=$ 学号
- $D_2=$ 学院
- $D_3=$ 生源城市 如果 $$ (114514, \text{CS}, \text{Shenzhen}) \in R $$ 表示学号为 $114514$ 的学生属于 CS 学院,并且来自深圳。但是 $$ (114515, \text{CE}, \text{Guangzhou}) \notin R $$ 表示“学号为 $114515$ 的学生属于 CE 且来自广州”这一说法不成立。
关系模式和实例 Relation Schema and Instance
$A_1, A_2, \cdots, A_n$ 被称为属性 attributes。
$R(A_1, A_2, \cdots, A_n)$ 被称为关系模式 relation schema,有时也被称为关系方案 relation scheme,它由关系名和属性列表组成,这些属性对应表中的列 columns。
- e.g. instructor(ID, name, dept_name, salary)
定义在关系模式 $R$ 上的关系实例 relation instance(也被称为关系状态 relation state)记为 $$ r(R) $$ 它由实际的数据值组成
- 一个关系当前的取值通常通过一张表 table 来表示
关系 $r$ 中的一个元素 $t$ 被称为元组 tuple,在表中用一行 row 来表示。
模式描述的是逻辑结构,而实例时某一时刻数据的快照。
例如下图
属性 Attributes
每个属性允许取值的集合称为该属性的域 domain。
属性值通常要求是原子性的 atomic,即不可再分的。
特殊值 null 被认为时每个域中搞的一个成员,表示该值未知。
null 值会使许多数据库操作的定义变得更加复杂。
键 Keys
设 $K \subseteq R$,如果属性集合 $K$ 的取值足以认证唯一标识关系 $R$ 的一个元组 tuple,则称 $K$ 为 $R$ 的一个超键 superkey。
- e.g. ${\text{ID}}$ 和 ${\text{ID}, \text{name}}$ 都是关系 instructor 的超键。
如果一个超键 $K$ 是最小的 minimal,则称之为候选键 candidate key。
- e.g. ${\text{ID}}$ 是 instructor 的一个候选键。
在所有候选键中,会选择一个作为主键 primary key。
外键 foreign key 约束:一个关系中的某个值必须出现在另一个关系中。
- 引用关系 referencing relation
- 被引用关系 referenced relation
关系代数 Relational Algebra
关系代数由一组运算组成,这些运算接受一个或两个关系作为输入,并生成一个新的关系作为它们的结果。
六个基础运算符:
- 选择 select:$\sigma$
- 投影 project:$\Pi$
- 集合并 union / 交 intersection:$\cup$
- 集合差 set difference:$-$
- 笛卡尔积 Cartesian product:$\times$
- 更名 rename:$\rho$
选择运算 Select Operation
选择运算选出给定满足给定谓语的元组。
记号:$\sigma_p(r)$
其中,$p$ 被称作选择谓词 selection predicate。
- e.g. 选择满足在 Physics 部门的 instructor:
$$ \sigma_{\text{dept name="Physics"}}(\text{instructor}) $$
在选择运算中我们允许使用 $=,\not=,>,\geq,<,\leq$,也可以利用连接符号 $\land,\lor,\lnot$ 结合谓语。
- e.g. 选择满足在 Physics 部门且薪水高于 $90,000 的 instructor:
$$ \sigma_{\text{dept name="Physics" } \land \text{ salary > 90,000}}(\text{instructor}) $$
投影运算 Project Operation
投影运算是一种一元运算,返回它的参数关系,但滤掉了特定的属性。
记号:$\Pi_{A_1, A_2, A_3, \cdots, A_k} (r)$
其中 $A_1, A_2, A_3, \cdots, A_k$ 是属性名,$r$ 是关系名。
- e.g. 查询以下属性的 instructor
$$ \Pi_{\text{ID, name, salary}}(\text{instructor}) $$
其实就是去掉了 dept_name。
关系运算的组合 Composition of Relational Operations
因为一个关系代数运算的结果也是关系,所以可以直接连接起来(就像函数括号),例如:
$$ \Pi_{\text{name}}(\sigma_{\text{dept name="Physics"}}(\text{instructor})) $$
笛卡尔积运算 Cartesian Product Operation
笛卡尔积运算用叉号 $\times$ 表示,允许我们结合来自任意两个关系的信息。其作用就是,假设现在有 $n$ 行的关系 $A$ 和 $m$ 行的关系 $B$,那么 $A \times B$ 就会得到一个 $n*m$ 的关系,$A$ 中的每一个关系都对应 $B$ 中的所有关系。
- e.g. instructor $\times$ teaches
这里找不到原来两张表的电子版了,但大概就是能看出来是原来的一条左边的关系对应右边的所有关系,这样形成一个 $n*m$ 行的关系
连接运算 Join Operation
连接运算是我们能够选择与笛卡尔积合并到单个运算中。
考虑关系 $r(R)$ 和 $r(S)$,并令 $\theta$ 为 $R \cup S$ 模式的属性上的一个谓词。连接运算的定义是: $$ r \bowtie s = \sigma_\theta (r \times s) $$
- e.g. $\text{instructor} \bowtie_{\text{instructor.id = teaches.id}} \text{teaches} \\ = \sigma_{\text{instructor.id = teaches.id}}(\text{instructor} \times \text{teaches})$
其实就是先做了笛卡尔积运算,再做一次选择运算。
集合运算 Set Operation
并运算 Union Operation
并运算使我们能够合并两个关系。
为了使并集运算有意义,必须要保证:
- 输入并运算的两个关系具有相同数量的属性,一个关系的属性数量被称为它的元数 Arity。
- 当属性有关联的类型时,对于每个 $i$,两个输入关系的第 $i$ 个属性的类型必须相同。
这样的关系被称为相容 Compatible 关系。
记号:$r \cup s$
交运算 Intersection Operation
交运算允许我们找到同时出现在两个输入关系中的元组。
与并集运算一样,我们要保证交是在相容关系之间进行运算的。
记号:$r \cap s$
集差运算 Set-Difference Operation
集差运算能使我们找到在一个关系中但不在另一个关系中的元组。
与并集运算一样,我们要保证集差是在相容关系之间进行运算的。
记号:$r - s$
因为教授的 PPT 上并没有很好的例子图示,在这里就不给出了,而且这三个集合运算也很好理解。
赋值运算 Assignment Operation
有时候通过将一个关系代数表达式中的一部分赋值给临时的关系变量,可以方便地编写该表达式,赋值运算用 $\leftarrow$ 表示,其工作方式类似于程序语言中的赋值。
- e.g. 找到所有在 Physics 和 Music 部门的 instructor $$ \begin{align*} &\text{Physics} \leftarrow \sigma_{dept name="Physics"}(instructor) \\ &\text{Music} \leftarrow \sigma_{dept name="Music"}(instructor) \ &\text{Physics} \cup \text{Music} \end{align*} $$
更名运算 Rename Operation
关系表达式的结果并没有可以用来指代它们的名称,更名运算使我们能够做到这一点。
记号:$\rho_x(E)$
能返回以 $x$ 命名的表达式 $E$ 的结果。
Part 3 实体-联系模型 The Entity-Relationship Model
实体-联系数据模型(E-R 数据模型)被开发来方便数据库的设计,它是通过允许定义代表数据库全局逻辑结构的企业模式 Enterprise Schema 来做到的。
E-R 数据模型包含下面三个基本概念:
- 实体集 Entity Sets
- 关系集 Relationship Sets
- 属性 Attributes
实体集 Entity Sets
实体 Entity 是现实世界中可以区分于所有其它对象的一个“事物”或“对象”。
- e.g. 特定的人,公司,活动
实体集 Entity Set 是共享相同性质或属性的、具有相同类型的实体的集合。
- e.g. 所有人的集合,公司集合,产品
实体通过一组属性 Attributes 来表示,属性是实体集中每个成员所拥有的描述性性质。
- e.g. $\text{instructor}=(\text{ID, name, salary})$
属性的子集构成了一个实体集的主键 Primary Key,即它可以唯一认定集合的元素。
实体集在 ER 图里的表示:
关系集 Relationship Sets
联系 Relationship 是多个实体间的相互关联,关系集 Relationship Set 是相同类型联系的集合。
关系集的数学定义: $$ {(e_1, e_2, \cdots, e_n):e_1 \in E_1, e_2 \in E2, \cdots, e_n \in E_n } $$ 其中,$(e_1, e_2, \cdots, e_n)$ 是个联系。
例如,我们定义一个关系集 advisor 表示 student 和 instructor 之间的关联,然后我们就可以在实体之间连线。
关系集在 ER 图里的表示:
联系也可以具有被称作描述性属性 Descriptive Attribute 的属性。
例如下图,每个联系具有一个 date,表示 student 和 instructor 建立 advisor 联系的时间。
带有属性的关系集在 ER 图里的表示:
实体集的关系不需要是独特的,实体在联系中扮演的功能被称为实体的角色 Role。
下图中,course_id 和 prereq_id 被称作角色。
复杂属性 Complex Attributes
映射基数 Mapping Cardinality
映射基数 Mapping Cardinality 或基数比率表示一个实体能通过一个关系集关联的另一些实体的数量,多用于描述而二元关系集。
对于一个二元关系集来说,映射基数必然是以下情况之一:
- 一对一 One-to-one
- 一对多 One-to-Many
- 多对一 Many-to-one
- 多对多 Many-to-many
在 ER 图中,我们用单箭头直线($\rightarrow$)来表示“一”,用无箭头之间($-$)来表示“多”。下图从左上到右下分别是一对一、一对多、多对一、多对多。
其中,一对一在这里的意思就是,一位导师最多可以指导一名学生,一名学生最多只能有一位导师;一对多在这里的意思就是,一位导师可以知道多名学生,但一名学生最多只能有一位导师;其余同理。
全部和部分参与 Total and Partial Participation
如果实体集 $E$ 中的每个实体都必须参与到关系集 $R$ 的至少一个联系中,那么实体集 $E$ 在关系集 $R$ 中的参与就被称为是全部的 Total,否则就被称为是部分的 Partial。
全部参与用双直线表示。
例如下图:
student 在 advisor 关系集中的参与是全部的,即每个学生都至少有一位导师。instructor 在 advisor 关系集中的参与是部分的,即一位导师不是必须要指导学生。
更复杂的约束 More Complex Constraints
线段上可以有一个关联的最小和最大基数,用 $l..h$ 的形式表示,其中 * 表没有限制。示例如下图:
导师可以指导 $0$ 个或多个学生;一个学生至少有一个导师,无上限。
主键 Primary Key
弱实体集 Weak Entity Sets
表示成模式 Representing Schema
Part 4 实体-关系特征拓展 Extended Entity-Relationship Features
Part 5 关系数据库设计和规范化 Relational Database Design and Normalization
Part 6 函数依赖理论 Functional Dependency Theory
Part 7 多值依赖和第四范式 Multivalued Dependencies and Fourth Normal Form
Generated by ChatGPT
下面从 Primary Key 开始,按你给出的课程结构一直讲到 Multivalued Dependency 与 Fourth Normal Form。我尽量保持 定义式、考试导向、少换行、逻辑紧凑的形式。
Part 3(后半)Primary Key, Weak Entity, Schema Representation
Primary Key(主键)
定义:在关系模式 (R) 中,若属性集合 (K) 满足以下两个条件,则称 (K) 为 Primary Key: (1) 唯一性(uniqueness):任意两个不同元组在属性 (K) 上的取值不同。 (2) 最小性(minimality):(K) 的任何真子集都不满足唯一性。
换句话说,Primary Key 是从 Candidate Keys(候选键) 中选出的一个作为主要标识符的键。Candidate Key 是所有满足唯一性与最小性的属性集合,而 Superkey 是任何能唯一标识元组的属性集合(不要求最小性)。因此关系为: Superkey ⊇ Candidate Key ⊇ Primary Key。
Prime Attribute 指属于某个 Candidate Key 的属性,Non-prime Attribute 指不属于任何 Candidate Key 的属性。
在关系模型中通常还涉及 Foreign Key:若关系 (R) 中属性集合 (F) 的值必须出现在另一关系 (S) 的 Primary Key 中,则称 (F) 为引用 (S) 的 Foreign Key。
Weak Entity Sets(弱实体集)
弱实体是 没有完整主键的实体集,其存在依赖于另一个实体集(称为 identifying entity)。弱实体的标识需要两个部分: (1) Partial Key(部分键):弱实体内部可区分对象的属性集合。 (2) Owner Entity 的 Primary Key。
因此弱实体的主键为: Owner Primary Key ∪ Partial Key。
例如:Employee(emp_id) 与 Dependent(name, age)。Dependent 无法仅通过 name 唯一标识,需要 (emp_id, name) 才能唯一确定一个 dependent。
在 ER 图中弱实体通常用 双矩形表示,identifying relationship 用 双菱形表示,且弱实体必须 total participation 于 identifying relationship。
Representing Schema(ER 到关系模式)
将 ER 模型转换为关系模式通常遵循以下规则: (1) 实体集 → 关系模式:实体的属性成为关系的属性,实体的主键成为关系的 Primary Key。 (2) 弱实体 → 关系模式:加入 owner entity 的主键作为外键并与 partial key 组成复合主键。 (3) 1:N relationship:在 N 端关系中加入 1 端的主键作为 Foreign Key。 (4) M:N relationship:建立新的关系模式,其属性为两个实体主键的并集(再加上关系属性)。 (5) relationship attributes:作为新关系模式中的属性出现。
Part 4 Extended Entity-Relationship Features
EER(Extended ER)模型在普通 ER 的基础上增加了 更复杂的建模能力,核心包括 specialization、generalization 与 aggregation。
Specialization(特化) 指从一个高层实体集派生出多个子实体集,每个子实体集具有额外属性或语义。例如 Person → Student、Employee。子类自动继承父类属性。
Generalization(泛化) 是 specialization 的逆过程,将多个实体抽象为一个更一般的实体。例如 Car 与 Truck 可以泛化为 Vehicle。
在 EER 中还涉及两个重要约束: (1) Disjointness constraint:子类之间是否互斥。若 disjoint,则一个实体最多属于一个子类;若 overlapping,则可属于多个子类。 (2) Completeness constraint:是否 total specialization。若 total,则每个父类实体必须属于某个子类;若 partial,则可以不属于任何子类。
另一个扩展概念是 Aggregation,用于表示“关系之间的关系”。例如 Manager 管理 Works_on 关系(员工参与项目),此时需要把 Works_on 看作一个高层对象参与新的关系。
Part 5 Relational Database Design and Normalization
关系数据库设计的目标是 减少冗余并避免更新异常(update anomaly)。常见异常包括: Insertion anomaly:无法插入某些信息而不引入无关数据; Deletion anomaly:删除数据时丢失其他有用信息; Update anomaly:同一事实需在多处修改。
为解决这些问题,需要进行 Normalization(规范化),即通过分析 Functional Dependencies(函数依赖) 将关系分解为多个结构更合理的关系模式。规范化通常涉及一系列范式(Normal Forms),如 1NF、2NF、3NF、BCNF、4NF 等。数据库设计通常希望达到 BCNF 或至少 3NF,同时保持 lossless join 与 dependency preservation。
Lossless Join 指将关系分解为多个关系后,通过自然连接可以恢复原关系。 Dependency Preservation 指原有函数依赖可以在分解后的关系中被直接验证。
Part 6 Functional Dependency Theory
Functional Dependency(FD) 是规范化理论的核心。
定义:在关系模式 (R) 中,属性集合 (X) 到属性集合 (Y) 的函数依赖记为 (X → Y),若对任意两个元组 (t_1,t_2),只要 (t_1[X] = t_2[X]),则必有 (t_1[Y] = t_2[Y])。
Trivial FD:若 (Y ⊆ X),则 (X → Y) 为平凡函数依赖。
Attribute Closure:给定属性集 (X) 和函数依赖集合 (F),闭包 (X^+) 是在 (F) 下由 (X) 可推出的所有属性集合。计算闭包的算法为:初始化 (X^+=X),若存在 FD (A→B) 且 (A ⊆ X^+),则加入 (B) 到 (X^+),重复直到不再变化。
闭包用于: (1) 判断 Candidate Key(若 (X^+) 包含所有属性,则 (X) 为 superkey); (2) 判断 FD 是否由 (F) 推导。
Armstrong’s Axioms 提供了推导函数依赖的基本规则: Reflexivity:若 (Y ⊆ X),则 (X → Y)。 Augmentation:若 (X → Y),则 (XZ → YZ)。 Transitivity:若 (X → Y) 且 (Y → Z),则 (X → Z)。
从这三条可推导出 union、decomposition、pseudo-transitivity 等规则。
3NF 与 BCNF
BCNF(Boyce–Codd Normal Form)定义:若对所有非平凡 FD (X → Y),都有 (X) 为 superkey,则关系模式在 BCNF 中。
Third Normal Form(3NF)定义:若对每个 FD (X → Y),满足以下之一: (1) (X) 是 superkey; (2) (Y) 为 prime attribute; (3) 该 FD 为 trivial。
BCNF 比 3NF 更严格。BCNF 能消除更多冗余,但分解后可能不保持依赖,而 3NF 保证存在 dependency-preserving decomposition。
Part 7 Multivalued Dependency and Fourth Normal Form
Multivalued Dependency(MVD) 表示为 (X →→ Y)。其含义是:若两个元组在 (X) 上相同,则可以交换它们在 (Y) 上的值而仍属于该关系。直观上表示:对于同一 (X),(Y) 的取值集合与其他属性是独立的。
例如关系 Student(student, skill, language),若学生的技能与语言互不影响,则存在 student →→ skill student →→ language。
这种情况下数据会出现冗余组合,例如若某学生有两种 skill 和两种 language,则需要存储四条记录。
Fourth Normal Form(4NF)定义:若对于每个非平凡 MVD (X →→ Y),都有 (X) 为 superkey,则关系模式在 4NF 中。
当存在违反 4NF 的 MVD 时,需要将关系分解。例如将 Student(student, skill, language) 分解为 StudentSkill(student, skill) 与 StudentLanguage(student, language)。
这种分解既消除了冗余,又保持无损连接。
CSC4001 Software Engineering
L0 前言
教学计划:
- 介绍
- 测试基础
- 变形测试、差分测试、内变形测试
- 模糊测试
- 测试用例约简
- 数据流分析
- 符号执行
- 软件维护
- 软件开发生命周期、需求分析
- 软件建模
软件工程是什么?
软件工程关注软件开发的各个阶段,包括需求、建模、原型设计、设计、实现、生成、分析、验证、测试、评估、维护以及软件系统的复用。
L1 软件测试
软件测试就是,评估和验证一个软件产品应用有没有做到它该做的,的过程。
尽可能找到更多的 bug。
测试阶段
从图中可以看到,从产品需求开始到最实现,都是可以返回来测试的。
推荐读物:Cooperative Software Design
单元测试 Unit Test
测试独立的单元。
目标:确认每个单元都是对的,而且能完成它们的功能。
集成测试 Integration Test
测试一群子系统,最终测试整个系统。
目标:测试多个子系统之间的交互,确保合并起来的整个系统满足需求。
系统测试 System Test
功能测试 Functional Test
目标:测试系统的功能。
验收测试 Acceptance Test
目标:证明系统迎合用户的需求。
- Alpha 测试(内测)
- Beta 测试(公测)
独立测试 Independent Test
由独立测试者来做的测试,尽量破坏软件,找到最深的 bug。