Lecture 1-3 - Prerequisites

Study Tips

Raise the bar on understanding
When do you feel like your studying has succeeded?
- When you get the solutions to the practice final?
- When you can see the solutions without seeing the practice final?
Recall
Half of the battle is remembering what you need, when you need it. (The other half is doing something once you remeber the information)
Spaced Repetition is a class of algorithms that remind you of things on an efficient schedule. Widely used for flashcard apps like Anki
Pay attention to the “knowledge tree”

Calculus

Derivative

Rate of change of a function with respect to a variable
Measures how a function changes as its input changes
Notation: $\frac{d}{d x} f (x)$ or $f^{'} (x)$

Common Derivative Rules

Power Rule: $\frac{d}{d x} (x^{n}) = n x^{n - 1}$
Product Rule: $\frac{d}{d x} (f (x) g (x)) = f^{'} (x) g (x) + f (x) g^{'} (x)$
Chain Rule: $\frac{d}{d x} (f (g (x))) = f^{'} (g (x)) \cdot g^{'} (x)$

Partial Derivative

Derivative of a multivariable function with respect to one variable, holding others constant
Notation: $\frac{\partial f}{\partial x}$ or $f_{x}$

Gradient

Vector of all partial derivatives of a multivariable function
Represents the direction of steepest increase of the function
Notation: $\nabla f = (\frac{\partial f}{\partial x _{1}}, \frac{\partial f}{\partial x _{2}}, ..., \frac{\partial f}{\partial x _{n}})$

Integral (barely needed)

Opposite of differentiation, finds the accumulation of quantities
Indefinite integral: $\int f (x) d x = F (x) + C$
Definite integral: $\int_{a}^{b} f (x) d x = F (b) - F (a)$

Optimization (Finding minima and maxima)

Process of finding the best solution from all feasible solutions
Applications in machine learning, economics, and engineering

Min and Argmin

Min: The minimum value of a function
Argmin: The input value(s) that result in the minimum of a function
Notation: $argmin_{x} f (x)$ is the value of $x$ that minimizes $f (x)$

Boolean Logic

Binary variables

Variables that can take on only two possible values: true or false (often represented as 1 or 0)
Used to represent propositions or statements that are either true or false

Logical operators

AND ( $\land$ ): True if both operands are true
OR ( $\lor$ ): True if at least one operand is true
NOT ( $\neg$ ): Negates the truth value of its operand
XOR ( $\oplus$ ): True if exactly one operand is true
IMPLIES ( $\to$ ): True unless the first operand is true and the second is false

Truth Tables

Tables that show all possible combinations of input values and their corresponding output values for logical operations
Used to define the behavior of logical operators and evaluate complex Boolean expressions

Truth Table Example

The XOR operation returns true if the inputs are different, and false if they are the same.

Truth table for XOR:

A	B	A XOR B
0	0	0
0	1	1
1	0	1
1	1	0

Where:

$A$ and $B$ are the input values
$\oplus$ represents the XOR operation
The result column shows the output for each combination of inputs

Predicate

A boolean predicate is a function that returns either true or false
A predicate can be reduced to a truth value if all free variables are given boolean values
Example:

$allThree (x, y, z) = (x \land y \land z)$

First Order Logic

Existential and Universal Quantifiers

Existential Quantifier ( $\exists$ ): “There exists” or “for some”
- $\exists x P (x)$ means “There exists an x such that P(x) is true”
Universal Quantifier ( $\forall$ ): “For all” or “for every”
- $\forall x P (x)$ means “For all $x$ , $P (x)$ is true”
These quantifiers allow for more expressive logical statements than Boolean logic alone
They can be combined with logical operators to form complex predicates

Set Theory

Set

An unordered container of objects.
The objects are unique.
A set is uniquely defined by the elements it contains.
The cardinality of a set is the number of elements it contains, denoted by $∣ A ∣$ for a set $A$ .

if $A = {1, 2, 3}$ and $B = {3, 2, 1}$ then $A = B$

Function

A function’s signature indicates the name of the function, the space of inputs to the function, and the space of outputs of the function.

In code

def f(x: int) -> int: # takes an integer, and returns an integer
    pass

In math notation

$f : Z \to Z$

Linear Algebra

Vectors

A 1D array of elements
Elements are usually real numbers (math) or floating point numbers (programming)

$x \in R^{n}$ a real vector with $n$ elements

With $n = 3$ , $x$ could be $[\[0.1, 2.2223, 5\]]$

$R$ is the real numbers.

$\in$ Read as “in”, or “element of”

Matrices

A 2D array of numbers

$A \in R^{m \times n}$ a real matrix with $m$ rows and $n$ columns

Eigenvalues and Eigenvectors

$A \in R^{n \times n}$ (a square matrix)

The eigenvectors of $A$ are the vectors $v \in R^{n}$ that satisfy:

$A v = λ v$

Where $λ$ are scalar values (the eigenvalues).

Graph Theory

A graph is a set of nodes and edges

$G = (V, E)$

The nodes are “atomic” – they don’t have any internal structure. We give each node a label.

$V = {v_{1}, v_{2}, v_{3}, ...}$

The edges are pairs of nodes.

$E = {(v_{1}, v_{2}), (v_{2}, v_{3}), ...}$

Weighted Graph

A graph with a scalar quantity on each edge

$E = {(v_{1}, v_{2}, 5.1), (v_{2}, v_{3}, 7.0), ...}$

Directed Graph

A graph where the relative order of the vertices in each edge matters
Mathematically, for a directed graph

$(v_{1}, v_{2}) \neq = (v_{2}, v_{1})$

Cliques (Fully-Connected Graphs)

Every node is connected to every other node.

Number of edges in an undirected graph with no self-loops

$∣ E ∣ = n (n - 1) /2$

Trees

An undirected graph without cycles

A graph has a cycle if there exists a path from a vertex back to itself without repeating any edges.

DAGs (Directed Acyclic Graphs)

A directed graph without cycles

Probability and Statistics

Probability

A measure of the likelihood of an event occurring
Expressed as a number between 0 (impossible) and 1 (certain)
Notation: $P (A)$ represents the probability of event A occurring

Experiment

A process with a well-defined set of possible outcomes
Example: rolling a die, flipping a coin

Outcome

A single result of an experiment
Example: getting a 3 when rolling a die

Event

A set of outcomes from an experiment
Example: rolling an even number on a die

Sample Space

The set of all possible outcomes of an experiment
Denoted by $Ω$ (omega)
Example: for a coin flip, $Ω$ = {heads, tails}

Naive Probability

Assumes all outcomes are equally likely and mutually exclusive
$P (A) = \frac{number of favorable outcomes}{total number of possible outcomes}$

Conditional Probability

The probability of an event occurring given that another event has already occurred
Notation: $P (A ∣ B)$ represents the probability of event A occurring given that event B has occurred
Formula: $P (A ∣ B) = \frac{P ( A and B )}{P ( B )}$

Bayes’ Rule

A theorem that relates conditional probabilities
Used to update probabilities based on new evidence
Formula: $P (A ∣ B) = \frac{P ( B ∣ A ) \cdot P ( A )}{P ( B )}$

Expected Value

The average outcome of an experiment if it is repeated many times
Denoted by $E (X)$ for a random variable $X$
Calculated by multiplying each possible outcome by its probability and summing the results

Variance

A measure of variability or spread in a set of data
Denoted by $Var (X)$ or $σ^{2}$ for a random variable $X$
Calculated as the average squared deviation from the mean

Hypothesis Testing

A statistical method used to make inferences about a population based on sample data
Involves formulating null and alternative hypotheses

Statistical Significance
- The likelihood that a result or relationship is caused by something other than chance
- Often measured using a significance level ( $α$ ), typically set at 0.05 or 0.01
- A result is considered statistically significant if its p-value is less than the chosen significance level
P-value
- The probability of obtaining test results at least as extreme as the observed results, assuming the null hypothesis is true
- Used to determine statistical significance
- A small p-value (typically < 0.05) suggests strong evidence against the null hypothesis

Programming

Functions
Control flow
Object-oriented programming
Environment Management
Third-party libraries
Reading documentation

Look for similarities

Code as data
Data as code

📚 gabe's wiki

Explorer