Bradley Woolf
Bradley Woolf
 Mathematical Tools

Mathematical Tools

Basic math LEGO bricks that I have used while building at the intersection of computers x biology. I use this as a reference.

Calculus + Linear Algebra

Component
Definition
In Biology / Computing
Math Variables
Derivative
Instantaneous rate of change
Enzyme kinetics, population growth, gradient descent
dydx,∇f(x)\frac{dy}{dx}, \nabla f(x)dxdy​,∇f(x)
Partial Derivative
Rate of change w.r.t. one variable
Multi-omics sensitivity, energy landscapes
∂f∂xi\frac{\partial f}{\partial x_i}∂xi​∂f​
Gradient
Vector of partial derivatives
Backpropagation, optimization
∇f(x)=[∂f∂x1,…,∂f∂xn]\nabla f(x) = \left[\frac{\partial f}{\partial x_1}, \ldots, \frac{\partial f}{\partial x_n}\right]∇f(x)=[∂x1​∂f​,…,∂xn​∂f​]
Jacobian
Matrix of first-order derivatives
Sensitivity analysis, neural nets
Jij=∂fi∂xjJ_{ij} = \frac{\partial f_i}{\partial x_j}Jij​=∂xj​∂fi​​
Hessian
Matrix of second-order derivatives
Curvature in protein folding, optimization
Hij=∂2f∂xi∂xjH_{ij} = \frac{\partial^2 f}{\partial x_i \partial x_j}Hij​=∂xi​∂xj​∂2f​
Taylor Expansion
Approximation around a point
Local linearization of pathways, dynamics
f(x)≈f(a)+f′(a)(x−a)+12f′′(a)(x−a)2f(x) \approx f(a) + f'(a)(x-a) + \tfrac{1}{2}f''(a)(x-a)^2f(x)≈f(a)+f′(a)(x−a)+21​f′′(a)(x−a)2
Chain Rule
Derivative of composite functions
Backpropagation in neural nets
dydx=dydududx\frac{dy}{dx} = \frac{dy}{du}\frac{du}{dx}dxdy​=dudy​dxdu​
Integral
Accumulated quantity
Population size from growth rate, cumulative flux
∫f(x) dx\int f(x)\, dx∫f(x)dx
Definite Integral
Area under curve / accumulation over interval
Protein abundance from expression rate
∫abf(x) dx\int_a^b f(x)\, dx∫ab​f(x)dx
Multiple Integral
Integration over multidimensional space
Partition functions, probability densities
∫⋯∫f(x1,…,xn) dx1⋯dxn\int \cdots \int f(x_1,\ldots,x_n) \, dx_1 \cdots dx_n∫⋯∫f(x1​,…,xn​)dx1​⋯dxn​
Divergence
Outflow rate of a vector field
Flux of ions, transport phenomena
∇⋅F⃗\nabla \cdot \vec{F}∇⋅F
Curl
Rotation of a vector field
Electromagnetic models in biophysics
∇×F⃗\nabla \times \vec{F}∇×F
Laplacian
Divergence of gradient
Diffusion models, electrophysiology
∇2f=∇⋅(∇f)\nabla^2 f = \nabla \cdot (\nabla f)∇2f=∇⋅(∇f)
Linear Equation
Equation in vector/matrix form
Kinetics, statistical models
Ax=bAx = bAx=b
Matrix Multiplication
Transformation, composition
Feature embeddings, transition systems
C=ABC = ABC=AB
Determinant
Scalar property of matrix
Volume scaling, invertibility
det⁡(A)\det(A)det(A)
Inverse Matrix
Solves linear systems
Regression, circuit models
A−1A^{-1}A−1
Eigenvalue / Eigenvector
Invariant scaling directions
PCA, network stability
Av=λvAv = \lambda vAv=λv
SVD
Decompose into orthogonal bases
Dimensionality reduction (scRNA-seq)
A=UΣVTA = U \Sigma V^TA=UΣVT
QR / LU / Cholesky
Matrix factorizations for solving
Numerical solvers for pathway models
A=QR, A=LU, A=LLTA = QR, \ A=LU, \ A=LL^TA=QR, A=LU, A=LLT
Projection
Mapping onto subspace
Embeddings, latent space representations
P=UUTP = UU^TP=UUT
Inner Product
Measure of similarity
Cosine similarity, kernel methods
⟨x,y⟩=xTy\langle x, y \rangle = x^T y⟨x,y⟩=xTy
Norm
Length / magnitude of vector
Regularization, error measures
∥x∥p=(∑i=1n∣xi∣ p)1/p\|x\|p=\left(\sum{i=1}^{n}|x_i|^{\,p}\right)^{1/p}∥x∥p=(∑i=1n∣xi​∣p)1/p
Mahalanobis Distance
Distance scaled by covariance
Anomaly detection in cell states
d(x,y)=(x−y)TΣ−1(x−y)d(x,y) = \sqrt{(x-y)^T \Sigma^{-1} (x-y)}d(x,y)=(x−y)TΣ−1(x−y)​
Orthogonality
Perpendicular vectors/bases
PCA axes, Fourier modes
xTy=0x^T y = 0xTy=0
Rank
Dimension of column/row space
Degrees of freedom in system
rank(A)\text{rank}(A)rank(A)
Trace
Sum of diagonal elements
Invariant measure in statistics
tr(A)=∑iAii\text{tr}(A) = \sum_i A_{ii}tr(A)=∑i​Aii​
Condition Number
Sensitivity of linear system
Numerical stability in simulations
κ(A)=∥A∥∥A−1∥\kappa(A) = \|A\| \|A^{-1}\|κ(A)=∥A∥∥A−1∥

Probability, Statistics, Information Theory

Component
Definition
In Biology / Computing
Math Variables
Probability Distribution
Assigns likelihood to outcomes
Gene expression variability, sequencing errors
P(X=x)P(X=x)P(X=x)
Expectation
Average value under distribution
Mean expression, fitness landscape averages
E[X]=∑xxP(x)\mathbb{E}[X] = \sum_x x P(x)E[X]=∑x​xP(x)
Variance
Spread around the mean
Expression noise, measurement error
Var(X)=E[(X−μ)2]\mathrm{Var}(X) = \mathbb{E}[(X-\mu)^2]Var(X)=E[(X−μ)2]
Covariance
Joint variability of two variables
Co-expression of genes
Cov(X,Y)=E[(X−μX)(Y−μY)]\mathrm{Cov}(X,Y) = \mathbb{E}[(X-\mu_X)(Y-\mu_Y)]Cov(X,Y)=E[(X−μX​)(Y−μY​)]
Correlation
Normalized covariance (−1 to 1)
Gene-gene correlation networks
ρXY=Cov(X,Y)σXσY\rho_{XY} = \frac{\mathrm{Cov}(X,Y)}{\sigma_X\sigma_Y}ρXY​=σX​σY​Cov(X,Y)​
Law of Large Numbers
Averages converge to expectation
Replicates reduce noise
Xˉn=1n∑i=1nXi→a.s.μas n→∞\bar X_n=\frac{1}{n}\sum_{i=1}^{n}X_i \xrightarrow{\text{a.s.}} \mu \quad \text{as } n\to\inftyXˉn​=n1​∑i=1n​Xi​a.s.​μas n→∞
Central Limit Theorem
Normal distribution emerges from sums
Sequencing depth → Gaussian errors
Xˉn−μσ/n→N(0,1)\frac{\bar{X}_n - \mu}{\sigma/\sqrt{n}} \to \mathcal{N}(0,1)σ/n​Xˉn​−μ​→N(0,1)
Bernoulli
Single binary outcome
Mutation present/absent
P(X=1)=p, P(X=0)=1−pP(X=1)=p, \ P(X=0)=1-pP(X=1)=p, P(X=0)=1−p
Binomial
Sum of Bernoullis
Count of mutations in reads
P(X=k)=(nk)pk(1−p)n−kP(X=k) = \binom{n}{k} p^k (1-p)^{n-k}P(X=k)=(kn​)pk(1−p)n−k
Poisson
Events in fixed interval
RNA counts, sequencing reads
P(X=k)=λke−λk!P(X=k) = \frac{\lambda^k e^{-\lambda}}{k!}P(X=k)=k!λke−λ​
Negative Binomial
Overdispersed counts
scRNA-seq models
P(X=k)=(k+r−1k)(1−p)rpkP(X=k) = \binom{k+r-1}{k} (1-p)^r p^kP(X=k)=(kk+r−1​)(1−p)rpk
Normal Distribution
Gaussian variability
Expression levels, errors
f(x)=12πσ2e−(x−μ)22σ2f(x) = \frac{1}{\sqrt{2\pi\sigma^2}} e^{-\frac{(x-\mu)^2}{2\sigma^2}}f(x)=2πσ2​1​e−2σ2(x−μ)2​
Log-normal
Multiplicative noise
Protein abundance distributions
f(x)=1xσ2πe−(ln⁡x−μ)22σ2f(x) = \frac{1}{x\sigma\sqrt{2\pi}} e^{-\frac{(\ln x - \mu)^2}{2\sigma^2}}f(x)=xσ2π​1​e−2σ2(lnx−μ)2​
Gamma
Waiting times, skewed distributions
Reaction times, lifetimes
f(x;α,β)=βαΓ(α)xα−1e−βxf(x;\alpha,\beta) = \frac{\beta^\alpha}{\Gamma(\alpha)} x^{\alpha-1} e^{-\beta x}f(x;α,β)=Γ(α)βα​xα−1e−βx
Beta
Distribution on [0,1]
Allele frequencies, probabilities
f(x;α,β)=xα−1(1−x)β−1B(α,β)f(x;\alpha,\beta) = \frac{x^{\alpha-1}(1-x)^{\beta-1}}{B(\alpha,\beta)}f(x;α,β)=B(α,β)xα−1(1−x)β−1​
Dirichlet
Multivariate generalization of Beta
Topic/cell-type mixtures
fx;α)=1B(α)∏ixiαi−1f{x};\alpha) = \frac{1}{B(\alpha)} \prod_i x_i^{\alpha_i - 1}fx;α)=B(α)1​∏i​xiαi​−1​
Multinomial
Multi-class counts
Reads across cell types
P(X=x1,…,xk)=n!∏ixi!∏ipixiP(X=x_1,\ldots,x_k) = \frac{n!}{\prod_i x_i!} \prod_i p_i^{x_i}P(X=x1​,…,xk​)=∏i​xi​!n!​∏i​pixi​​
Maximum Likelihood (MLE)
Best-fit parameters from data
Fit kinetic rates, emission probs
θ^MLE=arg⁡max⁡θ  L(θ∣x)=arg⁡max⁡θ  ∑i=1nlog⁡p(xi∣θ)\hat{\theta}{\mathrm{MLE}}=\arg\max{\theta}\;L(\theta\mid x) =\arg\max_{\theta}\;\sum_{i=1}^{n}\log p(x_i\mid\theta)θ^MLE=argmaxθL(θ∣x)=argmaxθ​∑i=1n​logp(xi​∣θ)
Maximum A Posteriori (MAP)
MLE with prior
Regularized estimates
θ^MAP=arg⁡max⁡θ  p(θ∣x)=arg⁡max⁡θ  [log⁡p(x∣θ)+log⁡p(θ)]\hat{\theta}{\mathrm{MAP}}=\arg\max{\theta}\;p(\theta\mid x) =\arg\max_{\theta}\;\big[\log p(x\mid\theta)+\log p(\theta)\big]θ^MAP=argmaxθp(θ∣x)=argmaxθ​[logp(x∣θ)+logp(θ)]
Bayesian Inference
Updates belief with data
Model calibration in biology
p(θ∣x)=p(x∣θ) p(θ)p(x),p(x)=∫p(x∣θ) p(θ) dθp(\theta\mid x)=\dfrac{p(x\mid\theta)\,p(\theta)}{p(x)},\qquad p(x)=\int p(x\mid\theta)\,p(\theta)\,d\thetap(θ∣x)=p(x)p(x∣θ)p(θ)​,p(x)=∫p(x∣θ)p(θ)dθ
Hypothesis Test
Decision on effect
Gene expression DE tests
H0:μ1=μ2H_0: \mu_1 = \mu_2H0​:μ1​=μ2​
t-test
Mean difference test
Differential gene expression
t=xˉ1−xˉ2sp1n1+1n2t = \frac{\bar{x}_1 - \bar{x}_2}{s_p\sqrt{\frac{1}{n_1}+\frac{1}{n_2}}}t=sp​n1​1​+n2​1​​xˉ1​−xˉ2​​
Chi-square test
Goodness of fit
Contingency tables in genomics
χ2=∑(Oi−Ei)2Ei\chi^2 = \sum \frac{(O_i - E_i)^2}{E_i}χ2=∑Ei​(Oi​−Ei​)2​
FDR / BH Procedure
Controls multiple testing
GWAS, RNA-seq DE genes
qi=min⁡(pimi,1)q_i = \min\left(\frac{p_i m}{i},1\right)qi​=min(ipi​m​,1)
Bootstrap
Resampling with replacement
CI for small-sample experiments
θ∗=f(X∗){\theta}^* = f(X^*)θ∗=f(X∗)
Entropy
Uncertainty of distribution
Cell diversity, motif randomness
H(p)=−∑ipilog⁡piH(p) = -\sum_i p_i \log p_iH(p)=−∑i​pi​logpi​
Cross-Entropy
Divergence between true & model
Loss for classifiers
H(p,q)=−∑ipilog⁡qiH(p,q) = -\sum_i p_i \log q_iH(p,q)=−∑i​pi​logqi​
KL Divergence
Relative entropy
Compare distributions (healthy vs disease)
DKL(p∥q)=∑ipilog⁡piqiD_{KL}(p\|q) = \sum_i p_i \log \frac{p_i}{q_i}DKL​(p∥q)=∑i​pi​logqi​pi​​
Jensen–Shannon Divergence
Symmetrized KL
Compare embeddings
DJS(p∥q)=12DKL(p∥m)+12DKL(q∥m),m=12(p+q)m=12(p+q)D_{JS}(p\|q) = \tfrac{1}{2}D_{KL}(p\|m)+\tfrac{1}{2}D_{KL}(q\|m), m=12(p+q)m=\tfrac{1}{2}(p+q)DJS​(p∥q)=21​DKL​(p∥m)+21​DKL​(q∥m),m=12(p+q)m=21​(p+q)
Mutual Information
Shared info between vars
Regulatory network inference
I(X;Y)=∑x,yp(x,y)log⁡p(x,y)p(x)p(y)I(X;Y) = \sum_{x,y} p(x,y)\log\frac{p(x,y)}{p(x)p(y)}I(X;Y)=∑x,y​p(x,y)logp(x)p(y)p(x,y)​
Perplexity
Exponential of entropy
Quality of embeddings, models
Perp(p)=2H(p){Perp}(p) = 2^{H(p)}Perp(p)=2H(p)
Information Bottleneck
Tradeoff compression vs. relevance
Latent representation of cell states
min⁡I(X;Z)−βI(Z;Y)\min I(X;Z) - \beta I(Z;Y)minI(X;Z)−βI(Z;Y)

Optimization and ML Primitives

Component
Definition
In Biology / Computing
Math Variables
Loss Function
Scalar measure of prediction error
Training models for gene expression, structure prediction
L(θ)=1N∑iℓ(fθ(xi),yi)L(\theta) = \frac{1}{N}\sum_i \ell(f_\theta(x_i), y_i)L(θ)=N1​∑i​ℓ(fθ​(xi​),yi​)
Mean Squared Error (MSE)
Average squared error
Regression tasks, kinetics fitting
L=1n∑i(yi−y^i)2L = \frac{1}{n}\sum_i (y_i - \hat{y}_i)^2L=n1​∑i​(yi​−y^​i​)2
Cross-Entropy Loss
Divergence between distributions
Classification, motif recognition
L=−∑iyilog⁡y^iL = -\sum_i y_i \log \hat{y}_iL=−∑i​yi​logy^​i​
Hinge Loss
Margin-based loss
Support vector machines, binary classification
L=max⁡(0,1−yf(x))L = \max(0, 1 - y f(x))L=max(0,1−yf(x))
Regularization
Penalty term to avoid overfitting
Control model complexity
L′=L+λ∥w∥pL' = L + \lambda \|w\|_pL′=L+λ∥w∥p​
L1 Regularization
Promotes sparsity
Feature selection in omics
min⁡w  L(w)+λ∥w∥1,∥w∥1=∑i=1d∣wi∣\min_{w}\; L(w)+\lambda\|w\|{1},\qquad \|w\|{1}=\sum_{i=1}^{d}|w_i|minw​L(w)+λ∥w∥1,∥w∥1=∑i=1d​∣wi​∣
L2 Regularization
Penalizes large weights
Ridge regression, weight decay
∥w∥22=∑iwi2\|w\|_2^2 = \sum_i w_i^2∥w∥22​=∑i​wi2​
Gradient Descent
Iterative optimization step
Neural nets, ODE parameter fitting
θt+1=θt−η∇θL\theta_{t+1} = \theta_t - \eta \nabla_\theta Lθt+1​=θt​−η∇θ​L
Stochastic Gradient Descent (SGD)
Uses minibatches for updates
Large-scale models on bio data
θt+1=θt−η∇θL(θ;xi)\theta_{t+1} = \theta_t - \eta \nabla_\theta L(\theta; x_i)θt+1​=θt​−η∇θ​L(θ;xi​)
Momentum
Uses past updates to accelerate
Faster convergence in training
vt+1=βvt+∇θL, θt+1=θt−ηvt+1v_{t+1} = \beta v_t + \nabla_\theta L, \ \theta_{t+1} = \theta_t - \eta v_{t+1}vt+1​=βvt​+∇θ​L, θt+1​=θt​−ηvt+1​
Adam Optimizer
Adaptive moment estimation
Standard in DL training
mt=β1mt−1+(1−β1)gt, vt=β2vt−1+(1−β2)gt2m_t = \beta_1 m_{t-1} + (1-\beta_1)g_t, \ v_t = \beta_2 v_{t-1}+(1-\beta_2)g_t^2mt​=β1​mt−1​+(1−β1​)gt​, vt​=β2​vt−1​+(1−β2​)gt2​
L-BFGS
Quasi-Newton optimization
Energy minimization in proteins
Updates use inverse Hessian approximation
Conjugate Gradient
Iterative quadratic solver
Sparse system solvers in genomics
xk+1=xk+αkpkx_{k+1} = x_k + \alpha_k p_kxk+1​=xk​+αk​pk​
Coordinate Descent
Optimizes one variable at a time
LASSO, constrained models
xi(t+1)=arg⁡min⁡f(x1,…,xi,… )x_i^{(t+1)} = \arg\min f(x_1,\dots,x_i,\dots)xi(t+1)​=argminf(x1​,…,xi​,…)
Lagrangian Multipliers
Optimization with constraints
Flux balance analysis
L(x,λ)=f(x)+λg(x)\mathcal{L}(x,\lambda) = f(x) + \lambda g(x)L(x,λ)=f(x)+λg(x)
KKT Conditions
Optimality for constrained optimization
Biochemical flux solutions
Stationarity, primal/dual feasibility, complementarity
Proximal Operator
Handles non-smooth penalties
Sparse regression, TV denoising
proxλf(v)=arg⁡min⁡x(f(x)+12λ∥x−v∥2)\text{prox}_{\lambda f}(v) = \arg\min_x \big(f(x) + \tfrac{1}{2\lambda}\|x-v\|^2\big)proxλf​(v)=argminx​(f(x)+2λ1​∥x−v∥2)
Expectation-Maximization (EM)
Iterative latent variable inference
Mixture models, cell deconvolution
E-step: Q(θ)Q(\theta)Q(θ), M-step: maximize QQQ
k-Means
Clustering by minimizing within-cluster variance
Cell type clustering
arg⁡min⁡∑i∥xi−μc(i)∥2\arg\min \sum_i \|x_i - \mu_{c(i)}\|^2argmin∑i​∥xi​−μc(i)​∥2
Gaussian Mixture Model (GMM)
Soft clustering with Gaussians
Expression distributions
p(x)=∑k=1Kπk N(x∣μk,Σk)p(x)=\sum_{k=1}^{K}\pi_k\,\mathcal{N}(x\mid \mu_k,\Sigma_k)p(x)=∑k=1K​πk​N(x∣μk​,Σk​) γik=πk N(xi∣μk,Σk)∑j=1Kπj N(xi∣μj,Σj)\gamma_{ik}=\dfrac{\pi_k\,\mathcal{N}(x_i\mid \mu_k,\Sigma_k)} {\sum_{j=1}^{K}\pi_j\,\mathcal{N}(x_i\mid \mu_j,\Sigma_j)}γik​=∑j=1K​πj​N(xi​∣μj​,Σj​)πk​N(xi​∣μk​,Σk​)​
Softmax
Converts scores to probabilities
Classification, attention weights
softmax(zi)=ezi∑jezj\text{softmax}(z_i) = \frac{e^{z_i}}{\sum_j e^{z_j}}softmax(zi​)=∑j​ezj​ezi​​
ReLU
Nonlinear activation
Neural networks
f(x)=max⁡(0,x)f(x) = \max(0, x)f(x)=max(0,x)
Sigmoid
Squashes to (0,1)
Logistic regression, gating
σ(x)=11+e−x\sigma(x) = \frac{1}{1+e^{-x}}σ(x)=1+e−x1​
Tanh
Squashes to (-1,1)
Normalized activations
tanh⁡(x)=ex−e−xex+e−x\tanh(x) = \frac{e^x - e^{-x}}{e^x + e^{-x}}tanh(x)=ex+e−xex−e−x​
Attention Mechanism
Weighted combination of inputs
Protein sequence models, embeddings
Attn(Q,K,V)=softmax ⁣(QKTdk)V\text{Attn}(Q,K,V) = \text{softmax}\!\left(\frac{QK^T}{\sqrt{d_k}}\right)VAttn(Q,K,V)=softmax(dk​​QKT​)V
Contrastive Loss
Brings similar pairs together
Align cell embeddings, protein-ligand
L=−log⁡esim(xi,xj)/τ∑kesim(xi,xk)/τL = -\log \frac{e^{\text{sim}(x_i,x_j)/\tau}}{\sum_k e^{\text{sim}(x_i,x_k)/\tau}}L=−log∑k​esim(xi​,xk​)/τesim(xi​,xj​)/τ​
VAE ELBO
Lower bound for latent variable models
Generative scRNA-seq
LELBO(x)=Eqϕ(z∣x) ⁣[log⁡pθ(x∣z)]−DKL ⁣(qϕ(z∣x) ∥ p(z))\mathcal{L}{\mathrm{ELBO}}(x)= \mathbb{E}{q_{\phi}(z\mid x)}\!\left[\log p_{\theta}(x\mid z)\right] - D_{\mathrm{KL}}\!\big(q_{\phi}(z\mid x)\,\|\,p(z)\big)LELBO(x)=Eqϕ​(z∣x)[logpθ​(x∣z)]−DKL​(qϕ​(z∣x)∥p(z))
Diffusion SDE
Forward corruption process
Generative protein design
dx=f(x,t) dt+g(t) dWtdx = f(x,t)\,dt + g(t)\,dW_tdx=f(x,t)dt+g(t)dWt​
Graph Laplacian
Encodes connectivity
Protein–protein interaction networks
L=D−AL = D - AL=D−A
Message Passing
Node embedding updates
GNNs for molecules
hv(t+1)=σ ⁣(W⋅AGG{hu(t):u∈N(v)})h_v^{(t+1)} = \sigma\!\left(W \cdot \text{AGG}\{h_u^{(t)}:u \in N(v)\}\right)hv(t+1)​=σ(W⋅AGG{hu(t)​:u∈N(v)})
Equivariance (SE(3))
Preserves geometric symmetries
Protein structure prediction
f(Rx)=Rf(x)f(Rx) = R f(x)f(Rx)=Rf(x)
Clustering Modularity
Graph-based community detection
Gene co-expression networks
Q=12m∑ij(Aij−kikj2m)δ(ci,cj)Q = \frac{1}{2m}\sum_{ij}\Big(A_{ij} - \frac{k_i k_j}{2m}\Big)\delta(c_i,c_j)Q=2m1​∑ij​(Aij​−2mki​kj​​)δ(ci​,cj​)

Dynamics (ODEs, PDEs, Stochastic Processes, Control/RL)

Component
Definition
In Biology / Computing
Math Variables
Ordinary Differential Equation (ODE)
Time evolution of variables
Gene circuits, pharmacokinetics
dxdt=f(x,t)\frac{dx}{dt} = f(x,t)dtdx​=f(x,t)
System of ODEs
Multiple interacting variables
Pathways, population dynamics
dxdt=F(x,t)\frac{d\mathbf{x}}{dt} = F(\mathbf{x},t)dtdx​=F(x,t)
Partial Differential Equation (PDE)
Evolution over space + time
Diffusion of molecules, electrophysiology
∂u∂t=D∇2u\frac{\partial u}{\partial t} = D \nabla^2 u∂t∂u​=D∇2u
Heat Equation
Diffusion PDE
Ion transport, morphogen gradients
∂u∂t=α∇2u\frac{\partial u}{\partial t} = \alpha \nabla^2 u∂t∂u​=α∇2u
Wave Equation
Propagation dynamics
Nerve impulses, biomechanics
∂2u∂t2=c2∇2u\frac{\partial^2 u}{\partial t^2} = c^2 \nabla^2 u∂t2∂2u​=c2∇2u
Poisson Equation
Potential field equation
Electrostatics in proteins
−∇2u=f-\nabla^2 u = f−∇2u=f
Lotka–Volterra
Predator-prey dynamics
Species competition, host-virus
dxdt=αx−βxy, dydt=δxy−γy\frac{dx}{dt} = \alpha x - \beta xy, \ \frac{dy}{dt} = \delta xy - \gamma ydtdx​=αx−βxy, dtdy​=δxy−γy
Michaelis–Menten Kinetics
Enzyme rate law
Metabolic modeling
v=Vmax⁡[S]KM+[S]v = \frac{V_{\max}[S]}{K_M + [S]}v=KM​+[S]Vmax​[S]​
Hill Equation
Cooperative binding
Gene regulation, transcription factors
θ=[L]nKdn+[L]n\theta = \frac{[L]^n}{K_d^n + [L]^n}θ=Kdn​+[L]n[L]n​
Mass-Action Kinetics
Reaction rate ∝ concentrations
Systems biology, stoichiometric models
v=k∏i[Xi]νiv = k \prod_i [X_i]^{\nu_i}v=k∏i​[Xi​]νi​
Gillespie Algorithm
Stochastic simulation of reactions
Single-cell variability
Generates trajectories via random exponential waiting times
Markov Chain (transition + distribution update)
Memoryless state transitions
Mutation models, sequence evolution
Pr⁡(Xt+1=j∣Xt=i)=Pij,πt+1⊤=πt⊤P\Pr(X_{t+1}=j\mid X_t=i)=P_{ij}, \qquad \pi_{t+1}^{\top}=\pi_{t}^{\top}PPr(Xt+1​=j∣Xt​=i)=Pij​,πt+1⊤​=πt⊤​P
Hidden Markov Model (HMM)
Latent-state probabilistic model, joint likelihood
Gene finding, protein domains
p(x1:T,z1:T)=p(z1) ∏t=2Tp(zt∣zt−1) ∏t=1Tp(xt∣zt)p(x_{1:T},z_{1:T})=p(z_1)\, \prod_{t=2}^{T}p(z_t\mid z_{t-1})\, \prod_{t=1}^{T}p(x_t\mid z_t)p(x1:T​,z1:T​)=p(z1​)∏t=2T​p(zt​∣zt−1​)∏t=1T​p(xt​∣zt​)
Stochastic Differential Equation (SDE)
Dynamics with noise
Noisy gene expression, molecular motion
dx=f(x,t) dt+g(x,t) dWtdx = f(x,t)\,dt + g(x,t)\,dW_tdx=f(x,t)dt+g(x,t)dWt​
Ornstein–Uhlenbeck Process
Mean-reverting SDE
Noise in biophysical systems
dx=θ(μ−x)dt+σdWtdx = \theta(\mu - x)dt + \sigma dW_tdx=θ(μ−x)dt+σdWt​
Random Walk
Successive random steps
Diffusion models, genome scans
Xt+1=Xt+ϵtX_{t+1} = X_t + \epsilon_tXt+1​=Xt​+ϵt​
Brownian Motion
Continuous random process
Molecular dynamics
W(t)∼N(0,t)W(t) \sim \mathcal{N}(0, t)W(t)∼N(0,t)
Control System
Regulates state of system
Bioreactor control, feedback
x˙=Ax+Bu, y=Cx\dot{x} = Ax+Bu, \ y=Cxx˙=Ax+Bu, y=Cx
PID Controller
Proportional-integral-derivative control
Lab robotics, process stabilization
u(t)=Kpe(t)+Ki∫e(τ)dτ+Kddedtu(t) = K_p e(t) + K_i \int e(\tau) d\tau + K_d \frac{de}{dt}u(t)=Kp​e(t)+Ki​∫e(τ)dτ+Kd​dtde​
Model Predictive Control (MPC)
Optimization-based control
Dynamic bioprocess optimization
min⁡u∑k=0N∥xk−x∗∥Q2+∥uk∥R2\min_u \sum_{k=0}^N \|x_k - x^\ast\|_Q^2 + \|u_k\|_R^2minu​∑k=0N​∥xk​−x∗∥Q2​+∥uk​∥R2​
Markov Decision Process (MDP)
Sequential decision framework
Adaptive experiment design
⟨S,A,P,R,γ⟩\langle S, A, P, R, \gamma \rangle⟨S,A,P,R,γ⟩
Bellman Equation
Recursive value definition
RL-based model design
V(s)=max⁡a[r(s,a)+γ∑s′P(s′∣s,a) V(s′)]V^{}(s)=\max_{a}\Big[r(s,a)+\gamma\sum_{s'}P(s'\mid s,a)\,V^{}(s')\Big]V(s)=maxa​[r(s,a)+γ∑s′​P(s′∣s,a)V(s′)]
Policy Gradient
Optimizes expected reward
Reinforcement learning in biology
∇θJ(θ)=Eπθ ⁣[∑t∇θlog⁡πθ(at∣st) At]\nabla_{\theta}J(\theta)= \mathbb{E}{\pi{\theta}}\!\left[\sum_{t} \nabla_{\theta}\log \pi_{\theta}(a_t\mid s_t)\,A_t\right]∇θ​J(θ)=Eπθ[∑t​∇θ​logπθ​(at​∣st​)At​]
Advantage Estimation (GAE)
Variance-reduced estimator
Efficient policy training
At=∑l=0∞(γλ)lδt+lA_t = \sum_{l=0}^\infty (\gamma \lambda)^l \delta_{t+l}At​=∑l=0∞​(γλ)lδt+l​
PPO (Proximal Policy Optimization)
RL with clipped objective
Safe training in bio RL models
LCLIP=E[min⁡(rt(θ)At,clip(rt(θ),1−ϵ,1+ϵ)At)]L^{CLIP} = \mathbb{E}[\min(r_t(\theta)A_t, \text{clip}(r_t(\theta),1-\epsilon,1+\epsilon)A_t)]LCLIP=E[min(rt​(θ)At​,clip(rt​(θ),1−ϵ,1+ϵ)At​)]
SAC (Soft Actor-Critic)
RL with entropy maximization
Exploration in protein design
J(π)=∑tE(st,at)∼π ⁣[r(st,at)+α H(π(⋅∣st))],H(π(⋅∣s))=− ⁣∫π(a∣s)log⁡π(a∣s) daJ(\pi)=\sum_{t}\mathbb{E}_{(s_t,a_t)\sim \pi}\!\left[ r(s_t,a_t)+\alpha\,\mathcal{H}\big(\pi(\cdot\mid s_t)\big)\right],\quad \mathcal{H}(\pi(\cdot\mid s))=-\!\int \pi(a\mid s)\log \pi(a\mid s)\,daJ(π)=∑t​E(st​,at​)∼π​[r(st​,at​)+αH(π(⋅∣st​))],H(π(⋅∣s))=−∫π(a∣s)logπ(a∣s)da

Signal Processing, Numerical Methods & HPC

Component
Definition
In Biology / Computing
Math Variables
Convolution
Weighted overlap of functions
Motif scanning, microscopy filtering
(f∗g)(t)=∫f(τ)g(t−τ) dτ(f*g)(t) = \int f(\tau)g(t-\tau)\,d\tau(f∗g)(t)=∫f(τ)g(t−τ)dτ
Correlation
Similarity via shifted overlap
Template matching in sequences
(f⋆g)(t)=∫f(τ)g(t+τ) dτ(f \star g)(t) = \int f(\tau)g(t+\tau)\,d\tau(f⋆g)(t)=∫f(τ)g(t+τ)dτ
Fourier Transform
Decomposes into frequencies
MRI k-space, EEG
f^(ω)=∫f(t)e−iωtdt\hat f(\omega) = \int f(t) e^{-i\omega t} dtf^​(ω)=∫f(t)e−iωtdt
Discrete Fourier Transform (DFT)
Finite-sample version
Sequence periodicity detection
Xk=∑n=0N−1xne−i2πkn/NX_k = \sum_{n=0}^{N-1} x_n e^{-i2\pi kn/N}Xk​=∑n=0N−1​xn​e−i2πkn/N
Fast Fourier Transform (FFT)
Efficient DFT algorithm
Bio-signal analysis at scale
O(Nlog⁡N)O(N \log N)O(NlogN) complexity
Wavelet Transform
Time–frequency decomposition
Microscopy image denoising
W(a,b)=∫f(t)ψ ⁣(t−ba)dtW(a,b) = \int f(t)\psi\!\left(\frac{t-b}{a}\right)dtW(a,b)=∫f(t)ψ(at−b​)dt
Radon Transform
Line integrals of function
CT reconstruction
Rf(θ,s)=∫f(x,y) δ(s−xcos⁡θ−ysin⁡θ) dxdyRf(\theta,s) = \int f(x,y)\,\delta(s-x\cos\theta-y\sin\theta)\,dxdyRf(θ,s)=∫f(x,y)δ(s−xcosθ−ysinθ)dxdy
Filter (Low/High-pass)
Signal smoothing or sharpening
Noise reduction in time series
Frequency cutoff: H(ω)H(\omega)H(ω)
Wiener Filter
Linear MMSE estimator
Signal denoising
H(ω)=Sxx(ω)Sxx(ω)+Snn(ω)H(\omega) = \frac{S_{xx}(\omega)}{S_{xx}(\omega)+S_{nn}(\omega)}H(ω)=Sxx​(ω)+Snn​(ω)Sxx​(ω)​
Kalman Filter
Recursive state estimator
Tracking cell motion
Predict:x^k∣k−1=A x^k−1∣k−1+B uk,Pk∣k−1=A Pk−1∣k−1A⊤+QUpdate:Kk=Pk∣k−1H⊤(HPk∣k−1H⊤+R)−1x^k∣k=x^k∣k−1+Kk(yk−Hx^k∣k−1),Pk∣k=(I−KkH)Pk∣k−1\textbf{Predict:}\quad \hat x_{k\mid k-1}=A\,\hat x_{k-1\mid k-1}+B\,u_k,\qquad P_{k\mid k-1}=A\,P_{k-1\mid k-1}A^{\top}+Q \textbf{Update:}\quad K_k=P_{k\mid k-1}H^{\top}\big(H P_{k\mid k-1}H^{\top}+R\big)^{-1} \\ \hat x_{k\mid k}=\hat x_{k\mid k-1}+K_k\big(y_k-H\hat x_{k\mid k-1}\big),\qquad P_{k\mid k}=(I-K_k H)P_{k\mid k-1}Predict:x^k∣k−1​=Ax^k−1∣k−1​+Buk​,Pk∣k−1​=APk−1∣k−1​A⊤+QUpdate:Kk​=Pk∣k−1​H⊤(HPk∣k−1​H⊤+R)−1x^k∣k​=x^k∣k−1​+Kk​(yk​−Hx^k∣k−1​),Pk∣k​=(I−Kk​H)Pk∣k−1​
Particle Filter
Sequential Monte Carlo
Nonlinear/noisy tracking
Approximate posterior via particles
Total Variation (TV) Denoising
Penalizes gradient magnitude
Microscopy deblurring
min⁡x∥x−y∥2+λ∥∇x∥1\min_x \|x-y\|^2 + \lambda \|\nabla x\|_1minx​∥x−y∥2+λ∥∇x∥1​
Compressed Sensing
Recovery from undersampling
Accelerated MRI
min⁡x∥x∥1 s.t. Ax=b\min_x \|x\|_1 \ \text{s.t.}\ Ax=bminx​∥x∥1​ s.t. Ax=b
Finite Difference
Approx derivative by discretization
PDE solvers
∂u∂x≈u(x+h)−u(x)h\frac{\partial u}{\partial x} \approx \frac{u(x+h)-u(x)}{h}∂x∂u​≈hu(x+h)−u(x)​
Finite Element Method (FEM)
Domain discretization into elements
Biomechanics, electrophysiology
Weak form: ∫Ω∇u⋅∇v dx\int_\Omega \nabla u \cdot \nabla v \,dx∫Ω​∇u⋅∇vdx
Finite Volume Method
Conserves fluxes per cell
Transport models in tissues
ddt∫Ωu dx+∫∂ΩF⋅n ds=0\frac{d}{dt}\int_\Omega u\,dx + \int_{\partial\Omega} F\cdot n \,ds = 0dtd​∫Ω​udx+∫∂Ω​F⋅nds=0
Monte Carlo Integration
Random sampling for integrals
Partition functions, uncertainty
I≈1N∑f(xi)I \approx \frac{1}{N}\sum f(x_i)I≈N1​∑f(xi​)
Importance Sampling
Weighted MC estimates
Rare-event modeling
I=1N∑f(xi)q(xi)I = \frac{1}{N}\sum \frac{f(x_i)}{q(x_i)}I=N1​∑q(xi​)f(xi​)​
Quasi-Monte Carlo
Low-discrepancy sequences
Faster convergence for high-dim integrals
Uses Sobol / Halton sequences
Automatic Differentiation
Programmatic derivative
Training ML models
Forward & reverse mode dydx\frac{dy}{dx}dxdy​
Backpropagation
Reverse-mode autodiff
Neural nets in bio
∂L∂w\frac{\partial L}{\partial w}∂w∂L​ via chain rule
Roofline Model
Performance vs. arithmetic intensity
Kernel optimization
FLOPs/byte tradeoff
Amdahl’s Law
Parallelism speedup bound
Multi-core scaling limits
S=1(1−p)+p/NS = \frac{1}{(1-p)+p/N}S=(1−p)+p/N1​
Gustafson’s Law
Scaling efficiency with workload
HPC bio pipelines
S=N−(N−1)(1−p)S = N - (N-1)(1-p)S=N−(N−1)(1−p)
Memory Bandwidth Limit
Bytes/sec bottleneck
GPU genomics kernels
Throughput = min(compute, memory BW)
SIMD / GPU Parallelism
Single-instruction, many data
K-mer counting, alignment
Vector ops per cycle
Sparse Matrix Ops
Efficient storage/computation
Genome graphs, scRNA matrices
Formats: CSR, COO

Bioinformatics and Sequence Mathematics

Component
Definition
In Biology / Computing
Math Variables
k-mer
Substring of length kk
Genome comparison, sequence hashing
s[i:i+k]s[i:i+k]s[i:i+k]
Jaccard Index
Set similarity measure
Genome sketching, assembly comparison
J(A,B)=∣A∩B∣∣A∪B∣J(A,B)=\dfrac{|A\cap B|}{|A\cup B|}J(A,B)=∣A∪B∣∣A∩B∣​
MinHash
Fast Jaccard approximation
Large-scale sequence similarity
Randomized hashing of k-mers
Count-Min Sketch
Probabilistic frequency table
Streaming k-mer counts
f~(i)=min⁡jC[j,hj(i)]\tilde f(i) = \min_j C[j,h_j(i)]f~​(i)=minj​C[j,hj​(i)]
Hamming Distance
Number of differing positions
DNA barcode error correction
dH(x,y)=∑i[xi≠yi]d_H(x,y) = \sum_i [x_i \ne y_i]dH​(x,y)=∑i​[xi​=yi​]
Levenshtein Distance
Edit distance (insert/del/sub)
Sequence alignment
Minimum edits to transform x→yx \to y
Smith–Waterman
Local alignment DP
Short sequence homology
Fi,j=max⁡{0,Fi−1,j−1+s,Fi−1,j−d,Fi,j−1−d}F_{i,j} = \max\{0, F_{i-1,j-1}+s, F_{i-1,j}-d, F_{i,j-1}-d\}Fi,j​=max{0,Fi−1,j−1​+s,Fi−1,j​−d,Fi,j−1​−d}
Needleman–Wunsch
Global alignment DP
Genome alignment
Fi,j=max⁡{Fi−1,j−1+s,Fi−1,j−d,Fi,j−1−d}F_{i,j} = \max\{F_{i-1,j-1}+s, F_{i-1,j}-d, F_{i,j-1}-d\}Fi,j​=max{Fi−1,j−1​+s,Fi−1,j​−d,Fi,j−1​−d}
Gotoh Algorithm
Alignment with affine gaps
Realistic indel scoring
Gap cost g+keg + k eg+ke
Substitution Matrix
Scoring amino acid swaps
BLOSUM, PAM
S(a,b)=log⁡P(a,b)P(a)P(b)S(a,b) = \log \frac{P(a,b)}{P(a)P(b)}S(a,b)=logP(a)P(b)P(a,b)​
Position Weight Matrix (PWM)
Motif probability matrix
TF binding site prediction
Pi(b)_{i}(b)i​(b), where bbb is base
Hidden Markov Model (Profile HMM)
Motif/sequence family model
Protein domains
Transition + emission probabilities
Suffix Array
Sorted suffix positions
Fast substring search
SASASA = sorted indices of suffixes
Suffix Tree
Tree of suffixes
Genome indexing
Nodes = substrings
Burrows–Wheeler Transform (BWT)
Reversible string transform
Basis of read mappers
Last column of sorted rotations
FM-Index
Compressed substring index
Read mapping
Supports O(m)O(m)O(m) pattern matching
De Bruijn Graph
k-mer graph structure
Genome assembly
Nodes = k-1-mers, edges = k-mers
Eulerian Path
Traverses each edge once
Assembly from k-mers
Exists if in-degree = out-degree
Hamiltonian Path
Traverses each node once
Overlap-layout assembly
NP-hard
Phred Score
Log-scaled error probability
Sequencing quality
Q=−10log⁡10pQ = -10\log_{10} pQ=−10log10​p
Codon Usage Bias
Frequency of synonymous codons
Expression optimization
CAI, tAI formulas
Codon Adaptation Index (CAI)
Expression potential metric
Gene design
CAI=(∏iwi)1/L\text{CAI} = \left(\prod_i w_i\right)^{1/L}CAI=(∏i​wi​)1/L
tRNA Adaptation Index (tAI)
Translation efficiency score
Synthetic biology
Based on tRNA availability
GC Content
Fraction of G+C bases
Genomic stability
G+CA+T+G+C\frac{G+C}{A+T+G+C}A+T+G+CG+C​
k-mer Spectrum
Histogram of k-mer counts
Detect heterozygosity, repeats
f(k)f(k)f(k) distribution
Sequence Entropy
Information content of sequence
Motif conservation
H=−∑pblog⁡pbH = -\sum p_b \log p_bH=−∑pb​logpb​
Motif Scanning (Convolution)
PWM convolution across genome
Regulatory site finding
Score=∑ilog⁡Pi(xi)Score = \sum_i \log P_i(x_i)Score=∑i​logPi​(xi​)
BLAST Scoring
Heuristic local alignment
Homology search
Uses seed-and-extend + substitution matrices
Phylogenetic Tree
Evolutionary tree model
Ancestral inference
Distance or likelihood based
Felsenstein Pruning Algorithm
Likelihood computation on tree
Phylogenetic likelihoods
Dynamic programming over nodes

Systems Biology, Structural Biology & Population Genetics

Component
Definition
In Biology / Computing
Math Variables
Stoichiometric Matrix (S)
Encodes reaction network
Flux balance analysis (FBA)
S⋅v=0S \cdot v = 0S⋅v=0
Flux Balance Analysis (FBA)
Linear optimization on S
Metabolic pathway prediction
maxcTv s.t. Sv=0, l≤v≤umax c^T v \ \text{s.t.}\ S v = 0, \ l \le v \le umaxcTv s.t. Sv=0, l≤v≤u
Flux Variability Analysis (FVA)
Range of feasible fluxes
Robustness of metabolism
Optimizes min/max viv_i under FBA constraints
Parsimonious FBA (pFBA)
Minimizes total flux
Efficient metabolic solutions
min⁡v  ∑i∣vi∣s.t.Sv=0,  vmin⁡≤v≤vmax⁡\min_{v}\;\sum_{i}|v_i| \quad \text{s.t.}\quad S v=0,\; v_{\min}\le v\le v_{\max}minv​∑i​∣vi​∣s.t.Sv=0,vmin​≤v≤vmax​
Metabolic Control Analysis
Quantifies control coefficients
Sensitivity in pathways
CiJ=∂ln⁡J∂ln⁡EiC^J_i = \frac{\partial \ln J}{\partial \ln E_i}CiJ​=∂lnEi​∂lnJ​
Michaelis–Menten
Enzyme kinetics
Reaction velocity
v=Vmax⁡[S]KM+[S]v = \frac{V_{\max}[S]}{K_M+[S]}v=KM​+[S]Vmax​[S]​
Hill Equation
Cooperative binding
TF–DNA regulation
θ=[L]nKdn+[L]n\theta = \frac{[L]^n}{K_d^n+[L]^n}θ=Kdn​+[L]n[L]n​
Mass-Action Law
Rate ∝ reactant concentrations
Reaction network modeling
v=k∏i[Xi]νiv = k\prod_i [X_i]^{\nu_i}v=k∏i​[Xi​]νi​
Arrhenius Equation
Temp dependence of rate
Biochemical kinetics
k=Ae−Ea/RTk = A e^{-E_a/RT}k=Ae−Ea​/RT
Eyring Equation
Transition state theory
Reaction thermodynamics
k=kBThe−ΔG‡/RTk = \frac{k_B T}{h} e^{-\Delta G^\ddagger /RT}k=hkB​T​e−ΔG‡/RT
Gibbs Free Energy
ΔG predicts spontaneity
Protein folding, binding
ΔG=ΔH−TΔS\Delta G = \Delta H - T\Delta SΔG=ΔH−TΔS
Binding Equilibrium
Ligand–receptor affinity
Protein–drug interactions
Kd=[P][L][PL]K_d = \frac{[P][L]}{[PL]}Kd​=[PL][P][L]​
ΔG–Kd Relation
Thermodynamic link
Quantifying binding strength
ΔG=−RTln⁡Kd\Delta G = -RT \ln K_dΔG=−RTlnKd​
Force Fields
Energy functions in MD
Protein simulations
E=Ebond+Eangle+Etorsion+EvdW+EelecE = E_{bond}+E_{angle}+E_{torsion}+E_{vdW}+E_{elec}E=Ebond​+Eangle​+Etorsion​+EvdW​+Eelec​
Lennard–Jones Potential
van der Waals model
Molecular packing
V(r)=4ϵ[(σ/r)12−(σ/r)6]V(r)=4\epsilon[(\sigma/r)^{12}-(\sigma/r)^6]V(r)=4ϵ[(σ/r)12−(σ/r)6]
Coulomb’s Law
Electrostatic interactions
Charged biomolecules
F=kq1q2r2F = \frac{kq_1 q_2}{r^2}F=r2kq1​q2​​
Ewald/PME Summation
Long-range electrostatics
Protein MD
Splits short/long-range terms
Root Mean Square Deviation (RMSD)
Structure difference metric
Protein structure evaluation
RMSD=1N∑i∥xi−yi∥2\text{RMSD} = \sqrt{\frac{1}{N}\sum_i \|x_i-y_i\|^2}RMSD=N1​∑i​∥xi​−yi​∥2​
TM-score
Protein structural similarity
Structure prediction accuracy
TM=max⁡(1L∑11+(di/d0(L))2)\text{TM} = \max\left(\frac{1}{L}\sum \frac{1}{1+(d_i/d_0(L))^2}\right)TM=max(L1​∑1+(di​/d0​(L))21​)
Contact Map
Binary residue contacts
Folding, docking models
Cij=[dij<δ]C_{ij} = [d_{ij} < \delta]Cij​=[dij​<δ]
Ramachandran Plot
φ–ψ torsional space
Protein conformational analysis
Allowed regions of ϕ,ψ\phi, \psiϕ,ψ
Rotamer Library
Discrete side-chain conformations
Protein modeling
Probabilities over torsion states
Free Energy Perturbation (FEP)
ΔΔG between states
Binding affinity prediction
ΔG=−kBTln⁡⟨e−ΔU/kBT⟩\Delta G = -k_BT \ln \langle e^{-\Delta U/k_BT}\rangleΔG=−kB​Tln⟨e−ΔU/kB​T⟩
Thermodynamic Integration (TI)
Computes ΔG via λ interpolation
Drug design
ΔG=∫01⟨∂U/∂λ⟩λdλ\Delta G = \int_0^1 \langle \partial U/\partial \lambda \rangle_\lambda d\lambdaΔG=∫01​⟨∂U/∂λ⟩λ​dλ
MBAR
Multi-state free-energy estimator
Protein/ligand ΔΔG
Weighted combination of samples
Hardy–Weinberg Equilibrium
Allele frequency model
Population genetics
p2+2pq+q2=1p^2+2pq+q^2=1p2+2pq+q2=1
Wright–Fisher Model
Genetic drift in finite pops
Allele frequency variance
Binomial sampling of alleles each gen
Moran Model
Overlapping-gen population model
Drift, fixation
One birth + one death per step
Coalescent Theory
Backward-time genealogy
Ancestral allele inference
Distribution of coalescent times
Fixation Probability
Probability allele becomes fixed
Selection vs drift
Pfix≈1−e−2s1−e−2NsP_{\text{fix}} \approx \frac{1-e^{-2s}}{1-e^{-2Ns}}Pfix​≈1−e−2Ns1−e−2s​
Substitution Models
Models nucleotide changes
Phylogenetics
JC69, HKY, GTR matrices
Felsenstein Pruning
Likelihood on trees
Phylogenetic inference
Recursive likelihood computation

Evaluation, Scaling & Cryptography

(includes some quantum)

Component
Definition
In Biology / Computing
Math Variables
Accuracy
Fraction of correct predictions
Classifier evaluation
TP+TNTP+FP+FN+TN\frac{TP+TN}{TP+FP+FN+TN}TP+FP+FN+TNTP+TN​
Precision
Correct positives / all positives
Gene variant calling
TPTP+FP\frac{TP}{TP+FP}TP+FPTP​
Recall (Sensitivity)
True positives / actual positives
Rare mutation detection
TPTP+FN\frac{TP}{TP+FN}TP+FNTP​
Specificity
True negatives / actual negatives
Diagnostic screening
TNTN+FP\frac{TN}{TN+FP}TN+FPTN​
F1 Score
Harmonic mean of precision & recall
Balancing bio classifier performance
2⋅PRP+R2\cdot\frac{PR}{P+R}2⋅P+RPR​
ROC Curve / AUC
Tradeoff sensitivity vs specificity
Diagnostic classifiers
AUCAUCAUC = area under curve
PR Curve / AUC
Precision–recall tradeoff
Imbalanced omics data
Area under PR curve
Matthews Correlation (MCC)
Balanced measure even with imbalance
DNA classification
TP⋅TN−FP⋅FN(TP+FP)(TP+FN)(TN+FP)(TN+FN)\frac{TP\cdot TN - FP\cdot FN}{\sqrt{(TP+FP)(TP+FN)(TN+FP)(TN+FN)}}(TP+FP)(TP+FN)(TN+FP)(TN+FN)​TP⋅TN−FP⋅FN​
Brier Score
Calibrated probability error
Probabilistic predictions
BS=1N∑(pi−yi)2BS = \frac{1}{N}\sum (p_i - y_i)^2BS=N1​∑(pi​−yi​)2
Calibration (Temp Scaling)
Adjusts softmax confidence
Protein function prediction
qi=softmax(zi/T)q_i = \text{softmax}(z_i/T)qi​=softmax(zi​/T)
Conformal Prediction
Distribution-free prediction intervals
Genomic risk scores
[L(x),U(x)]withcoverage1−α1−α[L(x), U(x)] with coverage 1−α1-\alpha[L(x),U(x)]withcoverage1−α1−α
Aleatoric Uncertainty
Intrinsic randomness
Sequencing noise
Modeled in likelihood variance
Epistemic Uncertainty
Model ignorance
Limited training data
Ensembles, Bayesian NNs
Learning Curve
Error vs dataset size
Scaling genomic models
L(N)≈aN−α+bL(N) \approx a N^{-\alpha} + bL(N)≈aN−α+b
Power Law (Scaling Law)
Performance vs compute/data
Deep learning in bio
L(C)=kC−β+ϵL(C) = k C^{-\beta} + \epsilonL(C)=kC−β+ϵ
Chinchilla Law
Optimal compute–data balance
Training large bio models
Loss scales with tokens ∝ C1/2C^{1/2}C1/2
Wright’s Law
Cost falls with production
Sequencing costs
C(n)=C0n−αC(n) = C_0 n^{-\alpha}C(n)=C0​n−α
Queueing Model (Little’s Law)
Throughput relation
Bio pipeline scheduling
L=λWL = \lambda WL=λW
Sensitivity Analysis
Effect of parameter variation
Bioprocess robustness
Si=∂y∂θiS_i = \frac{\partial y}{\partial \theta_i}Si​=∂θi​∂y​
Cryptographic Hash
One-way function
Genomic data integrity
h(x)h(x)h(x)
Homomorphic Encryption (HE)
Compute on ciphertexts
Privacy-preserving genomics
E(a+b)=E(a)⋅E(b)E(a+b) = E(a)\cdot E(b)E(a+b)=E(a)⋅E(b)
Lattice-based Crypto (RLWE)
Hard lattice problem
Secure bio models
as+e(modq)a s + e \pmod{q}as+e(modq)
CKKS Scheme
Approximate HE for reals
Encrypted ML inference
Supports +++, ×\times× on ciphertexts
Noise Budget
Error growth in HE ops
Bio AI on encrypted data
Ciphertext validity bound
Quantum Operator Algebra
Linear operators on Hilbert space
Quantum chemistry models
( \hat H
Spectral Decomposition
Expanding in eigenbasis
Quantum Hamiltonians, protein folding
A=∑iλiviviTA = \sum_i \lambda_i v_i v_i^TA=∑i​λi​vi​viT​
Tensor Product
Composite quantum states
Multi-particle biology
∣ψ⊗ϕ⟩=∣ψ⟩⊗∣ϕ⟩=[ψ1ψ2]⊗[ϕ1ϕ2]=[ψ1ϕ1ψ1ϕ2ψ2ϕ1ψ2ϕ2]|\psi\otimes\phi\rangle =|\psi\rangle\otimes|\phi\rangle =\begin{bmatrix}\psi_1\\\psi_2\end{bmatrix}\otimes \begin{bmatrix}\phi_1\\\phi_2\end{bmatrix} =\begin{bmatrix} \psi_1\phi_1\\ \psi_1\phi_2\\ \psi_2\phi_1\\ \psi_2\phi_2 \end{bmatrix}∣ψ⊗ϕ⟩=∣ψ⟩⊗∣ϕ⟩=[ψ1​ψ2​​]⊗[ϕ1​ϕ2​​]=​ψ1​ϕ1​ψ1​ϕ2​ψ2​ϕ1​ψ2​ϕ2​​​
Density Matrix
Mixed state representation
Open system biology
ρ=∑ipi ∣ψi⟩⟨ψi∣\rho=\sum_{i}p_i\,|\psi_i\rangle\langle\psi_i|ρ=∑i​pi​∣ψi​⟩⟨ψi​∣
von Neumann Entropy
Entropy of quantum state
Quantum biology analogs
S(ρ)=−tr(ρlog⁡ρ)S(\rho) = -\text{tr}(\rho \log \rho)S(ρ)=−tr(ρlogρ)