Mathematical Tools

Mathematical Tools

Basic math LEGO bricks that I have used while building at the intersection of computers x biology. I use this as a reference.

Calculus + Linear Algebra

Component
Definition
In Biology / Computing
Math Variables
Derivative
Instantaneous rate of change
Enzyme kinetics, population growth, gradient descent
dydx,f(x)\frac{dy}{dx}, \nabla f(x)
Partial Derivative
Rate of change w.r.t. one variable
Multi-omics sensitivity, energy landscapes
fxi\frac{\partial f}{\partial x_i}
Gradient
Vector of partial derivatives
Backpropagation, optimization
f(x)=[fx1,,fxn]\nabla f(x) = \left[\frac{\partial f}{\partial x_1}, \ldots, \frac{\partial f}{\partial x_n}\right]
Jacobian
Matrix of first-order derivatives
Sensitivity analysis, neural nets
Jij=fixjJ_{ij} = \frac{\partial f_i}{\partial x_j}
Hessian
Matrix of second-order derivatives
Curvature in protein folding, optimization
Hij=2fxixjH_{ij} = \frac{\partial^2 f}{\partial x_i \partial x_j}
Taylor Expansion
Approximation around a point
Local linearization of pathways, dynamics
f(x)f(a)+f(a)(xa)+12f(a)(xa)2f(x) \approx f(a) + f'(a)(x-a) + \tfrac{1}{2}f''(a)(x-a)^2
Chain Rule
Derivative of composite functions
Backpropagation in neural nets
dydx=dydududx\frac{dy}{dx} = \frac{dy}{du}\frac{du}{dx}
Integral
Accumulated quantity
Population size from growth rate, cumulative flux
f(x)dx\int f(x)\, dx
Definite Integral
Area under curve / accumulation over interval
Protein abundance from expression rate
abf(x)dx\int_a^b f(x)\, dx
Multiple Integral
Integration over multidimensional space
Partition functions, probability densities
f(x1,,xn)dx1dxn\int \cdots \int f(x_1,\ldots,x_n) \, dx_1 \cdots dx_n
Divergence
Outflow rate of a vector field
Flux of ions, transport phenomena
F\nabla \cdot \vec{F}
Curl
Rotation of a vector field
Electromagnetic models in biophysics
×F\nabla \times \vec{F}
Laplacian
Divergence of gradient
Diffusion models, electrophysiology
2f=(f)\nabla^2 f = \nabla \cdot (\nabla f)
Linear Equation
Equation in vector/matrix form
Kinetics, statistical models
Ax=bAx = b
Matrix Multiplication
Transformation, composition
Feature embeddings, transition systems
C=ABC = AB
Determinant
Scalar property of matrix
Volume scaling, invertibility
det(A)\det(A)
Inverse Matrix
Solves linear systems
Regression, circuit models
A1A^{-1}
Eigenvalue / Eigenvector
Invariant scaling directions
PCA, network stability
Av=λvAv = \lambda v
SVD
Decompose into orthogonal bases
Dimensionality reduction (scRNA-seq)
A=UΣVTA = U \Sigma V^T
QR / LU / Cholesky
Matrix factorizations for solving
Numerical solvers for pathway models
A=QR, A=LU, A=LLTA = QR, \ A=LU, \ A=LL^T
Projection
Mapping onto subspace
Embeddings, latent space representations
P=UUTP = UU^T
Inner Product
Measure of similarity
Cosine similarity, kernel methods
x,y=xTy\langle x, y \rangle = x^T y
Norm
Length / magnitude of vector
Regularization, error measures
xp=(i=1nxip)1/p\|x\|p=\left(\sum{i=1}^{n}|x_i|^{\,p}\right)^{1/p}
Mahalanobis Distance
Distance scaled by covariance
Anomaly detection in cell states
d(x,y)=(xy)TΣ1(xy)d(x,y) = \sqrt{(x-y)^T \Sigma^{-1} (x-y)}
Orthogonality
Perpendicular vectors/bases
PCA axes, Fourier modes
xTy=0x^T y = 0
Rank
Dimension of column/row space
Degrees of freedom in system
rank(A)\text{rank}(A)
Trace
Sum of diagonal elements
Invariant measure in statistics
tr(A)=iAii\text{tr}(A) = \sum_i A_{ii}
Condition Number
Sensitivity of linear system
Numerical stability in simulations
κ(A)=AA1\kappa(A) = \|A\| \|A^{-1}\|

Probability, Statistics, Information Theory

Component
Definition
In Biology / Computing
Math Variables
Probability Distribution
Assigns likelihood to outcomes
Gene expression variability, sequencing errors
P(X=x)P(X=x)
Expectation
Average value under distribution
Mean expression, fitness landscape averages
E[X]=xxP(x)\mathbb{E}[X] = \sum_x x P(x)
Variance
Spread around the mean
Expression noise, measurement error
Var(X)=E[(Xμ)2]\mathrm{Var}(X) = \mathbb{E}[(X-\mu)^2]
Covariance
Joint variability of two variables
Co-expression of genes
Cov(X,Y)=E[(XμX)(YμY)]\mathrm{Cov}(X,Y) = \mathbb{E}[(X-\mu_X)(Y-\mu_Y)]
Correlation
Normalized covariance (−1 to 1)
Gene-gene correlation networks
ρXY=Cov(X,Y)σXσY\rho_{XY} = \frac{\mathrm{Cov}(X,Y)}{\sigma_X\sigma_Y}
Law of Large Numbers
Averages converge to expectation
Replicates reduce noise
Xˉn=1ni=1nXia.s.μas n\bar X_n=\frac{1}{n}\sum_{i=1}^{n}X_i \xrightarrow{\text{a.s.}} \mu \quad \text{as } n\to\infty
Central Limit Theorem
Normal distribution emerges from sums
Sequencing depth → Gaussian errors
Xˉnμσ/nN(0,1)\frac{\bar{X}_n - \mu}{\sigma/\sqrt{n}} \to \mathcal{N}(0,1)
Bernoulli
Single binary outcome
Mutation present/absent
P(X=1)=p, P(X=0)=1pP(X=1)=p, \ P(X=0)=1-p
Binomial
Sum of Bernoullis
Count of mutations in reads
P(X=k)=(nk)pk(1p)nkP(X=k) = \binom{n}{k} p^k (1-p)^{n-k}
Poisson
Events in fixed interval
RNA counts, sequencing reads
P(X=k)=λkeλk!P(X=k) = \frac{\lambda^k e^{-\lambda}}{k!}
Negative Binomial
Overdispersed counts
scRNA-seq models
P(X=k)=(k+r1k)(1p)rpkP(X=k) = \binom{k+r-1}{k} (1-p)^r p^k
Normal Distribution
Gaussian variability
Expression levels, errors
f(x)=12πσ2e(xμ)22σ2f(x) = \frac{1}{\sqrt{2\pi\sigma^2}} e^{-\frac{(x-\mu)^2}{2\sigma^2}}
Log-normal
Multiplicative noise
Protein abundance distributions
f(x)=1xσ2πe(lnxμ)22σ2f(x) = \frac{1}{x\sigma\sqrt{2\pi}} e^{-\frac{(\ln x - \mu)^2}{2\sigma^2}}
Gamma
Waiting times, skewed distributions
Reaction times, lifetimes
f(x;α,β)=βαΓ(α)xα1eβxf(x;\alpha,\beta) = \frac{\beta^\alpha}{\Gamma(\alpha)} x^{\alpha-1} e^{-\beta x}
Beta
Distribution on [0,1]
Allele frequencies, probabilities
f(x;α,β)=xα1(1x)β1B(α,β)f(x;\alpha,\beta) = \frac{x^{\alpha-1}(1-x)^{\beta-1}}{B(\alpha,\beta)}
Dirichlet
Multivariate generalization of Beta
Topic/cell-type mixtures
fx;α)=1B(α)ixiαi1f{x};\alpha) = \frac{1}{B(\alpha)} \prod_i x_i^{\alpha_i - 1}
Multinomial
Multi-class counts
Reads across cell types
P(X=x1,,xk)=n!ixi!ipixiP(X=x_1,\ldots,x_k) = \frac{n!}{\prod_i x_i!} \prod_i p_i^{x_i}
Maximum Likelihood (MLE)
Best-fit parameters from data
Fit kinetic rates, emission probs
θ^MLE=argmaxθ  L(θx)=argmaxθ  i=1nlogp(xiθ)\hat{\theta}{\mathrm{MLE}}=\arg\max{\theta}\;L(\theta\mid x) =\arg\max_{\theta}\;\sum_{i=1}^{n}\log p(x_i\mid\theta)
Maximum A Posteriori (MAP)
MLE with prior
Regularized estimates
θ^MAP=argmaxθ  p(θx)=argmaxθ  [logp(xθ)+logp(θ)]\hat{\theta}{\mathrm{MAP}}=\arg\max{\theta}\;p(\theta\mid x) =\arg\max_{\theta}\;\big[\log p(x\mid\theta)+\log p(\theta)\big]
Bayesian Inference
Updates belief with data
Model calibration in biology
p(θx)=p(xθ)p(θ)p(x),p(x)=p(xθ)p(θ)dθp(\theta\mid x)=\dfrac{p(x\mid\theta)\,p(\theta)}{p(x)},\qquad p(x)=\int p(x\mid\theta)\,p(\theta)\,d\theta
Hypothesis Test
Decision on effect
Gene expression DE tests
H0:μ1=μ2H_0: \mu_1 = \mu_2
t-test
Mean difference test
Differential gene expression
t=xˉ1xˉ2sp1n1+1n2t = \frac{\bar{x}_1 - \bar{x}_2}{s_p\sqrt{\frac{1}{n_1}+\frac{1}{n_2}}}
Chi-square test
Goodness of fit
Contingency tables in genomics
χ2=(OiEi)2Ei\chi^2 = \sum \frac{(O_i - E_i)^2}{E_i}
FDR / BH Procedure
Controls multiple testing
GWAS, RNA-seq DE genes
qi=min(pimi,1)q_i = \min\left(\frac{p_i m}{i},1\right)
Bootstrap
Resampling with replacement
CI for small-sample experiments
θ=f(X){\theta}^* = f(X^*)
Entropy
Uncertainty of distribution
Cell diversity, motif randomness
H(p)=ipilogpiH(p) = -\sum_i p_i \log p_i
Cross-Entropy
Divergence between true & model
Loss for classifiers
H(p,q)=ipilogqiH(p,q) = -\sum_i p_i \log q_i
KL Divergence
Relative entropy
Compare distributions (healthy vs disease)
DKL(pq)=ipilogpiqiD_{KL}(p\|q) = \sum_i p_i \log \frac{p_i}{q_i}
Jensen–Shannon Divergence
Symmetrized KL
Compare embeddings
DJS(pq)=12DKL(pm)+12DKL(qm),m=12(p+q)m=12(p+q)D_{JS}(p\|q) = \tfrac{1}{2}D_{KL}(p\|m)+\tfrac{1}{2}D_{KL}(q\|m), m=12(p+q)m=\tfrac{1}{2}(p+q)
Mutual Information
Shared info between vars
Regulatory network inference
I(X;Y)=x,yp(x,y)logp(x,y)p(x)p(y)I(X;Y) = \sum_{x,y} p(x,y)\log\frac{p(x,y)}{p(x)p(y)}
Perplexity
Exponential of entropy
Quality of embeddings, models
Perp(p)=2H(p){Perp}(p) = 2^{H(p)}
Information Bottleneck
Tradeoff compression vs. relevance
Latent representation of cell states
minI(X;Z)βI(Z;Y)\min I(X;Z) - \beta I(Z;Y)

Optimization and ML Primitives

Component
Definition
In Biology / Computing
Math Variables
Loss Function
Scalar measure of prediction error
Training models for gene expression, structure prediction
L(θ)=1Ni(fθ(xi),yi)L(\theta) = \frac{1}{N}\sum_i \ell(f_\theta(x_i), y_i)
Mean Squared Error (MSE)
Average squared error
Regression tasks, kinetics fitting
L=1ni(yiy^i)2L = \frac{1}{n}\sum_i (y_i - \hat{y}_i)^2
Cross-Entropy Loss
Divergence between distributions
Classification, motif recognition
L=iyilogy^iL = -\sum_i y_i \log \hat{y}_i
Hinge Loss
Margin-based loss
Support vector machines, binary classification
L=max(0,1yf(x))L = \max(0, 1 - y f(x))
Regularization
Penalty term to avoid overfitting
Control model complexity
L=L+λwpL' = L + \lambda \|w\|_p
L1 Regularization
Promotes sparsity
Feature selection in omics
minw  L(w)+λw1,w1=i=1dwi\min_{w}\; L(w)+\lambda\|w\|{1},\qquad \|w\|{1}=\sum_{i=1}^{d}|w_i|
L2 Regularization
Penalizes large weights
Ridge regression, weight decay
w22=iwi2\|w\|_2^2 = \sum_i w_i^2
Gradient Descent
Iterative optimization step
Neural nets, ODE parameter fitting
θt+1=θtηθL\theta_{t+1} = \theta_t - \eta \nabla_\theta L
Stochastic Gradient Descent (SGD)
Uses minibatches for updates
Large-scale models on bio data
θt+1=θtηθL(θ;xi)\theta_{t+1} = \theta_t - \eta \nabla_\theta L(\theta; x_i)
Momentum
Uses past updates to accelerate
Faster convergence in training
vt+1=βvt+θL, θt+1=θtηvt+1v_{t+1} = \beta v_t + \nabla_\theta L, \ \theta_{t+1} = \theta_t - \eta v_{t+1}
Adam Optimizer
Adaptive moment estimation
Standard in DL training
mt=β1mt1+(1β1)gt, vt=β2vt1+(1β2)gt2m_t = \beta_1 m_{t-1} + (1-\beta_1)g_t, \ v_t = \beta_2 v_{t-1}+(1-\beta_2)g_t^2
L-BFGS
Quasi-Newton optimization
Energy minimization in proteins
Updates use inverse Hessian approximation
Conjugate Gradient
Iterative quadratic solver
Sparse system solvers in genomics
xk+1=xk+αkpkx_{k+1} = x_k + \alpha_k p_k
Coordinate Descent
Optimizes one variable at a time
LASSO, constrained models
xi(t+1)=argminf(x1,,xi,)x_i^{(t+1)} = \arg\min f(x_1,\dots,x_i,\dots)
Lagrangian Multipliers
Optimization with constraints
Flux balance analysis
L(x,λ)=f(x)+λg(x)\mathcal{L}(x,\lambda) = f(x) + \lambda g(x)
KKT Conditions
Optimality for constrained optimization
Biochemical flux solutions
Stationarity, primal/dual feasibility, complementarity
Proximal Operator
Handles non-smooth penalties
Sparse regression, TV denoising
proxλf(v)=argminx(f(x)+12λxv2)\text{prox}_{\lambda f}(v) = \arg\min_x \big(f(x) + \tfrac{1}{2\lambda}\|x-v\|^2\big)
Expectation-Maximization (EM)
Iterative latent variable inference
Mixture models, cell deconvolution
E-step: Q(θ)Q(\theta), M-step: maximize QQ
k-Means
Clustering by minimizing within-cluster variance
Cell type clustering
argminixiμc(i)2\arg\min \sum_i \|x_i - \mu_{c(i)}\|^2
Gaussian Mixture Model (GMM)
Soft clustering with Gaussians
Expression distributions
p(x)=k=1KπkN(xμk,Σk)p(x)=\sum_{k=1}^{K}\pi_k\,\mathcal{N}(x\mid \mu_k,\Sigma_k) γik=πkN(xiμk,Σk)j=1KπjN(xiμj,Σj)\gamma_{ik}=\dfrac{\pi_k\,\mathcal{N}(x_i\mid \mu_k,\Sigma_k)} {\sum_{j=1}^{K}\pi_j\,\mathcal{N}(x_i\mid \mu_j,\Sigma_j)}
Softmax
Converts scores to probabilities
Classification, attention weights
softmax(zi)=ezijezj\text{softmax}(z_i) = \frac{e^{z_i}}{\sum_j e^{z_j}}
ReLU
Nonlinear activation
Neural networks
f(x)=max(0,x)f(x) = \max(0, x)
Sigmoid
Squashes to (0,1)
Logistic regression, gating
σ(x)=11+ex\sigma(x) = \frac{1}{1+e^{-x}}
Tanh
Squashes to (-1,1)
Normalized activations
tanh(x)=exexex+ex\tanh(x) = \frac{e^x - e^{-x}}{e^x + e^{-x}}
Attention Mechanism
Weighted combination of inputs
Protein sequence models, embeddings
Attn(Q,K,V)=softmax ⁣(QKTdk)V\text{Attn}(Q,K,V) = \text{softmax}\!\left(\frac{QK^T}{\sqrt{d_k}}\right)V
Contrastive Loss
Brings similar pairs together
Align cell embeddings, protein-ligand
L=logesim(xi,xj)/τkesim(xi,xk)/τL = -\log \frac{e^{\text{sim}(x_i,x_j)/\tau}}{\sum_k e^{\text{sim}(x_i,x_k)/\tau}}
VAE ELBO
Lower bound for latent variable models
Generative scRNA-seq
LELBO(x)=Eqϕ(zx) ⁣[logpθ(xz)]DKL ⁣(qϕ(zx)p(z))\mathcal{L}{\mathrm{ELBO}}(x)= \mathbb{E}{q_{\phi}(z\mid x)}\!\left[\log p_{\theta}(x\mid z)\right] - D_{\mathrm{KL}}\!\big(q_{\phi}(z\mid x)\,\|\,p(z)\big)
Diffusion SDE
Forward corruption process
Generative protein design
dx=f(x,t)dt+g(t)dWtdx = f(x,t)\,dt + g(t)\,dW_t
Graph Laplacian
Encodes connectivity
Protein–protein interaction networks
L=DAL = D - A
Message Passing
Node embedding updates
GNNs for molecules
hv(t+1)=σ ⁣(WAGG{hu(t):uN(v)})h_v^{(t+1)} = \sigma\!\left(W \cdot \text{AGG}\{h_u^{(t)}:u \in N(v)\}\right)
Equivariance (SE(3))
Preserves geometric symmetries
Protein structure prediction
f(Rx)=Rf(x)f(Rx) = R f(x)
Clustering Modularity
Graph-based community detection
Gene co-expression networks
Q=12mij(Aijkikj2m)δ(ci,cj)Q = \frac{1}{2m}\sum_{ij}\Big(A_{ij} - \frac{k_i k_j}{2m}\Big)\delta(c_i,c_j)

Dynamics (ODEs, PDEs, Stochastic Processes, Control/RL)

Component
Definition
In Biology / Computing
Math Variables
Ordinary Differential Equation (ODE)
Time evolution of variables
Gene circuits, pharmacokinetics
dxdt=f(x,t)\frac{dx}{dt} = f(x,t)
System of ODEs
Multiple interacting variables
Pathways, population dynamics
dxdt=F(x,t)\frac{d\mathbf{x}}{dt} = F(\mathbf{x},t)
Partial Differential Equation (PDE)
Evolution over space + time
Diffusion of molecules, electrophysiology
ut=D2u\frac{\partial u}{\partial t} = D \nabla^2 u
Heat Equation
Diffusion PDE
Ion transport, morphogen gradients
ut=α2u\frac{\partial u}{\partial t} = \alpha \nabla^2 u
Wave Equation
Propagation dynamics
Nerve impulses, biomechanics
2ut2=c22u\frac{\partial^2 u}{\partial t^2} = c^2 \nabla^2 u
Poisson Equation
Potential field equation
Electrostatics in proteins
2u=f-\nabla^2 u = f
Lotka–Volterra
Predator-prey dynamics
Species competition, host-virus
dxdt=αxβxy, dydt=δxyγy\frac{dx}{dt} = \alpha x - \beta xy, \ \frac{dy}{dt} = \delta xy - \gamma y
Michaelis–Menten Kinetics
Enzyme rate law
Metabolic modeling
v=Vmax[S]KM+[S]v = \frac{V_{\max}[S]}{K_M + [S]}
Hill Equation
Cooperative binding
Gene regulation, transcription factors
θ=[L]nKdn+[L]n\theta = \frac{[L]^n}{K_d^n + [L]^n}
Mass-Action Kinetics
Reaction rate ∝ concentrations
Systems biology, stoichiometric models
v=ki[Xi]νiv = k \prod_i [X_i]^{\nu_i}
Gillespie Algorithm
Stochastic simulation of reactions
Single-cell variability
Generates trajectories via random exponential waiting times
Markov Chain (transition + distribution update)
Memoryless state transitions
Mutation models, sequence evolution
Pr(Xt+1=jXt=i)=Pij,πt+1=πtP\Pr(X_{t+1}=j\mid X_t=i)=P_{ij}, \qquad \pi_{t+1}^{\top}=\pi_{t}^{\top}P
Hidden Markov Model (HMM)
Latent-state probabilistic model, joint likelihood
Gene finding, protein domains
p(x1:T,z1:T)=p(z1)t=2Tp(ztzt1)t=1Tp(xtzt)p(x_{1:T},z_{1:T})=p(z_1)\, \prod_{t=2}^{T}p(z_t\mid z_{t-1})\, \prod_{t=1}^{T}p(x_t\mid z_t)
Stochastic Differential Equation (SDE)
Dynamics with noise
Noisy gene expression, molecular motion
dx=f(x,t)dt+g(x,t)dWtdx = f(x,t)\,dt + g(x,t)\,dW_t
Ornstein–Uhlenbeck Process
Mean-reverting SDE
Noise in biophysical systems
dx=θ(μx)dt+σdWtdx = \theta(\mu - x)dt + \sigma dW_t
Random Walk
Successive random steps
Diffusion models, genome scans
Xt+1=Xt+ϵtX_{t+1} = X_t + \epsilon_t
Brownian Motion
Continuous random process
Molecular dynamics
W(t)N(0,t)W(t) \sim \mathcal{N}(0, t)
Control System
Regulates state of system
Bioreactor control, feedback
x˙=Ax+Bu, y=Cx\dot{x} = Ax+Bu, \ y=Cx
PID Controller
Proportional-integral-derivative control
Lab robotics, process stabilization
u(t)=Kpe(t)+Kie(τ)dτ+Kddedtu(t) = K_p e(t) + K_i \int e(\tau) d\tau + K_d \frac{de}{dt}
Model Predictive Control (MPC)
Optimization-based control
Dynamic bioprocess optimization
minuk=0NxkxQ2+ukR2\min_u \sum_{k=0}^N \|x_k - x^\ast\|_Q^2 + \|u_k\|_R^2
Markov Decision Process (MDP)
Sequential decision framework
Adaptive experiment design
S,A,P,R,γ\langle S, A, P, R, \gamma \rangle
Bellman Equation
Recursive value definition
RL-based model design
V(s)=maxa[r(s,a)+γsP(ss,a)V(s)]V^{}(s)=\max_{a}\Big[r(s,a)+\gamma\sum_{s'}P(s'\mid s,a)\,V^{}(s')\Big]
Policy Gradient
Optimizes expected reward
Reinforcement learning in biology
θJ(θ)=Eπθ ⁣[tθlogπθ(atst)At]\nabla_{\theta}J(\theta)= \mathbb{E}{\pi{\theta}}\!\left[\sum_{t} \nabla_{\theta}\log \pi_{\theta}(a_t\mid s_t)\,A_t\right]
Advantage Estimation (GAE)
Variance-reduced estimator
Efficient policy training
At=l=0(γλ)lδt+lA_t = \sum_{l=0}^\infty (\gamma \lambda)^l \delta_{t+l}
PPO (Proximal Policy Optimization)
RL with clipped objective
Safe training in bio RL models
LCLIP=E[min(rt(θ)At,clip(rt(θ),1ϵ,1+ϵ)At)]L^{CLIP} = \mathbb{E}[\min(r_t(\theta)A_t, \text{clip}(r_t(\theta),1-\epsilon,1+\epsilon)A_t)]
SAC (Soft Actor-Critic)
RL with entropy maximization
Exploration in protein design
J(π)=tE(st,at)π ⁣[r(st,at)+αH(π(st))],H(π(s))= ⁣π(as)logπ(as)daJ(\pi)=\sum_{t}\mathbb{E}_{(s_t,a_t)\sim \pi}\!\left[ r(s_t,a_t)+\alpha\,\mathcal{H}\big(\pi(\cdot\mid s_t)\big)\right],\quad \mathcal{H}(\pi(\cdot\mid s))=-\!\int \pi(a\mid s)\log \pi(a\mid s)\,da

Signal Processing, Numerical Methods & HPC

Component
Definition
In Biology / Computing
Math Variables
Convolution
Weighted overlap of functions
Motif scanning, microscopy filtering
(fg)(t)=f(τ)g(tτ)dτ(f*g)(t) = \int f(\tau)g(t-\tau)\,d\tau
Correlation
Similarity via shifted overlap
Template matching in sequences
(fg)(t)=f(τ)g(t+τ)dτ(f \star g)(t) = \int f(\tau)g(t+\tau)\,d\tau
Fourier Transform
Decomposes into frequencies
MRI k-space, EEG
f^(ω)=f(t)eiωtdt\hat f(\omega) = \int f(t) e^{-i\omega t} dt
Discrete Fourier Transform (DFT)
Finite-sample version
Sequence periodicity detection
Xk=n=0N1xnei2πkn/NX_k = \sum_{n=0}^{N-1} x_n e^{-i2\pi kn/N}
Fast Fourier Transform (FFT)
Efficient DFT algorithm
Bio-signal analysis at scale
O(NlogN)O(N \log N) complexity
Wavelet Transform
Time–frequency decomposition
Microscopy image denoising
W(a,b)=f(t)ψ ⁣(tba)dtW(a,b) = \int f(t)\psi\!\left(\frac{t-b}{a}\right)dt
Radon Transform
Line integrals of function
CT reconstruction
Rf(θ,s)=f(x,y)δ(sxcosθysinθ)dxdyRf(\theta,s) = \int f(x,y)\,\delta(s-x\cos\theta-y\sin\theta)\,dxdy
Filter (Low/High-pass)
Signal smoothing or sharpening
Noise reduction in time series
Frequency cutoff: H(ω)H(\omega)
Wiener Filter
Linear MMSE estimator
Signal denoising
H(ω)=Sxx(ω)Sxx(ω)+Snn(ω)H(\omega) = \frac{S_{xx}(\omega)}{S_{xx}(\omega)+S_{nn}(\omega)}
Kalman Filter
Recursive state estimator
Tracking cell motion
Predict:x^kk1=Ax^k1k1+Buk,Pkk1=APk1k1A+QUpdate:Kk=Pkk1H(HPkk1H+R)1x^kk=x^kk1+Kk(ykHx^kk1),Pkk=(IKkH)Pkk1\textbf{Predict:}\quad \hat x_{k\mid k-1}=A\,\hat x_{k-1\mid k-1}+B\,u_k,\qquad P_{k\mid k-1}=A\,P_{k-1\mid k-1}A^{\top}+Q \textbf{Update:}\quad K_k=P_{k\mid k-1}H^{\top}\big(H P_{k\mid k-1}H^{\top}+R\big)^{-1} \\ \hat x_{k\mid k}=\hat x_{k\mid k-1}+K_k\big(y_k-H\hat x_{k\mid k-1}\big),\qquad P_{k\mid k}=(I-K_k H)P_{k\mid k-1}
Particle Filter
Sequential Monte Carlo
Nonlinear/noisy tracking
Approximate posterior via particles
Total Variation (TV) Denoising
Penalizes gradient magnitude
Microscopy deblurring
minxxy2+λx1\min_x \|x-y\|^2 + \lambda \|\nabla x\|_1
Compressed Sensing
Recovery from undersampling
Accelerated MRI
minxx1 s.t. Ax=b\min_x \|x\|_1 \ \text{s.t.}\ Ax=b
Finite Difference
Approx derivative by discretization
PDE solvers
uxu(x+h)u(x)h\frac{\partial u}{\partial x} \approx \frac{u(x+h)-u(x)}{h}
Finite Element Method (FEM)
Domain discretization into elements
Biomechanics, electrophysiology
Weak form: Ωuvdx\int_\Omega \nabla u \cdot \nabla v \,dx
Finite Volume Method
Conserves fluxes per cell
Transport models in tissues
ddtΩudx+ΩFnds=0\frac{d}{dt}\int_\Omega u\,dx + \int_{\partial\Omega} F\cdot n \,ds = 0
Monte Carlo Integration
Random sampling for integrals
Partition functions, uncertainty
I1Nf(xi)I \approx \frac{1}{N}\sum f(x_i)
Importance Sampling
Weighted MC estimates
Rare-event modeling
I=1Nf(xi)q(xi)I = \frac{1}{N}\sum \frac{f(x_i)}{q(x_i)}
Quasi-Monte Carlo
Low-discrepancy sequences
Faster convergence for high-dim integrals
Uses Sobol / Halton sequences
Automatic Differentiation
Programmatic derivative
Training ML models
Forward & reverse mode dydx\frac{dy}{dx}
Backpropagation
Reverse-mode autodiff
Neural nets in bio
Lw\frac{\partial L}{\partial w} via chain rule
Roofline Model
Performance vs. arithmetic intensity
Kernel optimization
FLOPs/byte tradeoff
Amdahl’s Law
Parallelism speedup bound
Multi-core scaling limits
S=1(1p)+p/NS = \frac{1}{(1-p)+p/N}
Gustafson’s Law
Scaling efficiency with workload
HPC bio pipelines
S=N(N1)(1p)S = N - (N-1)(1-p)
Memory Bandwidth Limit
Bytes/sec bottleneck
GPU genomics kernels
Throughput = min(compute, memory BW)
SIMD / GPU Parallelism
Single-instruction, many data
K-mer counting, alignment
Vector ops per cycle
Sparse Matrix Ops
Efficient storage/computation
Genome graphs, scRNA matrices
Formats: CSR, COO

Bioinformatics and Sequence Mathematics

Component
Definition
In Biology / Computing
Math Variables
k-mer
Substring of length kk
Genome comparison, sequence hashing
s[i:i+k]s[i:i+k]
Jaccard Index
Set similarity measure
Genome sketching, assembly comparison
J(A,B)=ABABJ(A,B)=\dfrac{|A\cap B|}{|A\cup B|}
MinHash
Fast Jaccard approximation
Large-scale sequence similarity
Randomized hashing of k-mers
Count-Min Sketch
Probabilistic frequency table
Streaming k-mer counts
f~(i)=minjC[j,hj(i)]\tilde f(i) = \min_j C[j,h_j(i)]
Hamming Distance
Number of differing positions
DNA barcode error correction
dH(x,y)=i[xiyi]d_H(x,y) = \sum_i [x_i \ne y_i]
Levenshtein Distance
Edit distance (insert/del/sub)
Sequence alignment
Minimum edits to transform x→yx \to y
Smith–Waterman
Local alignment DP
Short sequence homology
Fi,j=max{0,Fi1,j1+s,Fi1,jd,Fi,j1d}F_{i,j} = \max\{0, F_{i-1,j-1}+s, F_{i-1,j}-d, F_{i,j-1}-d\}
Needleman–Wunsch
Global alignment DP
Genome alignment
Fi,j=max{Fi1,j1+s,Fi1,jd,Fi,j1d}F_{i,j} = \max\{F_{i-1,j-1}+s, F_{i-1,j}-d, F_{i,j-1}-d\}
Gotoh Algorithm
Alignment with affine gaps
Realistic indel scoring
Gap cost g+keg + k e
Substitution Matrix
Scoring amino acid swaps
BLOSUM, PAM
S(a,b)=logP(a,b)P(a)P(b)S(a,b) = \log \frac{P(a,b)}{P(a)P(b)}
Position Weight Matrix (PWM)
Motif probability matrix
TF binding site prediction
Pi(b)_{i}(b), where bb is base
Hidden Markov Model (Profile HMM)
Motif/sequence family model
Protein domains
Transition + emission probabilities
Suffix Array
Sorted suffix positions
Fast substring search
SASA = sorted indices of suffixes
Suffix Tree
Tree of suffixes
Genome indexing
Nodes = substrings
Burrows–Wheeler Transform (BWT)
Reversible string transform
Basis of read mappers
Last column of sorted rotations
FM-Index
Compressed substring index
Read mapping
Supports O(m)O(m) pattern matching
De Bruijn Graph
k-mer graph structure
Genome assembly
Nodes = k-1-mers, edges = k-mers
Eulerian Path
Traverses each edge once
Assembly from k-mers
Exists if in-degree = out-degree
Hamiltonian Path
Traverses each node once
Overlap-layout assembly
NP-hard
Phred Score
Log-scaled error probability
Sequencing quality
Q=10log10pQ = -10\log_{10} p
Codon Usage Bias
Frequency of synonymous codons
Expression optimization
CAI, tAI formulas
Codon Adaptation Index (CAI)
Expression potential metric
Gene design
CAI=(iwi)1/L\text{CAI} = \left(\prod_i w_i\right)^{1/L}
tRNA Adaptation Index (tAI)
Translation efficiency score
Synthetic biology
Based on tRNA availability
GC Content
Fraction of G+C bases
Genomic stability
G+CA+T+G+C\frac{G+C}{A+T+G+C}
k-mer Spectrum
Histogram of k-mer counts
Detect heterozygosity, repeats
f(k)f(k) distribution
Sequence Entropy
Information content of sequence
Motif conservation
H=pblogpbH = -\sum p_b \log p_b
Motif Scanning (Convolution)
PWM convolution across genome
Regulatory site finding
Score=ilogPi(xi)Score = \sum_i \log P_i(x_i)
BLAST Scoring
Heuristic local alignment
Homology search
Uses seed-and-extend + substitution matrices
Phylogenetic Tree
Evolutionary tree model
Ancestral inference
Distance or likelihood based
Felsenstein Pruning Algorithm
Likelihood computation on tree
Phylogenetic likelihoods
Dynamic programming over nodes

Systems Biology, Structural Biology & Population Genetics

Component
Definition
In Biology / Computing
Math Variables
Stoichiometric Matrix (S)
Encodes reaction network
Flux balance analysis (FBA)
Sv=0S \cdot v = 0
Flux Balance Analysis (FBA)
Linear optimization on S
Metabolic pathway prediction
maxcTv s.t. Sv=0, lvumax c^T v \ \text{s.t.}\ S v = 0, \ l \le v \le u
Flux Variability Analysis (FVA)
Range of feasible fluxes
Robustness of metabolism
Optimizes min/max viv_i under FBA constraints
Parsimonious FBA (pFBA)
Minimizes total flux
Efficient metabolic solutions
minv  ivis.t.Sv=0,  vminvvmax\min_{v}\;\sum_{i}|v_i| \quad \text{s.t.}\quad S v=0,\; v_{\min}\le v\le v_{\max}
Metabolic Control Analysis
Quantifies control coefficients
Sensitivity in pathways
CiJ=lnJlnEiC^J_i = \frac{\partial \ln J}{\partial \ln E_i}
Michaelis–Menten
Enzyme kinetics
Reaction velocity
v=Vmax[S]KM+[S]v = \frac{V_{\max}[S]}{K_M+[S]}
Hill Equation
Cooperative binding
TF–DNA regulation
θ=[L]nKdn+[L]n\theta = \frac{[L]^n}{K_d^n+[L]^n}
Mass-Action Law
Rate ∝ reactant concentrations
Reaction network modeling
v=ki[Xi]νiv = k\prod_i [X_i]^{\nu_i}
Arrhenius Equation
Temp dependence of rate
Biochemical kinetics
k=AeEa/RTk = A e^{-E_a/RT}
Eyring Equation
Transition state theory
Reaction thermodynamics
k=kBTheΔG/RTk = \frac{k_B T}{h} e^{-\Delta G^\ddagger /RT}
Gibbs Free Energy
ΔG predicts spontaneity
Protein folding, binding
ΔG=ΔHTΔS\Delta G = \Delta H - T\Delta S
Binding Equilibrium
Ligand–receptor affinity
Protein–drug interactions
Kd=[P][L][PL]K_d = \frac{[P][L]}{[PL]}
ΔG–Kd Relation
Thermodynamic link
Quantifying binding strength
ΔG=RTlnKd\Delta G = -RT \ln K_d
Force Fields
Energy functions in MD
Protein simulations
E=Ebond+Eangle+Etorsion+EvdW+EelecE = E_{bond}+E_{angle}+E_{torsion}+E_{vdW}+E_{elec}
Lennard–Jones Potential
van der Waals model
Molecular packing
V(r)=4ϵ[(σ/r)12(σ/r)6]V(r)=4\epsilon[(\sigma/r)^{12}-(\sigma/r)^6]
Coulomb’s Law
Electrostatic interactions
Charged biomolecules
F=kq1q2r2F = \frac{kq_1 q_2}{r^2}
Ewald/PME Summation
Long-range electrostatics
Protein MD
Splits short/long-range terms
Root Mean Square Deviation (RMSD)
Structure difference metric
Protein structure evaluation
RMSD=1Nixiyi2\text{RMSD} = \sqrt{\frac{1}{N}\sum_i \|x_i-y_i\|^2}
TM-score
Protein structural similarity
Structure prediction accuracy
TM=max(1L11+(di/d0(L))2)\text{TM} = \max\left(\frac{1}{L}\sum \frac{1}{1+(d_i/d_0(L))^2}\right)
Contact Map
Binary residue contacts
Folding, docking models
Cij=[dij<δ]C_{ij} = [d_{ij} < \delta]
Ramachandran Plot
φ–ψ torsional space
Protein conformational analysis
Allowed regions of ϕ,ψ\phi, \psi
Rotamer Library
Discrete side-chain conformations
Protein modeling
Probabilities over torsion states
Free Energy Perturbation (FEP)
ΔΔG between states
Binding affinity prediction
ΔG=kBTlneΔU/kBT\Delta G = -k_BT \ln \langle e^{-\Delta U/k_BT}\rangle
Thermodynamic Integration (TI)
Computes ΔG via λ interpolation
Drug design
ΔG=01U/λλdλ\Delta G = \int_0^1 \langle \partial U/\partial \lambda \rangle_\lambda d\lambda
MBAR
Multi-state free-energy estimator
Protein/ligand ΔΔG
Weighted combination of samples
Hardy–Weinberg Equilibrium
Allele frequency model
Population genetics
p2+2pq+q2=1p^2+2pq+q^2=1
Wright–Fisher Model
Genetic drift in finite pops
Allele frequency variance
Binomial sampling of alleles each gen
Moran Model
Overlapping-gen population model
Drift, fixation
One birth + one death per step
Coalescent Theory
Backward-time genealogy
Ancestral allele inference
Distribution of coalescent times
Fixation Probability
Probability allele becomes fixed
Selection vs drift
Pfix1e2s1e2NsP_{\text{fix}} \approx \frac{1-e^{-2s}}{1-e^{-2Ns}}
Substitution Models
Models nucleotide changes
Phylogenetics
JC69, HKY, GTR matrices
Felsenstein Pruning
Likelihood on trees
Phylogenetic inference
Recursive likelihood computation

Evaluation, Scaling & Cryptography

(includes some quantum)

Component
Definition
In Biology / Computing
Math Variables
Accuracy
Fraction of correct predictions
Classifier evaluation
TP+TNTP+FP+FN+TN\frac{TP+TN}{TP+FP+FN+TN}
Precision
Correct positives / all positives
Gene variant calling
TPTP+FP\frac{TP}{TP+FP}
Recall (Sensitivity)
True positives / actual positives
Rare mutation detection
TPTP+FN\frac{TP}{TP+FN}
Specificity
True negatives / actual negatives
Diagnostic screening
TNTN+FP\frac{TN}{TN+FP}
F1 Score
Harmonic mean of precision & recall
Balancing bio classifier performance
2PRP+R2\cdot\frac{PR}{P+R}
ROC Curve / AUC
Tradeoff sensitivity vs specificity
Diagnostic classifiers
AUCAUC = area under curve
PR Curve / AUC
Precision–recall tradeoff
Imbalanced omics data
Area under PR curve
Matthews Correlation (MCC)
Balanced measure even with imbalance
DNA classification
TPTNFPFN(TP+FP)(TP+FN)(TN+FP)(TN+FN)\frac{TP\cdot TN - FP\cdot FN}{\sqrt{(TP+FP)(TP+FN)(TN+FP)(TN+FN)}}
Brier Score
Calibrated probability error
Probabilistic predictions
BS=1N(piyi)2BS = \frac{1}{N}\sum (p_i - y_i)^2
Calibration (Temp Scaling)
Adjusts softmax confidence
Protein function prediction
qi=softmax(zi/T)q_i = \text{softmax}(z_i/T)
Conformal Prediction
Distribution-free prediction intervals
Genomic risk scores
[L(x),U(x)]withcoverage1α1α[L(x), U(x)] with coverage 1−α1-\alpha
Aleatoric Uncertainty
Intrinsic randomness
Sequencing noise
Modeled in likelihood variance
Epistemic Uncertainty
Model ignorance
Limited training data
Ensembles, Bayesian NNs
Learning Curve
Error vs dataset size
Scaling genomic models
L(N)aNα+bL(N) \approx a N^{-\alpha} + b
Power Law (Scaling Law)
Performance vs compute/data
Deep learning in bio
L(C)=kCβ+ϵL(C) = k C^{-\beta} + \epsilon
Chinchilla Law
Optimal compute–data balance
Training large bio models
Loss scales with tokens ∝ C1/2C^{1/2}
Wright’s Law
Cost falls with production
Sequencing costs
C(n)=C0nαC(n) = C_0 n^{-\alpha}
Queueing Model (Little’s Law)
Throughput relation
Bio pipeline scheduling
L=λWL = \lambda W
Sensitivity Analysis
Effect of parameter variation
Bioprocess robustness
Si=yθiS_i = \frac{\partial y}{\partial \theta_i}
Cryptographic Hash
One-way function
Genomic data integrity
h(x)h(x)
Homomorphic Encryption (HE)
Compute on ciphertexts
Privacy-preserving genomics
E(a+b)=E(a)E(b)E(a+b) = E(a)\cdot E(b)
Lattice-based Crypto (RLWE)
Hard lattice problem
Secure bio models
as+e(modq)a s + e \pmod{q}
CKKS Scheme
Approximate HE for reals
Encrypted ML inference
Supports ++, ×\times on ciphertexts
Noise Budget
Error growth in HE ops
Bio AI on encrypted data
Ciphertext validity bound
Quantum Operator Algebra
Linear operators on Hilbert space
Quantum chemistry models
( \hat H
Spectral Decomposition
Expanding in eigenbasis
Quantum Hamiltonians, protein folding
A=iλiviviTA = \sum_i \lambda_i v_i v_i^T
Tensor Product
Composite quantum states
Multi-particle biology
ψϕ=ψϕ=[ψ1ψ2][ϕ1ϕ2]=[ψ1ϕ1ψ1ϕ2ψ2ϕ1ψ2ϕ2]|\psi\otimes\phi\rangle =|\psi\rangle\otimes|\phi\rangle =\begin{bmatrix}\psi_1\\\psi_2\end{bmatrix}\otimes \begin{bmatrix}\phi_1\\\phi_2\end{bmatrix} =\begin{bmatrix} \psi_1\phi_1\\ \psi_1\phi_2\\ \psi_2\phi_1\\ \psi_2\phi_2 \end{bmatrix}
Density Matrix
Mixed state representation
Open system biology
ρ=ipiψiψi\rho=\sum_{i}p_i\,|\psi_i\rangle\langle\psi_i|
von Neumann Entropy
Entropy of quantum state
Quantum biology analogs
S(ρ)=tr(ρlogρ)S(\rho) = -\text{tr}(\rho \log \rho)