# Spike trains and point-process GLMs > **Goal of this page.** Explain spike trains as *point processes*, the > *conditional intensity function* (CIF) that defines them, and how nSTAT > fits CIFs with *point-process generalized linear models* (GLMs). > > **Glossary jumps:** [point process](glossary.md#point-process) · > [CIF](glossary.md#conditional-intensity-function) · > [GLM](glossary.md#generalized-linear-model) · > [link function](glossary.md#link-function) · > [history / refractory](glossary.md#history-term-refractory-period) · > [AIC / BIC](glossary.md#aic-bic) · > [SSGLM](glossary.md#state-space-glm) ## Spike trains as point processes A **point process** is a probabilistic model for a series of events in time — here, spike times. Instead of predicting a continuous waveform, it predicts *when* discrete events happen. The complete statistical description of a point process is its **conditional intensity function (CIF)**, written $\lambda(t \mid H_t)$: > $\lambda(t \mid H_t) \cdot \Delta$ ≈ the probability of a spike in the small interval > $[t, t+\Delta)$, given the history $H_t$ of everything observed up to time $t$. Intuitively, $\lambda(t \mid H_t)$ is an *instantaneous firing rate* (in spikes/s) that is allowed to depend on the past. The dependence on history $H_t$ is what makes the CIF powerful: it captures refractoriness, bursting, and network coupling that a plain rate $\lambda(t)$ cannot. See [Daley & Vere-Jones (2003)](https://doi.org/10.1007/b97277) for the theory and [Truccolo et al. (2005)](https://pubmed.ncbi.nlm.nih.gov/15356183/) for the neuroscience framing. ![A stimulus-driven conditional intensity function and the spike raster it generates across repeated trials](figures/cif_raster.png) *Top: a conditional intensity function $\lambda(t)$ driven by a stimulus. Bottom: spikes simulated from it on 25 repeated trials — different every time, but denser wherever $\lambda(t)$ is high. The point process is the bridge between the two.* Special cases worth naming: - **Homogeneous Poisson:** $\lambda$ constant. Spikes are "completely random" at a fixed rate; no history dependence. - **Inhomogeneous Poisson:** $\lambda(t)$ varies with time (e.g. driven by a stimulus) but still ignores history. - **History-dependent (renewal and beyond):** $\lambda(t \mid H_t)$ depends on the time since the last spike(s) — the regime real neurons live in. ## The point-process GLM How do we *estimate* $\lambda(t \mid H_t)$ from data? nSTAT uses a **generalized linear model**: model the **log** of the CIF as a linear combination of covariates, $$ \begin{aligned} \log \lambda(t \mid H_t) \;=\; \beta_0 &\;+\; \underbrace{\sum_{k} \beta_{k}\, s_{k}(t)}_{\text{stimulus / extrinsic covariates}} \\ &\;+\; \underbrace{\sum_{j} \gamma_{j}\, n(t - \tau_{j})}_{\text{the neuron's own spike history}} \\ &\;+\; \underbrace{\sum_{i} \eta_{i}\, n_{\text{ensemble},\,i}(t)}_{\text{other neurons (ensemble)}}. \end{aligned} $$ Each term answers a scientific question: - **Stimulus terms $s_k(t)$** — how does an external covariate (a sensory stimulus, a movement, the animal's position) drive firing? Often expanded in a spline or other basis so the tuning can be nonlinear. - **History terms $n(t - \tau_j)$** — how does the neuron's own recent spiking shape its firing? Negative $\gamma$ near zero lag captures the refractory period; positive lobes capture bursting. - **Ensemble terms** — does another neuron's firing predict this one's, beyond the shared stimulus? This is functional coupling. Using the **log link** guarantees $\lambda > 0$ (a rate can't be negative) and makes the model *multiplicative* in the covariates. A central practical fact: the point-process GLM log-likelihood is **concave** ([Paninski 2004](https://pubmed.ncbi.nlm.nih.gov/15600233/)), so fitting has a unique maximum and converges reliably — no local-minima roulette. ## Fitting a GLM in nSTAT The objects involved: - `nspikeTrain` / `nstColl` — the spike data (see [Microelectrode recordings](microelectrode_recordings.md)). - `Covariate` / `CovColl` — the stimulus/covariate signals, sampled on a time grid. - `TrialConfig` — *which* covariates, history windows, and ensemble terms to include, and the basis for each. This is your **model specification**. - `Trial` — bundles spikes + covariates + events for one experiment. - `Analysis` — runs the fit; returns `FitResult` objects. A minimal stimulus + history fit: ```python import numpy as np from nstat import (Trial, TrialConfig, ConfigColl, Covariate, nstColl, nspikeTrain, Analysis) # --- data: one unit, a 1 Hz sinusoidal stimulus over 2 s at 1 kHz --- t = np.arange(0, 2.0, 1e-3) stim = np.sin(2 * np.pi * 1.0 * t).reshape(-1, 1) cov = Covariate(t, stim, name="stim", xlabelval="time", xunitval="s", ylabelval="amp", yunitval="a.u.", dataLabels=["stim"]) st = nspikeTrain(np.sort(np.random.default_rng(0).uniform(0, 2.0, 60)), name="unit01", sampleRate=1000, minTime=0.0, maxTime=2.0) pop = nstColl([st]); pop.setMinTime(0.0); pop.setMaxTime(2.0) trial = Trial(pop, ev=None, covarColl=None, neighbors=None) # --- model specification: baseline + stimulus (spline) + short history --- cfg = TrialConfig( covariate_specs=[("Baseline", "constant"), ("stim", "spline")], sampleRate=1000, history_window_times=[0.001, 0.002, 0.005, 0.010], # refractory + short ensCovHist=[], ) results = Analysis.runAnalysisForAllNeurons(trial, ConfigColl([cfg])) fit = results[0][0] print(f"AIC = {fit.AIC:.1f}") ``` `FitResult` exposes the fitted coefficients, their confidence intervals, the estimated CIF, and goodness-of-fit diagnostics. ## Model comparison: which covariates earn their place? More covariates always fit the training data better, so nSTAT scores models by **penalized likelihood** — Akaike's `AIC` and the Bayesian `BIC` (`fit.AIC`, `fit.BIC`) — which trade fit against the number of parameters. The workflow is: specify several `TrialConfig`s (e.g. *stimulus only* vs. *stimulus + history*), fit all of them, and prefer the lowest AIC/BIC **that also passes goodness-of-fit** (next page). A model can win on AIC and still be wrong — always check the KS plot. ![AIC bars for three nested models, decreasing from constant to stimulus to stimulus+history, with the stimulus+history model green for passing the KS test](figures/model_comparison.png) *Three nested models of a neuron with a stimulus drive **and** a refractory period. AIC falls as terms are added, but only the full **stimulus + history** model passes the KS goodness-of-fit test (green) — the constant and stimulus-only models are rejected despite their lower-but-not-lowest AIC. The runnable [model-comparison tutorial](https://github.com/cajigaslab/nSTAT-python/blob/main/examples/tutorials/model_comparison.py) also reports each coefficient with a 95% confidence interval.* ## Across-trial dynamics: the state-space GLM (SSGLM) The GLM above assumes the tuning is fixed. During *learning*, tuning changes trial to trial. The **state-space GLM** treats the GLM coefficients as a latent state that evolves across trials, estimated by an EM algorithm ([Smith & Brown 2003](https://pubmed.ncbi.nlm.nih.gov/12803953/)). In nSTAT this is `nstColl.ssglm()` / `ssglmFB()` (forward–backward). Paper Example 03 walks through it. ## Check your understanding 1. In words, what does $\lambda(t \mid H_t) \cdot \Delta$ represent? 2. Why does the GLM model $\log \lambda$ (a log link) rather than $\lambda$ directly?
Show answers 1. The **probability of a spike** in the small interval $[t, t+\Delta)$, **given the history** $H_t$ of past spikes and covariates. 2. The log link keeps the rate **positive**, makes covariates act **multiplicatively**, and yields a **concave** log-likelihood ([Paninski 2004](https://pubmed.ncbi.nlm.nih.gov/15600233/)) so the fit has a unique optimum.
> **Applying nSTAT — rhythmic and tremor cells.** A covariate need not be an > external stimulus. A neuron with its own *rhythm* — for example a **tremor > cell** locked to a few-hertz limb tremor in the human subthalamic nucleus > ([Levy et al. 2000](https://pubmed.ncbi.nlm.nih.gov/11027240/)) — is captured > by the *same* GLM with a **periodic covariate** at the rhythm's frequency. > See [Rhythmic firing and the clinical > microelectrode](rhythmic_firing_and_clinical_microelectrode.md). ## See also - Runnable examples: [`examples/fit_poisson_glm.py`](https://github.com/cajigaslab/nSTAT-python/blob/main/examples/fit_poisson_glm.py), Paper Example 02 (stimulus + history) and 03 (SSGLM). - Notebooks: [`AnalysisExamples.ipynb`](https://github.com/cajigaslab/nSTAT-python/blob/main/notebooks/AnalysisExamples.ipynb), [`HistoryExamples.ipynb`](https://github.com/cajigaslab/nSTAT-python/blob/main/notebooks/HistoryExamples.ipynb) - API: `Analysis`, `TrialConfig`, `FitResult`, `CIF`, `fit_poisson_glm` in the [API reference](../api.rst) - Next: [Goodness-of-fit and decoding](goodness_of_fit_and_decoding.md) · [Glossary](glossary.md) · [Bibliography](bibliography.md)