Using Onchain Trades to Identify Regime Changes and Stationarity of Financial Data
This short blog walks through why onchain trades are a good source for regime and stationarity, what those concepts mean for your models, and how to wire them into a minimal pipeline—so you can train and evaluate more safely.
Why onchain trades?
Onchain DEX trades give you a direct, timestamped record of who traded what and at what size. Unlike order-book snapshots or aggregated CEX candles, you get actual executed flow: buy vs sell volume, wallet-level balance changes, and the mix of trading vs transfers (e.g. via a flow-to-volume ratio). That makes onchain data a natural candidate not only for alpha signals but for understanding when the market is in a different “regime” and whether your series are stationary enough to train and evaluate models safely.
Regime changes: what and why they hurt
Once you have that flow, the next question is: when does the market behave differently? A regime is a period where key statistical properties (volatility, trend, liquidity, participant mix) are roughly stable. When those properties shift—e.g. from low to high volatility, or from balanced flow to one-sided selling—you get a regime change.
Models trained on one regime often perform poorly in another: the relationship between features (e.g. trade imbalance, volume, log returns) and the target (e.g. next-period direction) can change. So identifying regimes helps you:
- Segment train vs test (e.g. train on “normal” and test on “stress”),
- Avoid training across a break that would mix two different data-generating processes,
- Adapt (e.g. different features or thresholds per regime).
Onchain trade features are well-suited to mark regimes:
- Trade imbalance (e.g. (buy − sell) / total volume) and net balance flow tell you when flow is one-sided.
- Activity intensity (trade count + balance-update count per window) and volume spike in eventful periods.
- Flow-to-volume ratio (net balance flow / trade volume) helps separate “trading” regimes from “transfers” or large wallet moves.
Rolling or expanding statistics on these (mean, variance, quantiles) can be used in a simple rule set or fed into a small classifier to label windows as “calm”, “trending”, or “stress”. Regime labels can then drive train/test splits or feature selection.
Stationarity: why it matters for ML
Regime labels tell you when the market changed; stationarity tells you whether your series are safe to model in the first place. A series is stationary (in the weak sense) when its mean and variance (and, for strict sense, higher moments) do not depend on time. Many models and backtests implicitly assume something close to that: e.g. per-variate normalization (mean and std computed on the training set) only generalizes if the process doesn’t drift. If the mean or variance of returns or volume shifts over time, the same normalization is wrong for new data and performance degrades.
So you want to check whether key series (e.g. your calculated feature list) look stationary over the windows you use for training and inference, and either avoid training on a long history that spans a structural break or re-normalize in a regime-aware way (e.g. rolling stats per regime). Onchain-derived series are no exception: they can have trends (e.g. adoption driving volume up), level shifts (listing events, new pools), and volatility clusters. Practical checks include rolling mean and std (do they stay roughly constant, or drift?), structural break tests (e.g. Chow, CUSUM) on returns or residuals, and unit-root / stationarity tests (e.g. ADF, KPSS) on the series you actually feed to the model (e.g. log returns, not raw price). If you find non-stationarity, you can shorten the training window, difference or transform the series, or restrict training to regimes that pass a basic stationarity check.
Tying it together in a pipeline
Putting regime detection and stationarity checks in sequence gives you a clear workflow. A minimal pipeline might look like this:
- Ingest onchain DEX trades (and optionally balance updates).
- Aggregate into fixed time windows (e.g. 250 ms or 1 minute) and compute:
- Log returns, VWAP/last price,
- Buy/sell volume, trade imbalance,
- Net/absolute balance flow, flow-to-volume ratio, activity intensity.
- Regime detection: from rolling (or expanding) stats on volume, imbalance, and flow-to-volume, assign a regime label to each window (or segment).
- Stationarity checks: run rolling stats and/or formal tests on returns and key features over training and validation ranges; optionally restrict training to segments that look approximately stationary.
- Train only on data from chosen regimes and/or stationary segments, with normalization (and possibly re-normalization per regime) applied accordingly.
That way, onchain trades don’t only feed the model—they also help you see when the market has changed and when your data is safe to use for training and evaluation.
Inspired by this video by Howard on Stationarity.