Part IV · Chapter 15
1 · Naive symmetric quoting
FIT: teaching baseline only
Quote mid ± δ with fixed δ, forever. The fruit vendor who never adjusts. Every weakness this book has discussed, in one strategy: inventory random-walks unbounded, adverse selection unpriced, fees unconsidered. You already watched it bleed — the Chapter 3 simulator runs exactly this strategy.
+ Trivial; the control group for every backtest you run.
− Run over by trends and informed flow; bleeds in jumps.
2 · Avellaneda–Stoikov
FIT: liquid, diffusive markets (crypto majors, equities)
The canonical inventory-aware quoter, derived gently in Chapter 6. Its whole logic compresses into two lines:
The AS engine in two lines r = s − q·γ·σ²·(T−t) δtotal = γ·σ²·(T−t) + (2/γ)·ln(1+γ/k) What the symbols mean All symbols exactly as in Chapter 6: r — your private fair value, shifted off the mid s by inventory q · γ — risk aversion · σ² — price variance · T−t — time left · k — how fast fills die off with distance. Quotes go at r ± δ/2. In words: center on what the position makes it worth to you, and open the spread with risk and close it with competition.
Hummingbot ships a ready-made lab for parameter intuition: the legacy avellaneda_market_making strategy exposes risk_factor (γ) and an order-shape factor (η), and — since a 2022 rework — estimates k and σ live from trading intensity and volatility buffers rather than asking you for them. (Hummingbot now recommends its V2 controllers — pmm_simple / pmm_dynamic — for new bots; the AS math lives on inside them.)
+ Principled, closed-form, the industry's shared language.
− Brownian assumption; no adverse-selection term; unbounded inventory; constant parameters.
3 · Guéant–Lehalle–Fernandez-Tapia (GLFT)
FIT: production AS — anywhere AS fits, with real risk limits
AS re-solved with hard inventory bounds |q| ≤ Q (Q = the most contracts you allow yourself to hold); the mathematics collapses to simple equations with cheap closed-form approximations. The behavioral difference is easiest to see:
interactive — AS vs GLFT: what a hard inventory bound does
AS quotes—
GLFT quotes—
GLFT status—
Drag q toward the bound Q. Both engines skew their quotes down as you get longer (blue = AS, amber = GLFT) — but AS keeps bidding forever, letting inventory run past any limit, while GLFT's bid simply switches off at q = +Q (and its ask switches off at q = −Q). One side dark = the algorithm refusing to dig the hole deeper. That refusal is the entire practical difference — and why GLFT is the version you deploy.
+ Bounded risk; stable; computationally trivial; extends to multi-asset.
− Still parametric and diffusion-based; calibration drift remains your problem.
4 · Grid trading
FIT: ranging crypto; explicitly NOT trends
A ladder of buys below and sells above a center price. When price dips a rung, you buy; when it climbs back, you sell — each completed round trip banks one rung:
Profit per completed rung rung P&L = grid step − 2·fees What the symbols mean grid step — the fixed price distance between neighboring rungs of the ladder (e.g. one rung every 50¢) · 2·fees — you pay the venue twice per round trip, once for the buy and once for the sell. A vending machine for mean reversion — no model of anything. The catch is what happens when price doesn't come back:
simulation — the grid: vending machine or falling knife
Rungs banked0
Rung P&L0.0
Open inventory0
Inventory MTM0.0
Total0.0
Green rungs are resting buys, red rungs resting sells; each completed buy-then-sell flashes and banks one rung of profit. At drift = 0 (ranging) the machine hums: rung P&L climbs, inventory oscillates near zero. Now drag the drift negative and watch the pathology: price slides down the ladder filling every buy on the way, no sell ever completes, and the "profit per rung" line is dwarfed by the red mark-to-market of a growing long position. That is the Martingale trap — the grid quietly converts into a maximum-size bet that the trend will reverse. This is why every serious grid runs a hard stop beyond the last rung.
+ Dead simple; shines in sideways chop; psychologically easy to operate.
− In a trend it Martingales into the falling side; the "profit per rung" hides a short-trend position. Needs hard stop-loss beyond the grid.
5 · Cross-exchange hedged MM
FIT: crypto long tail; any quote-here-hedge-there pair of venues
Quote where the spread is wide; on every fill, hedge instantly where the book is deep. Inventory half-life drops from hours to seconds, and the P&L of each round trip becomes an arithmetic check you can audit daily:
Cross-exchange round trip net = quoted spread − maker fee − taker fee − hedge slippage What the symbols mean quoted spread — the wide spread you capture on the quiet venue · maker fee — what the quiet venue charges your resting quote · taker fee — what the deep venue charges the hedge order (it crosses the spread, so it takes) · hedge slippage — how far the deep venue's price moves against you between your fill and your hedge landing; it grows with your latency. If net > 0 across your fills, the business exists.
animation — fill here, hedge there
Round trips0
Avg spread captured—
Avg slippage + fees—
Net P&L0.0
Left book: the quiet long-tail venue where you quote a wide spread. Right book: the deep venue you hedge into. When a trader hits your quote (flash), a hedge order streaks across to the deep book after your latency delay — and the deep price keeps moving while it flies. At low latency the hedge lands near your fill price and nearly all the spread survives as profit. Drag latency up and watch the slippage bar eat the spread: the inventory risk you engineered away has been traded for latency risk, and this slider is its price.
+ Near-market-neutral; inventory risk mostly engineered away.
− Latency between venues is the new risk; balance/transfer management; doubles your fee load.
Reference implementation: Hummingbot's legacy cross_exchange_market_making strategy (V2 controller: xemm_multiple_levels) — set the maker venue, the taker venue, and min_profitability, and it enforces the inequality above for you.
6 · Microprice-anchored quoting
FIT: any CLOB with trustworthy L2 depth
Replace the mid with Stoikov's microprice (or weighted mid) as the quoting anchor, leaning away from the side the book says is about to break. Cheap, measurable adverse-selection reduction; combines with #2/#3 (anchor swap) rather than competing with them.
+ Free markout improvement; trivially composable.
− Imbalance can be spoofed; needs clean, fast book data.
7 · Reinforcement-learning MM
FIT: research frontier; safest as a parameter-tuner
Frame quoting as a game the machine plays millions of times — a Markov decision process:
The RL loop, in words state → action → reward → (better policy) → state → … What the pieces mean state — what the bot can see right now: its inventory, the book imbalance, recent volatility, any signals · action — what it can do: choose a spread and a skew (or, safer, choose the γ and k of an AS/GLFT engine) · reward — the score it's trying to maximize: risk-adjusted P&L, typically profit minus a penalty on held inventory · policy — the learned rulebook mapping every state to an action. Training = replaying markets until the rulebook stops improving.
What does a trained policy actually look like? Not magic — a lookup table with structure. Watch one emerge:
interactive — watch a quoting policy get learned
Policy at (long, buyers queuing)—
Reading—
Each cell is a market state — inventory across, book imbalance down — and its color is the action the policy has learned there (green = skew quotes up / lean to buy, red = skew down / lean to sell, pale = quote symmetric). At 0% training the map is noise: the agent acts randomly. Drag training forward and structure crystallizes: long inventory ⇒ skew down to shed it; heavy bid-side imbalance ⇒ lean with the coming move. By 100% the machine has rediscovered, cell by cell, roughly what Avellaneda–Stoikov wrote as one formula — which is the deepest lesson of RL market making: when it works, it mostly re-learns the textbook, plus nonlinear corrections at the edges.
The honest reading of the literature, mid-2026: the classic pattern is Alpha-AS (Falces Marin et al. 2022) — let RL choose the parameters of an analytical AS engine rather than raw quotes, keeping a safety skeleton. Since then the frontier has moved on three fronts: robustness (adversarial training — Spooner & Savani 2020 — and non-stationary, clustered order-flow models), better simulators (generative "world models" of the order book — token-level autoregressive flow models, diffusion engines like TRADES, GPU-parallel books like JAX-LOB — because an RL agent is only as honest as the market it trained in), and theory (online learning of AS's fill parameters with provable logarithmic regret, Cao et al. 2024). Most results still live in simulators; regime shift and your own market impact remain the unsolved parts. There is, so far, no "GPT moment" for market making.
+ Adapts nonlinearly; ingests arbitrary signals; Alpha-AS pattern keeps an analytical safety skeleton.
− Sample-hungry; simulator-to-live gap; uninterpretable failures. Never deploy unbounded.
8 · Alpha-anchored PMM (signal-driven reference price)
FIT: crypto perps and any venue where you can mine your own signal
Pure market making with one upgrade: the reference price you quote around is shifted by a short-horizon predictive signal, so the whole quote ladder leans toward where price is about to go. Hummingbot's pmm_dynamic controller does this with MACD; Stoikov's Cornell group showed the anchor is nearly interchangeable — and that a one-line candlestick feature they call Bar Portion beat MACD in live trading (Chapter 17 turns their pipeline into a full workflow). It's the retail-scale cousin of entry #6: pick a better center, and every fill gets a little less toxic.
+ Simple to bolt onto any PMM; signal quality is measurable offline (quintile analysis) before risking a cent.
− Signals decay as others find them; a wrong signal is worse than none — it leans you into the move.
The composition insightThese aren't eight rivals; they're modules. Microprice or alpha anchor (6, 8) + GLFT skeleton (3) + cross-exchange hedge (5) + RL-tuned parameters (7) is one coherent engine — and is, in fact, Proposal F.