Σ
Dimensionless.quant
Back to Strategies
MONITORMachine Learning Research Lead · Apr 12, 2026

ML Order-Flow Predictor

Gradient boosting model trained on order flow imbalance, quote updates, and trade intensity to predict 5-minute forward returns in high-volume equities.

Strategy Score
71

Strong in-sample, IR drops 38% out-of-sample. Approved for paper trading at $10,000 notional.

Performance Metrics
WF Sharpe
1.42
OOS Sharpe
0.88
DSR
0.42
Regimes
2/5

How It Works

Supervised learning model ingests Level 2 order book data and predicts short-term price movements using ensemble methods.

Mechanics

  1. 1.Collect real-time Level 2 data (bid/ask depth, order sizes, cancellations)
  2. 2.Engineer features: order flow imbalance, quote velocity, aggressive buy/sell ratios
  3. 3.Train XGBoost model on rolling 3-month window (retrain weekly)
  4. 4.Generate predictions every 30 seconds for top 50 liquid equities
  5. 5.Execute trades when prediction confidence exceeds 75% threshold

Signals

Order flow imbalance (buy vs sell volume)Bid-ask spread dynamicsOrder book depth imbalanceTrade aggressor classification (market buy/sell intensity)

Performance Results

Sharpe Ratio
Walk-Forward1.42
Out-of-Sample0.88
In-Sample2.31
Returns
Annualized11.2%
Max Drawdown-14.8%
Calmar Ratio0.76
Consistency
Win Rate53%
Profit Factor1.4
Avg Win/Loss0.3% / -0.4%
Capacity Analysis
Max Capacity
$5M
Current Slippage
4bps
At Capacity
15bps

Implementation

Instruments

Top 50 S&P 500 constituents (AAPL, MSFT, TSLA, etc.)

Execution

Direct market access (DMA), aggressive IOC orders for speed

Rebalancing

Continuous (predictions every 30s, trades on threshold breach)

Risk Limits

Max 1% per position, -10% daily stop loss, max 20 simultaneous positions

Technology

C++ low-latency engine, GPU model inference, FIX protocol for execution

Risk Analysis

Overfitting

High Impact

Walk-forward testing, k-fold cross-validation, regularization

Regime Shift

High Impact

Monthly model retraining, performance monitoring triggers

Latency

Medium Impact

Co-located servers, optimized execution logic

Data Quality

Medium Impact

Redundant data feeds, anomaly detection filters

Backtest Results

Period
Jan 2021 - Mar 2026
Total Trades
18,402
Avg Holding
8 minutes
Best / Worst Year
+18.3% (2021) / +3.1% (2023)

Stress Period Analysis

2022 Vol Spike+5.4%

Model struggled with regime change

2023 Low Vol+3.1%

Reduced signal strength in calm markets

2024 Recovery+12.8%

Performance improved post-retraining

Deployment Status

Status

Paper trading since Apr 15, 2026

Allocation

$5,000 notional (paper only)

Brokers

Paper trading account (Alpaca)

Monitoring

Real-time slippage analysis, model drift detection, daily review