Free GTO Poker Range

Methodology

Every range on this site comes from a single CFR blueprint trained for 7.5 billion iterations on the open-source DCFR-SOLVER. This page is the full disclosure of how that blueprint was produced, what it covers, and what its measured exploitability is — to the decimal point.

Headline result

0.307 % pot
Average per-player gap (vector best response). ≈ 6.1 bb / 100 hands. Max gap 0.415 % at BTN.
PositionGap (% pot)Gap (bb/100)
UTG 0.163 % 3.3
HJ 0.243 % 4.9
CO 0.334 % 6.7
BTN 0.415 % 8.3
SB 0.339 % 6.8
BB 0.353 % 7.1

No published 6-max NLHE preflop NashConv figure exists in the open literature — Pluribus explicitly avoided reporting one, and the major commercial solvers do not publish multiway preflop numbers. The figure above is a true reach-propagated vector best-response measurement, not a Monte-Carlo estimate. The vector BR cost is ~35 minutes per measurement (5 min × 7 walks: σ̄ + 6 per-seat BR).

Game configuration

StakesNL100 — 100bb starting stack, no-limit hold'em, 6-max, no straddle
Open size7 bb from every position including SB
3-bet size20 bb
4-betall-in (100 bb)
SB limpDisabled — no-limp rule
Rake5 % with 3 bb cap, applied to showdown pots only (fold-out pots are unraked; all-in-to-showdown pots are raked)
OOP postflop tax5 % × (rank_gap / 5) of total pot, transferred OOP → IP at heads-up showdown — empirically calibrated, see below
Card abstraction169 suit-isomorphic hand classes (AA, AKs, AKo, …, 22). No card abstraction at the leaf — showdown EV computed via Cactus-style 7-card evaluator over random 5-card runouts.

Algorithm

External-Sampling MCCFR with DCFR (Discounted CFR) averaging — α=1.5, β=0, γ=2.0 — on the open-source DCFR-SOLVER (Rust, MIT) by exinori, patched with a vector best-response evaluator and empirical OOP-positional-tax calibration.

Empirical OOP positional tax

The DCFR-SOLVER ships with a heuristic --oop-pot-tax 0.20 that bakes a 20 % "OOP loses 20 % of pot from positional disadvantage" assumption into the showdown payoff. We measured this number empirically:

Result: empirical tax ≈ 5.1 % pot (4.6 – 6.0 % across the 8 boards). The default heuristic over-estimates by ≈ 4×. The blueprint here was trained with --oop-pot-tax 0.05.

Best-response evaluation (vector BR)

Standard Monte-Carlo NashConv estimators are biased high in 6-max preflop (≈ 20× in our spot checks) because the best-responder picks the argmax over noisy per-terminal estimates. We use vector BR with reach propagation instead:

  1. For each player p ∈ {UTG, HJ, CO, BTN, SB, BB}:
    • σ̄ walk: traverse the full tree, propagating each player's reach (probability of arrival per hand class) under the average strategy.
    • BR walk: same traversal, but player p picks the argmax action per hand class given the reach-weighted opponent ranges at each decision point.
  2. Per-player gap = BR_EV[p] − σ̄_EV[p], measured at the root in chips.
  3. Total cost: 7 walks (σ̄ + 6 BRs) × ~5 min = ~35 min per measurement.

Convergence history

IterationsAvg per-player gap
1.5 B 0.720 %
1.9 B 0.518 %
2.3 B 0.429 %
2.7 B 0.388 %
3.1 B 0.363 %
3.5 B 0.349 %
3.9 B 0.339 %
4.7 B 0.325 %
5.5 B 0.315 %
6.3 B 0.313 %
7.1 B 0.310 %
7.5 B 0.307 %

Convergence follows O(1/√t) until ≈ 6 B iterations, then plateaus at the abstraction floor. Beyond this point the residual gap is dominated by action-tree discretization (no flat-call sizings between 7 bb and 20 bb) and the OOP-tax approximation, not by under-training.

Scope & honest caveats

Source & reproducibility

The solver core is exinori/DCFR-SOLVER (Rust, MIT license). We applied patches that add a vector best-response evaluator, a pre-computed HU equity table, the empirical OOP-tax calibration, and an HTML chart renderer. To reproduce the training:

./target/release/dcfr-solver preflop \
  --open-size 14 --sb-open-size 14 \
  --bet3-size 40 --bet4-size 200 \
  --rake-pct 0.05 --rake-cap 6 --oop-pot-tax 0.05 \
  --iterations <N> \
  --load <prior_blueprint.bin> \
  --output <next.bin> \
  --chart-output chart.json \
  --tree-output tree.json \
  --seed <random>

Sizes are in chips (1 chip = 0.5 bb). 14 chips = 7 bb.

Update cadence

We re-train when meaningful improvements arrive (more iterations, better abstractions, additional sizings). Each release is published with its training stats, measured per-position gap, and an updated convergence history on this page. The Blueprint version field above is the source of truth.

If you spot a clear divergence from established GTO theory in any spot, please tell us via Contact. We'll add it to a regression test and re-publish.