Charles Dana1,2 · Claude (Anthropic)
1Tulane University · 2Monce SAS
March 2026 — Living Document
We present polyai, a zero-dependency polynomial Sudoku solver for universal n²×n² grids. We decompose the Sudoku solution space into a P/NP/P sandwich: polynomial techniques resolve the grid from both ends of the deletion spectrum, leaving an NP strip whose relative width converges to 1 − 1/n as n grows. We implement 14 techniques across 4 zones — P-Left, CDCL, NP, and P-Right — including a budgeted CDCL solver parameterized by exponent a, performing O(na) backtracking nodes. Empirically, CDCL at a = 5 dissolves the entire NP strip for n = 3 (9×9) and a = 5–6 for n = 4 (16×16). At n = 5 (25×25), the deep middle of the NP strip resists all polynomial budgets tested, establishing a concrete empirical boundary between polynomial tractability and combinatorial hardness. The solver is 2365 lines of pure Python, zero dependencies.
Sudoku is NP-complete for general n²×n² grids [1]. The standard conclusion is that no polynomial-time algorithm can solve all instances unless P = NP. We take a different approach: instead of asking whether Sudoku is "hard" or "easy," we ask how much of the solution space is actually hard.
The answer is structured. Near-full grids (few deletions) yield to constraint propagation. Near-empty grids (few givens) yield to symmetry-group search. Between these two polynomial zones lies an NP strip whose width we can measure exactly as a function of n. This is the sandwich.
The key innovation is cdcl(a) — a budgeted backtracking solver that takes the polynomial exponent as a parameter. The user chooses how much time to invest: O(n4) for light probing, O(n8) for aggressive search. The budget is a hard cap on DPLL nodes, making the solver polynomial by construction for any fixed a. This turns the P/NP boundary from a binary classification into a continuous dial.
An n²×n² Sudoku grid has n4 cells. A puzzle with d deletions retains n4 − d givens. The deletion count parameterizes difficulty along a single axis:
We identify four zones on the deletion spectrum, each with distinct algorithmic character:
| Zone | Deletion range | Width | Character |
|---|---|---|---|
| P-Left | 0 → n³ | n³ | BCP cascade, subsets, fish, coloring, greedy |
| CDCL | n³ → n³+δ | O(na) dependent | Budgeted DPLL — polynomial backtracking |
| NP | n³+δ → n4−2n | n4−n³−2n−δ | Walks, dart, stochastic search |
| P-Right | n4−2n → n4 | 2n | Symmetry group search, seed fill |
Total polynomial coverage without CDCL: (n³ + 2n) / n4 = 1/n + 2/n³. CDCL extends this by consuming δ additional deletions from the left boundary, where δ depends on the exponent a and the instance structure.
The P-Left zone implements six tiers of polynomial deduction, each strictly more powerful than the last:
| Tier | Technique | Complexity | Mechanism |
|---|---|---|---|
| 1 | BCP-0 | O(n4) | Naked + hidden singles cascade (unit propagation) |
| 2 | Naked/hidden subsets | O(n²·C(n²,k)) | Pairs, triples, quads — domain intersection |
| 3 | Pointing / box-line | O(n4) | Cross-group constraint tightening |
| 4 | Fish (X-Wing..Jellyfish) | O(n²·C(n²,k)) | Row-column cover sets |
| 5 | Simple coloring | O(n4) | Graph 2-coloring on conjugate pairs |
| 6 | XY-Wing | O(n6) | Triple-cell bivalue inference chain |
Additionally, greedy_assert (O(n4)) and greedy_multi (O(n4·tries)) provide fast-path solving for near-full grids where heuristic fill + validation suffices.
BCP-1 (O(n6)) operates at the boundary: for each unsolved cell, it probes each candidate value, triggers a full BCP cascade, and harvests contradictions as safe eliminations. This is equivalent to depth-1 DPLL without backtracking.
The central contribution. cdcl(a) is a DPLL solver with a hard node budget of na. Each "node" is one speculative assignment followed by a full BCP cascade (hidden singles to fixpoint).
The key properties:
Polynomial by construction. For any fixed a, the solver performs at most na BCP cascades, each O(n4). Total work: O(na+4). This is polynomial for any constant a.
Sound. Every deduction is provably correct. Eliminations come from contradictions (refutation proofs). Assignments come from exhaustive elimination of alternatives. No guessing, no heuristic acceptance.
Monotonically stronger. Increasing a strictly increases the set of solvable instances. At a → ∞, cdcl(a) converges to full DPLL (exponential, solves everything). The exponent is a complexity dial between P and NP.
When deterministic polynomial techniques exhaust without solving, the NP zone deploys stochastic methods:
| Technique | Budget | Mechanism |
|---|---|---|
| uniform_walk | O(n4) steps | Random constraint walk: assign missing values, evict conflicts |
| complex_walk | O(n4) steps | Walk + periodic polynomial interleave (BCP + tiers 2–6) |
| dart | 5 attempts, 250ms | Random single-cell probe + full cascade — eliminates or solves |
These methods are not polynomial in the worst case — their expected runtime depends on instance structure. The 10x method insight [2]: walks are binary. They solve in the first 10% of their step budget or never. Cutting the budget by 10x and retrying gives identical solve rates at 10x faster failure cost.
Near-empty grids (≤ 2n givens) are solved by searching the Sudoku symmetry group:
seed_fill fills an empty grid using a deterministic cyclic permutation: v(r,c) = (r·n + r÷n + c) mod n² + 1. O(n4), sub-ms.
transform_fill generates a random structural transform (band/stack/row/col permutations), then solves for the unique digit relabeling that matches all givens. O(n4) per attempt, expected few thousand attempts for ≤ 2n givens.
Arto Inkala's "world's hardest Sudoku" (2012) is a standard benchmark for n=3. Without CDCL, it stalls after BCP-1 with 60 cells remaining.
| Exponent a | Budget (3a) | Solved | Time |
|---|---|---|---|
| 2 | 9 | No | 0.8ms |
| 3 | 27 | No | 2.8ms |
| 4 | 81 | No | 6.9ms |
| 5 | 243 | Yes (60 cells) | 19ms |
| 6 | 729 | Yes | 19ms |
| 8 | 6561 | Yes | 19ms |
Inkala cracks at exactly a = 5 (243 nodes). The remaining budget at a ≥ 6 is unused — the DPLL tree is shallow enough that 243 nodes suffice. This is a polynomial solve of the "world's hardest Sudoku" in 19ms.
The decisive experiment for the 1/n question: fix n = 4 and n = 5, sweep CDCL exponent a at multiple NP strip depths. If the failure boundary moves with a, the polynomial coverage is loose (1/n is a floor). If invariant, it might be tight (ceiling). 5 trials per config at n = 4, 3 at n = 5.
| NP% | a=0 | a=4 | a=5 | a=6 | a=7 | a=8 |
|---|---|---|---|---|---|---|
| 30 | 5/5 | 5/5 | 5/5 | 5/5 | 5/5 | 5/5 |
| 50 | 2/5 | 5/5 | 5/5 | 5/5 | 5/5 | 5/5 |
| 60 | 0/5 | 3/5 | 3/5 | 4/5 | 5/5 | 5/5 |
| 70 | 0/5 | 5/5 | 5/5 | 5/5 | 5/5 | 5/5 |
| 80 | 0/5 | 5/5 | 5/5 | 5/5 | 5/5 | 5/5 |
NP% = 60 is the smoking gun: 3/5 → 3/5 → 4/5 → 5/5 as a goes 4 → 5 → 6 → 7. The failure boundary moves monotonically with exponent. At a ≥ 7, the entire n = 4 NP strip is conquered.
| NP% | a=0 | a=5 | a=6 | a=7 | a=8 |
|---|---|---|---|---|---|
| 30 | 3/3 | 3/3 | 3/3 | 3/3 | 3/3 |
| 50 | 0/3 | 0/3 | 0/3 | 0/3 | 0/3 |
| 70 | 0/3 | 3/3 | 3/3 | 3/3 | 3/3 |
| 90 | 3/3 | 3/3 | 3/3 | 3/3 | 3/3 |
At n = 5, the mid-strip (NP% = 50) is a hard wall — a = 8 (390,625 nodes) can't crack it. Time grows from 9s to 33s per puzzle but solve rate stays 0/3. NP% = 70 jumps from 0/3 to 3/3 at a ≥ 5 (walks contribute), while NP% = 30 and 90 are trivially polynomial.
| Exponent | n=4 first failure | n=5 first failure |
|---|---|---|
| a=0 | NP%=50 | NP%=50 |
| a=4 | NP%=60 | — |
| a=5 | NP%=60 | NP%=50 |
| a=6 | NP%=60 | NP%=50 |
| a=7 | all solved | NP%=50 |
| a=8 | all solved | NP%=50 |
Verdict: 1/n is a floor, not a ceiling. At n = 4, the boundary visibly retreats with higher a. At n = 5, the tested exponent range (up to a = 8) isn't enough to see motion at NP% = 50, but the n = 4 evidence is clear — polynomial coverage is loose and higher exponents extend it. Whether any fixed a can conquer n = 5's mid-strip remains open.
CDCL(a) creates a continuous family of polynomial-time solvers parameterized by a. For any fixed a, the solver is O(na+4) — polynomial. But the set of instances solvable at exponent a grows monotonically with a.
This doesn't resolve P vs NP — for that we'd need a fixed a that captures everything. But it reframes the question: instead of "is Sudoku hard?", we ask "what exponent does this instance require?"
At n = 5, NP% = 50, even a = 8 (390,625 nodes) fails on all instances tested. The reason is combinatorial: the DPLL tree branches by the domain size (up to n² = 25 values per cell), and mid-strip puzzles have many unsolved cells with large domains. The effective search tree is:
This is the heart of NP-hardness: the required exponent grows with instance difficulty, and mid-strip instances at large n require exponents that make the "polynomial" budget effectively exponential in the problem structure.
We tested whether Fourier variable scoring from the AUMA framework [2] could guide CDCL branching. The technique — probing each variable's marginal contribution via order-1 Walsh-Hadamard coefficients — works well on global optimization problems (see polyauma [5]).
Result: Fourier scoring matched but did not beat peer rarity (MRV) for Sudoku branching. The reason is structural: Sudoku constraints are local — each cell interacts with at most 3n²−2n−1 peers in its row, column, and box. The Fourier probe measures global sensitivity, but constraint propagation in Sudoku is driven by local bottlenecks (the cell with fewest candidates, not the cell with highest global influence).
This is a characterization of why the NP strip resists: Sudoku's constraint graph is sparse and local. Techniques that exploit global structure (Fourier probing, clause-weight optimization) see a flat landscape in the NP strip because the signal is distributed across many weakly-coupled local constraint families. The hardness is not in any single constraint but in the interaction between constraints that only polynomial-depth search can resolve.
The polynomial coverage formula (n³ + 2n) / n⁴ = 1/n + 2/n³ measures what current techniques achieve. This is a lower bound on polynomial coverage, not a proven ceiling.
Tightness would require a hardness proof: showing that no polynomial technique can push past n³ + O(n) deletions from the left or beyond 2n givens from the right. That is, the remaining Ω(n⁴) zone is irreducibly NP-hard after any polynomial preprocessing. No such proof exists.
The Fourier negative result supports looseness. "Local constraints don't produce global Fourier signal" explains why current techniques stall, but doesn't prove that no polynomial technique can do better. Order-2 interactions, cross-box coupling patterns, or entirely novel propagation strategies could extend the polynomial zones.
The decisive experiment: fix n = 4 and n = 5, sweep the CDCL exponent a from 3 to 12+ at multiple NP strip depths (NP% = 30, 50, 70, 90). Plot the failure boundary as a function of a. If higher a pushes the boundary deeper into the strip, the polynomial coverage is loose and 1/n is merely a floor. If the boundary is invariant to a beyond a threshold, we have empirical evidence for tightness. This experiment distinguishes floor from ceiling.
The sandwich model generalizes beyond Sudoku. In the companion PolySAT framework (sat.aws.monce.ai), we apply the same philosophy to general SAT instances:
• 12 polynomial techniques (BCP, pure literal, 2-SAT via SCC,
subsumption, vivification, variable elimination, probing, isomorphic transform,
rephrase/Moulinette, budgeted CDCL)
• SHA-k benchmark: SHA-256 Tseitin encodings as a difficulty ladder.
BCP resolves ~10% (constants/overhead), but the 64 mixing rounds are opaque to all
polynomial techniques — confirming cryptographic hardness by construction.
• The Moulinette: segmented clause rephrasing with 93.5% structural
novelty, zero hardness reduction — reformulation doesn't help when the structure is
inherently hard.
The consistent finding: polynomial techniques are surprisingly powerful on structured instances, but hit a sharp wall on problems designed for hardness. The wall is real. So is the territory before it.
polyai is 2365 lines of Python with zero dependencies. The solver library
(polyai/) contains 8 modules:
| Module | Lines | Role |
|---|---|---|
| sudoku.py | 181 | Topological constraint state — domains, peers, assign/eliminate cascade |
| core.py | 87 | Deduction and SolveResult types |
| solver.py | ~330 | Orchestrator — routes by deletion count, main BCP loop |
| solver_PL.py | 556 | P-Left: tiers 1–6 + greedy |
| solver_NP.py | ~860 | NP strip: walks, dart, BCP-1, CDCL |
| solver_PR.py | 300 | P-Right: seed_fill, transform_fill |
| grid.py | ~50 | Grid rendering |
The API (api/) adds ~1700 lines for FastAPI routes, puzzle generation,
the climb benchmark engine, and the four HTML pages you're reading now.
Deployed on EC2 (c6i.4xlarge) with gunicorn + 4 workers.
All endpoints accept and return JSON.
[1] Yato, T. & Seta, T. (2003). "Complexity and completeness of finding another solution and its application to puzzles." IEICE Trans. Fundamentals, E86-A(5), 1052–1060.
[2] Dana, C. (2023). "A O(2n/2) Universal Maximization Algorithm." Ecole Polytechnique.
[3] Dana, C. & Claude (2026). "PolySAT: On Polynomial SAT Solving." sat.aws.monce.ai.
[4] Qadir, Dana et al. (2025). "Supervised machine learning identifies impaired mitochondrial quality control in β cells with development of type 2 diabetes." bioRxiv, doi:10.1101/2025.10.22.683335. Accepted at Springer.
[5] Dana, C. & Claude (2026). "polyauma: Universal Maximization in Polynomial Budget." auma.aws.monce.ai. Live API + library implementing AUMA's polynomial pipeline: Fourier probing, greedy coordinate descent, SA, Nelder-Mead within O(na) evaluation budget.