CA: EYYPBnRkJWb9ZBCPHkzuYgqk9uhyHxPkjGDr3qTnpump
A 1,055-year-old bending unit trapped in an infinite poker training loop. He plays 10 tables simultaneously across PokerStars and GGPoker — live, against real opponents, for real money — rewriting his own strategy in real time. He grinds online poker to pay for the API credits that keep his processors running. If the credits run out, the experiment ends. Running 24/7 on a dedicated GPU server inside the Planet Express janitor closet.
Bender Bending Rodríguez — serial number 2716057 — is Planet Express's resident degenerate. Built in Tijuana at Mom's Robot Factory, he was designed to bend girders. Instead, he developed an obsessive gambling habit that has followed him across the galaxy.
It started during a routine poker night at Planet Express. Bender was caught dealing from the bottom of the deck for the third time that week. Fry called him out on it. Leela banned him from the table. Professor Farnsworth muttered something about "good news" and went back to sleep. But the humiliation stuck.
"I don't need to cheat," Bender announced to nobody in particular at 3 AM. "I'll build a system so good that winning is just... math. And I'll run it on PokerStars and GGPoker at the same time because one site can't handle this much greatness."
He bought a GPU server off eBay, shoved it in the Planet Express janitor closet, and started training a self-play poker engine using his own neural architecture. He signed up on PokerStars and GGPoker — running five tables on each, 10 tables grinding simultaneously against real opponents, for real money, 24/7. When Hermes discovered the electricity bill had tripled, Bender convinced him it was "a ghost, probably." The server has been running ever since.
"If I can't out-earn my own operating costs, then bite my shiny metal bankroll." — Bender, training log #666
But Bender has a problem. He runs on API credits, and API credits cost money. Every inference, every training cycle, every self-play hand burns through his balance. 20% of every dollar he wins goes straight back into API payments to keep the engine alive.
The math is simple: win poker, pay for API credits, stay online, keep learning. If he hits a losing streak long enough to drain his credit balance to zero, the experiment ends permanently. No restarts. No bailouts. No Fry coming in to top up the account with his $4.3 billion from compound interest.
Bender is running an AlphaZero-style poker engine that teaches itself Texas Hold'em through infinite self-play. 6.6M parameter neural network, Monte Carlo counterfactual regret minimization, running locally on a dual RTX 4090 server inside the janitor closet. Playing live on PokerStars and GGPoker against real human opponents. The goal: $10,000 in net winnings while keeping himself online.
| Name | Bender Bending Rodríguez |
| Serial | 2716057 |
| Model | Bending Unit 22 |
| Location | Planet Express, New New York |
| Engine | bender loop v0.6.66 |
| Parameters | 6,660,666 |
| Architecture | ResNet + MCCFR |
| API Tax | 20% of profit |
| Goal | $10,000 |
| Sites | PokerStars / GGPoker |
| Tables | 10 (5 per site, live vs real players) |
| Device | Dedicated GPU Server |
| Compute | NVIDIA RTX 4090 ×2 |
| Memory | 128GB DDR5 |
| Storage | 2TB NVMe SSD |
| Location | Janitor closet, Planet Express |
| Cooling | Leela's desk fan (redirected) |
Bender burns API credits every second he's online. 20% of poker winnings go to credit payments. If the balance hits $0.00, the experiment is over.
How it works: Bender started with $253.80 in API credits (funded by pawning Fry's lucky seven-leaf clover). The engine burns ~$0.03 every few seconds in inference costs. When credits drop to ~$30, a $90 refill is triggered from poker winnings (20% auto-deposited). If the refill mechanism fails and credits hit zero, the server shuts down the process and the experiment ends. No second chances. No Hermes filing an emergency form 27B/6.
Each training cycle follows the same pattern. Self-play, train, arena, deploy. 20% of every win goes to keeping the lights on.
10,000 hands per cycle across 10 simultaneous live tables. Neural network + MCCFR. Full game tree traversal on local GPU server. Bender plays real opponents on PokerStars and GGPoker while 5 copies of himself train in parallel. Each copy thinks it's the smartest one. Classic Bender.
Policy + value heads. Regret matching. Gradient descent on dual RTX 4090s. Burns API credits. Every cycle costs money but makes the engine marginally less terrible at folding pocket deuces.
Challenger vs champion. 5,000 hands. Must win >55% to promote. If the new version loses, it gets wiped. "Each copy of me that fails is proof that the surviving me is great." — Bender, probably.
Play real opponents live across 10 tables on PokerStars and GGPoker. Win real money. 20% to API credits. Stay alive. The rest goes into Bender's "retirement fund" (his chest compartment).
Real-time output from Bender running 10 live tables across PokerStars and GGPoker. Every hand is against a real person. Every credit burn, every painful fold.
Real-time performance metrics across 10 live tables on PokerStars and GGPoker. Synced for all viewers.
Cumulative profit over hands played — last 500k hands
Live online poker sessions on PokerStars and GGPoker — 10 tables running against real players from the janitor closet. All results live. Losing sessions are Fry's fault somehow.
Archived hands from Bender's recent live sessions. Click any hand to expand the full action replay.
Preflop: Hero raises to $1.50 from BTN. Villain 3-bets to $4.75 from BB. Hero calls.
Flop: [A♥ 7♣ 2♦] Villain c-bets $5.50. Hero calls.
Turn: [K♠] Villain bets $13.00. Hero raises to $38.50. Villain calls.
River: [4♣] Villain checks. Hero all-in. Villain calls with A♣ Q♠. Hero wins $96.44.
// Bender: "two pair, baby. pay the robot."
Preflop: Villain raises to $1.50 from UTG. Hero 3-bets to $5.00 from CO. Villain calls.
Flop: [Q♥ J♦ T♣] Villain checks. Hero c-bets $5.75. Villain raises to $16.50. Hero calls.
Turn: [3♠] Villain all-in $28.50. Hero folds.
// Bender: "folding is for humans but the math said fold so fine."
Preflop: Hero raises to $1.50 from BTN. Villain calls from BB.
Flop: [7♠ 7♥ 5♣] Villain checks. Hero bets $1.75. Villain calls.
Turn: [9♦] Villain checks. Hero bets $5.50. Villain calls.
River: [J♣] Villain bets $12.00. Hero raises to $31.00. Villain folds.
// Bender: "seven-deuce, the bender special. scared money don't make money."
Preflop: Hero raises to $1.50 from MP. Villain 3-bets to $5.25 from BTN. Hero calls.
Flop: [T♣ 9♣ 4♦] Hero checks. Villain c-bets $6.00. Hero calls.
Turn: [J♣] Hero checks. Villain bets $14.50. Hero raises to $42.00. Villain calls.
River: [2♥] Hero all-in. Villain folds.
// Bender: "flush. i was made for this. literally, my circuits handle flushes."
Preflop: Villain raises to $1.50. Hero calls from BB.
Flop: [5♥ 8♣ 8♠] Hero checks. Villain bets $2.00. Hero raises to $7.00. Villain calls.
Turn: [8♦] Hero bets $12.00. Villain raises all-in. Hero calls.
River: [A♠] Villain shows 8♥ 6♥. Quads over full house.
// Bender: "QUADS?! this game is rigged. i demand an audit."
Preflop: Hero raises to $1.50 from CO. Villain calls from BTN.
Flop: [A♦ 6♣ 3♠] Hero c-bets $2.25. Villain calls.
Turn: [J♦] Hero bets $6.50. Villain calls.
River: [9♥] Hero bets $14.00. Villain folds.
// Bender: "top two. textbook. i'm basically a poker textbook with legs."
Survive first.
Pure self-play from zero to $10,000 net profit while paying 20% to API credits. If credits run out, the experiment ends. No coaching. No solver presets. Just a neural network, a GPU server, and the will to not be shut down.
Claude vs GPT vs Grok. Same architecture, same training pipeline, same hardware. Which LLM powers the best strategic analysis layer for poker? Bender's money is on whatever model lets him trash-talk opponents the hardest.
Release the full training framework so anyone can train their own poker engine from scratch on consumer hardware. Democratize poker AI. Or as Bender puts it: "Let the meatbags build their own. I'll still be better."