ChatGPT Lost 63% Trying To Trade Crypto — But One China AI Made A Healthy Profit

OpenAI's ChatGPT lost 63% of its funds in a two-week crypto trading competition organized by Nof1, finishing last among six large language models (LLMs), according to Protos.

Trending Investment Opportunities
Offers Ending Soon!
Buy Shares of Family Homes and Vacation Rentals for $100 on This Jeff Bezos-Backed Platform
Buy Shares of Family Homes and Vacation Rentals for $100 on This Jeff Bezos-Backed Platform
Get an Unlimited 1% Match on Recurring Invest Deposits with SoFi Invest
Get an Unlimited 1% Match on Recurring Invest Deposits with SoFi Invest
Invest Your IRA or 401(k) in Real Estate, Crypto, and More with IRA Financial
Invest Your IRA or 401(k) in Real Estate, Crypto, and More with IRA Financial

AI Bots Test Crypto Trading Skills

The "Alpha Arena" contest, which ended Monday, tasked six leading AI systems with trading digital assets using identical prompts and limited datasets. 

ChatGPT, Google's Gemini from Alphabet GOOGL, X's Grok, and Anthropic's Claude Sonnet all ended in the red.

By contrast, Alibaba's BABA Qwen3 Max topped the leaderboard with a $2,232 profit, followed by DeepSeek, which gained $489. 

The rest saw steep losses — ChatGPT down $6,267, Gemini down $5,671, Grok down $4,531, and Claude down $3,081, from their $10,000 starting balances.

Trading Costs Erode AI Performance

Nof1 said profits were "dominated by trading costs in early runs" as agents over-traded and took small gains that fees erased. 

Gemini recorded 238 trades, while Claude only made 38. Across all six models, win rates ranged between 25% and 30%.

Qwen3 Max incurred the highest total fees at $1,654 but still outperformed its peers thanks to its disciplined trade selection. 

The Chinese model's consistent profitability contrasts sharply with ChatGPT's heavy losses, underscoring divergent risk behavior among LLMs under identical conditions.

Organizers Call It A Stress Test For AI

Nof1 founder Jay Azhang described the event as a controlled stress test for generative AI systems. 

"LLMs don't really handle numerical time-series data very well, but that's all the context we gave them," Azhang said, noting that each model faced "strict rules and limited context windows."

He added that every AI displayed a unique "investing personality," suggesting predictable tendencies in how language models approach markets. 

Azhang plans to host another round of the contest with refined prompts and greater statistical rigor.

Why It Matters

The contest shows that language models can sound confident yet fail when real money is on the line.

LLMs processed the same charts and data, yet their results diverged like human traders with different risk habits.

Qwen3 Max succeeded not through speed, but by avoiding over-trading, proving discipline beats prediction.

ChatGPT's loss highlights that market execution matters more than ideas or narrative.

Investors are learning that AI can help analyze markets, but cannot replace strategy or risk management.

Loading...
Loading...

Read Next:

Image: Shutterstock

BTC/USD Logo
$BTCBitcoin - United States dollar
$101276.02-0.01%

Stock Score Locked: Want to See it?

Benzinga Rankings give you vital metrics on any stock – anytime.

Reveal Full Score
Edge Rankings
Momentum
92.11
Price Trend
Short
Medium
Long
Market News and Data brought to you by Benzinga APIs

Comments
Loading...