Which LLM to Choose?
I broke down which one you should choose for each task.
We have officially reached “subscription fatigue.”
Every week a new model drops, claiming to be the “GPT-Killer.” You cannot subscribe to all of them. Nor should you.
I’ve spent the last month running the same prompts across every major frontier model to answer one question: Which one is actually worth the money?
The results were surprising. The gap between “good” and “great” is widening, and for the first time, OpenAI isn’t sitting alone at the top.
Below is the definitive ranking of the 8 major models, scored out of 80 based on coding, reasoning, math, and real-world utility.
This stuff changes weekly. If you want to stay ahead of what’s happening in AI with practical guidance and weekly live sessions, sign up for the AI Creators Club.
It’s me teaching you weekly and daily the best things that are happening in AI, specifically showing you how to scale your visibility and pipeline on LinkedIn with (actually) good, non-generic AI content.
Here’s what members are already achieving:
Karol Boguniewicz challenged himself to post for 28 days. He started hesitant, but momentum built quickly. Over four weeks, he reached 100,000 impressions, grew his followers by 12%, and connected with over 50,000 people. “For technical leaders like me, it’s not just about content - it’s about credibility. The frameworks helped me stay consistent, share meaningful insights, and turn visibility into career leverage.”
Monica Joy hit 5,000 followers in 4 weeks and built systems that keep growing her brand. “This wasn’t just a bootcamp. It was a real-time glow-up. The AI workflows, DM strategies, and positioning advice helped me move from £10 tasks to £10K execution. Every session with Charlie and Fatima delivered tools I used the same day. This experience didn’t just teach me theory. It accelerated my momentum.”
Michael Zajac’s LinkedIn reach exploded by 926% in 28 days. “Charlie and Fatima’s bootcamp taught me how to show up consistently, share real value, and lead with authenticity. I used to overthink every post. Now I focus on serving others and the results speak for themselves. The momentum hasn’t stopped.”
Sarah Ashton launched her studio 12 weeks ago and started posting on LinkedIn 42 days ago. Since then, she’s seen 50,748 impressions - a 9350% increase - and reached 18,343 members. Her follower count has grown by 529, and more importantly, clients are converting. “I’ve welcomed three new clients, with three more in active discussions, and fresh enquiries continue to land in my inbox. The momentum feels steady. Like trekking up a mountain, it’s one step at a time. Some journeys come in leaps, but mine is a climb - intentional and clear.”
The Leaderboard
1. Gemini 3 Pro — 71/80
Best reasoning model available. First to break 1500 on LMArena leaderboard. Wins most benchmark tests. Handles text, images, video, audio together. Massive 1M token context window.
Coding: █████████░ 9/10
Reasoning: ██████████ 10/10
Math: █████████░ 9/10
Speed: █████████░ 9/10
Cost: ███████░░░ 7/10
Context: ██████████ 10/10
Web Search: █████████░ 9/10
Ecosystem: ████████░░ 8/10
2. Claude Sonnet 4.5 — 63/80
World’s best coding model. Fixes real GitHub bugs better than any competitor. Runs autonomous tasks for 30+ hours straight. Zero errors on code editing tests.
Coding: ██████████ 10/10
Reasoning: █████████░ 9/10
Math: ███████░░░ 7/10
Speed: ███████░░░ 7/10
Cost: ██████░░░░ 6/10
Context: ██████████ 10/10
Web Search: ██████░░░░ 6/10
Ecosystem: ████████░░ 8/10
3. GPT-5 — 63/80
Best developer tools and integrations. Automatically switches between fast mode and thinking mode. Biggest ecosystem with most third-party support. Works everywhere.
Coding: ██████████ 10/10
Reasoning: ██████████ 10/10
Math: █████████░ 9/10
Speed: ████████░░ 8/10
Cost: ████░░░░░░ 4/10
Context: ██████░░░░ 6/10
Web Search: ██████░░░░ 6/10
Ecosystem: ██████████ 10/10
4. Perplexity Pro — 58/80
One subscription gets you GPT-5, Claude, Gemini and more. Best web search with live citations. Perfect for research. No need to pick models yourself.
Coding: ████████░░ 8/10
Reasoning: ████████░░ 8/10
Math: ████████░░ 8/10
Speed: ███████░░░ 7/10
Cost: ████░░░░░░ 4/10
Context: ███████░░░ 7/10
Web Search: ██████████ 10/10
Ecosystem: ██████░░░░ 6/10
5. Grok 4.1 — 55/80
Most human-like conversations. Ranks #1 for personality and creativity. Plugged into X for real-time info. Reduced mistakes by 66%. Best creative writing.
Coding: ████████░░ 8/10
Reasoning: ███████░░░ 7/10
Math: ███████░░░ 7/10
Speed: ████████░░ 8/10
Cost: ██████░░░░ 6/10
Context: █████░░░░░ 5/10
Web Search: █████████░ 9/10
Ecosystem: █████░░░░░ 5/10
6. Meta AI — 54/80
Llama 4 powers Facebook, Instagram, WhatsApp. Handles 1M tokens at once. Open source means you can customize everything. Free but underperforms in real-world use.
Coding: █████░░░░░ 5/10
Reasoning: █████░░░░░ 5/10
Math: ███████░░░ 7/10
Speed: ███████░░░ 7/10
Cost: █████████░ 9/10
Context: ██████████ 10/10
Web Search: ████░░░░░░ 4/10
Ecosystem: ███████░░░ 7/10
7. DeepSeek V3.2 — 51/80
Destroyed math competitions. Gold medals at IMO, IOI, ICPC, CMO. Beats GPT-5 at pure math. 10x cheaper than competitors. Open source and free to modify.
Coding: █████████░ 9/10
Reasoning: █████████░ 9/10
Math: ██████████ 10/10
Speed: ███░░░░░░░ 3/10
Cost: ██████████ 10/10
Context: █████░░░░░ 5/10
Web Search: █░░░░░░░░░ 1/10
Ecosystem: ████░░░░░░ 4/10
8. Copilot — 49/80
GPT-5 but slower and more restricted. Needs Microsoft 365 for best features. Only searches your OneDrive files. Good for enterprises already using Microsoft.
Coding: ████████░░ 8/10
Reasoning: ████████░░ 8/10
Math: ████████░░ 8/10
Speed: ██████░░░░ 6/10
Cost: ███░░░░░░░ 3/10
Context: █████░░░░░ 5/10
Web Search: █████░░░░░ 5/10
Ecosystem: ██████░░░░ 6/10
If you can only pay for one subscription:
Get Perplexity Pro. It gives you “good enough” access to the top models (GPT-5 and Claude) while providing the best web search experience on the planet.
If you are a Developer:
Get Claude Sonnet 4.5. The coding capabilities and the “Projects” feature for organising massive codebases are indispensable.
If you need reasoning and multimodal (video/audio):
Get Gemini 3 Pro. It is currently the smartest model available, with the highest reasoning score (10/10) and the best context window.
I’m using Gemini 3 Pro for almost all my tasks now. I actually can’t believe the day has come that another AI has dethroned ChatGPT for me.
But if you’re like me - a professional who subscribes to multiple frontier models because each one offers a unique, task-smashing feature - then a different strategy is required. The true edge isn’t in having one “GPT-Killer,” but in intelligently combining the unique strengths of the top players to build a hyper-efficient, multi-model workflow:



