Blog
CoT Reasoning Models – Which One Reigns Supreme in 2025?
A comprehensive analysis for o3-Mini-High vs Claude Sonnet 3.7 Thinking vs Grok 3 Think vs Deep Seek R1 on multiple reasoning, math,…
March 8, 2025
OpenAI GPT-4.5 vs. Claude 3.7 Sonnet
After so long, OpenAI finally unveiled GPT-4.5, its biggest-ever base model. The initial vibe checks from taste testers have been outstanding. The…
March 6, 2025
Claude 3.7 Sonnet thinking vs. Deepseek r1
So, Anthropic finally broke the silence and released Claude 3.7 Sonnet, a hybrid model that can think step-by-step like a thinking model…
March 1, 2025
Claude 3.7 Sonnet vs. Grok 3 vs. o3-mini-high
Just a week after Grok’s release, we now have the Claude 3.7 Sonnet, which certainly has eaten into Grok’s hype pie. Grok was definitely…
February 27, 2025
Grok 3 vs. Deepseek r1
After much anticipation, xAI has finally released the third iteration of Grok. It is apparently the smartest LLM in the world, scoring…
February 21, 2025
OpenAI o3-mini vs o1 vs Deepseek r1
OpenAI launched its latest model, the o3-mini, last Friday. It is the first member of the o3 family of models. There are…
February 8, 2025
Notes on the new Deepseek r1
Before anything, let’s bow to Richard Sutton; he was so early for this. Pure RL, neither Monte-Carlo tree search (MCTS) nor Process…
January 24, 2025
Notes on the new Deepseek v3
Deepseek released their flagship model, v3, a 607B mixture-of-experts model with 37B active parameters. Currently, it is the best open-source model, beating…
January 1, 2025
How 11x added $4.2M in enterprise deals and saved 380 engineering hours with Composio
How 11x Added $4.2M in Enterprise Deals & Saved 380 Engineering Hours with Composio $4.2 M Increase in ARR Pipeline 34%↑ Improvement…
December 30, 2024
How Composio Helped Gorilla from UC Berkeley Ship AgentArena
How Composio Helped Gorilla, UC Berkeley Ship AgentArena saving 100s of Engineering hours 80% ↑ Increase In Tool Calling Accuracy 100+ Development…
December 17, 2024
Gemini 2.0 vs Flash vs OpenAI o1 and Claude 3.5 Sonnet
Google has finally woken up and decided to drop the bombshell Gemini 2.0, completing the AI trifecta. Google has launched two new…
December 17, 2024
OpenAI o1 vs Claude 3.5 Sonnet: Which One’s Really Worth Your $20?
It’s been a week since OpenAI o1 was out of preview, and along with that, OpenAI has also introduced a new tier,…
December 12, 2024