Excited for the future!

I’m Sabareesh, a curious researcher exploring Large Language Models (LLMs) and reinforcement learning. By studying the inner workings of LLMs, I’m working to better understand their capabilities, uncover insights, and contribute meaningfully to these transformative fields.

Let’s create something extraordinary together!

Teaching Qwen3-4B to Trade: From Hold-Collapse to +9.4% Returns

Can you turn an LLM into a profitable trader? I spent three months finding out. This post covers the full arc: a 5-stage supervised fine-tuning pipeline on Qwen3-4B, a catastrophic failure mode I had to diagnose and fix, the checkpoint that hit +9.4% returns with perfect format validity, and why supervised learning hit a ceiling that only RL can break through. The Setup The model is Qwen3-4B. The task: given 30 days of OHLCV data plus 20+ quantitative features (RSI, MACD, volatility, beta, etc.) for a stock, output a structured JSON trade plan inside <think>...</think><answer>{plan}</answer> tags. The plan includes decision (enter/hold), side (long/short), stop loss, take profit, holding days, and position size. ...

March 13, 2026 · 6 min · Sabareesh

MCP Compact: Keep Agent Context Lean

The problem: MCP agents return bulky tool outputs (screenshots, DOM dumps, network traces) and quickly blow past context limits. Downstream steps stall or get fuzzy because the signal is buried. TL;DR: MCP Compact sits between your agent and MCP server, summarizes noisy tool outputs per-tool, and keeps context lean (e.g., 109k DOM -> 8.9k tokens) without changing agent code. What MCP Compact does: it sits between your agent and the upstream MCP server, forwards every tool call, and summarizes the response with an LLM. You set per-tool rules (token budget, what to preserve), and the proxy enforces them automatically. ...

November 20, 2025 · 2 min · Sabareesh

All You Need is 4x 4090 GPUs to Train Your Own Model

How I built an ML rig for training LLMs locally, exploring hardware choices, setup tricks, and lessons learned along the way.

December 28, 2024 · 6 min · Sabareesh

Defining AGI

A thoughtful exploration of Artificial General Intelligence (AGI) through three fundamental concepts.

December 28, 2024 · 1 min · Sabareesh

Embarking on My Journey into LLM

Join a curious engineer’s quest into the fascinating world of Large Language Models (LLMs). From tinkering with GPUs to unraveling the mysteries of architectures like Llama2, this journey is filled with challenges, breakthroughs, and the relentless pursuit of understanding AI’s limitless potential.

December 27, 2024 · 5 min · Sabareesh