News
500 Blog Posts To Learn About Llms
Are LLMs a Higher Level of Abstraction? No, And Here's Why
Specifically, I am seeing the claim that LLMs are the next step in the abstractions we had, going from programming in binary to pr...
The LLM Stack Decision Nobody Makes Cleanly
There is a meeting that happens in almost every team building with LLMs. Someone puts four boxes on a whiteboard: prompt engineeri...
From Prompts to Harnesses: How AI Engineering Has Grown Up
AI engineering has gone through three stages, and knowing where you are in that progression tells you where to put your energy nex...
How to Teach the LLM to Think With Your Data
This approach misses the real strength of LLMs. Instead of exposing raw RAG output, we should feed the retrieval knowledge back in...
How I Built a Self-Maintaining Knowledge Base for 6 Projects Using Claude Code &...
How to got Claude Code to maintain a self-updating wiki across 6 projects — 192 pages bootstrapped in ~2 hours — so every new sess...
How Frontier Labs Use FP8 to Train Faster and Spend Less
A practical look at FP8 in LLM pretraining: how it works, where to apply it, what to watch out for, and what speedups you can real...
The Intelligence Paradox: Why We're Building LLMs Wrong (And How to Fix It)
LLMs aren’t failing because they’re small—they’re failing because scale is mistaken for intelligence. Benchmarks don’t reflect rea...
You Should Stop Fine-Tuning Blindly: What to Do Instead
Fine-tuning is not one thing. You’re choosing a point on a spectrum: Full FT → PEFT (Adapters/Prompt Tuning/LoRA) → QLoRA → Prefer...
AI Doesn’t Lie - It ReflectsHow Fragmented Signals Distort What LLMs Think Your...
AI systems don’t “understand” your company—they reconstruct it from public signals. When those signals are fragmented, outdated,...
I Built a RAG System for Our Analytics Team. It Worked Great Until We Added Real...
RAG was supposed to help analysts and stakeholders get answers about our data. But the system was wrong in ways that would have co...
Building Composable Safety and Performance Layers for Agents in Rust
The LLM Pipelines architecture is a middleware-based pipeline architecture for AI agents. It was inspired by how web frameworks so...
