I hate how he called out terminal user interfaces as shit… then proved web interfaces to be superior. Damn him. I love working from my terminal, but having ai prove itself through html reports including video, image, metrics, charts, and text is goated. Rethinking yourself has the bottleneck not the orchestrator feels real. Validating the work is hard, theres a shift right now and everyone is trying to figure it out. Lucas’s technique is a little bit of be lazy and tell it to prove itself to you, so as you juggle your 15 agents you have a nice report to read.
Posts tagged: agents
All posts with the tag "agents"
This is a really good guide, with quite a few good nuggets. I need to try deleting my AGENTS.md and rebuilding it from scratch more often. I liked how he talked about having agents prove their work and tell them up front how they will be judged. What I didn’t care for so much was the feeling that a lot of the rules go in markdown, thats not a rule, thats a suggestion. Rules should be deterministic. They should be tests and linters that ensure they are followed. Suggestions are good, but dont trust the agents to always follow them. And don’t trust that they wont change your rules, keep them honest.
Agents Are Here
🌱 This post is still growing
Late last year I started writing I'm Out On Agents. Agents sucked, the models were good, but there was still something missing between the harnesses and the models. They could write good code, they could do some debugging and exploring, but they were too good at fucking up the whole project to be useful. They could crank out Green Field POC’s like nobody’s business, but they created so much mess in brown field projects that it was easier to chat and edit yourself.
The Beautiful Glitch - Gemini
...
Thinking about ai productivity again
Such a good interview @lexfridman is such a talented interview. It’s so cool to see the other side of this. For weeks we’ve heard about the story of the name change, we’ve seen everyone shitting on the security model, buying up all the mac minis in existance, fear mongering not to install this thing. @steipete.me has such a cool story from the beginning talking about making this thing fun and exciting. Giving it a personality that is not “You are absolutely right”. The story of changing the name twice, and getting pwnd on every step the first time and nailing it the second time is incredible. Dude is having fun trying to make the thing he wants in the world exist.
Pm Not Babysitter
Stop babysitting your agents, treat them like a real team and they will reward you.
Back in December I saw theo make a comment that code is now cheap, its the run rate of models, He quoted a study, not sure that he fully even believed it, but it claimed that the average developer after all meetings, training, emails, planning and extra shit in their day averages out 10 well tested lines of code per day. Opus 3.5 made him 10k loc (lines of code) that day.
We have all agreed for decades that lines of code is not a proxy to productivity or quality. Often more code means more risk, more review, more infrastructure. This has become MUCH different. Lines of code are still far from any sort of good metric. That aside, your agents are not doing 10k lines with you babysitting them, and in fact its very likely that the product quality is MUCH worse as you babysit them.
...
Agent Management Is Exhausting
The state of development in early 2026 is all wrapped around learning how to manage many agents running in parallel. Everyone’s trying to figure out the workflow.
The secret I’ve discovered is a good, well-defined plan. This could be a markdown file or a GitHub issue. Agents are actually great at writing these for you. They’ll include reproduction steps, outline changes needed, and structure the work.
This is your opportunity to step in. Read the plan. Look for hallucinations. Spot where it’s going off track. Edit the plan before the agent starts coding.
...
Context Is King
A new approach to agentic workflows.
This is probably news to no one else, I’m sure I’m behind on this one. You can’t one sentence prompt and expect to get what you want.
I'm In On Agents
It’s the start of 2026 and agents are getting a lot better than they were. I’m using opencode at home, free mode with Zen and big pickle. At work I have access to a wider variety of models including what seems to be the gold standard 3 from anthropic opus, sonnet, haiku.
Around Aug 2025 I wrote I'm Out On Agents. I saw others in the space having such great success I gave it a solid shot, but found it to egregious edit more than I asked, make massive unneeded changes, and make more small bugs hidden in the details than was...
...
I'm Out On Agents
Its the year 2025 and we are only a few years into having 6 months to live before ai takes our jobs, and the big push right now is agents, managing agents. I will fully concede to I’m not doing it right, or a future state gets better than where we are right now, but right now they kinda suck.
Chat is what really kicked off ai uses and goes back as old as computers, but it always sucked. Then chatgpt rocked the world with the biggest launch day in history and showed us that it could actually be pretty good. Unethically trained on everything they could get their hands on, burning cities worth of electricity to train, and keep training to stay ahead of the competition. It does a damn good job. There are tells, and if you see enough of it there is a lot that turns to slop, but if you had never seen it before, there is no way you would assume that it was not a computer.
It does a damn good job at being average, it can do what seems like everything not related to security and authentication...
...