<?xml version="1.0" encoding="utf-8" standalone="yes"?><rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom"><channel><title>AI on CodeGoalie</title><link>https://codegoalie.com/categories/ai/</link><description>Recent content in AI on CodeGoalie</description><generator>Hugo -- gohugo.io</generator><language>en-us</language><lastBuildDate>Sat, 18 Apr 2026 05:54:21 -0400</lastBuildDate><atom:link href="https://codegoalie.com/categories/ai/index.xml" rel="self" type="application/rss+xml"/><item><title>Agents for System Design</title><link>https://codegoalie.com/posts/agents-for-system-design/</link><pubDate>Sat, 18 Apr 2026 05:54:21 -0400</pubDate><guid>https://codegoalie.com/posts/agents-for-system-design/</guid><description>&lt;p>System Design used to be reserved for the senior folks. They&amp;rsquo;ll hand down a
document showcasing the new architecture or new service. Agents are making it so
everyone can do some system design because even the smallest features can
benefit from more intentional design.&lt;/p>
&lt;p>Over the past couple months, I&amp;rsquo;ve been tasked with leading a very large
cross-team project to essentially replace the core data model in our business.
This has involved designing new services, new and novel interactions between
services, and adding new capabilities to existing services. I&amp;rsquo;ve leaned heavily
on agents to help produce and communicate my intent to other people (and other
agents). Here is advice I&amp;rsquo;ve picked up along this journey.&lt;/p>
&lt;h2 id="know-what-you-want">Know what you want&lt;/h2>
&lt;p>This is a general principle for all if not close to all interactions with an
agent and LLM. It&amp;rsquo;s not a magic oracle, even if it genuinely feels like it
sometimes. It can&amp;rsquo;t read your mind. It doesn&amp;rsquo;t have the context of your company,
team, feature, past mistakes, etc. It wasn&amp;rsquo;t at the last all-hands meeting to
absorb some of that new initiative from the leadership team. It wasn&amp;rsquo;t a part of
that ad-hoc conversation last week where we got clarity on an upcoming API
interaction.&lt;/p>
&lt;p>It doesn&amp;rsquo;t need to know that. It shouldn&amp;rsquo;t know that. It&amp;rsquo;s your job to determine
what it should know, and in order to do that, you need to determine what you want
it to do first. Every time you open up your agent, you should know what you want
from it and when you&amp;rsquo;ll be able to exit it (ctrl+d ctrl+d).&lt;/p>
&lt;p>As a principal engineer at PrizePicks, a lot of my work is designing new
systems, adding new capabilities to existing systems, or integrating 2 systems
together. The first step that I take is to create a proposal doc. There is a lot
of formatting and process around this typically in companies. I ignore most of
that here. I want the agent to produce a document which describes at a high
level how this new thing is going to work.&lt;/p>
&lt;p>That last sentence is pretty much the functional part of my initial prompt. The key
here is know what you want before you start:&lt;/p>
&lt;ul>
&lt;li>A document defining something in a 3rd party system (Notion, Linear docs,
etc.)&lt;/li>
&lt;li>A series of steps for me to execute some task&lt;/li>
&lt;li>JSON for a grafana chart&lt;/li>
&lt;li>Mermaid diagram of this flow/architecture&lt;/li>
&lt;li>etc.&lt;/li>
&lt;/ul>
&lt;p>Have an end in mind before you start. Agents are a tool. Waving a shovel around
wildly in the air is not the best way to dig a hole.&lt;/p>
&lt;h2 id="provide-the-load-bearing-walls">Provide the load bearing walls&lt;/h2>
&lt;p>This is a recent insight that I realized I&amp;rsquo;d been doing naturally, but I
articulated it to a colleague recently and it actually inspired this whole post.
This 5-second version is that if you&amp;rsquo;re building a tent with the help of an
agent, you need to bring the tent poles. It can build the waterproof and
functional covering. It can put in a door with a 2 sided zipper. But if you
don&amp;rsquo;t bring the poles, it&amp;rsquo;ll just be a pile of fabric on the floor.&lt;/p>
&lt;p>Conversely, you don&amp;rsquo;t need to build the whole tent out. It knows what tents are
and how, generally, they work. It&amp;rsquo;s your job to, first, communicate that you&amp;rsquo;re
building a tent (these are general purpose models). And then articulate what
makes your tent yours.&lt;/p>
&lt;p>A good recipe for this is to start with what you have. Talk about what you need.
Then be explicit about what you want. A while back, I had an agent research the
Epcot Flower &amp;amp; Garden festival menus for this year, comparing first a markdown
file then &lt;a href="https://codegoalie.com/flower-2026.html">a static HTML file&lt;/a>. My wife
and I went through the menus and picked the items we wanted to be sure to try.
Now, I wanted to combine the two &amp;ndash; Well, wait. Here&amp;rsquo;s the exact prompt I used:&lt;/p>
&lt;pre>&lt;code>❯ I have a full overview of the 2026 Epcot Flower &amp;amp; Garden festival in the
2026.[md|html] files. I've gone through them and noted my favorites in
2026-notes.md. I'd like to come up with a useful and intuitive guide to use
while I'm in Epcot to make sure I don't miss these choices. We'll be going
around the world showcase in one order or another so put the booths in the same
order as the world showcase starting with Mexico. I also would love to be able
to hide stuff I've already tried.
&lt;/code>&lt;/pre>&lt;p>This was in plan mode in Claude Code so it generated a
&lt;a href="https://codegoalie.com/agents-for-system-design-plan-example.md">plan&lt;/a>.&lt;/p>
&lt;h2 id="meticulously-review">Meticulously review&lt;/h2>
&lt;p>Once you have a plan or document, it &lt;em>needs&lt;/em> your review. You&amp;rsquo;ve provided a few
sentences and it&amp;rsquo;s generated a whole document. This upscaling of information is
lossy, &lt;strong>by definition&lt;/strong>. In fact, each step of the way, you&amp;rsquo;re going from lower
specificity to higher specificity and review is necessary.&lt;/p>
&lt;p>The earlier steps need more review as they&amp;rsquo;re setting the stage for everything
afterwards. A small misalignment in the initial plan will cascade and amplify
into the final product.&lt;/p>
&lt;p>I once missed that a message topic needed to have a version bump. The agent used
&lt;code>v1&lt;/code> in the plan, but we actually wanted &lt;code>v2&lt;/code>. Even the copilot during code
review after I manually fixed the version numbers was complaining that &amp;ldquo;it
should be &lt;code>v1&lt;/code> here.&amp;rdquo; 😩&lt;/p>
&lt;p>This ties directly into the first point in this post. Know what you want
beforehand so that you can evaluate it when the agent produces the artifact.&lt;/p>
&lt;h2 id="cutthroat-revisions">Cutthroat revisions&lt;/h2>
&lt;p>Since each phase feeds directly into another increase in specificity, be
ruthless in your revisions earlier rather than later. I use the same working model as I
do for code reviews: if this isn&amp;rsquo;t what you would have done, say something. It
will only get worse from here. The models are very good at making things sound
plausible and correct. That&amp;rsquo;s exactly what they do, in fact. &amp;ldquo;The new
multiplication function can live in the addition package since they are both
math.&amp;rdquo; Nope. Maybe you want to rename that package to newmath, or put
multiplication in its own package. You need to make the call now and update the
plan. It&amp;rsquo;s a one sentence change in the plan, but from here on out it gets more
ingrained (maybe AGENTS.md updates to include something like &amp;ldquo;All math features
are implemented in the &lt;code>addition&lt;/code> package&amp;rdquo; 😱)&lt;/p>
&lt;p>Functionally, I like to have the agent make the changes usually in the same
conversation as the original plan was created. However, since you have an
artifact, you can start a new conversation if you want. That&amp;rsquo;s the beauty of
working with artifacts outside of the harness.&lt;/p>
&lt;hr>
&lt;p>I find the system design part of this new Agentic Engineering paradigm to be the
most interactive. Some days I like that. Other days I appreciate being able to
hit a button and get a PR (more on that later). But the PR button only works
when the plan is solid. I used to spend days hand-writing a technical design doc
and it was often just OK. With agents, I get really great docs in often less
than an hour of focused work after days of
&lt;a href="https://codegoalie.com/posts/2025-05-24-use-your-noodler">noodling&lt;/a>.&lt;/p>
&lt;p>Happy designing!&lt;/p>
&lt;p>&amp;ndash; Chris&lt;/p></description></item><item><title>Agents Calm Down</title><link>https://codegoalie.com/posts/agents-calm-down/</link><pubDate>Mon, 30 Mar 2026 07:21:16 -0400</pubDate><guid>https://codegoalie.com/posts/agents-calm-down/</guid><description>&lt;p>Agentic coding harnesses are violating the unix philosophy and trying to become
the &lt;em>everything app&lt;/em> for coding (and work and personal assistant and so on).&lt;/p>
&lt;h2 id="why-not-build-everything">Why not build everything?&lt;/h2>
&lt;p>Many of the articles I&amp;rsquo;m reading are saying what we&amp;rsquo;ve been saying all along,
&amp;ldquo;writing the code hasn&amp;rsquo;t been the bottleneck.&amp;rdquo; No project was late because
engineers didn&amp;rsquo;t spend enough time sitting at their desks typing code. And now
that writing the code is approaching zero, I think it&amp;rsquo;s more important than ever
to carefully, methodically consider every project and initiative.&lt;/p>
&lt;blockquote>
&lt;p>&amp;ldquo;This feature is not a good fit for coding agents so we aren&amp;rsquo;t going to put
it into our harness.&amp;rdquo;&lt;/p>
&lt;ul>
&lt;li>No AI company ever&lt;/li>
&lt;/ul>
&lt;/blockquote>
&lt;p>Non-engineering folks have long enjoyed the luxury of engineering being the
bottleneck (not writing code, but review, deploy, integration, etc.) and have
been able to change requirements, goals and targets while we were working. When
we can take a set of requirements and later that afternoon be dev complete&amp;hellip;
Just think about what would have happened if we&amp;rsquo;d implemented verbatim some
of those first draft project proposals?&lt;/p>
&lt;p>I&amp;rsquo;m working on a personal project to upgrade my personal daily task list from a
plain markdown file to use &lt;a href="https://www.rememberthemilk.com/">Remember The Milk&lt;/a>
(the only task tracking app which gets recurring items right, but I digress). I
have a pretty decent idea of what I want. I&amp;rsquo;ve been working in this flow for
over 8 years after all. I&amp;rsquo;m, of course, having an agent write all the code. But
I&amp;rsquo;m going slow. I&amp;rsquo;m carefully considering each feature and more precisely how it
should work than I ever have &lt;em>ahead of time&lt;/em> before. I feel this great
responsibility to move intentionally &lt;strong>because&lt;/strong> I have this great power to push
a button, fold 5 pieces of clothes, turn back to my laptop and have a working
feature.&lt;/p>
&lt;p>To me, LLMs and agents and all of it, from the start, has never been about
getting faster. It&amp;rsquo;s about reducing the &amp;ldquo;manual&amp;rdquo; effort of willing a (software)
thing into existence. As I&amp;rsquo;ve been gaining experience with agents et al, I keep
coming back time and time again to the best practices of the industry: &amp;ldquo;have a
really good and detailed scope doc,&amp;rdquo; &amp;ldquo;compare that to the code and form a
focused implementation plan,&amp;rdquo; &amp;ldquo;break that down into units of work that depend on
each other or can be done in parallel.&amp;rdquo; We always wanted that but it was too
time consuming to do and we relied on each other&amp;rsquo;s existing context. Many
tickets don&amp;rsquo;t have any body description. We just talked about the project scope
in the kick off, but didn&amp;rsquo;t formalize a document. Ironically, the machine is
forcing (and enabling) us to do the things that would have been best all along,
even for humans.&lt;/p>
&lt;h2 id="do-one-thing-well">Do one thing well&lt;/h2>
&lt;p>Time and again the best software is focused and does one thing really well.
That&amp;rsquo;s the unix philosophy. Composability over batteries included. Make a simple
and powerful tool to let those using it combine it with other simple and
powerful tools to make something greater than the sum of the tools themselves.&lt;/p>
&lt;p>These new releases in the agent harnesses are hard to reason about because
many of the new features feel half-baked and not rooted in solving a real
problem. I end up feeling obligated to find use for them. It&amp;rsquo;s FOMO a bit.
&amp;ldquo;Well, if they implemented it and released it, it must be something good. I must
be missing something if I can&amp;rsquo;t find the use case.&amp;rdquo;&lt;/p>
&lt;p>It&amp;rsquo;s hard to face this down. The company worth tens of billions of dollars is
churning out features on the most talked about software of the moment and my gut
is telling me to ignore most of it. I got bored and frustrated really quickly
babysitting the agent. &amp;ldquo;Yes&amp;rdquo; &amp;ldquo;Approve&amp;rdquo; over and over. Same prompt. Same slash
command. Same skill. Tweak it a bit here and there. Now, I&amp;rsquo;m automating it and
running into some longing for more focus on a simple tool that I can work &lt;em>with&lt;/em>
not indefinitely inside of.&lt;/p>
&lt;h2 id="do-less">Do less&lt;/h2>
&lt;p>I don&amp;rsquo;t want to leave on a down note so let&amp;rsquo;s talk solutions. Do less. Make
more. Let&amp;rsquo;s build an ecosystem around this tool. Let&amp;rsquo;s solve the rest of the
SDLC competently. LLMs are a versatile technology and are made even more
powerful with agent harnesses. They aren&amp;rsquo;t a silver bullet either. I love that
we&amp;rsquo;re pushing the envelope of what can these things do. I love that we don&amp;rsquo;t
know yet. It&amp;rsquo;s exciting to be a part of.&lt;/p>
&lt;p>The more I work with agents the more I see how important the &amp;ldquo;old&amp;rdquo; best
practices are. The more pre-planning matters. The more detailed specs matter.
The more nailing down inter-service interacations matter. Start at the
boundaries and work inward on both sides. Hand offs. Metrics. Automated
end-to-end testing. There&amp;rsquo;s nothing new here. It&amp;rsquo;s just compressed and
accelerated; and can be exhausting at times. In this age where we can build
anything, the taste and discipline to not build something matters. I&amp;rsquo;d love to
see that in the tools themselves.&lt;/p>
&lt;p>Happy shipping!&lt;/p>
&lt;p>&amp;ndash; Chris&lt;/p></description></item></channel></rss>