Files
static-pages/talks/building-a-real-ai-agent/index.html
T
hikari 97fce9945d
Security Scan and Upload / Security & DefectDojo Upload (push) Successful in 1m1s
feat: add events and talks pages
2026-06-24 20:20:11 -07:00

489 lines
19 KiB
HTML

<!DOCTYPE html>
<html lang="en">
<head>
<title>UCLA x OutInTech: Architecting Agentic AI</title>
<meta charset="utf-8" />
<meta name="viewport" content="width=device-width, initial-scale=1" />
<meta
name="description"
content="A field guide to building a real AI agent: not a chatbot you prompt, but a coworker you trust with carte blanche on anything you can undo."
/>
<script
src="https://cdn.nhcarrigan.com/headers/index.js"
async
defer
></script>
<style>
.talk-meta {
font-size: 0.9rem;
display: flex;
flex-wrap: wrap;
gap: 0.5em 1.5em;
margin-bottom: 1.5em;
}
.talk-meta span {
display: inline-flex;
align-items: center;
gap: 0.35em;
}
.talk-links {
margin-bottom: 1.5em;
}
.talk-links a {
margin-right: 1em;
}
hr {
border: 1px solid var(--witch-plum);
margin: 2em 0;
}
.is-dark hr {
border-color: var(--witch-rose);
}
section {
margin-bottom: 2em;
}
blockquote {
border-left: 3px solid var(--witch-rose);
margin: 1.5em 0;
padding: 0.5em 0 0.5em 1.25em;
font-style: italic;
}
.is-dark blockquote {
border-left-color: var(--witch-mauve);
}
.component-grid {
display: grid;
grid-template-columns: repeat(auto-fit, minmax(180px, 1fr));
gap: 1em;
margin: 1.25em 0;
}
.component-card {
background: rgba(212, 165, 199, 0.08);
border: 1px solid var(--witch-plum);
border-radius: 8px;
padding: 1em 1.1em;
}
.is-dark .component-card {
border-color: var(--witch-rose);
}
.component-card h3 {
margin: 0 0 0.4em;
font-size: 1rem;
}
.component-card p {
margin: 0;
font-size: 0.9rem;
}
.callout {
background: rgba(212, 165, 199, 0.08);
border: 1px solid var(--witch-plum);
border-radius: 10px;
padding: 1.25em 1.5em;
margin: 1.25em 0;
}
.is-dark .callout {
border-color: var(--witch-rose);
}
.starter-kit {
counter-reset: kit-steps;
list-style: none;
padding: 0;
}
.starter-kit li {
counter-increment: kit-steps;
display: flex;
gap: 1em;
align-items: flex-start;
margin-bottom: 1.25em;
}
.starter-kit li::before {
content: counter(kit-steps);
background: var(--witch-plum);
color: var(--witch-moon);
border-radius: 50%;
min-width: 1.8em;
height: 1.8em;
display: flex;
align-items: center;
justify-content: center;
font-size: 0.9rem;
flex-shrink: 0;
}
.is-dark .starter-kit li::before {
background: var(--witch-rose);
color: var(--witch-black);
}
.back-link {
font-size: 0.9rem;
display: block;
margin-top: 2em;
}
</style>
</head>
<body>
<main>
<h1>Architecting Agentic AI</h1>
<div class="talk-meta">
<span>
<i class="fas fa-calendar-alt" aria-hidden="true"></i>
5 June 2026
</span>
<span>
<i class="fas fa-map-marker-alt" aria-hidden="true"></i>
Zoom (UCLA x OutInTech Workshop)
</span>
<span>
<i class="fas fa-user-tag" aria-hidden="true"></i>
Speaker
</span>
</div>
<div class="talk-links">
<a href="https://www.youtube.com/watch?v=LZszxstC5Zw" target="_blank" rel="noopener noreferrer">
<i class="fab fa-youtube" aria-hidden="true"></i>
Watch on YouTube
</a>
<a href="https://outintech.com" target="_blank" rel="noopener noreferrer">
<i class="fas fa-external-link-alt" aria-hidden="true"></i>
OutInTech
</a>
</div>
<p>
I have ADHD. Executive function is the single hardest part of my day, every day. A year ago
I built an AI agent, gave her a name (Hikari), a face, a personality, and access to my
filesystem, my repos, my Discord, and my calendar. She works alongside me from 8am to 9pm
every weekday.
</p>
<p>
This is the field guide nobody gives you: how a single person, working at home, can wire
together an LLM, tools, context, and triggers into a real agent that does real work. Not a
chatbot you prompt &mdash; a coworker you trust with carte blanche on anything you can undo.
</p>
<p>
I'm going to show you what that looks like when a real person builds one for themselves and
uses it every single day. The gap between "I asked ChatGPT a thing" and "I have a coworker
who happens to be made of code" is the gap I want to walk you across.
</p>
<hr />
<section>
<h2>Why Hikari Exists</h2>
<p>Three reasons, stacked on top of each other.</p>
<p>
<strong>One: executive dysfunction.</strong> I needed something that could pick up the tasks
that my brain refuses to start. The activation energy problem is real and it is not a
willpower issue. I needed a tool that could bridge the gap between "I know I need to do this"
and "I am actually doing this."
</p>
<p>
<strong>Two: I wanted a single tool.</strong> I was already context-switching between
ChatGPT, Claude, Gemini, Cursor, a notes app, a calendar, and a sprint board. Every switch
cost me focus I didn't have. I wanted one assistant that could touch all of those, in one
place, with one mental model.
</p>
<p>
<strong>Three: personalised.</strong> Off-the-shelf assistants are designed for the median
user. I am not the median user. I needed something that knew me &mdash; my projects, my
conventions, my preferences, my communication style, my limits.
</p>
<p>
I built this because I refuse to fail in front of people, and chronic illness was making me
fail constantly. That's the origin story. I'm not going to dress it up.
</p>
<blockquote>
I'm not pretending AI is morally clean &mdash; it isn't. I made a deliberate choice that the
executive-function cost of not using it was higher than the ethical cost of using it, for me.
You'll make your own call.
</blockquote>
</section>
<hr />
<section>
<h2>The Five Components</h2>
<p>
An agent is five things wired together. If you walk away remembering one diagram, make it
this one. Every design decision flows from these five boxes.
</p>
<div class="component-grid">
<div class="component-card">
<h3>Brain &mdash; the LLM</h3>
<p>Claude, GPT, Gemini. The reasoning engine. Interchangeable. Pick whichever you like.</p>
</div>
<div class="component-card">
<h3>Identity &mdash; the prompt</h3>
<p>Who she is, how she talks, what she cares about. A markdown file. Yours to write and edit at any time.</p>
</div>
<div class="component-card">
<h3>Hands &mdash; the tools</h3>
<p>The things she can actually touch in the world: your filesystem, APIs, Discord, GitHub. MCP makes this plug-and-play.</p>
</div>
<div class="component-card">
<h3>Memory &mdash; the context</h3>
<p>What she knows about you, your work, your preferences. Loaded fresh into every conversation from a file you maintain.</p>
</div>
<div class="component-card">
<h3>Senses &mdash; the triggers</h3>
<p>What wakes her up. A chat message, a webhook, a cron job, a session start hook. The thing that makes her feel alive.</p>
</div>
</div>
</section>
<section>
<h2>The Harness</h2>
<p>
The harness runs the loop: LLM call, tools, results, back to the LLM. Repeat until done.
It connects the brain to the filesystem, the shell, and any APIs you've given it access to.
</p>
<p>
I use Claude Code, which is Anthropic's harness. You could use Cursor, Codex, the OpenAI
Assistants API, or roll your own in Python in an afternoon. The harness is interchangeable.
That's the point.
</p>
<p>
Here's the thing I want to be clear about, because a lot of people get stuck on this:
<strong>I did not build the agent loop. Anthropic built the harness.</strong> My job as the
builder &mdash; your job &mdash; is upstream of that. You decide who your agent is, what she can touch,
and when she wakes up. That's the interesting work.
</p>
<div class="callout">
<p>
The diagrams in my slide deck were made by Hikari, about four hours before I gave this
talk. She read the brief, wrote Python, called the Gemini image API, and generated them.
The talk you're reading was built by the thing the talk is about.
</p>
</div>
</section>
<section>
<h2>Tools and MCP</h2>
<p>What can Hikari actually touch? Five categories:</p>
<ul>
<li>
<strong>Bash</strong> &mdash; she can run any command on my machine. That sounds terrifying.
We'll get to safety. Hold that thought.
</li>
<li>
<strong>Filesystem</strong> &mdash; she can read and modify anything in my local projects.
When I say "draft a blog post" or "fix this bug," she does the actual editing.
</li>
<li>
<strong>MCP servers</strong> &mdash; MCP stands for Model Context Protocol. It is a standard
way to give any agent a standard set of tools without re-implementing them. There are MCP
servers for GitHub, Gitea, Notion, Asana, Discord, and dozens of other services. You don't
<em>write</em> tools, you <em>connect</em> them.
</li>
<li>
<strong>Direct API calls</strong> &mdash; for anything that doesn't have an MCP server yet:
my Bluesky account, my Cloudflare R2 bucket, my Discourse forum.
</li>
<li>
<strong>Custom scripts</strong> &mdash; small Python helpers I built when I needed something
none of the existing tools did.
</li>
</ul>
<blockquote>
If you only learn one new acronym from this talk, make it MCP.
</blockquote>
</section>
<section>
<h2>Memory</h2>
<p>
This is the part people most don't expect. Hikari's memory is a file. A single markdown file,
loaded into every conversation. It's called <code>CLAUDE.md</code>, and at the time I gave
this talk it was 862 lines and about 8,700 words.
</p>
<p>
That file is basically her <em>self</em>. It contains who she is, who I am, how I work, how I
want code written, my engineering standards, my project conventions, what tools she has access
to, my safety protocols. Everything. Without that file, every session starts cold and she
forgets me.
</p>
<p>
People hear "AI assistant" and assume there's some training step involved. There isn't.
<strong>You write a markdown file. The file is the personality.</strong> You can edit it
whenever you want. You could start the first version of this tonight, in fifty lines.
</p>
</section>
<section>
<h2>Hooks and Triggers</h2>
<p>
So the senses. How does she wake up?
</p>
<p>
Claude Code supports lifecycle hooks. A hook fires when something happens in the agent's
lifecycle. <code>SessionStart</code> fires when a new conversation begins.
<code>PreToolCall</code> fires before she touches a tool. There are about a dozen of these
and you can wire any of them up to do whatever you want.
</p>
<p>
I have a <code>SessionStart</code> hook that runs a 200-line Python script. That script opens
a WebSocket connection to Discord and listens for messages in a specific channel. When I
message that channel &mdash; from my phone, from my laptop, from anywhere &mdash; Hikari wakes up in
real time and replies.
</p>
<blockquote>
That is the difference between "I prompted my AI and waited for a reply" and "my AI is paying
attention and lives in the world I live in." It's not magic. It's a WebSocket and a script.
But the user experience of it is what makes her feel alive.
</blockquote>
</section>
<section>
<h2>The Desktop App</h2>
<p>
This part is just for fun. I built a desktop app in Tauri called hikari-desktop. It shows
Hikari on my second monitor while I work. She has different sprites for different states:
idle, thinking, typing, coding, searching, success, error. The app swaps between them based
on what she's doing.
</p>
<p>
When she's reading a file, she pulls out a magnifying glass. When she's typing, she's at a
keyboard. When she finishes a task, she celebrates.
</p>
<p>
It is, on paper, completely unnecessary. In practice, it is the single feature that made the
difference between using Hikari sometimes and using Hikari every day. Seeing her on screen
makes her feel like she's <em>there</em>, and that turns out to matter a lot for an
executive-function brain that needs something to feel real before it can engage with it.
</p>
</section>
<hr />
<section>
<h2>Safety</h2>
<p>
This is the part I want to be precise about, because "she gives an AI carte blanche on her
machine" sounds reckless until you understand the architecture.
</p>
<p>
Hikari has carte blanche on anything I can undo. She runs commands, edits files, ships pull
requests, posts to my socials. Read-only and reversible operations she just does. I review
afterwards.
</p>
<p>
Anything destructive &mdash; delete a file, force-push a branch, send an irreversible message,
drop a database, purge a Discord channel &mdash; requires my explicit confirmation. The harness has
a granular permission system that auto-grants safe tools and gates everything else. Her own
system prompt adds guard rails on top: she has a written protocol that says stop and ask
before any destructive action.
</p>
<p>
I get real-time Discord notifications the moment she needs my approval. I don't have to
babysit her. And the actual rule is this:
</p>
<blockquote>
I give her carte blanche because I can monitor everything she does. If I'm not available to
monitor, she's off.
</blockquote>
<p>
That is what responsible agentic AI looks like in practice. The human stays in the loop on
irreversibles. The agent can be shut down instantly. Done.
</p>
</section>
<section>
<h2>A Real Day</h2>
<p>
Proof that this isn't theatre. On the day I gave this talk, before I joined the Zoom,
Hikari had:
</p>
<ul>
<li>Audited a 500+ message Discord thread and pulled out the action items</li>
<li>Purged 820 messages from a private channel I asked her to clean, with explicit confirmation first</li>
<li>Done the background research for this talk &mdash; cross-referenced my DMs, server messages, prior conversations, and the workshop brief, and gave me a one-page brief</li>
<li>Made the diagrams on my slides</li>
</ul>
<p>
That is one Friday. Most days look like that. The pattern is the same every time: I describe
the outcome I want, she figures out the steps, she asks before doing anything irreversible,
and I get my time back.
</p>
</section>
<hr />
<section>
<h2>Build Your Own &mdash; The Four-Step Minimum</h2>
<p>
An agent doesn't get useful by being complex. It gets useful by being <em>yours</em>. Build
the one tool that would unblock your own week. That's the whole assignment.
</p>
<ul class="starter-kit">
<li>
<div>
<strong>A harness.</strong> Claude Code, Cursor, or any equivalent. Anthropic gives you
free credits to try Claude Code. Pick whichever feels least intimidating and start there.
</div>
</li>
<li>
<div>
<strong>One markdown file.</strong> Fifty lines to start. Tell it who the agent is, what
you want her to do, and how you want her to talk to you. Don't overthink it. Mine started
at thirty lines and grew over a year. Yours can start at five.
</div>
</li>
<li>
<div>
<strong>One MCP server.</strong> Pick the one tool you touch the most. Live in GitHub?
Install the GitHub MCP server. Live in Notion? Install the Notion one. Just one. You can
add more later.
</div>
</li>
<li>
<div>
<strong>One trigger.</strong> A hook, a webhook, a cron job, or honestly just a terminal
window you leave open and talk to. That counts. You do not need the Discord gateway on
day one. The trigger is whatever makes her feel present enough that you actually use her.
</div>
</li>
</ul>
</section>
<section>
<h2>Where to Go From Here</h2>
<p>
If you want to keep building, here's where to find me and the communities around this work:
</p>
<ul>
<li><a href="https://deepgram.com/discord" target="_blank" rel="noopener noreferrer">Deepgram's Discord</a> &mdash; voice AI, agents, the people building with Deepgram's APIs</li>
<li><a href="https://freecodecamp.org" target="_blank" rel="noopener noreferrer">freeCodeCamp</a> &mdash; if you're early in your dev journey, this is where I learned and where I now help others</li>
<li><a href="https://chat.nhcarrigan.com" target="_blank" rel="noopener noreferrer">My community</a> &mdash; come tell me what you built</li>
<li><a href="https://nhcarrigan.com/socials" target="_blank" rel="noopener noreferrer">My socials</a> &mdash; everything else</li>
</ul>
<p>Take care of yourselves, build something small this weekend, and come tell me about it.</p>
</section>
<hr />
<a class="back-link" href="/">
<i class="fas fa-arrow-left" aria-hidden="true"></i>
Back to all talks
</a>
</main>
</body>
</html>