From 97fce9945d2eee7e4d652bee365224e428b1e66a Mon Sep 17 00:00:00 2001 From: Hikari Date: Wed, 24 Jun 2026 20:20:11 -0700 Subject: [PATCH] feat: add events and talks pages --- events/index.html | 222 ++++++++ .../ai-and-open-source-mentorship/index.html | 138 +++++ talks/building-a-real-ai-agent/index.html | 488 ++++++++++++++++ .../cultivating-a-remote-workspace/index.html | 308 +++++++++++ talks/index.html | 114 ++++ talks/voice-ai-agent-workshop/index.html | 522 ++++++++++++++++++ 6 files changed, 1792 insertions(+) create mode 100644 events/index.html create mode 100644 talks/ai-and-open-source-mentorship/index.html create mode 100644 talks/building-a-real-ai-agent/index.html create mode 100644 talks/cultivating-a-remote-workspace/index.html create mode 100644 talks/index.html create mode 100644 talks/voice-ai-agent-workshop/index.html diff --git a/events/index.html b/events/index.html new file mode 100644 index 0000000..1aaf8bf --- /dev/null +++ b/events/index.html @@ -0,0 +1,222 @@ + + + + Events + + + + + + + +
+

Events

+

A collection of events I have attended, spoken at, or volunteered for.

+

Workshop companion guides are available at workshops.nhcarrigan.com.

+

Full talk companion guides are available at talks.nhcarrigan.com.

+
+
+

AgentCon / MCPCon North America 2026

+
+ + + 22–23 October 2026 + + + + San Jose, CA, USA + + + + Speaker (Pending) + +
+

+ CFP submitted to share boots-on-the-ground findings from a cohort of over 100 open-source mentees + across 14 teams. The cohort generated 651 reviewed contributions and over 16,000 Discord messages -- + and the exit survey revealed a sharp tension: 71% of participants used AI to augment their work, 58% + wanted clearer guidelines, and AI assistance was the lowest-rated part of the experience. The talk + covers how AI helped participants stay engaged, how it made skill-gap identification harder, and why + MCP-enabled agentic tooling is reshaping what open-source mentorship needs to look like. +

+ + + AgentCon / MCPCon North America + +
+
+

Berkeley AI Hackathon 2026

+
+ + + 20–21 June 2026 + + + + Berkeley, CA, USA + + + + Speaker & Sponsor + +
+

+ Attended as a Deepgram sponsor and ran a workshop on building a fully functional voice AI agent + from scratch. Covered the full real-time loop: audio capture, low-latency transcription, LLM + reasoning with context, and streaming text-to-speech back fast enough to feel like a real + conversation. Dug into the design decisions that matter most in production: interruption handling, + conversation state, and keeping latency low. Attendees left with a clear mental model of how voice + agents work under the hood and a working architecture they could ship at the hackathon that same day. +

+ + + Berkeley AI Hackathon + +
+
+

RainbowGram Pride Presentation — Deepgram All Hands

+
+ + + 18 June 2026 + + + + Virtual HQ + + + + Speaker + +
+

+ A Pride month session for the Deepgram all-hands focused on practical allyship in distributed teams, + not policy. Covering five concrete daily habits any colleague can adopt immediately: names and pronouns, + navigating questions, confidentiality, speaking up when queer colleagues are not in the room, and the + cumulative weight of inclusive language. The session opened with a moment of genuine celebration of + Pride's origins, addressed the current landscape honestly, and closed with a tiered action list and + a reframe: treating queer colleagues as simply normal gives back energy that would otherwise be spent + just existing in those spaces. +

+
+
+

UCLA x OutInTech: Building a Real AI Agent

+
+ + + 5 June 2026 + + + + Zoom + + + + Speaker + +
+

+ A talk on building a real AI agent — not a chatbot — as a solo developer with ADHD. + I walked through the five core pieces of an agent (brain, identity, hands, memory, ears), + demystified common overcomplication around MCP, hooks, and gateways, and shared the safety + framework that lets me grant broad filesystem and repo access without it being reckless. + Built solo on Claude Code with MCP servers, a Discord gateway listener, and an Electron + desktop app. The talk ended with a four-step starter kit attendees could build that same evening. +

+ + + Watch on YouTube + +  ·  + + + OutInTech + +
+
+

CascadiaJS 2026

+
+ + + 1–2 June 2026 + + + + Town Hall Seattle, Seattle, WA, USA + + + + Attendee + +
+

+ My first event in the Seattle tech scene. A wonderful couple of days in the Pacific Northwest + JavaScript community, and a great chance to catch up with Amanda, Head of DevRel at Vapi + (and a Deepgram partner!). +

+ + + CascadiaJS + +
+
+
+ + diff --git a/talks/ai-and-open-source-mentorship/index.html b/talks/ai-and-open-source-mentorship/index.html new file mode 100644 index 0000000..431f69c --- /dev/null +++ b/talks/ai-and-open-source-mentorship/index.html @@ -0,0 +1,138 @@ + + + + AgentCon / MCPCon: AI and Open Source Mentorship + + + + + + + +
+

AgentCon / MCPCon: AI and Open Source Mentorship

+
+ + + 22–23 October 2026 + + + + San Jose, CA, USA + + + + Speaker (Pending) + +
+ +

+ + CFP submitted — pending review. This page will be updated if the talk is accepted. +

+
+
+

Overview

+

+ Earlier this year I ran a cohort with over 100 mentees focused entirely on open-source contributions + and emulating a real-world developer workflow. There were 14 teams, participants sent over 16,000 + messages in the Discord, and 651 contributions were reviewed. Then I ran an exit survey. +

+

+ The results were telling: 71% of participants used AI to augment their work, and 58% wished I had + provided better guidelines around how to do so. That last metric was the lowest-rated portion of the + entire experience. +

+

+ This talk shares the boots-on-the-ground findings: what the data show, how AI helped participants + stay engaged, and how it made it harder to identify skill gaps that led to contributor churn. +

+
+
+

Key Takeaway

+

+ AI is drastically reshaping the open source ecosystem. It is the core driver of the gap between + contributors who stay and contributors who churn. As MCP-enabled tooling continues to redefine what + "AI-augmented workflow" means, mentorship becomes more important than ever — and the lessons from + this cohort belong to every maintainer. +

+

+ We need to adapt to the ever-changing agentic AI domain. The talk ends with concrete guidance on how + to do that. +

+
+
+ + + Back to all talks + +
+ + diff --git a/talks/building-a-real-ai-agent/index.html b/talks/building-a-real-ai-agent/index.html new file mode 100644 index 0000000..5a68197 --- /dev/null +++ b/talks/building-a-real-ai-agent/index.html @@ -0,0 +1,488 @@ + + + + UCLA x OutInTech: Architecting Agentic AI + + + + + + + +
+

Architecting Agentic AI

+
+ + + 5 June 2026 + + + + Zoom (UCLA x OutInTech Workshop) + + + + Speaker + +
+ + +

+ I have ADHD. Executive function is the single hardest part of my day, every day. A year ago + I built an AI agent, gave her a name (Hikari), a face, a personality, and access to my + filesystem, my repos, my Discord, and my calendar. She works alongside me from 8am to 9pm + every weekday. +

+

+ This is the field guide nobody gives you: how a single person, working at home, can wire + together an LLM, tools, context, and triggers into a real agent that does real work. Not a + chatbot you prompt — a coworker you trust with carte blanche on anything you can undo. +

+

+ I'm going to show you what that looks like when a real person builds one for themselves and + uses it every single day. The gap between "I asked ChatGPT a thing" and "I have a coworker + who happens to be made of code" is the gap I want to walk you across. +

+ +
+ +
+

Why Hikari Exists

+

Three reasons, stacked on top of each other.

+

+ One: executive dysfunction. I needed something that could pick up the tasks + that my brain refuses to start. The activation energy problem is real and it is not a + willpower issue. I needed a tool that could bridge the gap between "I know I need to do this" + and "I am actually doing this." +

+

+ Two: I wanted a single tool. I was already context-switching between + ChatGPT, Claude, Gemini, Cursor, a notes app, a calendar, and a sprint board. Every switch + cost me focus I didn't have. I wanted one assistant that could touch all of those, in one + place, with one mental model. +

+

+ Three: personalised. Off-the-shelf assistants are designed for the median + user. I am not the median user. I needed something that knew me — my projects, my + conventions, my preferences, my communication style, my limits. +

+

+ I built this because I refuse to fail in front of people, and chronic illness was making me + fail constantly. That's the origin story. I'm not going to dress it up. +

+
+ I'm not pretending AI is morally clean — it isn't. I made a deliberate choice that the + executive-function cost of not using it was higher than the ethical cost of using it, for me. + You'll make your own call. +
+
+ +
+ +
+

The Five Components

+

+ An agent is five things wired together. If you walk away remembering one diagram, make it + this one. Every design decision flows from these five boxes. +

+
+
+

Brain — the LLM

+

Claude, GPT, Gemini. The reasoning engine. Interchangeable. Pick whichever you like.

+
+
+

Identity — the prompt

+

Who she is, how she talks, what she cares about. A markdown file. Yours to write and edit at any time.

+
+
+

Hands — the tools

+

The things she can actually touch in the world: your filesystem, APIs, Discord, GitHub. MCP makes this plug-and-play.

+
+
+

Memory — the context

+

What she knows about you, your work, your preferences. Loaded fresh into every conversation from a file you maintain.

+
+
+

Senses — the triggers

+

What wakes her up. A chat message, a webhook, a cron job, a session start hook. The thing that makes her feel alive.

+
+
+
+ +
+

The Harness

+

+ The harness runs the loop: LLM call, tools, results, back to the LLM. Repeat until done. + It connects the brain to the filesystem, the shell, and any APIs you've given it access to. +

+

+ I use Claude Code, which is Anthropic's harness. You could use Cursor, Codex, the OpenAI + Assistants API, or roll your own in Python in an afternoon. The harness is interchangeable. + That's the point. +

+

+ Here's the thing I want to be clear about, because a lot of people get stuck on this: + I did not build the agent loop. Anthropic built the harness. My job as the + builder — your job — is upstream of that. You decide who your agent is, what she can touch, + and when she wakes up. That's the interesting work. +

+
+

+ The diagrams in my slide deck were made by Hikari, about four hours before I gave this + talk. She read the brief, wrote Python, called the Gemini image API, and generated them. + The talk you're reading was built by the thing the talk is about. +

+
+
+ +
+

Tools and MCP

+

What can Hikari actually touch? Five categories:

+
    +
  • + Bash — she can run any command on my machine. That sounds terrifying. + We'll get to safety. Hold that thought. +
  • +
  • + Filesystem — she can read and modify anything in my local projects. + When I say "draft a blog post" or "fix this bug," she does the actual editing. +
  • +
  • + MCP servers — MCP stands for Model Context Protocol. It is a standard + way to give any agent a standard set of tools without re-implementing them. There are MCP + servers for GitHub, Gitea, Notion, Asana, Discord, and dozens of other services. You don't + write tools, you connect them. +
  • +
  • + Direct API calls — for anything that doesn't have an MCP server yet: + my Bluesky account, my Cloudflare R2 bucket, my Discourse forum. +
  • +
  • + Custom scripts — small Python helpers I built when I needed something + none of the existing tools did. +
  • +
+
+ If you only learn one new acronym from this talk, make it MCP. +
+
+ +
+

Memory

+

+ This is the part people most don't expect. Hikari's memory is a file. A single markdown file, + loaded into every conversation. It's called CLAUDE.md, and at the time I gave + this talk it was 862 lines and about 8,700 words. +

+

+ That file is basically her self. It contains who she is, who I am, how I work, how I + want code written, my engineering standards, my project conventions, what tools she has access + to, my safety protocols. Everything. Without that file, every session starts cold and she + forgets me. +

+

+ People hear "AI assistant" and assume there's some training step involved. There isn't. + You write a markdown file. The file is the personality. You can edit it + whenever you want. You could start the first version of this tonight, in fifty lines. +

+
+ +
+

Hooks and Triggers

+

+ So the senses. How does she wake up? +

+

+ Claude Code supports lifecycle hooks. A hook fires when something happens in the agent's + lifecycle. SessionStart fires when a new conversation begins. + PreToolCall fires before she touches a tool. There are about a dozen of these + and you can wire any of them up to do whatever you want. +

+

+ I have a SessionStart hook that runs a 200-line Python script. That script opens + a WebSocket connection to Discord and listens for messages in a specific channel. When I + message that channel — from my phone, from my laptop, from anywhere — Hikari wakes up in + real time and replies. +

+
+ That is the difference between "I prompted my AI and waited for a reply" and "my AI is paying + attention and lives in the world I live in." It's not magic. It's a WebSocket and a script. + But the user experience of it is what makes her feel alive. +
+
+ +
+

The Desktop App

+

+ This part is just for fun. I built a desktop app in Tauri called hikari-desktop. It shows + Hikari on my second monitor while I work. She has different sprites for different states: + idle, thinking, typing, coding, searching, success, error. The app swaps between them based + on what she's doing. +

+

+ When she's reading a file, she pulls out a magnifying glass. When she's typing, she's at a + keyboard. When she finishes a task, she celebrates. +

+

+ It is, on paper, completely unnecessary. In practice, it is the single feature that made the + difference between using Hikari sometimes and using Hikari every day. Seeing her on screen + makes her feel like she's there, and that turns out to matter a lot for an + executive-function brain that needs something to feel real before it can engage with it. +

+
+ +
+ +
+

Safety

+

+ This is the part I want to be precise about, because "she gives an AI carte blanche on her + machine" sounds reckless until you understand the architecture. +

+

+ Hikari has carte blanche on anything I can undo. She runs commands, edits files, ships pull + requests, posts to my socials. Read-only and reversible operations she just does. I review + afterwards. +

+

+ Anything destructive — delete a file, force-push a branch, send an irreversible message, + drop a database, purge a Discord channel — requires my explicit confirmation. The harness has + a granular permission system that auto-grants safe tools and gates everything else. Her own + system prompt adds guard rails on top: she has a written protocol that says stop and ask + before any destructive action. +

+

+ I get real-time Discord notifications the moment she needs my approval. I don't have to + babysit her. And the actual rule is this: +

+
+ I give her carte blanche because I can monitor everything she does. If I'm not available to + monitor, she's off. +
+

+ That is what responsible agentic AI looks like in practice. The human stays in the loop on + irreversibles. The agent can be shut down instantly. Done. +

+
+ +
+

A Real Day

+

+ Proof that this isn't theatre. On the day I gave this talk, before I joined the Zoom, + Hikari had: +

+
    +
  • Audited a 500+ message Discord thread and pulled out the action items
  • +
  • Purged 820 messages from a private channel I asked her to clean, with explicit confirmation first
  • +
  • Done the background research for this talk — cross-referenced my DMs, server messages, prior conversations, and the workshop brief, and gave me a one-page brief
  • +
  • Made the diagrams on my slides
  • +
+

+ That is one Friday. Most days look like that. The pattern is the same every time: I describe + the outcome I want, she figures out the steps, she asks before doing anything irreversible, + and I get my time back. +

+
+ +
+ +
+

Build Your Own — The Four-Step Minimum

+

+ An agent doesn't get useful by being complex. It gets useful by being yours. Build + the one tool that would unblock your own week. That's the whole assignment. +

+
    +
  • +
    + A harness. Claude Code, Cursor, or any equivalent. Anthropic gives you + free credits to try Claude Code. Pick whichever feels least intimidating and start there. +
    +
  • +
  • +
    + One markdown file. Fifty lines to start. Tell it who the agent is, what + you want her to do, and how you want her to talk to you. Don't overthink it. Mine started + at thirty lines and grew over a year. Yours can start at five. +
    +
  • +
  • +
    + One MCP server. Pick the one tool you touch the most. Live in GitHub? + Install the GitHub MCP server. Live in Notion? Install the Notion one. Just one. You can + add more later. +
    +
  • +
  • +
    + One trigger. A hook, a webhook, a cron job, or honestly just a terminal + window you leave open and talk to. That counts. You do not need the Discord gateway on + day one. The trigger is whatever makes her feel present enough that you actually use her. +
    +
  • +
+
+ +
+

Where to Go From Here

+

+ If you want to keep building, here's where to find me and the communities around this work: +

+
    +
  • Deepgram's Discord — voice AI, agents, the people building with Deepgram's APIs
  • +
  • freeCodeCamp — if you're early in your dev journey, this is where I learned and where I now help others
  • +
  • My community — come tell me what you built
  • +
  • My socials — everything else
  • +
+

Take care of yourselves, build something small this weekend, and come tell me about it.

+
+ +
+ + + Back to all talks + +
+ + diff --git a/talks/cultivating-a-remote-workspace/index.html b/talks/cultivating-a-remote-workspace/index.html new file mode 100644 index 0000000..75dd568 --- /dev/null +++ b/talks/cultivating-a-remote-workspace/index.html @@ -0,0 +1,308 @@ + + + + RainbowGram: Cultivating a Remote Workspace + + + + + + + +
+

Cultivating a Remote Workspace

+
+ + + 18 June 2026 + + + + Deepgram All Hands (Virtual) + + + + Speaker — RainbowGram Pride Presentation + +
+ +

+ Most workplace allyship conversations focus on policy. This one doesn't. This is a practical, + no-overhaul-required guide to the daily habits that make a measurable difference to LGBTQ+ + colleagues in a distributed team. +

+ +
+ Here's what's going on. If it doesn't affect you, it affects someone you know. Here's how you show up. +
+ +
+ +
+

Start With Joy

+

+ Before anything else: we're here, and that's worth celebrating. Pride exists because queer + people built joy in the face of things that were designed to erase them. That is not small. And + celebrating that genuinely — not performatively — is one of the most powerful things any of us + can do. +

+

+ We spend 40+ hours a week at work. That is a genuinely enormous portion of our waking lives. + Which means the workplace is one of the most important places we can make a real difference for + queer people — not by overhauling policy, but by choosing differently in small moments, every day. +

+
+ +
+

The Honest Context

+

+ The national conversation around LGBTQ+ rights has gotten quieter in recent years. That is not + the same as things getting better. The discourse shifted. A lot is happening that doesn't make + headlines the way it used to. +

+

+ About 7.6% of US adults identify as LGBTQ+, according to Gallup's most recent data. In any + company, that is statistically several of your colleagues. And even if you somehow don't work + directly with a queer person, you almost certainly know one: a sibling, a child, a friend, a + parent. The legislation being passed right now, the erasure that's happening — it doesn't stay + abstract. It lands on a specific person that you care about. +

+
+ This isn't a gay issue. It's a people issue. Which means this conversation is for all of us. +
+
+ +
+ +
+

Five Things You Can Do

+

+ No policy overhaul required. No perfect ally credential. These are gifts you can give starting + today. Think of them as small choices that compound. +

+ +
+

1. Names and Pronouns — It's a Love Language

+

+ Use the correct name and pronouns. Get them right. If you mess up, correct yourself and keep + moving — don't turn your mistake into a whole thing that makes the queer person comfort you + about it. That shift of burden is the part that's exhausting. +

+

+ Here's a concrete thing you can do right now: add your pronouns to your Slack display name. + Go first. It takes about thirty seconds, and it makes space for everyone else to do the same. + When leadership goes first, it signals that it's safe. +

+
+ +
+

2. Curiosity Without Invasion

+

+ Curiosity is fine. Human, even. But there is a category of questions about queer people's + bodies, medical histories, and what someone was "really born as" that you would never ask a + straight or cis colleague. The same standard applies. +

+

+ If you find yourself wondering about surgery, or transition, or someone's identity before you + knew them — that's a question for Google, or a community resource, or a book. Not a person. + The distinction is this: does this question serve them, or does it serve your curiosity? +

+
+ +
+

3. Their Story Is Theirs

+

+ Being out to you does not mean being out to everyone. Someone trusting you with that + information is not permission to share it. Not as gossip. Not as helpful context. Not even as + a compliment. Not your story to give. +

+

+ This matters more in distributed teams than people realise. A Slack message in the wrong + channel, a comment on a call with people they don't know, a well-meaning mention in a + one-on-one — all of these can out someone without any malicious intent. The rule is simple: + unless they told you it's fine to share, it's not. +

+
+ +
+

4. Be the One Who Says Something

+

+ Speak up when queer people aren't in the room. That's exactly when it matters most. The + comment that would never get made in front of a queer colleague gets made because everyone + assumes no one in the room cares. You can change that. +

+

+ You don't have to deliver a perfect speech. You don't have to have the right words memorised. + "I don't think [person] would love hearing that" is enough. That's the whole thing. + This is the biggest gift on this list, and it costs nothing except a moment of choosing to + say something instead of saying nothing. +

+
+ +
+

5. Language Is a Welcome Mat

+

+ Job postings, internal docs, emails, Slack messages, all-hands announcements — the language + we use by default shapes who feels like they belong. Gender-neutral language costs nothing and + includes everyone. +

+
    +
  • Use "they" as a default pronoun when you don't know someone's.
  • +
  • Use "partner" when you don't know someone's relationship structure.
  • +
  • Reconsider "guys" as a group term — "everyone," "folks," "team" all work.
  • +
+

+ None of this requires a policy change or a manager's sign-off. It's a choice, made in the + moment, every time you write something. +

+
+
+ +
+ +
+

Your Action List

+

This is intentionally short. All of it is doable.

+
    +
  • + Right now: + Add your pronouns to your Slack profile and Zoom name. +
  • +
  • + This month: + If you can donate to an LGBTQ+ charity or mutual aid fund, do it. Every dollar counts, and there's no shortage of organisations doing necessary work right now. +
  • +
  • + This year: + When you vote, think about the specific people in your life whose lives are literally on the ballot. Not abstractly. Specifically. +
  • +
  • + Always: + When something makes a queer colleague feel smaller — say something. You don't have to get it perfectly right. You just have to show up. +
  • +
+
+ +
+

The Close

+

+ That's it. That's the whole talk. And I know it can feel like a lot when you list it out — + but most of it comes down to one thing. Treating queer colleagues like their existence is normal + and welcome. Because it is. +

+

+ Every time you use someone's correct name, every time you speak up in a meeting, every time you + add your pronouns to a profile — you're giving back energy that person would otherwise spend + just trying to exist at work. That energy doesn't disappear. It goes somewhere much more + interesting. +

+
+ You are how we cultivate a remote workspace. Happy Pride. 🏳️‍🌈🏳️‍⚧️ +
+
+ +
+ + + Back to all talks + +
+ + diff --git a/talks/index.html b/talks/index.html new file mode 100644 index 0000000..9ca0377 --- /dev/null +++ b/talks/index.html @@ -0,0 +1,114 @@ + + + + Talks + + + + + + + +
+

Talks

+

Slides, notes, and companion guides for talks and workshops I have given.

+
+
+

+ AgentCon / MCPCon: AI and Open Source Mentorship + — Pending +

+

+ + 22–23 October 2026 +

+

Boots-on-the-ground findings from a 100-person open source mentorship cohort: how AI helped contributors stay engaged, and how it made skill-gap identification harder.

+
+
+

+ Berkeley AI Hackathon: Voice AI Agent Workshop +

+

+ + 20–21 June 2026 +

+

A hands-on workshop covering the full real-time voice AI loop: audio capture, low-latency transcription, LLM reasoning, and streaming text-to-speech.

+
+
+

+ RainbowGram: Cultivating a Remote Workspace +

+

+ + 18 June 2026 +

+

A practical, no-overhaul-required guide to the daily habits that make a measurable difference to LGBTQ+ colleagues in a distributed team.

+
+
+

+ UCLA x OutInTech: Building a Real AI Agent +

+

+ + 5 June 2026 +

+

A field guide to wiring together an LLM, tools, context, and triggers into a real agent that does real work — not a chatbot you prompt, but a coworker you trust.

+
+
+
+ + diff --git a/talks/voice-ai-agent-workshop/index.html b/talks/voice-ai-agent-workshop/index.html new file mode 100644 index 0000000..d6058c5 --- /dev/null +++ b/talks/voice-ai-agent-workshop/index.html @@ -0,0 +1,522 @@ + + + + Berkeley AI Hackathon: Voice AI Agent Workshop + + + + + + + +
+

Voice AI Agent Workshop

+
+ + + 20–21 June 2026 + + + + UC Berkeley Campus, Berkeley, CA + + + + Speaker & Sponsor (Berkeley AI Hackathon) + +
+ + +

+ Voice AI is having a moment — and it's more accessible than you might think. In this + workshop, we'll build a fully functional voice AI agent from scratch, using real-time + speech-to-text, a large language model for reasoning, and text-to-speech to talk back. By + the end, you'll have a working agent on your laptop that you can drop straight into a project. +

+ +
+ Talk to your hackathon project in 40 minutes. +
+ +

No prior voice AI experience needed. If you've worked with an API before, you're good to go.

+ +
+ +
+

How a Voice Agent Works

+

+ A voice agent is three pieces wired together in a loop. Understanding this loop is the whole + conceptual foundation. Everything else is implementation detail. +

+
+
+ Ear
+ Speech-to-Text +
+ +
+ Brain
+ LLM +
+ +
+ Mouth
+ Text-to-Speech +
+ +
+

+ The ear captures your audio and transcribes it in real time using + speech-to-text. Latency here is everything: if your STT is slow, the whole agent feels + sluggish. Deepgram's STT runs at sub-300ms end-to-end, which is fast enough to feel like + a real conversation. +

+

+ The brain receives the transcript and decides what to say. This is your + LLM: it reasons over the conversation history and your system prompt, generates a response, + and can call any functions you've given it access to. It can look things up, run code, + fetch data — anything you wire in. +

+

+ The mouth takes the LLM's text response and streams it back as audio. Fast + streaming matters here too: you want audio to start playing before the full response is + generated, or the agent feels like it's thinking too hard. +

+

+ With Deepgram, all three pieces run over a single WebSocket connection. You're not juggling + three separate APIs — it's one socket, one loop, about 80 lines of code to start. +

+
+ +
+

Design Decisions That Actually Matter

+

+ Most tutorials skip these. They're the difference between an agent that feels like a demo + and one that feels like a tool. +

+ +

Interruption Handling

+

+ In a real conversation, you don't wait for the other person to finish before you start + talking. Your agent shouldn't either. When the user starts speaking mid-response, the agent + needs to stop its output and listen. Getting this wrong is the fastest way to make an agent + feel robotic. +

+

+ Deepgram's Voice Agent API handles this for you. The WebSocket connection detects voice + activity on both ends and manages the interrupt logic. You don't have to implement it + yourself — just don't override it. +

+ +

Conversation State

+

+ Every turn in the conversation needs to be threaded correctly. The LLM needs the full + context of what's been said so far — by both sides — to give coherent responses. This + means maintaining a message history and passing it in with every LLM call. +

+

+ Keep your context window in mind. Very long conversations will eventually push older turns + out of the context. A simple solution: keep the system prompt, the last N turns, and let + the rest roll off. The agent won't remember everything, but the conversation will stay + coherent. +

+ +

Latency

+

+ The number that matters is end-to-end: from when the user stops speaking to when audio + starts playing back. Under 500ms feels conversational. Over 1 second feels like a phone + call with bad signal. +

+

+ Three places latency hides: STT processing time, LLM first-token time, and TTS start time. + Streaming helps with all three. Start playing audio as soon as the first TTS chunk arrives. + Choose an LLM model that prioritises speed over capability for conversational use — a + smaller, faster model is often better here than a smarter, slower one. +

+
+ +
+ +
+

Getting Set Up

+ +
+

Before you start:

+
    +
  • Node 20+ or Python 3.11+ installed
  • +
  • A code editor
  • +
  • A free Deepgram account — includes $200 credit on sign-up, no credit card needed
  • +
+
+ +

Five steps to a running agent:

+
    +
  1. +
    + Sign up at console.deepgram.com. + Your account comes with $200 in free credit — more than enough for this workshop and a weekend of hacking. +
    +
  2. +
  3. +
    + Grab your API key from the console. It lives under API Keys in the left sidebar. Copy it somewhere safe. +
    +
  4. +
  5. +
    + Clone the starter repo. We'll walk through this together in the workshop, but the pattern is: +
    git clone <starter-repo-url>
    +cd <repo-name>
    +
    +
  6. +
  7. +
    + Add your API key to the environment. Create a .env file in the project root: +
    DEEPGRAM_API_KEY=your_key_here
    +
    +
  8. +
  9. +
    + Install dependencies and run it: +
    # Node
    +npm install && npm start
    +
    +# Python
    +uv venv && uv pip install -r requirements.txt
    +python main.py
    +
    +
  10. +
+

+ If it's working, you'll see a connection message in the terminal and the agent will greet + you when you speak. If it's not working, come find us at the booth. +

+
+ +
+ +
+

Make It Yours — Three Modifications

+

+ The starter app works, but it's generic. These three modifications are where the session + becomes yours. Each one takes about five to seven minutes. Do them in order, or skip to + whichever interests you most. +

+ +
+

Modification A: Change the Personality

+

+ The agent's system prompt is what defines who it is. Find the system_prompt + variable in the starter code — it'll look something like this: +

+
system_prompt = "You are a helpful assistant."
+

+ Replace it with something more specific. Give the agent a name, a role, a point of view. + For example: +

+
system_prompt = """You are Alf, a friendly AI assistant at a hackathon.
+You give encouragement, suggest project ideas, and answer questions
+about voice AI. You're enthusiastic but concise - people are busy building."""
+

+ Restart the agent and talk to it. The change is immediate. The system prompt is the + entire personality — there's no training, no fine-tuning. You just wrote it. +

+
+ +
+

Modification B: Swap the Voice

+

+ Deepgram's TTS has a full voice catalogue. Find the voice configuration in the starter + code — it'll be a single string like "aura-asteria-en". Change it to any + other voice from the catalogue. +

+

+ A few to try: +

+
    +
  • aura-asteria-en — warm, conversational
  • +
  • aura-orion-en — deep, authoritative
  • +
  • aura-luna-en — clear, neutral
  • +
  • aura-zeus-en — bold, energetic
  • +
+

+ Restart and talk to your agent. It's a one-line change and the agent sounds entirely + different. This is the modification that gets the strongest reaction. +

+
+ +
+

Modification C: Add a Custom Function

+

+ This is the one that makes you realise you can hook it to anything. Function calling lets + the agent invoke Python (or JS) functions you write, then incorporate the result into its + response naturally. +

+

+ Here's a simple example: a function that returns a random project idea when the agent + is asked for inspiration. +

+
import random
+
+def get_project_idea():
+    ideas = [
+        "A voice-controlled to-do list that reads back your tasks",
+        "An AI study buddy that quizzes you out loud",
+        "A real-time translator that speaks back in the target language",
+        "A voice journalling app that summarises your entries",
+    ]
+    return random.choice(ideas)
+

+ Register this function with the agent and tell the LLM when to use it via the system + prompt: "When asked for a project idea, call get_project_idea() and share the result." +

+

+ Now ask your agent for a project idea. Watch it call the function and weave the result + into a natural spoken response. This is the "aha" moment: the agent can call your own + code, your own APIs, your own data. The voice interface is just the front door. +

+
+
+ +
+ +
+

What You Can Build From Here

+

+ Now that you have a working, personalised, function-capable voice agent, here's what that + unlocks for your hackathon project: +

+
    +
  • + Accessibility layer — add voice input and output to any existing + interface. Users who can't type or read small text get a completely different experience. +
  • +
  • + In-game NPC — drop the agent into a game as a character that actually + talks back. Hook the function calling to your game state so it knows what's happening. +
  • +
  • + Voice-controlled developer tool — talk to your build process, your + deploy pipeline, your monitoring dashboard. Voice is an unusually good interface for + things you want to do hands-free. +
  • +
  • + Multilingual support — Deepgram's STT handles dozens of languages. + The LLM can respond in whatever language the user speaks. Global voice interface, almost for free. +
  • +
+
+ +
+

Get Help

+

Building something? Stuck on something? Here's where to find us:

+ +

+ The Deepgram challenge prize this weekend goes to the team that builds the most creative + voice-powered experience. Come say hi, show us what you're building, and let us know if you + want feedback on your voice integration before judging. +

+
+ +
+ + + Back to all talks + +
+ +