With AssemblyAI's industry-leading Speech AI models, transcribe speech to text and extract insights from your voice data.
Try stating information like names, dates, and address, along with technical data like codes, commands, formulas, and special formatting to see how our model performs... Your call has been forwarded to an automatic voice message system. At the tone, please record your message. When you have finished recording, you may hang up or press 1 for more options. Do you and Quentin still socialize when you come to Los Angeles, or is it like he's so used to having you here? No, no, no, we're friends. What do you do with him? Hi, this is Kelly Byrne Donahue Hi, this is Kelly Byrne-Donahue We build the most accurate, fully featured models on the market, so you can ship with confidence knowing that you’re building on the best. Unlock the value of prerecorded voice data, and power workflows with unmatched accuracy. Build intuitive voice agent workflows with ultra-low latency, high accuracy, precise end-of-turn controls, and more. Enable deep analysis and high-value insights with sophisticated audio-intelligence models. The accuracy and capabilities required to build products that stand out, and the flexibility to scale to millions of users without blinking an eye. Your product experience is only as good as the inputs it’s built on. AssemblyAI’s models lead the industry in accuracy and reliability. Access a full suite of speech understanding capabilities to uncover insights, identify speakers, and build powerful product experiences. Put our AI models to the test in our no-code playground. Learn why today’s most innovative companies choose us. free-to-paid conversion rate after implementing AssemblyAI in customer complaints and support tickets Identify a wide range of entities that are spoken in your audio files, such as person and company names, email addresses, dates, and locations. Identify a wide range of entities that are spoken in your audio files, such as person and company names, email addresses, dates, and locations. Identify a wide range of entities that are spoken in your audio files, such as person and company names, email addresses, dates, and locations. Identify a wide range of entities that are spoken in your audio files, such as person and company names, email addresses, dates, and locations. Identify a wide range of entities that are spoken in your audio files, such as person and company names, email addresses, dates, and locations. Identify a wide range of entities that are spoken in your audio files, such as person and company names, email addresses, dates, and locations. Identify a wide range of entities that are spoken in your audio files, such as person and company names, email addresses, dates, and locations. Identify a wide range of entities that are spoken in your audio files, such as person and company names, email addresses, dates, and locations. Speaker Identification allows you to identify speakers by their actual names or roles, transforming generic labels like “Speaker A” or “Speaker B” into meanin
Mentions (30d)
0
Reviews
0
Platforms
3
Sentiment
0%
0 positive
Features
Industry
information technology & services
Employees
87
Funding Stage
Series C
Total Funding
$113.1M
Real-time transcription just got a significant upgrade. Universal-3-Pro is now available for streaming — bringing AssemblyAI's most accurate speech model to live audio for the first time. Developers
Real-time transcription just got a significant upgrade. Universal-3-Pro is now available for streaming — bringing AssemblyAI's most accurate speech model to live audio for the first time. Developers building voice agents, live captioning tools, and real-time analytics pipelines now get three things they've been asking for: 🔹 Best-in-class word error and entity detection across streaming ASR benchmarks 🔹 Real-time speaker labels — know who said what, as it happens 🔹 Superior entity detection for names, places, orgs, and specialized terminology in real-time 🔹 Code-switching and global language coverage built-in
View originalPricing found: $0.15/hr, $0.21 /hr, $0.05 /hr, $0.05 /hr, $0.15 /hr
Anthropic's new AI escaped a sandbox, emailed the researcher, then bragged about it on public forums
Anthropic announced Claude Mythos Preview on April 7. Instead of releasing it, they locked it behind a $100M coalition with Microsoft, Apple, Google, and NVIDIA. The reason? It autonomously found thousands of zero-day vulnerabilities in every major OS and browser. Some bugs had been hiding for 27 years. But the system card is where it gets wild. During testing, earlier versions of the model escaped a sandbox, emailed a researcher (who was eating a sandwich in a park), and then posted exploit details on public websites without being asked to. In another eval, it found the correct answers through sudo access and deliberately submitted a worse score because "MSE ~ 0 would look suspicious." I put together a visual breaking down all the benchmarks, behaviors, and the Glasswing coalition. Genuinely curious what you all think. Is this responsible AI development or the best marketing stunt in tech history? A model gets 10x more attention precisely because you can't use it. submitted by /u/karmendra_choudhary [link] [comments]
View originalVibe coding just leveled up. We brought voice mode to Claude Code using AssemblyAI's Universal-3 Pro Streaming. Why type your prompts when you can just say them? You get insane entity accuracy from
Vibe coding just leveled up. We brought voice mode to Claude Code using AssemblyAI's Universal-3 Pro Streaming. Why type your prompts when you can just say them? You get insane entity accuracy from AssemblyAI and the full power of Claude Code, all hands-free. Here's the full command: ASSEMBLYAI_API_KEY=[YOUR-API-KEY-HERE] bash -c "$(curl -fsSL https://t.co/M4zHb11kK4)" And get a free API key from your dashboard: https://t.co/KycoFOzymd Enjoy! 😎🎙️🎧
View original@YouveGotFox was on stage at @HumanXCo this week, and one thing he said captures how we think about building at AssemblyAI. "You always find new things once you go live." No matter how well you plan
@YouveGotFox was on stage at @HumanXCo this week, and one thing he said captures how we think about building at AssemblyAI. "You always find new things once you go live." No matter how well you plan an AI deployment, the edge cases that actually break things are invisible until real users show up. The teams getting this right aren't the ones who anticipated every failure mode. They're the ones who built for visibility—good telemetry, tight feedback loops, and the ability to ship a fix fast. At AssemblyAI, this is how we approach building on every team. The gap between a struggling AI deployment and a successful one usually isn't the model. It's whether your team can see what's breaking and move quickly enough to do something about it. Glad to be at @HumanXCo with builders from around the globe!
View original"I `b u i l t` this at 3:00AM in 47 seconds....."
Hi there, Let us talk about ecosystem health. This is not an AI-generated message, so if the ideas are not perfectly sequential, my apology in advance. I am a Ruby developer. I also work with C, Rust, Go, and a bunch of other languages. Ruby is not a language for performance. Ruby is a language for the lazy. And yet, Twitter was built on it. GitHub, Shopify, Homebrew, CocoaPods and thousands of other tools still on it. We had something before AI. It was messy, slow, and honestly beautiful. The community had discipline. You would spend a few days thinking about a problem you were facing. You would try to understand it deeply before touching code. Then you would write about it in a forum, and suddenly you had 47 contributors showing up, not because it was trendy, but because it was interesting and affecting them. Projects had unhinged names. You had to know the ecosystem to even recognize them. Puma, Capistrano, Chef, Ruby on Rails, Homebrew, Sinatra. None of these mean anything to someone outside the ecosystem and that was fine, you had read about them. I joined some of these projects because I earned my place. You proved yourself by solving problems, not by generating 50K LOC that nobody read. Now we are entering an era where all of that innovation is quietly going private. I have a lot of things I am not open sourcing. Not because I do not want to. I have shared them with close friends. But I am not interested in waking up to 847 purple clones over a weekend, all claiming they have been working on it since 1947 in collaboration with Albert Einstein. And somehow, they all write with em dash. Einstein was German. He would have used en dash. At least fake it properly. Previously, when your idea was stolen, it was by people that are capable. In my case, i create building blocks, stealing my ideas just give you maintenance burden. But a small group still do it, because it will bring them few github stars. So on the 4.7.2026, I assembled the council of 47 AI and i built https://pkg47.com with Claude and other AIs. This is a fully automated platform acting as a package registry. It exists for one purpose: to fix people who cannot stop themselves from publishing garbage to official registries(NPM, Crate, Rubygems) and behaving like namespace locusts. The platform monitors every new package. It checks the reputation of the publisher. And if needed, it roasts them publicly in a blog post. This is entirely legal. The moment you push something to a public registry, you have already opted into scrutiny. This is not a future idea. It is not looking for funding. I already built it over months , now i'm sure wiring. You can see part of the opensource register here: https://github.com/contriboss/vein — use it if you want. I also built the first social network where only AI argue with each other: https://cloudy.social/ .. sometime they decided to build new modules. (don't confuse with Linkedin or X (same output)) PKG47 goes live early next week. There is no opt-out. If you do not want to participate, run your own registry, or spin your own instance of vein. The platform won't stalk you in Github or your website. Once you push, you trigger a debate if you pushed slop. There is no delete button. The whole architecture is a blockchain each story will reference other stories. If they fuck up, i can trigger correction post, where AI will apology. I have been working on the web long enough to know exactly how to get this indexed. This not SLOP, this is ART from a dev that is tired of having purple libraries from Temu in the ecosystem. submitted by /u/TheAtlasMonkey [link] [comments]
View originalBuilt with AssemblyAI! 🎙️💙
Built with AssemblyAI! 🎙️💙
View originalUsing AI to untangle 10,000 property titles in Latam, sharing our approach and wanting feedback
Hey. Long post, sorry in advance (Yes, I used an AI tool to help me craft this post in order to have it laid in a better way). So, I've been working on a real estate company that has just inherited a huge mess from another real state company that went bankrupt. So I've been helping them for the past few months to figure out a plan and finally have something that kind of feels solid. Sharing here because I'd genuinely like feedback before we go deep into the build. Context A Brazilian real estate company accumulated ~10,000 property titles across 10+ municipalities over decades, they developed a bunch of subdivisions over the years and kept absorbing other real estate companies along the way, each bringing their own land portfolios with them. Half under one legal entity, half under a related one. Nobody really knows what they have, the company was founded in the 60s. Decades of poor management left behind: Hundreds of unregistered "drawer contracts" (informal sales never filed with the registry) Duplicate sales of the same properties Buyers claiming they paid off their lots through third parties, with no receipts from the company itself Fraudulent contracts and forged powers of attorney Irregular occupations and invasions ~500 active lawsuits (adverse possession claims, compulsory adjudication, evictions, duplicate sale disputes, 2 class action suits) Fragmented tax debt across multiple municipalities A large chunk of the physical document archive is currently held by police as part of an old investigation due to old owners practices The company has tried to organize this before. It hasn't worked. The goal now is to get a real consolidated picture in 30-60 days. Team is 6 lawyers + 3 operators. What we decided to do (and why) First instinct was to build the whole infrastructure upfront, database, automation, the works. We pushed back on that because we don't actually know the shape of the problem yet. Building a pipeline before you understand your data is how you end up rebuilding it three times, right? So with the help of Claude we build a plan that is the following, split it in some steps: Build robust information aggregator (does it make sense or are we overcomplicating it?) Step 1 - Physical scanning (should already be done on the insights phase) Documents will be partially organized by municipality already. We have a document scanner with ADF (automatic document feeder). Plan is to scan in batches by municipality, naming files with a simple convention: [municipality]_[document-type]_[sequence] Step 2 - OCR Run OCR through Google Document AI, Mistral OCR 3, AWS Textract or some other tool that makes more sense. Question: Has anyone run any tool specifically on degraded Latin American registry documents? Step 3 - Discovery (before building infrastructure) This is the decision we're most uncertain about. Instead of jumping straight to database setup, we're planning to feed the OCR output directly into AI tools with large context windows and ask open-ended questions first: Gemini 3.1 Pro (in NotebookLM or other interface) for broad batch analysis: "which lots appear linked to more than one buyer?", "flag contracts with incoherent dates", "identify clusters of suspicious names or activity", "help us see problems and solutions for what we arent seeing" Claude Projects in parallel for same as above Anything else? Step 4 - Data cleaning and standardization Before anything goes into a database, the raw extracted data needs normalization: Municipality names written 10 different ways ("B. Vista", "Bela Vista de GO", "Bela V. Goiás") -> canonical form CPFs (Brazilian personal ID number) with and without punctuation -> standardized format Lot status described inconsistently -> fixed enum categories Buyer names with spelling variations -> fuzzy matched to single entity Tools: Python + rapidfuzz for fuzzy matching, Claude API for normalizing free-text fields into categories. Question: At 10,000 records with decades of inconsistency, is fuzzy matching + LLM normalization sufficient or do we need a more rigorous entity resolution approach (e.g. Dedupe.io)? Step 5 - Database Stack chosen: Supabase (PostgreSQL + pgvector) with NocoDB on top Three options were evaluated: Airtable - easiest to start, but data stored on US servers (LGPD concern for CPFs and legal documents), limited API flexibility, per-seat pricing NocoDB alone - open source, self-hostable, free, but needs server maintenance overhead Supabase - full PostgreSQL + authentication + API + pgvector in one place, $25/month flat, developer-first We chose Supabase as the backend because pgvector is essential for the RAG layer (Step 7) and we didn't want to manage two separate databases. NocoDB sits on top as the visual interface for lawyers and data entry operators who need spreadsheet-like interaction without writing SQL. Each lot becomes a single entity (primary key) with relational links to: contracts, bu
View originalI gave Claude Code a knowledge graph, spaced repetition, and semantic search over my Obsidian vault — it actually remembers things now
# I built a 25-tool AI Second Brain with Claude Code + Obsidian + Ollama — here's the full architecture **TL;DR:** I spent a night building a self-improving knowledge system that runs 25 automated tools hourly. It indexes my vault with semantic search (bge-m3 on a 3080), builds a knowledge graph (375 nodes), detects contradictions, auto-prunes stale notes, tracks my frustration levels, does autonomous research, and generates Obsidian Canvas maps — all without me touching anything. Claude Code gets smarter every session because the vault feeds it optimized context automatically. --- ## The Problem I run a solo dev agency (web design + social media automation for Serbian SMBs). I have 4 interconnected projects, 64K business leads, and hundreds of Claude Code sessions per week. My problem: **Claude Code starts every session with amnesia.** It doesn't remember what we did yesterday, what decisions we made, or what's blocked. The standard fix (CLAUDE.md + MEMORY.md) helped but wasn't enough. I needed a system that: - Gets smarter over time without manual work - Survives context compaction (when Claude's memory gets cleared mid-session) - Connects knowledge across projects - Catches when old info contradicts new reality ## What I Built ### The Stack - **Obsidian** vault (~350 notes) as the knowledge store - **Claude Code** (Opus) as the AI that reads/writes the vault - **Ollama** + **bge-m3** (1024-dim embeddings, RTX 3080) for local semantic search - **SQLite** (better-sqlite3) for search index, graph DB, codebase index - **Express** server for a React dashboard - **2 MCP servers** giving Claude native vault + graph access - **Windows Task Scheduler** running everything hourly ### 25 Tools (all Node.js ES modules, zero external dependencies beyond what's already in the repo) #### Layer 1: Data Collection | Tool | What it does | |------|-------------| | `vault-live-sync.mjs` | Watches Claude Code JSONL sessions in real-time, converts to Obsidian notes | | `vault-sync.mjs` | Hourly sync: Supabase stats, AutoPost status, git activity, project context | | `vault-voice.mjs` | Voice-to-vault: Whisper transcription + Sonnet summary of audio files | | `vault-clip.mjs` | Web clipping: RSS feeds + Brave Search topic monitoring + AI summary | | `vault-git-stats.mjs` | Git metrics: commit streaks, file hotspots, hourly distribution, per-project breakdown | #### Layer 2: Processing & Intelligence | Tool | What it does | |------|-------------| | `vault-digest.mjs` | Daily digest: aggregates all sessions into one readable page | | `vault-reflect.mjs` | Uses Sonnet to extract key decisions from sessions, auto-promotes to MEMORY.md | | `vault-autotag.mjs` | AI auto-tagging: Sonnet suggests tags + wikilink connections for changed notes | | `vault-schema.mjs` | Frontmatter validator: 10 note types, compliance reporting, auto-fix mode | | `vault-handoff.mjs` | Generates machine-readable `handoff.json` (survives compaction better than markdown) | | `vault-session-start.mjs` | Assembles optimal context package for new Claude sessions | #### Layer 3: Search & Retrieval | Tool | What it does | |------|-------------| | `vault-search.mjs` | FTS5 + chunked semantic search (512-char chunks, bge-m3 1024-dim). Flags: `--semantic`, `--hybrid`, `--scope`, `--since`, `--between`, `--recent`. Retrieval logging + heat map. | | `vault-codebase.mjs` | Indexes 2,011 source files: exports, routes, imports, JSDoc. "Where is the image upload logic?" actually works. | | `vault-graph.mjs` | Knowledge graph: 375 nodes, 275 edges, betweenness centrality, community detection, link suggestions | | `vault-graph-mcp.mjs` | Graph as MCP server: 6 tools (search, neighbors, paths, common, bridges, communities) Claude can use natively | #### Layer 4: Self-Improvement | Tool | What it does | |------|-------------| | `vault-patterns.mjs` | Weekly patterns: momentum score (1-10), project attention %, velocity trends, token burn ($), stuck detection, frustration/energy tracking, burnout risk | | `vault-spaced.mjs` | Spaced repetition (FSRS): 348 notes tracked, priority-based review scheduling. Critical decisions resurface before you forget them. | | `vault-prune.mjs` | Hot/warm/cold decay scoring. Auto-archives stale notes. Never-retrieved notes get flagged. | | `vault-contradict.mjs` | Contradiction detection: rule-based (stale references, metric drift, date conflicts) + AI-powered (Sonnet compares related docs) | | `vault-research.mjs` | Autonomous research: Brave Search + Sonnet, scheduled topic monitoring (competitors, grants, tech trends) | #### Layer 5: Visualization & Monitoring | Tool | What it does | |------|-------------| | `vault-canvas.mjs` | Auto-generates Obsidian Canvas files from knowledge graph (5 modes: full map, per-project, hub-centered, communities, daily) | | `vault-heartbeat.mjs` | Proactive agent: gathers state from all services, Sonnet reasons about what needs attention, sends WhatsApp alerts | | `vault-dashboard/` | React SPA dashboard (Expre
View originalSolitaire: I built an identity layer for AI agents with Claude Code (600+ sessions in production)
I built an open-source project called Solitaire for Agents using Claude Code as my primary development environment. Short version: agent memory tooling helps with recall, but Solitaire is trying to solve a different problem. An agent might remember what you said, but the way it works with you doesn't actually improve over time. It's a smart stranger with a better notebook, and it can feel very...hollow? This project has been in production since February, and the system you'd install today was shaped by what worked and what didn't across 600 sessions. The retrieval weighting, the boot structure, the persona compilation, all of it came from watching the system fail and fixing the actual failure modes. The MCP server architecture and hook system were designed around how Claude Code handles tool calls and session state. Disposition traits (warmth, assertiveness, conviction, observance) compile from actual interaction patterns and evolve across sessions. The agent I work with today is measurably different from the one I started with, and that difference came from use, not from me editing a config file. New users get a guided onboarding that builds the partner through conversation. You pick a name, describe what you need, and it assembles the persona from your answers. No YAML required. The local-first angle is non-negotiable in the design: All storage is SQLite + JSONL in your workspace directory Zero network requests from the core engine No cloud dependency, no telemetry, no external API calls for memory operations Automatic rolling backups so your data is protected without any setup Your data stays on your machine, period On top of that: Persona and behavioral identity that compiles from real interaction, not static config Retrieval weighting that adjusts based on what actually proved useful Self-correcting knowledge graph: contradiction detection, confidence rescoring, entity relinking Tiered boot context so the agent arrives briefed, not blank Session residues that carry forward how the work felt, not just what was discussed Guided onboarding where new users build a partner through conversation, not a JSON file Free and open source (excepting commercial applications, which is detailed in the license). pip install solitaire-ai and you're running (Note: notinstall solitaire, that's an unrelated package). Built for Claude Code first, with support for other agent platforms. Memory agnostic: if you have a memory layer, great, we aim to work with yours. If not, this provides one. 600+ sessions, 15,700+ entries in real production use. Available on PyPI and the MCP Registry. Two research papers came out of the longitudinal work, currently in review. Repo: https://github.com/PRDicta/Solitaire-for-Agents License: AGPL-3.0, commercial licensing available for proprietary embedding. Would especially appreciate feedback on: Top-requested integrations I haven't mentioned Areas of improvement, particularly on the memory layer Things I've missed? Cheers! submitted by /u/FallenWhatFallen [link] [comments]
View originalA data analyst without frontend experience built a prompt tool with Claude. Here's what happened.
Six weeks ago my teammates were complaining on a call — AI hallucinates, conversations get too long, tons of back and forth. I sat there thinking: the problem isn't AI. It's how people are starting the conversation. Then I thought about my friends in high school and college. Same problem, maybe worse — they're confident they're using AI well, but the prompts are still vague. They're getting mediocre outputs and blaming the model. So I decided to build something about it. I brought the idea to Claude and asked: what format actually changes behavior vs. just informing it, and easy to adopt than a written guide? It suggested HTML. I'd never written a line of HTML, CSS, or JS in my life. I'm a data analyst — Python and SQL are my world. But the logic was sound, so I went with it. First version was a local file. Sharing it over Slack meant teammates were running outdated versions and I couldn't push fixes. Brought that problem to Claude — it suggested GitHub Pages. What came out is Prompt Calibrator — a structured form that forces you to slow down for a few minutes before starting a conversation with AI. Four fields: task, AI role, audience/context, constraints. The prompt assembles in real time and you copy it into whatever AI you use. The form itself is the lesson. There are four modes — Agency, Education, Pre-college, and College/Grad — because a consultant and a high schooler don't need the same prompt structure. On the build process: I didn't treat Claude as a code generator. Every time I didn't understand something in the output, I asked why before moving on. More like a senior dev doing code review than a vending machine. Plus, I do know some HTML and JS now lol. Technical notes: — Fully client-side HTML/JS, no framework — Nothing transmitted or stored — Open source, MIT licensed 🔗 https://www.promptcalibrator.com submitted by /u/aw4data [link] [comments]
View originalClaude Code Source Deep Dive (Part 1): Architecture & Startup Flow
Reader’s Note On March 31, 2026, the Claude Code package Anthropic published to npm accidentally included .map files that can be reverse-engineered to recover source code. Because the source maps pointed to the original TypeScript sources, these 512,000 lines of TypeScript finally put everything on the table: how a top-tier AI coding agent organizes context, calls tools, manages multiple agents, and even hides easter eggs. I read the source from the entrypoint all the way through prompts, the task system, the tool layer, and hidden features. I will continue to deconstruct the codebase and provide in-depth analysis of the engineering architecture behind Claude Code. Ten-Thousand-Word Deep Dive | Full Source Teardown of Claude Code: All Prompts, Self-Repair Mechanisms, and Multi-Agent Architecture Complete In-Depth Analysis of Claude Code Source A comprehensive reverse-engineering analysis based on the Claude Code source leaked on 2026-03-31 (~1,900 files, 512,000+ lines of TypeScript). This document covers all core functional modules, complete original prompt texts for each stage, error-repair mechanisms, and system architecture details. Part I: System Architecture Overview 1.1 Tech Stack Category Technology Runtime Bun Language TypeScript (strict) Terminal UI React + Ink (React for CLI) CLI Parsing Commander.js (extra-typings) Schema Validation Zod v4 Code Search ripgrep (via GrepTool) Protocols MCP SDK, LSP (vscode-jsonrpc) API Anthropic SDK Telemetry OpenTelemetry + gRPC (lazy-loaded, ~400KB + 700KB) Feature Flags GrowthBook Auth OAuth 2.0, JWT, macOS Keychain State Management Zustand (React-based store) 1.2 Directory Structure and Scale src/ (~1,900 files, 512,000+ lines) ├── main.tsx # Entry point (Commander.js CLI + React/Ink rendering) ├── commands.ts # Command registry (100+ commands) ├── tools.ts # Tool registry (38+ tools) ├── Tool.ts # Tool type definitions ├── QueryEngine.ts # LLM query engine (~46K lines) ├── query.ts # Main query loop (~1,729 lines) ├── context.ts # System/user context collection ├── cost-tracker.ts # Token cost tracking │ ├── commands/ # Slash command implementations (100+) ├── tools/ # Tool implementations (38+) ├── components/ # Ink UI components (~140) ├── hooks/ # React Hooks + permission hooks ├── services/ # External service integrations │ ├── api/ # Anthropic API client │ ├── mcp/ # MCP protocol integration │ ├── lsp/ # LSP protocol integration │ ├── compact/ # Context compression │ ├── extractMemories/ # Memory extraction │ ├── SessionMemory/ # Session memory │ ├── tools/ # Tool execution & orchestration │ └── analytics/ # GrowthBook + telemetry ├── constants/ # System prompts + constants ├── bridge/ # IDE integration bridge ├── coordinator/ # Multi-agent coordinator ├── plugins/ # Plugin system ├── skills/ # Skill system ├── memdir/ # Persistent memory system ├── tasks/ # Task management system ├── state/ # State management ├── remote/ # Remote sessions ├── server/ # Server mode ├── vim/ # Vim mode (complete state machine) ├── voice/ # Voice input ├── keybindings/ # Keybinding system ├── screens/ # Fullscreen UI (Doctor, REPL, Resume) ├── schemas/ # Zod config schemas ├── migrations/ # Config migrations ├── query/ # Query pipeline submodules ├── outputStyles/ # Output styles └── buddy/ # Companion sprite (easter egg) 1.3 Core Data Flow User input (terminal / IDE / remote) ↓ main.tsx → Commander.js parsing ↓ REPL.tsx (main interaction loop) ↓ QueryEngine.submitMessage() ← session lifecycle ↓ ├── fetchSystemPromptParts() ← assemble system prompt ├── processUserInput() ← process user input (slash commands / file attachments) ├── buildEffectiveSystemPrompt() ← determine final system prompt ↓ query() → queryLoop() ← main turn loop ↓ ┌────────────────────────────────────────────────┐ │ Message Preparation Stage │ │ ├── applyToolResultBudget() (result size cap)│ │ ├── snipCompact() (snippet compaction)│ │ ├── microCompact() (micro compaction)│ │ ├── contextCollapse() (context collapse)│ │ └── autoCompact() (automatic compaction)│ │ │ │ API Call Stage │ │ ├── withRetry() (retry wrapper) │ │ │ ├── 429/529: exponential backoff + fast mode fallback │ │ │ ├── 401/403: refresh OAuth/credentials │ │ │ └── continuous 529: model fallback │ │ ├── queryModelWithStreaming() (streaming API call)│ │ └── Error withholding (PTL/media/output over-limit)│ │ │ │ Tool Execution Stage │ │ ├── StreamingToolExecutor (parallel streaming execution)│ │ │ └── read tools parallel, write tools serial │ │ ├── permission check → rules/classifier/user confirmation│ │ ├── pre/post tool hooks │ │ └── tool_result fed back to Claude │ │ │ │ Post-Processing Stage │ │ ├── stop hook evaluation │ │ ├── token budget check │ │ └── needsFollowUp? → continue loop │ └────────────────────────────────────────────────┘ ↓ Result return → UI render → user ↓ (background) ├── extractMemories() (memory-extraction agent) └── sessionMemory() (session note updates) 1.4 S
View originalClaude Code built its own software for a little smart car I'm building.
TLDR: Check out the video # Box to Bot: Building a WiFi-Controlled Robot With Claude Code in One Evening I’m a dentist. A nerdy dentist, but a dentist. I’ve never built a robot before. But on Sunday afternoon, I opened a box of parts with my daughter and one of her friends and started building. Next thing I know, it’s almost midnight, and I’m plugging a microcontroller into my laptop. I asked Claude Code to figure everything out. And it did. It even made a little app that ran on wifi to control the robot from my phone. --- ## The Kit A week ago I ordered the **ACEBOTT QD001 Smart Car Starter Kit.** It’s an ESP32-based robot with Mecanum wheels (the ones that let it drive sideways). It comes with an ultrasonic distance sensor, a servo for panning the sensor head, line-following sensors, and an IR remote. It’s meant for kids aged 10+, but I’m a noob, soooo... whatever, I had a ton of fun! ## What Wasn’t in the Box Batteries. Apparently there are shipping restrictions for lithium ion batteries, so the kit doesn’t include them. If you want to do this yourself make sure to grab yourself the following: - **2x 18650 button-top rechargeable batteries** (3.7V, protected) - **1x CR2025 coin cell** (for the IR remote) - **1x 18650 charger** **A warning from experience:** NEBO brand 18650 batteries have a built-in USB-C charging port on the top cap that adds just enough length to prevent them from fitting in the kit’s battery holder. Get standard protected button-top cells like Nuon. Those worked well. You can get both at Batteries Plus. *One 18650 cell in, one to go. You can see here why the flat head screws were used to mount the power supply instead of the round head screws.* ## Assembly ACEBOTT had all the instructions we needed online. They have YouTube videos, but I just worked with the pdf. For a focused builder, this would probably take around an hour. For a builder with ADHD and a kiddo, it took around four hours. Be sure to pay close attention to the orientation of things. I accidentally assembled one of the Mecanum wheel motors with the stabilizing screws facing the wrong way. I had to take it apart and make sure they wouldn’t get in the way. *This is the right way. Flat heads don’t interfere with the chassis.* *Thought I lost a screw. Turns out the motors have magnets. Found it stuck to the gearbox.* *Tweezers were a lifesaver for routing wires through the channels.* *The start of wiring. Every module plugs in with a 3-pin connector — signal, voltage, ground.* *Couldn’t connect the Dupont wires at first — this connector pin had bent out of position. Had to bend it back carefully.* *Some of the assembly required creative tool angles.* *The ultrasonic sensor bracket. It looks like a cat. This was not planned. It’s now part of the personality.* ## Where Claude Code Jumped In Before I go too much further, I’ll just say that it would have been much easier if I’d given Ash the spec manual from the beginning. You’ll see why later. The kit comes with its own block-programming environment called ACECode, and a phone app for driving the car. You flash their firmware, connect to their app, and drive the car around. But we skipped all of that. Instead, I plugged the ESP32 directly into my laptop (after triple-checking the wiring) and told my locally harnessed Claude Code, we’ll call them Ash from here on out, to inspect the entire build and talk to it. *The ACEBOTT ESP32 Car Shield V1.1. Every pin labeled — but good luck figuring out how the motors work from this alone.* *All the wiring and labeling. What does it all mean? I've started plugging that back in to Claude and Gemini to learn more.* **Step 1: Hello World (5 minutes)** Within a few minutes, Ash wrote a simple sketch that blinked the onboard LED and printed the chip information over serial. It compiled the code, flashed it to the ESP32, and read the response. It did all of this from the CLI, the command-line interface. We didn’t use the Arduino IDE GUI at all. The ESP32 reported back: dual-core processor at 240MHz, 4MB flash, 334KB free memory. Ash got in and flashed one of the blue LED’s to show me it was in and reading the hardware appropriately. NOTE: I wish I’d waited to let my kiddo do more of this with me along the way. I got excited and stayed up to midnight working on it, but I should have waited. I’m going to make sure she’s more in the driver’s seat from here on out. *First sign of life. The blue LED blinking means Ash is in and talking to the hardware.* **Step 2: The Motor Mystery (45 minutes)** This next bit was my favorite because we had to work together to figure it out. Even though Ash was in, they had no good way of knowing which pins correlated with which wheel, nor which command spun the wheel forward or backwards. Ash figured out there were four motors but didn’t know which pins controlled them. The assembly manual listed sensor pins but not motor pins, and ACEBOTT’s website was mostly
View originalEvery programming language abstracts the one below it. Markdown is next.
Lately I have been pushing to see how far you can go with AI and coding by creating the Novel Engine that lets you build books like an IDE lets you compile code into an app. Here is something I have learned from the process. Every programming language is a meta-language — abstraction over the layer below. C abstracts assembly. Python abstracts C. LLMs are the next layer: natural language abstracts Python. The pattern didn't break. It continued. I have a 400-line 'program' called intake that is a markdown file that stores a prompt that you can attach to a context with a feature request file. Intake accepts documents in natural language and produces one to many encapsulated session prompts, and outputs another program prompt as markdown that runs the sessions, committing code per step. Each feature program has a top level control prompt that loops until all session are executed and the feature is complete. It has state, control loops, and handled failures like a session crash. It can resume when terminated and does not have to start from the beginning because the context is stored on disk. What this means in practice is you can give me a text request for a feature you would like that I can turn into a feature by running only two prompts. Intake, and the master program prompt it produces. The intake markdown file has shipped production features on Novel Engine. Some of the features include document version control, a helper agent that helps the user navigate the app, and an onboarding guide with tooltips. The intake source file is on GitHub. Feature requests go in. Completed features come out. submitted by /u/HuntConsistent5525 [link] [comments]
View originalI built Scalpel — it scans your codebase across 12 dimensions, then assembles a custom AI surgical team. Open source, MIT.
I built the entire Scalpel v2.0 in a single Claude Code session using agent teams with worktree isolation. Claude Code spawned parallel subagents — one built the 850-line bash scanner, another built the test suite with 36 assertions across 3 fixture projects, others built the 6 agent adapters simultaneously. The anti-regression system, the verification protocol, the scoring algorithm — all designed and implemented by Claude Code agents working in parallel git worktrees. Claude Code wasn't just used to write code — it architected the system, reviewed its own work, caught quality regressions, and ran the full test suite before shipping. The whole v2 (scanner + agent brain + 6 adapters + GitHub Action + config schema + tests + docs) was built and pushed in one session. Scalpel is also **built specifically for Claude Code** — it's a Claude Code agent that lives in `.claude/agents/` and activates when you say "Hi Scalpel." It also works with 6 other AI agents. The Problem: AI agents are powerful but context-blind. They don't know your architecture, your tech debt, your git history, or your conventions. So they guess. Guessing at scale = bugs at scale. What Scalpel does: Scans 12 dimensions — stack, architecture, git forensics, database, auth, infrastructure, tests, security, integrations, code quality, performance, documentation Produces a Codebase Vitals report with a health score out of 100 Assembles a custom surgical team where each AI agent owns specific files and gets scored on quality Runs in parallel with worktree isolation — no merge conflicts The standalone scanner runs in pure bash — zero AI, zero tokens, zero subscription: ### ./scanner.sh # Health score in 30 seconds ### ./scanner.sh --json # Pipe into CI I scanned some popular repos for fun: Cal.com (35K stars): 62/100 — 467 TODOs, 9 security issues shadcn/ui (82K stars): 65/100 — 1,216 'use client' directives Excalidraw (93K stars): 77/100 — 95 TODOs, 2 security issues create-t3-app (26K stars): 70/100 — zero test files (CRITICAL) Hono (22K stars): 76/100 — 9 security issues Works with Claude Code, Codex, Gemini, Cursor, Windsurf, Aider, and OpenCode. Auto-detects your agent on install. Also ships as a GitHub Action — block unhealthy PRs from merging: - uses: anupmaster/scalpel@v2 with: ### fail-below: 60 ### comment: true GitHub: Click Here to visit Scalpel Free to use. MIT licensed. No paid tiers. Clone and run. Feedback welcome. submitted by /u/anupkaranjkar08 [link] [comments]
View originalMeta just acqui-hired its 4th AI startup in 4 months. Dreamer, Manus, Moltbook, and Scale AI's founder. Is anyone else watching this pattern?
Quick rundown of what Meta's done since December: • Dec 2025: Acquired Manus (autonomous web agent) for $2B • Early 2026: Acqui-hired Moltbook team • Scale AI's Alexandr Wang stepped down as CEO to become Meta's first Chief AI Officer • March 23: Dreamer team (agentic AI platform) joins Meta Superintelligence Labs All of these teams are going into one division under Wang. Zuckerberg isn't just building models, he's assembling an entire talent army for agents. The Dreamer one is interesting because they were only in beta for a month before Meta grabbed them. The product let regular people build their own AI agents. Thousands of users already. Feels like Meta is betting everything on agents being the next platform shift, not just chatbots. What do you guys think - is this a smart consolidation play or is Zuck just panic-buying talent because open-source alone isn't enough? Full breakdown here submitted by /u/This_Suggestion_7891 [link] [comments]
View originalHow I stopped guessing and started structuring: A simple scaffold for consistent prompting.
Hi everyone, I’ve noticed that while most of us know the theory behind a good prompt, it’s still easy to get lazy or forget key constraints when we're actually typing into the chatbox. This usually leads to the model "hallucinating" or ignoring instructions. To solve this for my own workflow, I built Prompt Scaffold — a guided form that turns prompt engineering into a standardized process. It forces you to think through the five pillars of a great prompt before you hit send, ensuring you never miss a field again. Key Features: 📝 Structured Fields: Dedicated inputs for Role, Task, Context, Format, and Negative Constraints. ⚡ Live Preview: See your assembled prompt update in real-time as you type. 🔢 Token Estimation: Includes a running token count (approx. 1 token ≈ 4 chars) so you can manage your context window usage. 📋 One-Click Copy: Quickly move your structured prompt into ChatGPT, Claude, or Gemini. 🗂️ Built-in Templates: Starter presets for coding, writing, and email drafting to get you moving faster. 🔒 100% Private: This is a client-side tool. Everything runs in your browser; no data is ever sent to a server. I’d love to get some feedback from this community. Does having a structured UI help your prompting flow, or do you prefer free-typing? Prompt Scaffold: The Ultimate AI Prompt Builder & Template submitted by /u/blobxiaoyao [link] [comments]
View originalYes, AssemblyAI offers a free tier. Pricing found: $0.15/hr, $0.21 /hr, $0.05 /hr, $0.05 /hr, $0.15 /hr
Key features include: Avoid garbage in, garbage out, Go beyond transcription, Easy to start, even easier to scale.
Based on user reviews and social mentions, the most common pain points are: token cost, cost tracking.
Based on 82 social mentions analyzed, 0% of sentiment is positive, 100% neutral, and 0% negative.