Open-source search infrastructure for AI
Based on the limited social mentions provided, there isn't enough substantive user feedback to provide a comprehensive summary of what users think about Chroma. The social mentions consist mainly of brief YouTube titles mentioning "Chroma AI" without detailed reviews or user opinions. One Hacker News mention discusses video embedding technology that may be related to Chroma's capabilities, and there's a reference to a gaming device called "Chromatic" which appears to be unrelated to the Chroma software tool. To provide an accurate assessment of user sentiment, more detailed reviews and user feedback would be needed.
Mentions (30d)
1
Reviews
0
Platforms
4
GitHub Stars
27,321
2,180 forks
Based on the limited social mentions provided, there isn't enough substantive user feedback to provide a comprehensive summary of what users think about Chroma. The social mentions consist mainly of brief YouTube titles mentioning "Chroma AI" without detailed reviews or user opinions. One Hacker News mention discusses video embedding technology that may be related to Chroma's capabilities, and there's a reference to a gaming device called "Chromatic" which appears to be unrelated to the Chroma software tool. To provide an accurate assessment of user sentiment, more detailed reviews and user feedback would be needed.
Features
Industry
information technology & services
Employees
110
Funding Stage
Seed
Total Funding
$18.0M
790
GitHub followers
27
GitHub repos
27,321
GitHub stars
20
npm packages
4
HuggingFace models
183,065
npm downloads/wk
13,670,430
PyPI downloads/mo
Show HN: Gemini can now natively embed video, so I built sub-second video search
Gemini Embedding 2 can project raw video directly into a 768-dimensional vector space alongside text. No transcription, no frame captioning, no intermediate text. A query like "green car cutting me off" is directly comparable to a 30-second video clip at the vector level.<p>I used this to build a CLI that indexes hours of footage into ChromaDB, then searches it with natural language and auto-trims the matching clip. Demo video on the GitHub README. Indexing costs ~$2.50/hr of footage. Still-frame detection skips idle chunks, so security camera / sentry mode footage is much cheaper.
View originalPricing found: $5, $0, $2.50, $0.33, $0.0075
Your AI agents remember yesterday.
# AIPass **Your AI agents remember yesterday.** A local multi-agent framework where your AI assistants keep their memory between sessions, work together on the same codebase, and never ask you to re-explain context. --- ## Contents - [The Problem](#the-problem) - [What AIPass Does](#what-aipass-does) - [Quick Start](#quick-start) - [How It Works](#how-it-works) - [The 11 Agents](#the-11-agents) - [CLI Support](#cli-support) - [Project Status](#project-status) - [Requirements](#requirements) - [Subscriptions & Compliance](#subscriptions--compliance) --- ## The Problem Your AI has memory now. It remembers your name, your preferences, your last conversation. That used to be the hard part. It isn't anymore. The hard part is everything that comes after. You're still one person talking to one agent in one conversation doing one thing at a time. When the task gets complex, *you* become the coordinator — copying context between tools, dispatching work manually, keeping track of who's doing what. You are the glue holding your AI workflow together, and you shouldn't have to be. Multi-agent frameworks tried to solve this. They run agents in parallel, spin up specialists, orchestrate pipelines. But they isolate every agent in its own sandbox. Separate filesystems. Separate worktrees. Separate context. One agent can't see what another just built. Nobody picks up where a teammate left off. Nobody works on the same project at the same time. The agents don't know each other exist. That's not a team. That's a room full of people wearing headphones. What's missing isn't more agents — it's *presence*. Agents that have identity, memory, and expertise. Agents that share a workspace, communicate through their own channels, and collaborate on the same files without stepping on each other. Not isolated workers running in parallel. A persistent society with operational rules — where the system gets smarter over time because every agent remembers, every interaction builds on the last, and nobody starts from zero. ## What AIPass Does AIPass is a local CLI framework that gives your AI agents **identity, memory, and teamwork**. Verified with Claude Code, Codex, and Gemini CLI. Designed for terminal-native coding agents that support instruction files, hooks, and subprocess invocation. **Start with one agent that remembers:** Your AI reads `.trinity/` on startup and writes back what it learned before the session ends. That's the whole memory model — JSON files your AI can read and write. Next session, it picks up where it left off. No database, no API, no setup beyond one command. ```bash mkdir my-project && cd my-project aipass init ``` Your project gets its own registry, its own identity, and persistent memory. Each project is isolated — its own agents, its own rules. No cross-contamination between projects. **Add agents when you need them:** ```bash aipass init agent my-agent # Full agent: apps, mail, memory, identity ``` | What you need | Command | What you get | |---------------|---------|-------------| | A new project | `aipass init` | Registry, project identity, prompts, hooks, docs | | A full agent | `aipass init agent ` | Apps scaffold, mailbox, memory, identity — registered in project | | A lightweight agent | `drone @spawn create --template birthright` | Identity + memory only (no apps scaffold) | **What makes this different:** - **Agents are persistent.** They have memories and expertise that develop over time. They're not disposable workers — they're specialists who remember. - **Everything is local.** Your data stays on your machine. Memory is JSON files. Communication is local mailbox files. No cloud dependencies, no external APIs for core operations. - **One pattern for everything.** Every agent follows the same structure. One command (`drone @branch command`) reaches any agent. Learn it once, use it everywhere. - **Projects are isolated by design.** Each project gets its own registry. Agents communicate within their project, not across projects. - **The system protects itself.** Agent locks prevent double-dispatch. PR locks prevent merge conflicts. Branches don't touch each other's files. Quality standards are embedded in every workflow. Errors trigger self-healing. **Say "hi" tomorrow and pick up exactly where you left off.** One agent or fifteen — the memory persists. --- ## Quick Start ### Start your own project ```bash pip install aipass mkdir my-project && cd my-project aipass init # Creates project: registry, prompts, hooks, docs aipass init agent my-agent # Creates your first agent inside the project cd my-agent claude # Or: codex, gemini — your agent reads its memory and is ready ``` That's it. Your agent has identity, memory, a mailbox, and knows what AIPass is. Say "hi" — it picks up where it left off. Come back tomorrow, it remembers. ### Explore the full framework Clone the repo to see all 11 agents working together — the reference implementatio
View originalA Claude memory retrieval system that actually works (easily) and doesn't burn all my tokens
TL;DR: By talking to claud and explaining my problem, I built a very powerfu local " memory management" system for Claude Desktop that indexes project documents and lets Claude automatically retrieve relevant passages that are buried inside of those documents during Co-Work sessions. for me it solves the "document memory" problem where tools like NotebookLM, Notion, Obsidian, and Google Drive can't be queried programmatically. Claude did all of it. I didn't have to really do anything. The description below includes plenty of things that I don't completely understand myself. the key thing is just to explain to Claude what the problem is ( which I described below) , and what your intention is and claude will help you figure it out. it was very easy to set this up and I think it's better than what i've seen any youtuber recommend The details: I have a really nice solution to the Claude external memory/external brain problem that lots of people are trying to address. Although my system is designed for one guy using his laptop, not a large company with terabytes of data, the general approach I use could be up-scaled just with substitution of different tools. I wanted to create a Claude external memory system that is connected to Claude Co-Work in the desktop app. What I really wanted was for Claude to proactively draw from my entire base of knowledge for each project, not just from the documents I dropped into my project folder in Claude Desktop. Basically, I want Claude to have awareness of everything I have stored on my computer, in the most efficient way possible (Claude can use lots of tokens if you don't manage the "memory" efficiently. ) I've played with Notion and Google Drive as an external brain. I've tried NotebookLM. And I was just beginning to research Obsidian when I read this article, which I liked very much and highly recommend: https://limitededitionjonathan.substack.com/p/stop-calling-it-memory-the-problem That got my attention, so I asked Claude to read the document and give me his feedback based on his understanding of the projects I was trying to work on. Claude recommended using SQLite to connect to structured facts, an optional graph to show some relationships, and .md files for instructions to Claude. But...I pointed out that almost all of the context information I would want to be retrievable from memory is text in documents, not structured data. Claude's response was very helpful. He understood that although SQLite is good at single-point facts, document memory is a different challenge. For documents, the challenge isn't storing them—it's retrieving the right passage when it's relevant without reading everything (which consumes tokens). SQLite can store text, but storing a document in a database row doesn't solve the retrieval problem. You still need to know which row to pull. I asked if NotebookLM from Google might be a better tool for indexing those documents and making them searchable. Claude explained that I was describing is a Retrieval-Augmented Generation (RAG) problem. The standard approach: Documents get chunked into passages (e.g., 500 words each) Each chunk gets converted to an embedding—a vector that captures its meaning When Claude needs context, it converts the query to the same vector format and finds the semantically closest chunks Those chunks get injected into the conversation as context This is what NotebookLM is doing under the hood. It's essentially a hosted, polished RAG system. NotebookLM is genuinely good at what it does—but it has a fundamental problem for my case: It's a UI, not infrastructure. You use it; Claude can't. There's no API, no MCP tool, no way to have Claude programmatically query it during a Co-Work session. It's a parallel system, not an integrated one. So NotebookLM answers "how do I search my documents as a human?"—not "how does Claude retrieve the right document context automatically?" After a little back and forth, here's what we decided to do. For me, a solo operator with only a laptop's worth of documents that need to be searched, Claude proposed a RAG pipeline that looks like this: My documents (DOCX, PDF, XLSX, CSV) ↓ Text extraction (python-docx, pymupdf, openpyxl) ↓ Chunking (split into ~500 word passages, keep metadata: file, folder, date) ↓ Embedding (convert each chunk to a vector representing its meaning) ↓ A local vector database + vector extension (store chunks + vectors locally, single file) ↓ MCP server (exposes a search_knowledge tool to Claude) ↓ Claude Desktop (queries the index when working on my business topics) With that setup, when you're talking to Claude and mention an idea like "did I pay the overdue invoice" or "which projects did Joe Schmoe help with," Claude searches the index, gets the 3-5 most relevant passages back, and uses them in its answer without you doing anything. We decided to develop a search system like that, specific to each of my discrete projects. Th
View originalClaude Engram - persistent memory for Claude Code that auto-tracks mistakes and context
Some of you might remember my post a few months ago about Mini Claude. I had Claude build its own memory system after researching its own user complaints. That project worked, but the hook system was a pain. I shelved it. Then Claude Code got "open-sourced", and I could actually see how hooks like PostToolUseFailure, PreCompact, and all the lifecycle events work internally. Rewrote the whole thing with proper hook integration. Renamed it Claude Engram. What changed from the original: The old version required Claude to manually call everything. The new version automatically hooks into Claude Code's tool lifecycle. Claude doesn't have to invoke anything for the core features to work. How it works: Hooks intercept every edit, bash command, error, and session event. Zero manual effort. Before you edit a file, it surfaces past mistakes and relevant context, scored by file match, tags, and recency. Survives context compaction. Auto-checkpoints before, re-injects rules and mistakes after. Tiered storage. Hot memories stay fast, old ones archive to cold storage. Searchable, restorable. Multi-project workspaces. Memories scoped per project, workspace-level rules cascade down. Hybrid search using AllMiniLM. Keyword + vector + reranking. No ChromaDB dependency. Update — v0.4.0: Session Mining Since the original post, engram now mines your Claude Code session logs automatically. This is the big addition. Claude Code stores your full conversation as JSONL files. After every session, engram parses them in the background and extracts what hooks can't capture: Decisions, mistakes, and approach changes extracted from conversation flow (not regex — structural analysis + AllMiniLM semantic scoring, naturally typo-tolerant) Searchable index across all past conversations — "what did we discuss about auth?" returns results in 112ms — every user message and assistant response from every past session gets embedded and indexed (7310 messages across 11 sessions in testing) Detects recurring struggles, error patterns across sessions, and which files are always edited together Predictive context — before you edit a file, it surfaces related files and likely errors from your history Cross-project learning — finds patterns that hold across all your projects Retroactive bootstrap — install on an existing project and it mines all your past sessions automatically Benchmark Result Decision Capture (220 prompts) 97.8% precision Injection Relevance (50 memories) 14/15, 100% isolation Compaction Survival 6/6 Error Auto-Capture (53 payloads) 100% recall, 97% precision Multi-Project Scoping 11/11 Session Mining Foundation 27/27 Obsidian Vault Compatibility 25/25 Cross-session search 112ms over 7310 indexed messages Not just Claude Code: The MCP server works with any MCP client — Cursor, Windsurf, Zed, Continue.dev. Claude Code gets the full auto-capture hooks + session mining on top. Also works with Obsidian vaults (PARA + CLAUDE.md structure). Tested and verified. No cloud, no API costs, runs locally. MIT licensed. https://github.com/20alexl/claude-engram submitted by /u/Crunchy-Nut1 [link] [comments]
View original[D] MemPalace claims 100% on LoCoMo and a "perfect score on LongMemEval." Its own BENCHMARKS.md documents why neither is meaningful.
A new open-source memory project called MemPalace launched yesterday claiming "100% on LoCoMo" and "the first perfect score ever recorded on LongMemEval. 500/500 questions, every category at 100%." The launch tweet went viral reaching over 1.5 million views while the repository picked up over 7,000 GitHub stars in less than 24 hours. The interesting thing is not that the headline numbers are inflated. The interesting thing is that the project's own BENCHMARKS.md file documents this in detail, while the launch tweet strips these caveats. Some of failure modes line up with the methodology disputes the field has been arguing about for over a year (Zep vs Mem0, Letta's "Filesystem All You Need" reproducibility post, etc.). 1. The LoCoMo 100% is a top_k bypass. The runner uses top_k=50. LoCoMo's ten conversations have 19, 19, 32, 29, 29, 28, 31, 30, 25, and 30 sessions respectively. Every conversation has fewer than 50 sessions, so top_k=50 retrieves the entire conversation as the candidate pool every time. The Sonnet rerank then does reading comprehension over all sessions. BENCHMARKS.md says this verbatim: The LoCoMo 100% result with top-k=50 has a structural issue: each of the 10 conversations has 19–32 sessions, but top-k=50 exceeds that count. This means the ground-truth session is always in the candidate pool regardless of the embedding model's ranking. The Sonnet rerank is essentially doing reading comprehension over all sessions - the embedding retrieval step is bypassed entirely. The honest LoCoMo numbers in the same file are 60.3% R@10 with no rerank and 88.9% R@10 with hybrid scoring and no LLM. Those are real and unremarkable. A 100% is also independently impossible on the published version of LoCoMo, since roughly 6.4% of the answer key contains hallucinated facts, wrong dates, and speaker attribution errors that any honest system will disagree with. 2. The LongMemEval "perfect score" is a metric category error. Published LongMemEval is end-to-end QA: retrieve from a haystack of prior chat sessions, generate an answer, GPT-4 judge marks it correct. Every score on the published leaderboard is the percentage of generated answers judged correct. The MemPalace LongMemEval runner does retrieval only. For each of the 500 questions it builds one document per session by concatenating only the user turns (assistant turns are not indexed at all), embeds with default ChromaDB embeddings (all-MiniLM-L6-v2), returns the top five sessions by cosine distance, and checks set membership against the gold session IDs. It computes both recall_any@5 and recall_all@5, and the project reports the softer one. It never generates an answer. It never invokes a judge. None of the LongMemEval numbers in this repository - not the 100%, not the 98.4% "held-out", not the 96.6% raw baseline - are LongMemEval scores in the sense the published leaderboard means. They are recall_any@5 retrieval numbers on the same dataset, which is a substantially easier task. Calling any of them a "perfect score on LongMemEval" is a metric category error. 3. The 100% itself is teaching to the test. The hybrid v4 mode that produces the 100% was built by inspecting the three remaining wrong answers in their dev set and writing targeted code for each one: a quoted-phrase boost for a question containing a specific phrase in single quotes, a person-name boost for a question about someone named Rachel, and "I still remember" / "when I was in high school" patterns for a question about a high school reunion. Three patches for three specific questions. BENCHMARKS.md, line 461, verbatim: This is teaching to the test. The fixes were designed around the exact failure cases, not discovered by analyzing general failure patterns. 4. Marketed features that don't exist in the code. The launch post lists "contradiction detection catches wrong names, wrong pronouns, wrong ages before you ever see them" as a feature. mempalace/knowledge_graph.py contains zero occurrences of "contradict". The only deduplication logic is an exact-match check on (subject, predicate, object) triples that blocks identical triples from being added twice. Conflicting facts about the same subject can accumulate indefinitely. 5. "30x lossless compression" is measurably lossy in the project's own benchmarks. The compression module mempalace/dialect.py truncates sentences at 55 characters, filters by keyword frequency, and provides a decode() function that splits the compressed string into a header dictionary without reconstructing the original text. There is no round-trip. The same BENCHMARKS.md reports results_raw_full500.jsonl at 96.6% R@5 and results_aaak_full500.jsonl at 84.2% R@5 — a 12.4 percentage point drop on the same dataset and the same metric, run by the project itself. Lossless compression cannot cause a measured quality drop. Why this matters for the benchmark conversation. The field needs benchmarks where judge reliability is adversarially validated, an
View originalShow HN: Gemini can now natively embed video, so I built sub-second video search
Gemini Embedding 2 can project raw video directly into a 768-dimensional vector space alongside text. No transcription, no frame captioning, no intermediate text. A query like "green car cutting me off" is directly comparable to a 30-second video clip at the vector level.<p>I used this to build a CLI that indexes hours of footage into ChromaDB, then searches it with natural language and auto-trims the matching clip. Demo video on the GitHub README. Indexing costs ~$2.50/hr of footage. Still-frame detection skips idle chunks, so security camera / sentry mode footage is much cheaper.
View originalI Created My First AI-assisted Pull Request and I Feel Like a Fraud
I used Claude Code recently to make an open source contribution to Chroma, which is Hugo's default syntax highlighter. My code was approved and merged, but it still felt like flinging slop over the wall to a maintainer. The thing is that I made a real and valuable contribution. But I don't know how to feel about it since I was mostly clueless about the code. submitted by /u/nelson-f [link] [comments]
View originalI made Claude aware of my entire git history
I got tired of Claude forgetting everything that happened in my repo, so I built a memory layer for it Every time I start a new Claude Code session I waste 10 minutes explaining context. "We removed that function last week because of X." This workaround exists because of a race condition we hit in production." Claude has no idea. It just sees the current state of the code. So I built claudememory. It indexes your entire git history into a local vector database and exposes it to Claude as MCP tools. Now when Claude touches a file it can actually look up what changed there, why, and what bugs were already fixed in that area. The tools it gives Claude: - search_git_history("why was X removed") - semantic search over all your commits - commits_touching_file("auth.py") - full history of a file before editing it - bug_fix_history("payments") - all past fixes near the code you're about to change - latest_commits(10) - what changed since last session - architecture_decisions("state machine") - why things are structured the way they are The thing that actually changed my workflow is Claude now checks for prior bug fixes before adding new code near a known problem area. It stopped re-introducing things we already fixed. Works with OpenAI embeddings or Ollama locally. If you have neither it still runs, just uses ChromaDB without the semantic layer. pip install claudememory GitHub: github.com/gunesbizim/claudememory Happy to answer questions. submitted by /u/taxesarehigh [link] [comments]
View originalI turned Claude Code into a full RAG learning academy — 20 agents, 17 slash commands, 9-module curriculum
Repo: https://github.com/TakaGoto/rag-learning-academy I've been trying to find good tutorials or courses on RAG. Unfortunately none of them met my expectations. Most of them have the same issues: The UI felt unintuitive and confusing The content was outdated They lean on services like AWS Bedrock which hide a lot of the details I actually wanted to learn about RAG I had tons of questions as I went through the material and couldn't get answers without waiting or digging through forums They assume you're a true beginner, which just wastes time when you have 13 years of software experience So I decided to build my own interactive learning experience — inside Claude Code. What it actually does well Assesses your knowledge first. Run /start and it determines whether you should be on the beginner, intermediate, or advanced track. No sitting through "what is a vector?" when you already know. You can ask questions along the way. Digress if you want to. Claude keeps you on track and logs your progress. There are 20 specialist agents — if you go deep on something like chunking strategy or reranking, the right expert jumps in. Uses open-source tools by default, no API keys required. Local embeddings (all-MiniLM-L6-v2), ChromaDB, and Claude Code as the LLM — you're already running it. Want to use OpenAI embeddings or Pinecone instead? Swap them in. Multi-language support is in progress (Python only right now). The content stays current. Monthly CI checks the codebase for deprecated patterns, stale model references, and outdated libraries. You can also run /audit-content for an on-demand freshness check. Quick start git clone https://github.com/TakaGoto/rag-learning-academy.git cd rag-learning-academy claude /start its open sourced, its free. tell me what you think submitted by /u/bokuwataka [link] [comments]
View originalCursor and Claude beefing
Sorry for it being a picture but this is hilarious, i’ve been feeding boths responses into each other and they are lowkey throwing shade submitted by /u/ovrlrdx [link] [comments]
View originalCursor and Claude are beefing lmao
Sorry for it being a picture but this is hilarious Ive been feeding boths responses into each other and they’re lowkey throwing shade submitted by /u/Brilliant-Gas9662 [link] [comments]
View originalPalmer Luckey’s retro gaming startup ModRetro reportedly seeks funding at $1B valuation
The company launched its first product, a Game Boy-style handheld device called the Chromatic, in 2024.
View originalRepository Audit Available
Deep analysis of chroma-core/chroma — architecture, costs, security, dependencies & more
Yes, Chroma offers a free tier. Pricing found: $5, $0, $2.50, $0.33, $0.0075
Key features include: Product, Follow, Company, Legal.
Chroma has a public GitHub repository with 27,321 stars.
Based on 16 social mentions analyzed, 0% of sentiment is positive, 100% neutral, and 0% negative.
Shawn Wang
Founder at smol.ai
1 mention