TL;DR · 10 seconds
A live RAG pipeline over a fictional consultant's 16-doc archive — Notion, Gmail, Drive, Slack — with ranked retrieval, citations, and Claude synthesis.
- Click a suggested chip or type your own question.
- Toggle “Synthesize answer” — Claude composes a coherent answer from the top hits with inline source citations.
- Filter by source surface (Notion / Gmail / Drive / Slack) to watch cross-source retrieval work.
Sprint 02 · Your Own Leverage · Claude Code Tier · Advanced
Personal Knowledge Agent.
A Bun + TypeScript RAG pipeline over a fictional management consultant's 8-year personal corpus — 16 sample documents spanning Notion engagement notes, Gmail client threads, Drive methodology docs, and Slack workspace exports. CLI: bun src/cli.ts ask "<question>". Private. Local. Yours. Real working pipeline. Public repo. Top-1 retrieval correct on all 8 canonical queries.
Tier
Code · Advanced
Build time
~3-5 hours
Persona
Solo Consultant
Output
Live demo + repo
priya-rag (CLI version, the build path you replicate with your own corpus — Notion / Gmail / Drive / Slack export, same pipeline) and priya-rag-web(the Vercel-deployed Next.js frontend you're clicking on the live demo).Section I
What the pipeline does.
Priya Banerjee is a fictional independent management consultant — 8 years solo, focused on operations and process improvement for mid-market manufacturers ($30M-$200M revenue). She has 6 client engagements documented across her personal corpus, with cross-references between engagement notes, methodology docs, and the email/Slack threads where decisions were made.
When a new RFP comes in (e.g., a regional 3PL wants help on lane-utilization optimization), Priya wants to answer questions like: "What did I conclude about lean inventory in the Carrington engagement?" The RAG agent retrieves the most relevant snippets from her corpus, with citations to the original source files.
The use case: not a chatbot. Not a workflow tool. A searchable second-brain for one knowledge worker that respects "Private. Local. Yours." — the corpus is Priya's, the embedding runs against the OpenAI API or fully local with @xenova/transformers, and the retrieval runs on her laptop in milliseconds.
Section II
Sample queries (real outputs).
5 of 8 canonical · top-1 correct on all
Each row below is a real query run against the live pipeline. Top hit listed with cosine similarity score; takeaway describes why the retrieval landed where it did. Full 8-query test suite at samples/queries.md.
Q1
“What did I conclude about lean inventory in the Carrington engagement?”
Engagement notes surface as top hit. Methodology doc (drive/) ranks #2 — multi-source retrieval working as designed.
Q2
“Which client engagements involved a TMS migration?”
Top 2 are both Hartwell (notes + Slack channel from the engagement). Westmont correctly suppressed — Westmont was an HR system migration, not TMS.
Q3
“What did the project team flag as the top three operational risks for Westmont?”
Engagement notes contain the explicit 'top three operational risks identified' section. Workday TL go-live email (#2) references the one-pager Carl forwarded to the board.
Q4
“How did the Northbridge ERP selection ultimately go?”
Captures both the recommendation (Infor LN), the override (NetSuite), and the off-the-record analysis. Tom Akerly's January 2025 follow-up email retrievable on a more specific query.
Q5
“What was the deadhead-reduction opportunity at Hartwell?”
Drive methodology doc ranks above the engagement notes — it has the lane-by-lane numbers (IND-CHI 64% vs 82% cohort, $340k annualized opportunity).
Section III
Sample corpus inputs.
16 fictional documents across 4 source surfaces. All in the repo at data/— open and read them in the repo for the full picture (engagement narratives are threaded across files, which is what makes multi-source retrieval work).
Notion engagement notes (6 files)
Full project debriefs from 6 client engagements: Carrington, Hartwell, Westmont, Northbridge, MidStates, Bayfield. Each ~2-5 pages with findings, stakeholders, what-didn't-go-well, and quantified outcomes.
Gmail threads (5 files)
Client correspondence threads — Tom Akerly's NetSuite-implementation follow-up, Lena Park's McLeod-selection final read, Pete DiMarco's RFP-results pushback, Dana Reyes's Workday TL go-live punch list, Sara Chen's Phoenix-route conversation.
Drive methodology docs (3 files)
Reusable methodology references — Hartwell deadhead-analysis methodology + Python script, MidStates spend-cube methodology, Carrington S&OP cadence redesign methodology.
Slack workspace exports (2 files)
Channel exports from active engagements — #tms-evaluation (Hartwell, Sep-Nov 2024), #procurement-rfp (MidStates, Oct-Dec 2023). Real-time decision-making context that the polished engagement notes don't capture.
Section IV
Build path · 5 steps.
How this build came together with Claude Code. ~3-5 hours of focused work. If you bring your own corpus (Notion export, Gmail takeout, Drive folder, Slack export), you replicate steps 1, 3, 4, 5 — same shape, your data.
01
Scaffold the Bun + TypeScript repo
claude-code → 'create a Bun TypeScript repo for a RAG pipeline over a personal knowledge corpus.' Sets up package.json, tsconfig.json, src/ directory, .env.example, and the directory structure for data/{notion,gmail,drive,slack}/.
02
Generate the sample corpus
Hand-write or claude-generate ~16 fictional documents threaded across the 4 source surfaces. Persona consistency matters — same engagements referenced across notion + drive + gmail keeps retrieval signal strong.
03
Build the embedding + ingest pipeline
src/embed.ts wraps OpenAI text-embedding-3-small (annotated with the @xenova/transformers swap for fully-local embedding). src/ingest.ts walks data/, embeds each file, writes embeddings/corpus.json. ~30 seconds + ~$0.02 against your OpenAI account for the 16-doc sample.
04
Build the retrieval CLI
src/search.ts loads corpus.json, computes cosine similarity, returns top-k hits with snippets. src/cli.ts is the user-facing wrapper: 'bun src/cli.ts ask "<question>"' → embed query → search → print top 5 hits with source / path / score / snippet.
05
Validate retrieval quality + ship
Run 8 canonical queries (samples/queries.md). Confirm top-1 correct on each + score-gap discipline holds (5+pt typical separation). Push to public GitHub. Forks happen; modifications happen; the pattern is portable.
Section V
Architecture (one paragraph).
Bun + TypeScript runtime. OpenAI text-embedding-3-small for embeddings (1536-dim vectors, $0.02/1M tokens). In-memory cosine similarity for retrieval — sufficient for sub-100-doc personal corpora.
Pipeline: bun src/ingest.ts walks data/, embeds each file, writes embeddings/corpus.json. bun src/cli.ts ask "..." embeds the query and returns top-5 ranked hits with snippets + source paths + cosine scores.
Production swaps: SQLite + sqlite-vss for 10k+ docs; @xenova/transformers running Xenova/all-MiniLM-L6-v2 for fully-local embeddings (no API key, ~50MB model download cached after first run). Both swap-points annotated in the source.
End of Specimen No. 05
Now build yours.
Priya is sample data. The pipeline is portable. Notion export, Gmail takeout, Drive folder, Slack export — your corpus, your laptop, your retrieval. The Sprint 02 / Code / Advanced build is this exact shape with your inputs.
One real corpus. One working CLI. One cohort.