Skip to content

aRealGem/Memorialbot

Repository files navigation

MemorialBot

A memorial chatbot framework that emulates a loved one's texting voice from curated iMessage history. Uses RAG (ChromaDB) over curated episodes and conversation chunks, with an LLM generating responses in the persona's style.

The repo ships with example persona ids (babybearbot, davidbot) and template soul / persona files. Replace them with your own content (see PERSONA_SETUP.md). If you are the original operator of this fork, restore prompts and YAML from LOCAL_PERSONA_NOTES.md (gitignored; create from your private backup if missing).

Architecture

iMessage corpus  -->  pipeline/  -->  workbench/<persona>/  -->  bot/
  (raw .txt)          (parse,          (analytics DB,              (chunker, indexer,
                       candidates,      candidates,                 retriever, prompt
                       stats)           style_notes,                builder, Telegram)
                                        biography,
                                        eval/)

Runtime content lives in bot/content/<persona>/:

  • soul.md — persona identity and boundaries
  • memory.md — curated facts (tagged by source), when used
  • style_examples.md — style examples for the LLM, when used

The PERSONA environment variable (default in this tree: davidbot) selects which persona's content, data, and episodes are used at every layer. Set PERSONA=babybearbot for the other example persona.

Quick start

# 1. Parse corpus (optional — requires your own export under iMessageCorpusBeforeCuration/)
cd pipeline && make parse

# 2. Generate analytics and candidates
make analyst-all

# 3. Build bot indexes
cd ../bot
python3 chunker.py
python3 indexer.py --reindex --collection both

# 4. Configure and run
cp .env.example .env        # fill in TELEGRAM_TOKEN, ANTHROPIC_API_KEY, CONV_FILTER as needed
python3 bot.py

# 5. Evaluate (example persona id)
python3 ../pipeline/src/evaluate.py --persona babybearbot
python3 eval_runner.py --limit 5 --delay 2
python3 eval_scorer.py

Structure

  • bot/ — Runtime: Telegram bot, retriever, prompt builder, content helpers
  • bot/content/<persona>/ — Content consumed by the bot at runtime
  • pipeline/ — Corpus parsing, candidate generation, analytics, evaluation
  • workbench/<persona>/ — Analytics DB, candidates, stats, eval (generated CSV/JSON often gitignored)
  • staging/<persona>/ — Curated memory episodes
  • tools/ — Utilities (gap finder, export scripts)
  • archive/openclaw/ — Legacy deployment scripts (preserved, not active)

Evaluation

The eval harness generates test cases from the corpus, runs the bot headless, and scores outputs (style, grounding, fabrication, AI leak). See pipeline/ and bot/eval_*.py for details.

Adding a new persona

See PERSONA_SETUP.md.

Public / fork hygiene

Some files under workbench/*/eval/experiments/ may still contain verbatim chat excerpts copied from a private corpus (used as few-shot or length-rule examples). Before publishing a repo broadly, review those YAMLs or exclude them from the remote. Markdown score reports under workbench/*/eval/results/ are gitignored by default because they can embed message text.

About

Pipeline for building memorial chatbot of loved ones who've departed based upon text messages and shared attachments/photos

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors