Lyra: Private LLM with Memory

Set up a private LAN LLM with Ollama, build a FastAPI hub that ingests data into SQLite/FTS5, and prepend contextual notes to chats securely.

Overview

I’ll show a tiny but real home-network LLM that understands my household context—no cloud, no slides. We’ll start with Ollama on Ubuntu, bake a system prompt via a Modelfile, bind safely to LAN, and hit it with curl. Then I’ll add a 60-line FastAPI “Lyra Hub” that ingests real-world signals, stores them in SQLite/FTS5, and augments chat by prepending concise household notes to the prompt, turning a stateless model into a proactive assistant.

We’ll show:
1) Modelfile + build:
FROM tinyllama:1.1b-chat
SYSTEM “"”You are Lyra… private LAN… family-safe… Czech by default…”””

# ollama create lyra-assistant -f Modelfile

2) Minimal hub (core idea):
@app.post(“/ingest”) -> save to SQLite + FTS
@app.post(“/chat”) -> notes = recent() + search(); call /api/chat with [system(notes)] + messages

3) Safe LAN exposure: UFW + optional Caddy (internal TLS + Bearer).
Everything is reproducible, copy-pasteable, and runs on a 4GB refurb PC. Fellow tinkerers can clone the pattern and drop in their own connectors in minutes.

Tech stack