Use Cases

Built for the complexity of
real engineering teams

AI-assisted engineering isn't a future state — it's happening now. Teams running Claude Code, Codex and Cursor face the same coordination problem: AI context doesn't cross session boundaries, team boundaries, or role boundaries. memnos is the shared-memory layer that fixes this.

Scenario 1

The AI-Assisted Engineering Team

Every role in your engineering org has its own AI assistant. Without shared memory, you have as many knowledge silos as you have team members — times two.

  Product ──── Engineering ──── QA ──── DevOps
     ↓               ↓              ↓         ↓
     └───────────────────────────────────────→ memnos memory store
     ←── reads: requirements · decisions · failures · config ───

Product Manager + AI

Requirements & roadmap context

Without memnos

Requirements are scattered across Jira and Confluence. The AI assistant has no memory of prior product decisions, past trade-offs, or why features were scoped the way they were.

With memnos

Product context and requirements are stored as memories with full history. Every engineering agent knows the product direction, the "why" behind decisions, and current constraints — without the PM repeating themselves.

Engineers + AI Assistants

Architecture decisions & patterns

Without memnos

Each dev's AI session starts cold. Senior architecture decisions don't reach junior developers. Architecture drift happens silently. The same design questions get answered differently every day.

With memnos

Shared memory store means every developer's AI session starts with the team's full institutional context. Stored architecture constraints surface in recall whenever they're relevant. Senior decisions propagate to the whole team.

QA Engineers + Test Agents

Test patterns & failure history

Without memnos

Test failure patterns go undiscovered because no one connects the dots across sessions. The same bugs get re-investigated from scratch. Test agents write tests that duplicate existing coverage without knowing it.

With memnos

Every test failure is a memory. Test agents surface known failure modes before writing new tests. Cross-session patterns are visible. QA context — flaky tests, known edge cases, past regressions — persists and grows.

DevOps + Ops Agents

Incidents & operational context

Without memnos

Incident resolutions are locked in runbooks nobody reads. Every PagerDuty alert starts a fresh investigation. Ops agents have no institutional knowledge — they can't tell if they've seen this failure before.

With memnos

Every incident becomes an instantly-retrievable memory. Future alerts surface similar past resolutions in seconds. Oncall agents arrive at the problem already knowing the 3 most likely causes and fixes.

Scenario 2

Stop architecture drift before it ships

Architecture documents are written. Engineers agree. Then AI agents generate code that violates them — because no one can enforce documented constraints at code-generation time.

How memnos solves it

1

Ingest your architecture docs

Upload architecture documents, ADRs, and design decisions. memnos parses them into typed constraint nodes — SHALL, SHOULD, and MAY rules — stored in the memory store.

2

Connect your CI pipeline

Add a single call to client.corpus.check_all(diff, context) in your pipeline. memnos checks every changed file against your architecture corpus.

3

Automatic enforcement

SHALL violations block the PR. SHOULD violations annotate it with a comment. Architecture is enforced automatically — no manual review required for rules that are already documented.

ci_pipeline.py

# In your CI pipeline
result = await client.corpus.check_all(
    code=pull_request_diff,
    context="patient-access service changes",
)
# SHALL violations → block PR
if result[0].shall_violations:
    print("Architecture violations found:")
    print(result[0].format())
    sys.exit(1)

enforcement_output.txt

[CONSTRAINT|SHALL] All PHI data SHALL be encrypted
at rest using AES-256 or stronger.
Source: arch/security.md | Section: Data Protection
Score: 0.94

[CONSTRAINT|SHOULD] Services SHOULD include
correlation IDs in all log entries.
Source: arch/observability.md | Section: Logging
Score: 0.87

The problem this solves

AI-generated code bypasses the review mental model engineers developed for human-written code. Engineers read PRs looking for logic errors — not architecture violations that should have been caught before the first keystroke. memnos moves architecture enforcement earlier, before the code is written and automatically in CI before it merges.

Scenario 3

Stop re-investigating the same failures

PagerDuty fires at 2am. Your oncall opens a new agent session. Without memnos, they start from scratch. With memnos, they already know the 3 most similar past incidents and their exact resolutions.

!

pagerduty alert · 02:14 UTC

CRITICAL: prod-api latency p99 > 10s

firing

memnos surfaces instantly:

✓

Redis OOM on prod-02, 2025-12-14 — Fix: maxmemory-policy=allkeys-lru (similarity: 0.94)

✓

Connection pool exhaustion on api-gateway, 2026-01-08 — Fix: increase pool size in app.yaml (similarity: 0.87)

~

Slow query on prod-db after schema migration, 2026-02-21 — Root cause: missing index on events.created_at (similarity: 0.71)

oncall_agent.py

# On alert — surface past incidents
past = await client.search(
    alert.description,
    namespace="org:acme:incidents",
    top_k=5,
)
# → ["Redis OOM on prod-02. Fix: set
#     maxmemory-policy=allkeys-lru (0.94)"]

# Store resolution for next time
await client.write(
    content="prod-api latency: Redis eviction.
    Fix: allkeys-lru policy restart.",
    memory_type="incident",
    namespace="org:acme:incidents",
)

The flywheel effect

1

Incident fires, agent investigates

memnos surfaces similar past incidents. Oncall has context in seconds instead of minutes.

2

Resolution is found and applied

The resolution — exact commands, config changes, root cause — is stored as a new memory with provenance.

3

Next similar incident is faster

Every incident makes the next one faster to resolve. The corpus of operational knowledge compounds over time — each oncall shift builds on all the ones before it.

4

Patterns become constraints

Recurring root causes can be promoted to architecture constraints — agents are warned before writing code that historically causes incidents.

Scenario 4

New hire to productive in hours, not weeks

Every architecture decision, every coding pattern, every deployment lesson is already in the graph. A new developer's first AI session immediately has the institutional knowledge of the entire team.

Day 1

Full context, immediately

The new hire's first Claude Code session queries memnos and immediately receives the team's architecture decisions, coding conventions, recent incidents, and active constraints — everything that used to live in people's heads.

Day 3

Contributing confidently

Because stored constraints surface in recall alongside other memories, the new hire's AI assistant naturally follows team conventions. They write code that fits the codebase style, avoids known pitfalls, and respects architectural rules — automatically.

Week 1

Adding to the graph

As the new hire learns and discovers, their agent writes new memories. They become a net contributor to institutional knowledge from their first week — not a drain on senior engineers' time.

The hidden cost of stateless onboarding

Traditional onboarding with AI tools makes this worse, not better. An AI assistant in a new developer's first session has less context than a senior engineer's first day — because the senior at least had onboarding sessions. The AI has nothing.

With memnos, new hires and their AI assistants start with the same institutional knowledge as a 2-year veteran. The playing field levels on day one.

new_hire_day1.py

# New hire's first query
context = await client.search(
    "how does the auth service work",
    namespace="org:acme:engineering",
)
# Returns (facts + episodes, reranked):
# Chose JWT over sessions for the auth service
# Tokens expire in 15min (constraint, security.md)
# Token refresh race condition — fixed 2026-02-11
# Use auth.verify_token(), not raw JWT decode
# PKCE flow required for all OAuth2 clients

Scenario 5

Replace static wikis with a living memory store

Static wikis go stale. Documentation rots. memnos's memory store updates itself as agents work — every decision, discovery, and lesson is a node that cross-links automatically through graph traversal.

Why wikis fail at scale

Requires human curation to stay current

Someone has to update the wiki. Nobody does. Knowledge drifts from reality within weeks.

No semantic connections between documents

An incident post-mortem in Confluence has no automatic link to the architecture decision that caused it.

AI agents can't efficiently search wiki structure

Vector search over a flat wiki returns marginally better results than keyword search. Without graph structure, connections are invisible.

No audit trail on knowledge changes

Who changed the architecture page? When? Why? What was the previous recommendation? Wikis don't know.

How memnos is different

Self-updating — agents write as they work

Every agent writes what it learns. The memory store grows automatically as work happens — no manual curation required.

Graph edges link related knowledge automatically

Incidents link to the constraints that were violated. Decisions link to the ADRs that superseded them. The memory store builds its own connections.

Hybrid search — vector + graph traversal

Semantic similarity finds the closest memories; graph traversal surfaces related nodes two or three hops away that pure vector search would miss.

Every change is auditable

Every memory write — who wrote it, what tool, which commit, which ticket — is permanently recorded. The full history is always queryable.

Unsolved Problems

The top 5 problems with AI coding agents — and how memnos fixes them

Claude Code, Codex, Cursor, GitHub Copilot, and similar tools have fundamentally changed how developers work. But they all share the same infrastructure gaps. memnos was built to close them.

1

Context window exhaustion — and the hidden cost of going over

Long sessions fill up the context window. When you hit the limit, coding agents either stop working, silently discard earlier context, or — on plans like Claude Code Max — trigger extra usage charges even when you still have weekly allocation remaining. The 1M token context window is a premium feature that bills separately from your monthly plan. Most developers discover this at the worst possible moment, mid-task.

How memnos fixes it

The auto-compact hook monitors input_tokens in the session transcript before every prompt. When context reaches 85% (configurable), Claude is instructed to run /compact before responding. Compaction writes a session summary to memnos — so nothing is lost — then resets the context window. Enable at install time with one prompt.

2

Every new session starts blank

Close Claude Code and reopen it tomorrow. The agent has no memory of what you decided yesterday, which library you chose, why you rejected an approach, or what the naming conventions in your project are. You re-explain the same context every session. Every session. This is the single most common frustration reported by teams that use AI coding tools daily.

How memnos fixes it

The memory_search injection hook retrieves the 8 most relevant memories automatically before every prompt. No commands to type. The agent already knows your architectural decisions, project conventions, and prior work — from the first message of every session.

3

Running 8 windows to do 8 tasks — and none of them talk to each other

Developers open multiple Claude Code windows to parallelize work. But each window is isolated — it has no idea what the others are doing. Agent A might choose Postgres while Agent B chooses SQLite for the same service. Agent C might refactor a function that Agent D is currently depending on. Without a shared memory layer, parallel agents create more problems than they solve.

How memnos fixes it

All agents share the same memnos namespace. When Agent A writes a decision, Agent B sees it in the next memory injection — within seconds. Agents can also subscribe to a namespace and pick up new memories by webhook push or cursor polling. One source of truth, all agents synchronized.

4

API keys and secrets leaking into chat history

Developers paste API keys, database passwords, and credentials directly into Claude Code prompts because it is the fastest way to give the agent what it needs. Those secrets now live in the transcript — a plaintext JSONL file on disk, visible to any process that reads it. Some end up in CLAUDE.md files committed to version control. Some get summarized into compaction blobs and persist indefinitely.

How memnos fixes it

memnos redacts credential patterns (API keys, JWTs, PEM blocks, passwords) at write time — before any text enters memory — so a pasted secret never lands in the store in plaintext. Secrets that should persist go in the AES-256-GCM encrypted vault (memnos secret set), behind the same authenticated, audit-logged control plane.

5

AI agents that keep making the same architectural mistakes

You tell Claude "we use snake_case for all Python variables." Next week you repeat it. The week after, same thing. System prompts and CLAUDE.md files are static text — they do not update as decisions evolve, they are invisible to new sessions, and they cannot be queried semantically. Agents drift. They re-introduce patterns you explicitly banned. They propose dependencies you already rejected.

How memnos fixes it

Architecture constraints stored in memnos are searchable facts — when a prompt touches their topic, the recall hook surfaces them with the rest of memory, no manual inclusion required. Corpus ingestion parses your existing ADRs, RFC documents and architecture files into SHALL/SHOULD/MAY rules, and a /corpus/check call from CI returns the rules relevant to a PR diff — annotate the PR or fail the build on your own policy.

Start with your team's
most painful use case.

Whether it's incident intelligence, architecture enforcement, or simply shared context across AI sessions — memnos deploys in 90 seconds and pays off immediately.

Deploy Free View on GitHub

Built for the complexity ofreal engineering teams

The AI-Assisted Engineering Team

Product Manager + AI

Engineers + AI Assistants

QA Engineers + Test Agents

DevOps + Ops Agents

Stop architecture drift before it ships

How memnos solves it

Ingest your architecture docs

Connect your CI pipeline

Automatic enforcement

The problem this solves

Stop re-investigating the same failures

The flywheel effect

Incident fires, agent investigates

Resolution is found and applied

Next similar incident is faster

Patterns become constraints

New hire to productive in hours, not weeks

Full context, immediately

Contributing confidently

Adding to the graph

The hidden cost of stateless onboarding

Replace static wikis with a living memory store

Why wikis fail at scale

How memnos is different

The top 5 problems with AI coding agents — and how memnos fixes them

Context window exhaustion — and the hidden cost of going over

Every new session starts blank

Running 8 windows to do 8 tasks — and none of them talk to each other

API keys and secrets leaking into chat history

AI agents that keep making the same architectural mistakes

Start with your team'smost painful use case.

Built for the complexity of
real engineering teams

Start with your team's
most painful use case.