Self-Improving Agent
A unified system for continuous improvement: capture corrections, reflect on work, maintain persistent memory across projects and sessions, and extract reusable skills from recurring patterns.
Storage Architecture
Two tiers: project-local for active work, global for cross-project patterns. Plus a shared channel for cross-session experience exchange.
Project-Local (per project)
<project-root>/.learnings/
└── LEARNINGS.md # Corrections, patterns, and errors for this project
Created on first use. Lightweight — one file, simple format.
Global (always available)
~/self-improving/
├── memory.md # HOT: ≤100 lines, always loaded — top rules and preferences
├── corrections.md # Last 50 corrections log
├── projects/ # WARM: per-project patterns, loaded on context match
│ └── {project-name}.md
├── domains/ # WARM: domain-specific patterns (code, writing, comms)
│ └── {domain}.md
├── shared/ # Cross-session insights — any session writes, all sessions read
│ └── YYYY-MM-DD-topic.md
├── archive/ # COLD: decayed or long-unused patterns
└── index.md # Topic index for quick lookup
Initialisation
Project-local (create when first logging for a project):
mkdir -p .learnings
[ -f .learnings/LEARNINGS.md ] || printf "# Learnings — %s\n\nCaptured corrections, patterns, and errors.\n\n---\n" "$(basename $(pwd))" > .learnings/LEARNINGS.md
Global (create once):
mkdir -p ~/self-improving/{projects,domains,shared,archive}
[ -f ~/self-improving/memory.md ] || printf "# Global Memory\n\n## Rules\n\n## Preferences\n\n## Patterns\n\n---\n" > ~/self-improving/memory.md
[ -f ~/self-improving/corrections.md ] || printf "# Corrections Log\n\nLast 50 corrections. Oldest auto-pruned.\n\n---\n" > ~/self-improving/corrections.md
[ -f ~/self-improving/index.md ] || printf "# Index\n\nTopic → file mapping for quick lookup.\n\n---\n" > ~/self-improving/index.md
Startup Check
At the beginning of each session, perform a lightweight health check to keep memory from decaying:
- Prune expired shared insights: Delete files in
~/self-improving/shared/older than 7 days. - Check shared/ volume: If more than 20 files exist, prune oldest or promote and delete.
- Check HOT layer health: If
memory.mdexceeds 100 lines, trigger compaction (see Compaction section). - Log check result: Briefly note what was pruned or compacted — no verbose output.
This replaces a heavy heartbeat with a minimal startup sweep. No background processes, no recurring timers — just one pass when the session starts.
When to Log
Auto-detect these signals:
Corrections → log immediately:
- "No, that's not right..." / "Actually, it should be..." / "You're wrong about..."
- "I prefer X, not Y" / "Always do X" / "Never do Y" / "Stop doing X"
- "That's outdated..." / "I told you before..."
Errors → log to project-local:
- Non-zero exit codes, exceptions, unexpected output
- API failures, timeouts, connection issues
Feature gaps → note for later:
- "Can you also..." / "I wish you could..." / "Why can't you..."
Knowledge gaps → log when you learn something new:
- User provides info you didn't have
- Your understanding was wrong or outdated
Better approaches → log when you discover improvements:
- A faster way to do something recurring
- A cleaner pattern you hadn't considered
Do NOT log:
- One-time instructions ("do X now")
- Pure context ("in this file...")
- Hypotheticals ("what if...")
- Anything inferred from silence — only explicit corrections
Logging Format
Keep it lightweight. No heavy metadata — just enough to be useful later.
Project-Local Entry
Append to .learnings/LEARNINGS.md:
### [correction|error|pattern|gap] short summary
**Date**: YYYY-MM-DD
**Context**: what was happening
What went wrong / what was learned:
Description of the situation.
**Rule**: the takeaway (one sentence).
Global Correction Entry
Append to ~/self-improving/corrections.md:
### [date] short summary
**Project**: project-name (or "general")
**Context**: what was happening
**Correction**: what the user said.
**Rule**: the permanent takeaway.
Keep only the last 50 entries — prune oldest when it exceeds that.
Global Memory Entry (HOT tier)
Append to ~/self-improving/memory.md under the appropriate section:
- **Rule text** (from: project/area, first noted: date)
Self-Reflection
After completing significant work, pause and evaluate:
- Did it meet expectations? — Compare outcome vs intent
- What could be better? — Identify improvements for next time
- Is this a pattern? — If yes, log it
When to self-reflect:
- After multi-step tasks
- After receiving feedback (positive or negative)
- After fixing a bug or mistake
- When you notice your output could be better
Format:
CONTEXT: [type of task]
REFLECTION: [what I noticed]
LESSON: [what to do differently — log if reusable]
Only persist lessons that are reusable — one-off observations stay in conversation context.
Cross-Session Sharing
When running multiple sessions in parallel (e.g., different projects, different tasks), use the shared directory as a lightweight message board. No special tools required — just file conventions.
How It Works
~/self-improving/shared/ is a flat directory that all sessions can read and write. Each shared insight is a small markdown file. Sessions check for new files at natural breakpoints.
Writing a Shared Insight
When your session discovers something worth sharing with other running sessions:
- Create
~/self-improving/shared/YYYY-MM-DD-topic-slug.md:
# Topic Title
**From**: project-name / task description
**Date**: YYYY-MM-DD HH:MM
**Relevance**: writing | code | general | project-name
What was learned or discovered:
Brief description.
**Actionable rule**: one sentence the other session can apply.
- Keep it short — this is a sticky note, not a document.
- Only share insights that are reusable by other sessions, not project-specific debugging details.
Reading Shared Insights
At natural breakpoints (starting a new task, switching context, after completing work):
# Check for insights from the last 3 days
find ~/self-improving/shared/ -name "*.md" -mtime -3 | sort
Read any relevant files. If an insight applies to your current work, apply it and optionally delete the file (consumed). If not relevant, skip it.
Lifecycle
- Write when you discover something another session could use
- Read at task start or context switches
- Delete after applying (consumed) or after 7 days if not relevant
- Promote to
memory.md(HOT) ordomains/(WARM) if the insight proves broadly valuable - Never let
shared/accumulate beyond 20 files — prune oldest or promote and delete
What's Worth Sharing
| Share | Don't Share | |-------|------------| | "User prefers concise bullet points over prose for summaries" | "The bug is on line 42 of foo.py" | | "API X requires auth header format Y" | "I'm debugging the login flow" | | "This markdown template works well for reports" | One-off file-specific fixes | | "User corrected: never use Oxford comma" | Internal reasoning or hypotheses |
Memory Tiers & Lifecycle
Tier Rules
| Tier | Location | Size Limit | Loaded When |
|------|----------|------------|-------------|
| HOT | memory.md | ≤100 lines | Every session |
| WARM | projects/, domains/ | ≤200 lines each | Context matches |
| COLD | archive/ | Unlimited | Explicit query only |
Promotion (upward)
- Pattern applied successfully 3+ times in 7 days → promote to HOT
- Project pattern useful across 2+ projects → promote to WARM domain file
- Any pattern used consistently → consider extraction as a Skill
Demotion (downward)
- HOT pattern unused for 30 days → demote to WARM
- WARM pattern unused for 90 days → archive to COLD
- Never delete without asking — demote, summarize, or archive instead
Compaction
When any file exceeds its size limit:
- Merge similar entries into single rules
- Archive unused patterns
- Summarize verbose entries
- Never lose confirmed user preferences
Namespace & Conflict Resolution
Isolation
- Project patterns →
projects/{name}.md - Global preferences → HOT (
memory.md) - Domain patterns (code, writing, comms) →
domains/{domain}.md
Inheritance
Global → Domain → Project (outermost is default, innermost overrides)
When Patterns Conflict
- Most specific wins (project > domain > global)
- Most recent wins (at same level)
- If still ambiguous → ask the user
Promotion to Project Memory
When a project-local learning is broadly applicable:
- Distill into a concise rule
- Add to
CLAUDE.md(project conventions) or relevant project file - Update the entry: mark as
promoted
What belongs where:
| Target | What Goes There |
|--------|----------------|
| CLAUDE.md | Project conventions, build commands, gotchas |
| ~/self-improving/memory.md | Cross-project preferences and behavioral rules |
| ~/self-improving/projects/{name}.md | Project-specific patterns |
| ~/self-improving/domains/{domain}.md | Domain patterns (code, writing, comms) |
Skill Extraction
When a learning becomes a reusable, self-contained capability, extract it as a Skill.
Extraction Criteria (any one):
- Recurring across 3+ sessions or projects
- Verified working with a tested solution
- Non-obvious (required investigation to discover)
- Broadly applicable (not project-specific)
- User says "save this as a skill"
Process:
- Create
skills/{skill-name}/SKILL.md - Write self-contained instructions (no references to current session)
- Include YAML frontmatter with
nameanddescription - Verify it works in a fresh session
- Mark the source learning as
promoted_to_skill
Review Workflow
Before Major Tasks
- Check global memory (
memory.md) for relevant rules - Search project-local
.learnings/for past issues in this area - Load matching WARM files if they exist
At Natural Breakpoints
- After completing a feature or document
- When switching between projects
- When you notice a repeated correction
Quick Status
# Count pending learnings in current project
grep -c "^###" .learnings/LEARNINGS.md 2>/dev/null || echo 0
# Count global rules
grep -c "^-" ~/self-improving/memory.md 2>/dev/null || echo 0
# List corrections from last 7 days
grep -A3 "$(date -d '7 days ago' +%Y-%m-%d 2>/dev/null || date -v-7d +%Y-%m-%d)" ~/self-improving/corrections.md
Query Interface
| User says | Action |
|-----------|--------|
| "What do you know about X?" | Search all tiers for X |
| "What have you learned?" | Show recent corrections and HOT rules |
| "Show project patterns" | Load projects/{name}.md |
| "Any shared insights?" | List recent files in shared/ |
| "Memory stats" | Report counts per tier (see below) |
| "Forget X" | Search all tiers, confirm before removing |
| "What could be improved?" | Self-reflect on recent work |
Memory Stats
Memory Status
HOT (always loaded):
memory.md: X entries
WARM (load on demand):
projects/: X files, X entries total
domains/: X files, X entries total
SHARED (cross-session):
shared/: X files (X new since last check)
COLD (archived):
archive/: X files
Recent activity (7 days):
Corrections logged: X
Promotions to HOT: X
Demotions to WARM: X
Shared insights written: X
Shared insights consumed: X
Security
- Never store credentials, tokens, API keys, or secrets
- Never store health data, addresses, or sensitive personal info
- Redact sensitive context in error logs — summaries over raw output
- Prefer short descriptions over full transcripts
Transparency
- Cite memory sources when applying rules: "Using rule from projects/foo.md"
- When loading from memory, tell the user what was loaded
- If context limits prevent loading everything, state what was skipped
- Every memory operation is visible — no silent behavior changes
Common Pitfalls
| Pitfall | Why It Fails | Better Approach |
|---------|-------------|-----------------|
| Learning from silence | Creates false rules | Wait for explicit correction or repeated evidence |
| Promoting too fast | Pollutes HOT memory | Keep tentative until pattern repeats 3x |
| Loading everything | Wastes context window | Load HOT + smallest matching WARM files only |
| Deleting entries | Loses trust and history | Demote or archive instead |
| Logging everything | Creates noise | Only log explicit corrections, verified errors, and reusable patterns |
| Hoarding shared insights | Clutters shared/, wastes reads | Consume (delete) after applying, promote valuable ones to WARM/HOT |
| Sharing too specifically | Other sessions can't reuse | Share the rule, not the debugging details |
微信扫一扫