返回 Skill 列表
extension
分类: 其它无需 API Key

Meta Knowledge Base

AI驱动的知识库构建工具,自动捕获、整理和检索信息。通过对话、文档和互动学习...

person作者: jason-aka-chenhubclawhub

Meta Knowledge Base

Self-building knowledge management system that learns and grows automatically.

Features

1. Auto-Capture

  • Conversation Learning: Extract key information from chats
  • Document Parsing: Extract from PDFs, docs, emails
  • Web Scraping: Learn from visited pages
  • File Watch: Monitor folders for new content

2. Knowledge Organization

  • Auto-Tagging: Automatic topic categorization
  • Entity Extraction: People, companies, concepts
  • Relationship Mapping: Connect related ideas
  • Version History: Track knowledge evolution

3. Semantic Search

  • Vector Embeddings: Semantic similarity search
  • Hybrid Search: Combine keyword + semantic
  • Filtering: Filter by date, tags, source
  • Ranking: Relevance-based results

4. Intelligent Q&A

  • RAG Pipeline: Retrieve + Generate answers
  • Context-Aware: Understand conversation context
  • Citing Sources: Reference original knowledge
  • Confidence Scoring: Show answer confidence

5. Continuous Learning

  • User Feedback: Learn from corrections
  • Implicit Learning: Learn from interactions
  • Knowledge Updates: Keep information fresh
  • Gap Identification: Find missing knowledge

Installation

pip install numpy faiss-cpu sentence-transformers

Usage

Initialize Knowledge Base

from meta_knowledge import KnowledgeBase

kb = KnowledgeBase(
    name="my_knowledge",
    embedding_model="paraphrase-multilingual-MiniLM-L12-v2"
)

Add Knowledge

# From text
kb.add(
    content="Python is a high-level programming language...",
    tags=["programming", "python"],
    metadata={"source": "user", "date": "2026-03-22"}
)

# From document
kb.add_from_file("document.pdf", tags=["research"])

# From URL
kb.add_from_url("https://example.com/article", tags=["news"])

Search

# Semantic search
results = kb.search(
    query="What is machine learning?",
    top_k=5
)

for r in results:
    print(f"{r.score:.2f} | {r.content[:100]}...")

Q&A

# Ask questions
answer = kb.ask(
    question="What do I know about AI?",
    include_sources=True
)

print(answer['answer'])
print("Sources:", answer['sources'])

Knowledge Graph

# Get entity relationships
graph = kb.get_knowledge_graph()

# Find related concepts
related = kb.find_related("Python", depth=2)

API Reference

Adding Knowledge

| Method | Description | |--------|-------------| | add(content, ...) | Add single piece of knowledge | | add_batch(contents) | Add multiple items | | add_from_file(path) | Parse and add file | | add_from_url(url) | Fetch and add web content | | add_from_email(email) | Parse email content |

Searching

| Method | Description | |--------|-------------| | search(query, top_k) | Semantic search | | hybrid_search(query, ...) | Keyword + semantic | | filter_search(query, filters) | Search with filters | | find_similar(content) | Find similar items |

Q&A

| Method | Description | |--------|-------------| | ask(question, ...) | Get answer with RAG | | get_context(question) | Get relevant context | | generate_summary(topic) | Generate topic summary |

Management

| Method | Description | |--------|-------------| | get_knowledge_graph() | Get entity relationships | | list_tags() | List all tags | | export(format) | Export knowledge | | import_(data) | Import knowledge |

Architecture

┌─────────────┐     ┌─────────────┐     ┌─────────────┐
│   Sources   │────▶│  Ingestion  │────▶│   Storage    │
│ - Chat      │     │ - Parser    │     │ - Vector DB  │
│ - Docs      │     │ - Embedder  │     │ - Graph DB   │
│ - Web       │     │ - Indexer   │     │ - Document   │
└─────────────┘     └─────────────┘     └─────────────┘
                                           │
                    ┌──────────────────────┘
                    ▼
┌─────────────┐     ┌─────────────┐     ┌─────────────┐
│   Query     │────▶│   Retrieve  │────▶│   Generate  │
│ - Search    │     │ - Vector    │     │ - LLM       │
│ - Ask       │     │ - Graph     │     │ - Cite      │
└─────────────┘     └─────────────┘     └─────────────┘

Embedding Models

| Model | Dimensions | Languages | Use Case | |-------|------------|-----------|----------| | paraphrase-multilingual-MiniLM-L12-v2 | 384 | 50+ | General | | bge-small-zh-v1.5 | 512 | Chinese | Chinese | | text-embedding-ada-002 | 1536 | EN | Production |

Use Cases

  • Personal Assistant: Remember everything
  • Team Wiki: Shared knowledge base
  • Customer Support: Q&A automation
  • Research: Paper search & summarization
  • Codebase: Documentation search

Best Practices

  1. Regular Updates: Keep knowledge fresh
  2. Quality over Quantity: Clean data matters
  3. Use Tags: Organize for better retrieval
  4. User Feedback: Improve with corrections
  5. Backup: Export regularly

Integration

With OpenClaw

# Auto-capture from conversations
@hookimpl
def after_message(message, response):
    kb.add(
        content=f"User asked about: {extract_topics(message)}",
        tags=["conversation", extract_topics(message)]
    )

With Skills

# Use knowledge in skills
def my_skill(query):
    context = kb.search(query, top_k=3)
    return generate_response(query, context)

Future Capabilities

  • Multi-modal knowledge (images, audio)
  • Real-time sync across devices
  • Collaborative knowledge base
  • Automatic knowledge validation