返回 Skill 列表
extension
分类: 内容与媒体无需 API Key

ai-dev-guidelines

全面的AI/ML开发指南,适用于LangChain、LangGraph以及在FastAPI中的ML模型集成。在构建LLM应用程序、代理、RAG系统、情感分析、基于方面的分析、链编排、提示工程、向量存储、嵌入或与FastAPI端点集成ML模型时使用。涵盖了LangChain模式、LangGraph状态机、模型部署、API集成、流处理、错误处理和最佳实践。

person作者: jakexiaohubgithub

AI/ML Development Guidelines (LangChain/LangGraph/FastAPI)

Purpose

Establish best practices for integrating AI/ML capabilities into FastAPI applications, with focus on LangChain, LangGraph, and ML model deployment.

When to Use This Skill

Automatically activates when working on:

  • LangChain chains, agents, tools
  • LangGraph workflows and state machines
  • RAG (Retrieval Augmented Generation)
  • Prompt engineering and templates
  • Vector stores and embeddings
  • ML model integration (sentiment analysis, aspect-based analysis, etc.)
  • Streaming LLM responses
  • AI service layers
  • Model deployment and optimization

Quick Start

New AI Feature Checklist

  • [ ] Model Selection: Choose appropriate model/API (OpenAI, Anthropic, local)
  • [ ] Prompt Design: Create prompt templates
  • [ ] Chain/Graph: Build LangChain chain or LangGraph workflow
  • [ ] API Endpoint: FastAPI route with streaming support
  • [ ] Error Handling: Retry logic, fallbacks, timeouts
  • [ ] Validation: Input/output validation with Pydantic
  • [ ] Monitoring: Log tokens, latency, errors
  • [ ] Testing: Unit tests with mocked LLM responses
  • [ ] Documentation: Document prompts and expected behavior

Architecture for AI Features

Layered AI Architecture

HTTP Request
    ↓
FastAPI Route (streaming setup)
    ↓
AI Service (chain/graph orchestration)
    ↓
LangChain/LangGraph (LLM calls)
    ↓
Model/API (OpenAI, Anthropic, HuggingFace, local)

Current Project ML Structure:

backend/app/
├── routes/
│   └── ML_Routes.py        # ML API endpoints
├── machine_learning/
│   └── mlendpoint.py       # Sentiment analysis functions
└── services/               # TO BE CREATED for LangChain/LangGraph
    └── ai_services/
        ├── sentiment.py    # Refactored sentiment analysis
        ├── chains.py       # LangChain chains
        ├── graphs.py       # LangGraph workflows
        └── prompts.py      # Prompt templates

Package Management

This project uses uv for Python package management.

# Add LangChain dependencies
uv add langchain langchain-openai langchain-anthropic langgraph

# Add vector store dependencies (when needed)
uv add faiss-cpu chromadb

# Add other AI dependencies
uv add tiktoken sentence-transformers

# Run Python scripts with uv
uv run python script.py

❌ NEVER use pip install - Always use uv add instead.


Core Principles for AI Development

1. Separate Prompts from Code

# ❌ NEVER: Hardcoded prompts in code
def analyze_text(text: str):
    response = llm.invoke(f"Analyze sentiment of: {text}")
    return response

# ✅ ALWAYS: Templated prompts
from langchain.prompts import ChatPromptTemplate

SENTIMENT_PROMPT = ChatPromptTemplate.from_messages([
    ("system", "You are a sentiment analysis expert."),
    ("user", "Analyze the sentiment of the following text:\n\n{text}\n\nProvide: sentiment (positive/neutral/negative) and confidence (0-1).")
])

def analyze_text(text: str):
    chain = SENTIMENT_PROMPT | llm | output_parser
    return chain.invoke({"text": text})

2. Use Pydantic for LLM Output Validation

from pydantic import BaseModel, Field
from langchain.output_parsers import PydanticOutputParser

class SentimentResult(BaseModel):
    sentiment: str = Field(description="positive, neutral, or negative")
    confidence: float = Field(ge=0.0, le=1.0)
    reasoning: str = Field(description="Brief explanation")

# Create parser
parser = PydanticOutputParser(pydantic_object=SentimentResult)

# Add format instructions to prompt
prompt = ChatPromptTemplate.from_messages([
    ("system", "You are a sentiment expert."),
    ("user", "{text}\n\n{format_instructions}")
])

# Use in chain
chain = prompt | llm | parser

result: SentimentResult = chain.invoke({
    "text": "I love this product!",
    "format_instructions": parser.get_format_instructions()
})

3. Handle Streaming for Better UX

from fastapi import FastAPI
from fastapi.responses import StreamingResponse
from langchain.callbacks.streaming_stdout import StreamingStdOutCallbackHandler

# Setup streaming LLM
from langchain_openai import ChatOpenAI

llm = ChatOpenAI(
    model="gpt-4",
    streaming=True,
    callbacks=[StreamingStdOutCallbackHandler()]
)

@app.post("/ai/generate-stream")
async def generate_stream(request: GenerateRequest):
    """Stream LLM responses for real-time feedback"""

    async def event_generator():
        async for chunk in chain.astream({"input": request.text}):
            if chunk:
                yield f"data: {chunk}\n\n"

    return StreamingResponse(
        event_generator(),
        media_type="text/event-stream"
    )

4. Implement Retry Logic and Fallbacks

from tenacity import retry, stop_after_attempt, wait_exponential
from langchain.chat_models import ChatOpenAI, ChatAnthropic

@retry(
    stop=stop_after_attempt(3),
    wait=wait_exponential(multiplier=1, min=2, max=10)
)
async def call_llm_with_retry(prompt: str):
    try:
        # Try primary model
        response = await primary_llm.ainvoke(prompt)
        return response
    except Exception as e:
        # Log and fallback
        logger.error(f"Primary LLM failed: {e}")
        # Try fallback model
        response = await fallback_llm.ainvoke(prompt)
        return response

5. Use Environment Variables for API Keys

# ❌ NEVER: Hardcoded API keys
llm = ChatOpenAI(api_key="sk-...")

# ✅ ALWAYS: From environment
import os
from pydantic_settings import BaseSettings

class AISettings(BaseSettings):
    openai_api_key: str
    anthropic_api_key: str
    model_name: str = "gpt-4"
    max_tokens: int = 1000
    temperature: float = 0.7

    class Config:
        env_file = ".env"

settings = AISettings()

llm = ChatOpenAI(
    api_key=settings.openai_api_key,
    model=settings.model_name,
    max_tokens=settings.max_tokens
)

6. Track Token Usage and Costs

from langchain.callbacks import get_openai_callback

@app.post("/ai/analyze")
async def analyze_with_tracking(text: str):
    with get_openai_callback() as cb:
        result = chain.invoke({"text": text})

        # Log metrics
        logger.info(f"""
        Tokens used: {cb.total_tokens}
        Prompt tokens: {cb.prompt_tokens}
        Completion tokens: {cb.completion_tokens}
        Total cost: ${cb.total_cost}
        """)

    return {
        "result": result,
        "metadata": {
            "tokens": cb.total_tokens,
            "cost": cb.total_cost
        }
    }

7. Implement Proper Error Boundaries

from fastapi import HTTPException
from langchain.schema import LLMResult
from openai.error import RateLimitError, APIError

@app.post("/ai/process")
async def process_with_ai(request: AIRequest):
    try:
        result = await ai_service.process(request.text)
        return result

    except RateLimitError:
        raise HTTPException(
            status_code=429,
            detail="Rate limit exceeded. Please try again later."
        )

    except APIError as e:
        logger.error(f"LLM API error: {e}")
        raise HTTPException(
            status_code=503,
            detail="AI service temporarily unavailable"
        )

    except Exception as e:
        logger.error(f"Unexpected error in AI processing: {e}")
        raise HTTPException(
            status_code=500,
            detail="Error processing request"
        )

LangChain Patterns

Basic Chain Pattern

from langchain_openai import ChatOpenAI
from langchain.prompts import ChatPromptTemplate
from langchain.output_parsers import StrOutputParser

# Components
llm = ChatOpenAI(model="gpt-4")
prompt = ChatPromptTemplate.from_messages([
    ("system", "You are a helpful assistant."),
    ("user", "{input}")
])
output_parser = StrOutputParser()

# Chain
chain = prompt | llm | output_parser

# Invoke
result = chain.invoke({"input": "Hello!"})

Chain with Memory

from langchain.memory import ConversationBufferMemory
from langchain.chains import ConversationChain

memory = ConversationBufferMemory()

conversation = ConversationChain(
    llm=llm,
    memory=memory,
    verbose=True
)

# First interaction
response1 = conversation.predict(input="Hi, I'm Aaron")

# Memory persists
response2 = conversation.predict(input="What's my name?")
# Will respond: "Your name is Aaron"

RAG Chain Pattern

from langchain.vectorstores import FAISS
from langchain.embeddings import OpenAIEmbeddings
from langchain.text_splitter import RecursiveCharacterTextSplitter
from langchain.chains import RetrievalQA

# 1. Prepare documents
text_splitter = RecursiveCharacterTextSplitter(
    chunk_size=1000,
    chunk_overlap=200
)
texts = text_splitter.split_documents(documents)

# 2. Create vector store
embeddings = OpenAIEmbeddings()
vectorstore = FAISS.from_documents(texts, embeddings)

# 3. Create retrieval chain
qa_chain = RetrievalQA.from_chain_type(
    llm=llm,
    chain_type="stuff",
    retriever=vectorstore.as_retriever(search_kwargs={"k": 3})
)

# 4. Query
result = qa_chain.invoke({"query": "What are the main findings?"})

Tool-Using Agent

from langchain.agents import create_openai_functions_agent, AgentExecutor
from langchain.tools import Tool

# Define tools
def search_projects(query: str) -> str:
    """Search projects in database"""
    # Your search logic
    return f"Found 3 projects matching {query}"

def get_sentiment(text: str) -> str:
    """Analyze sentiment of text"""
    # Your sentiment analysis
    return "positive"

tools = [
    Tool(
        name="search_projects",
        func=search_projects,
        description="Search for urban planning projects by name or city"
    ),
    Tool(
        name="analyze_sentiment",
        func=get_sentiment,
        description="Analyze sentiment of text (positive/neutral/negative)"
    )
]

# Create agent
agent = create_openai_functions_agent(llm, tools, prompt)
agent_executor = AgentExecutor(agent=agent, tools=tools)

# Execute
result = agent_executor.invoke({
    "input": "Find projects in Mexico City and analyze sentiment"
})

LangGraph Patterns

Simple State Machine

from langgraph.graph import StateGraph, END
from typing import TypedDict, Annotated
from operator import add

# Define state
class AgentState(TypedDict):
    input: str
    analysis: str
    sentiment: str
    final_output: str

# Define nodes
def analyze_node(state: AgentState):
    # Perform analysis
    analysis = llm.invoke(f"Analyze: {state['input']}")
    return {"analysis": analysis}

def sentiment_node(state: AgentState):
    # Extract sentiment
    sentiment = extract_sentiment(state['analysis'])
    return {"sentiment": sentiment}

def format_node(state: AgentState):
    # Format output
    output = f"Analysis: {state['analysis']}\nSentiment: {state['sentiment']}"
    return {"final_output": output}

# Build graph
workflow = StateGraph(AgentState)

workflow.add_node("analyze", analyze_node)
workflow.add_node("sentiment", sentiment_node)
workflow.add_node("format", format_node)

workflow.set_entry_point("analyze")
workflow.add_edge("analyze", "sentiment")
workflow.add_edge("sentiment", "format")
workflow.add_edge("format", END)

app = workflow.compile()

# Run
result = app.invoke({"input": "This project is amazing!"})

Conditional Routing

def route_by_sentiment(state: AgentState):
    """Route based on sentiment"""
    if state["sentiment"] == "negative":
        return "handle_negative"
    elif state["sentiment"] == "positive":
        return "handle_positive"
    else:
        return "handle_neutral"

# Add conditional edges
workflow.add_conditional_edges(
    "sentiment",
    route_by_sentiment,
    {
        "handle_negative": "negative_handler",
        "handle_positive": "positive_handler",
        "handle_neutral": "neutral_handler"
    }
)

Detailed Guides

LangChain Patterns

  • Chain composition
  • Memory and context
  • Tools and agents
  • RAG implementation

LangGraph Workflows

  • State machines
  • Conditional routing
  • Multi-agent systems
  • Complex orchestration

Prompt Engineering

  • Effective prompt design
  • Few-shot learning
  • Chain-of-thought prompting
  • Prompt templates

Model Deployment

  • Local model serving
  • API integration
  • Optimization and caching
  • Cost management

Testing AI Systems

  • Unit testing with mocks
  • Integration testing
  • Prompt testing
  • Evaluation metrics

Quick Reference

Common LangChain Imports

from langchain_openai import ChatOpenAI, OpenAI
from langchain_anthropic import ChatAnthropic
from langchain.prompts import ChatPromptTemplate, PromptTemplate
from langchain.output_parsers import PydanticOutputParser, StrOutputParser
from langchain.chains import LLMChain, SequentialChain
from langchain.memory import ConversationBufferMemory
from langchain.agents import create_openai_functions_agent, AgentExecutor
from langchain.tools import Tool
from langchain.callbacks import get_openai_callback

Common LangGraph Imports

from langgraph.graph import StateGraph, END
from typing import TypedDict, Annotated
from operator import add

FastAPI + Streaming

from fastapi.responses import StreamingResponse

async def stream_response():
    async for chunk in chain.astream(input):
        yield f"data: {chunk}\n\n"

return StreamingResponse(stream_response(), media_type="text/event-stream")

Resources


Remember: AI features require careful prompt design, error handling, and monitoring. Always validate LLM outputs, implement retry logic, and track costs!