返回 Skill 列表
extension
分类: 内容与媒体无需 API Key

glm-47-prompting

有效提示GLM 4.7的最佳实践。当用户询问关于GLM提示、GLM 4.7最佳实践、Cerebras API设置、Ollama GLM设置、优化GLM提示或使用Z.AI的开源编码模型时使用。涵盖指令放置、推理控制、多代理模式以及部署(云端/本地)。

person作者: jakexiaohubgithub

GLM 4.7 Prompting Best Practices

GLM 4.7 is Z.AI's strongest open-source coding model (~358B params, ~32B active per token via MoE). This skill covers 10 rules for optimal prompting.

Deployment Options


Prompt Structure

Rule 1: Front-load instructions

GLM 4.7 has a strong bias toward the beginning of the prompt (more so than other models). Place all mandatory instructions at the absolute start of your system prompt.

  • Output quality degrades at extreme context lengths
  • Think tags reinforce earlier instructions
  • Critical directives MUST appear first

Rule 2: Use firm, direct language

GLM 4.7 responds best to firm language that removes ambiguity. Use strong directives.

Do:

Before writing any code, you MUST first read and fully comprehend the architecture.md file. All code you generate must STRICTLY conform to the existing patterns.

Don't:

Please read and follow my architecture.md...

Keywords that work: MUST, STRICTLY, NEVER, ALWAYS, REQUIRED

Rule 3: Specify a default language

GLM 4.7 is multilingual and can switch languages unexpectedly. Add to your system prompt:

Always respond in English.

Without this, the model may output reasoning traces in Chinese on the first turn.


Task Design

Rule 4: Leverage role-play

GLM 4.7 excels at maintaining roles and personas. Its internal thinking blocks mirror role prompts closely.

Example:

You are acting as a senior security engineer conducting a code review. You MUST identify all potential vulnerabilities, focusing on injection attacks, authentication bypasses, and data exposure risks.

Use explicit personas or create multi-agent systems with distinct roles.

Rule 5: Break up the task

GLM 4.7 does NOT support interleaved thinking (unlike Claude/GPT). It performs a single reasoning pass per prompt before acting.

Instead of: "Refactor this codebase"

Do:

  1. List all dependencies and their versions
  2. Propose the new directory structure
  3. Generate migration scripts for each module
  4. Verify each migration compiles

This incremental approach matches GLM's execution-first tendencies.


Reasoning Control

Rule 6: Disable reasoning for simple tasks

GLM 4.7 often includes verbose internal thought blocks. For straightforward tasks, this slows down responses.

Methods to minimize:

  • API: Set disable_reasoning: true
  • Prompt: "Skip reasoning for straightforward tasks"
  • Use structured outputs (JSON, lists) that discourage verbose reasoning
  • Set clear_thinking: true to remove internal state between turns
  • Set appropriate max_completion_tokens limits

Rule 7: Enable reasoning for complex tasks

For complex problem-solving, reasoning becomes valuable.

Methods to enhance:

  • API: Ensure disable_reasoning: false (or omit)
  • Prompt: "Think step by step before answering"
  • Use chain-of-thought examples showing the reasoning process

Rule 8: Use clear_thinking to control memory

Controls internal thinking state across calls:

| Setting | Use Case | |---------|----------| | clear_thinking: false | Agent loops, multi-step plans, coding sessions | | clear_thinking: true | One-off calls, batch jobs, when seeing drift |


Multi-Agent Patterns

Rule 9: Use critic agents

Employ specialized sub-agents to review outputs before advancing. Decouple generation from validation.

Critic types:

| Agent | Focus | |-------|-------| | Code Review | SOLID/DRY/YAGNI principles, maintainability | | QA Expert | User flows, edge cases, integration points | | Security Review | Vulnerabilities, unsafe patterns, compliance | | Performance Audit | Bottlenecks, inefficient algorithms, resource leaks |

Each critic gets a focused persona (leverages Rule 4).

Rule 10: Pair with frontier models

GLM 4.7 may fall short on the toughest 10% of use cases. Three patterns:

  1. Router: Route simple tasks to GLM 4.7, fall back to slower models for complex queries
  2. Backbone: Use GLM 4.7 as fast backbone, loop in frontier models only when needed
  3. Plan-execute: Use Claude/GPT to create plan, execute rapidly with GLM 4.7

Leverage GLM's 17x speed advantage for the majority of tasks.