返回 Skill 列表
extension
分类: 其它需要 API Key

Codex PPT

从文章、报告、论文、笔记或大纲生成视觉统一的图像化PPT/PPTX

person作者: ningzimuhubclawhub

Codex PPT

Overview

This skill creates image-based PowerPoint decks from source material. Each slide is a complete 16:9 generated image. Final images are assembled into .pptx with scripts/assemble_ppt.py.

Use this when the user wants a visually unified presentation and accepts full-slide image pages. Do not use it when every textbox, chart, or shape must remain separately editable.

Prefer the built-in image generation/editing tool. Use scripts/image_gen.py only when the built-in backend is unavailable, lacks a required capability, or the user explicitly asks for API/CLI mode.

Hard Constraints

  • Read the relevant Reference Map files before each phase. This file is the orchestration contract; detailed rules live in docs/ and worker prompts in prompts/.
  • Respect approval gates. Do not create final deck_spec.json, speech.md, prompt jobs, slide images, or .pptx before the approvals in docs/workflow-gates-and-progress.md.
  • After the user approves the sample slide and authorizes full-deck generation, every remaining slide image job must be dispatched to a slide subagent whenever subagents are available.
  • The main agent owns orchestration, prompt jobs, state recording, QA, speaker notes, and assembly. Do not silently replace available slide subagents with sequential production.
  • Every final origin_image/slide_XX.png must be generated by the selected image backend: built-in image generation/editing tool or scripts/image_gen.py.
  • Local drawing, Pillow, SVG, HTML/CSS/canvas screenshots, python-pptx/PptxGenJS layouts, and manual overlays are failure modes, not fallbacks.
  • The selected image backend must stay fixed after backend confirmation. Do not let subagents switch backend for convenience.
  • After sample approval, record how the approved sample was generated and pass that exact method to every slide subagent.
  • Slide dispatch and result state must be recorded with the bundled scripts. Chat messages alone do not make a slide dispatched or complete.
  • If a required subagent, image backend, or required-image path is unavailable, stop and report a blocker with the slide id and evidence. Do not create a lower-quality replacement.

Visible Progress

For non-trivial decks, keep a user-visible checklist with one active step. Canonical completion evidence is in docs/workflow-gates-and-progress.md.

Default visible steps:

  1. Prepare source, outline, style, and backend decisions.
  2. Generate and approve one sample slide.
  3. Prepare slide jobs and slide state.
  4. Dispatch slide subagents.
  5. Record generated slide results.
  6. QA, repair, notes, and PPT assembly.

Do not mark a step complete from chat alone; use real files or script-recorded state.

Default Workflow

  1. Understand the source content.

    • Identify topic, audience, goal, page count, style/brand constraints, and sections to include or exclude.
    • If no page count is specified, choose a practical count. Typical decks are 8-12 slides.
  2. Plan the deck outline.

    • Before writing or updating outline.md, read docs/workflow-gates-and-progress.md and docs/outline-style-and-sample.md.
    • Draft slide roles and required source images. Ask for confirmation, then stop before style, backend, sample, or downstream artifacts until approved.
  3. Confirm a unified visual style.

    • Before offering style options or using files from references/, read docs/outline-style-and-sample.md.
    • Offer 2-3 concrete style directions, recommend one, wait for confirmation, then keep one visual identity while varying layouts by page role.
  4. Confirm the image backend.

    • Before generating any slide image, read docs/backend-selection.md.
    • Check whether a built-in image tool is callable, state what you checked, name the backend, explain fallback status, and wait for confirmation.
    • If CLI/API fallback is selected, read docs/cli-api-fallback.md. Read docs/image-model-configuration.md only after config errors or explicit API-setting requests.
  5. Generate one sample slide for approval.

    • Before generating or approving the sample slide, read docs/outline-style-and-sample.md.
    • Generate exactly one representative sample after outline, style, and backend are confirmed. Do not generate the full deck until approved.
    • After approval, record sample_generation_method in deck_spec.json so jobs and subagents inherit the same path.
  6. Create the project directory.

    • Before initializing folders or assembling files, read docs/project-assembly-and-reporting.md.
    • If no destination is specified, use the current working directory or the source file directory.
  7. Prepare user-supplied assets.

    • Before using paper figures, charts, screenshots, logos, or other required assets, read docs/user-supplied-assets.md.
    • Treat required assets as strict inputs and confirm slide-to-asset mapping before generation.
  8. Generate all slide images.

    • Before full-deck image generation, read docs/slide-generation-and-subagents.md.
    • Create per-slide jobs with scripts/prepare_slide_prompts.py or saved prompts/slide_XX.json files.
    • Every final image must come from the selected backend and be recorded with bundled state scripts.
  9. Dispatch slide subagents.

    • Before dispatching or replacing slide workers, read docs/slide-generation-and-subagents.md and prompts/slide-worker.md.
    • Use one subagent per remaining slide job whenever possible. If required subagents cannot be spawned, stop and report a blocker unless the user changes the workflow.
  10. Quality check and repair.

    • Before QA or assembly, read docs/project-assembly-and-reporting.md.
    • Inspect every slide before assembly: text, outline match, truncation, style, unwanted page numbers, overlaps, and required assets.
    • Regenerate severe failures with a tighter prompt. Use backend editing for localized issues when available.
    • For CLI/API fallback edit commands, read docs/cli-api-fallback.md. Replace the final slide only after validating the edited output.
  11. Write speaker notes and assemble the PPT.

    • Before writing speech.md or running assembly, read docs/project-assembly-and-reporting.md.
    • Make sure outline.md reflects the final confirmed deck outline. Use speech.md headings that map to Slide N.
    • Before assembly, ensure slide_jobs.json shows generated slides as recorded and approved samples as accepted. If any slide is pending, dispatched, or blocked, stop.
  12. Report the result.

    • Use the final report checklist in docs/project-assembly-and-reporting.md.
    • Include paths, slide count, backend used, recorded-result status, and any limitations or blockers.
  13. Save reusable styles when requested.

    • If asked to save the current deck style or a supplied image/PDF/PPT/PPTX style, read docs/style-library.md.

Subagent Dispatch

Slide subagents are mandatory after sample approval whenever the runtime can spawn them. The main agent prepares jobs and records state; each worker handles exactly one prompts/slide_XX.json job and returns only selected image path, backend, and QA note.

Use docs/slide-generation-and-subagents.md for dispatch, commands, result recording, blockers, and backend provenance. Use prompts/slide-worker.md as the handoff template.

Subagents must not edit outline.md, deck_spec.json, other slide jobs, origin_image/, speech.md, or the final .pptx. The parent records outputs and assembles.

Acceptance Criteria

  • Output is a valid .pptx.
  • Each expected final slide image exists under origin_image/slide_XX.png.
  • Every final slide image was generated by the confirmed backend and recorded through record_slide_result.py, except an approved sample marked accepted by run state.
  • outline.md reflects the approved deck outline.
  • speech.md exists when speaker notes are expected, and assembly writes those notes into the PPT.
  • slide_jobs.json and slide_run_state.json reflect the final state.
  • Required source images are visibly represented, or a blocker is reported.
  • If blocked, the final response identifies phase, slide id, evidence path, and unfinished reason; do not call the deck complete.

Reference Map

  • docs/workflow-gates-and-progress.md: approval gates, progress, completion evidence.
  • docs/backend-selection.md: backend decision rules and confirmation text.
  • docs/outline-style-and-sample.md: outline, style, sample rules, prompt examples.
  • docs/user-supplied-assets.md: strict handling for required source assets.
  • docs/slide-generation-and-subagents.md: jobs, dispatch, result recording, blockers, provenance.
  • docs/cli-api-fallback.md: fallback runtime, generation/edit commands, image limits, troubleshooting.
  • docs/image-model-configuration.md: API key, base URL, model, .env; read only when config is needed.
  • docs/project-assembly-and-reporting.md: project directory, notes, assembly, final report, prompting principles.
  • prompts/slide-worker.md: slide subagent handoff template.
  • references/*.md: visual style references.

Documentation and Updates

For source, docs, install, config, and examples, see ningzimu/codex-ppt-skill.