返回 Skill 列表
extension
分类: 其它需要 API Key

Search Console Report

根据Google Search Console数据生成全面的SEO分析报告,支持PDF导出。适用于用户需要分析搜索表现、获取SEO洞察时使用。

person作者: lgx-00hubclawhub

Search Console SEO Report Generator

Generate professional, chart-rich PDF reports from Google Search Console data. Covers traffic trends, top pages, top keywords, country/device distribution, growth analysis, and actionable SEO recommendations.

Prerequisites

Before running this skill, verify these requirements:

1. Service Account Key File

You need a Google Cloud Service Account JSON key file with access to the Search Console properties. The file looks like:

{
  "type": "service_account",
  "project_id": "...",
  "private_key_id": "...",
  "private_key": "-----BEGIN PRIVATE KEY-----\n...",
  "client_email": "xxx@project.iam.gserviceaccount.com",
  "token_uri": "https://oauth2.googleapis.com/token",
  ...
}

Ask the user for the path to their key file. Common locations: ~/Downloads/*.json, project directory.

If the user doesn't have one yet, guide them through:

  1. Create a Service Account in Google Cloud Console (IAM & Admin > Service Accounts)
  2. Create a JSON key for it (Keys tab > Add Key > JSON)
  3. Add the service account email as a user in Search Console (Settings > Users and permissions > Add user, "Restricted" permission is sufficient)
  4. Enable the "Google Search Console API" in the project's API Library

2. Python Environment

The script requires these packages: pyjwt, cryptography, requests, matplotlib, pandas, reportlab.

Set up a virtual environment to avoid system conflicts:

python3 -m venv /tmp/sc-env
/tmp/sc-env/bin/pip install pyjwt cryptography requests matplotlib pandas reportlab

IMPORTANT: Always use /tmp/sc-env/bin/python to run scripts, not the system Python.

Timeout warning: Package installation and first matplotlib import can be slow (60-120s). Set bash timeout to 180000ms for these operations.

3. Chinese Font for PDF (macOS)

The PDF uses STHeiti for proper CJK + Latin + symbol rendering. Register it like this:

from reportlab.pdfbase import pdfmetrics
from reportlab.pdfbase.ttfonts import TTFont

pdfmetrics.registerFont(TTFont('CNFont', '/System/Library/Fonts/STHeiti Medium.ttc', subfontIndex=0))
pdfmetrics.registerFont(TTFont('CNFontLight', '/System/Library/Fonts/STHeiti Light.ttc', subfontIndex=0))
pdfmetrics.registerFontFamily('CNFont', normal='CNFontLight', bold='CNFont')

CRITICAL font rules:

  • Do NOT use UnicodeCIDFont('STSong-Light') — it causes English letter spacing to be too narrow and Unicode symbols like (U+2022) to render as garbage characters (e.g. "煉").
  • Always use TrueType fonts registered via TTFont for proper mixed CJK/Latin rendering.
  • On non-macOS systems, find an available CJK TTF font: fc-list :lang=zh file or look for Noto Sans CJK / WenQuanYi.

4. Matplotlib Font for Charts (macOS)

CRITICAL: STHeiti alone does NOT cover Korean glyphs — keywords or country names in Korean will show as hollow rectangles (□). Use Arial Unicode MS instead, which covers CJK + Korean + most other scripts:

import matplotlib.font_manager as fm

matplotlib.rcParams['font.family'] = 'sans-serif'
matplotlib.rcParams['axes.unicode_minus'] = False

_unicode_font_path = '/Library/Fonts/Arial Unicode.ttf'
if not os.path.exists(_unicode_font_path):
    _unicode_font_path = '/System/Library/Fonts/STHeiti Medium.ttc'  # fallback
fm.fontManager.addfont(_unicode_font_path)
_ufname = fm.FontProperties(fname=_unicode_font_path).get_name()
matplotlib.rcParams['font.sans-serif'] = [_ufname, 'DejaVu Sans']

Do NOT try to combine STHeiti + AppleSDGothicNeo via font.sans-serif list — matplotlib uses a single font per render pass and does not do per-glyph fallback, so the list only helps if glyphs exist in the first font.

Step-by-Step Instructions

Step 1: Gather Input from User

Ask for or determine:

  • Key file path: Path to the Service Account JSON key file
  • Site URLs: One or more Search Console property URLs (format: https://www.example.com/)
  • Date range: Default to last 90 days. The user may request a custom range.
  • Output path: Where to save the PDF and data files. Default to project directory.
  • Language: Report can be in Chinese (default) or English — match the user's language.

Step 2: Authenticate with Google API

Use JWT-based Service Account authentication. Here is the exact authentication code:

import json, time, jwt, requests

def get_access_token(key_file):
    with open(key_file) as f:
        creds = json.load(f)
    
    now = int(time.time())
    payload = {
        "iss": creds["client_email"],
        "scope": "https://www.googleapis.com/auth/webmasters.readonly",
        "aud": creds["token_uri"],
        "iat": now,
        "exp": now + 3600,
    }
    signed_jwt = jwt.encode(payload, creds["private_key"], algorithm="RS256")
    
    resp = requests.post(creds["token_uri"], data={
        "grant_type": "urn:ietf:params:oauth:grant-type:jwt-bearer",
        "assertion": signed_jwt,
    })
    resp.raise_for_status()
    return resp.json()["access_token"]

Error handling: If authentication fails with 403, the API may not be enabled or the service account may not have Search Console access. Tell the user which to check.

Step 3: Fetch Data from Search Console API

Use the Search Analytics API endpoint for each site. The base query function:

import datetime

END_DATE = datetime.date.today() - datetime.timedelta(days=3)  # Data has ~3 day lag
START_DATE = END_DATE - datetime.timedelta(days=89)

def query_sc(token, site_url, dimensions, start=None, end=None, row_limit=100):
    """Query Search Console Search Analytics API.
    
    Args:
        token: OAuth2 access token
        site_url: Full property URL, e.g. "https://www.example.com/"
        dimensions: List of dimensions. Valid values:
            - "date"    — daily breakdown
            - "query"   — search keywords
            - "page"    — page URLs
            - "country" — ISO 3166-1 alpha-3 country codes (lowercase)
            - "device"  — "DESKTOP", "MOBILE", "TABLET"
            - "searchAppearance" — rich result types
            Can combine: ["query", "page"] for keyword-page matrix
        start: Start date (datetime.date). Defaults to START_DATE.
        end: End date (datetime.date). Defaults to END_DATE.
        row_limit: Max rows (max 25000).
    
    Returns:
        List of row dicts with keys: "keys" (list), "clicks", "impressions", "ctr", "position"
        Note: "ctr" is a decimal (0.05 = 5%), multiply by 100 for display.
    """
    url = f"https://www.googleapis.com/webmasters/v3/sites/{requests.utils.quote(site_url, safe='')}/searchAnalytics/query"
    body = {
        "startDate": (start or START_DATE).isoformat(),
        "endDate": (end or END_DATE).isoformat(),
        "dimensions": dimensions,
        "rowLimit": row_limit,
    }
    resp = requests.post(url, headers={"Authorization": f"Bearer {token}"}, json=body)
    resp.raise_for_status()
    return resp.json().get("rows", [])

For each site, fetch ALL of the following data (this is the complete list — do not skip any):

| # | Query | Dimensions | row_limit | Purpose | |---|-------|-----------|-----------|---------| | 1 | Daily traffic trend | ["date"] | 25000 | Time series for charts | | 2 | Top pages | ["page"] | 50 | Most visited pages | | 3 | Top queries | ["query"] | 50 | Most searched keywords | | 4 | Country distribution | ["country"] | 30 | Geographic breakdown | | 5 | Device distribution | ["device"] | 10 | Desktop/Mobile/Tablet split | | 6 | Search appearance | ["searchAppearance"] | 20 | Rich result types | | 7 | Query-page combos | ["query", "page"] | 100 | Which keywords drive which pages | | 8 | Period comparison (first half) | ["page"] with first-half dates | 500 | Growth analysis | | 9 | Period comparison (second half) | ["page"] with second-half dates | 500 | Growth analysis |

Period comparison logic: Split the date range in half. For each page URL, compare clicks between the two halves. Categorize pages as:

  • Growing: clicks increased (sort by change descending)
  • Declining: clicks decreased (sort by change ascending)
  • New: appeared only in the second half
  • Lost: appeared only in the first half

Step 4: Calculate Summary Statistics

For each site, compute:

daily = site_data["daily_trend"]
total_clicks = sum(d["clicks"] for d in daily)
total_impressions = sum(d["impressions"] for d in daily)
avg_ctr = sum(d["ctr"] for d in daily) / len(daily)  # Already *100 if you stored it that way
avg_position = sum(d["position"] for d in daily) / len(daily)

# Trend: compare last 30 days vs first 30 days
if len(daily) >= 60:
    first_30_clicks = sum(d["clicks"] for d in daily[:30])
    last_30_clicks = sum(d["clicks"] for d in daily[-30:])
    click_trend_pct = ((last_30_clicks - first_30_clicks) / max(first_30_clicks, 1)) * 100
    # Same for impressions

Step 5: Save Raw Data as JSON

Save all fetched data to sc_detailed_data.json for reproducibility:

with open(f"{output_dir}/sc_detailed_data.json", "w") as f:
    json.dump(all_data, f, ensure_ascii=False, indent=2)

Step 6: Generate Charts with Matplotlib

IMPORTANT: Always set matplotlib.use('Agg') BEFORE importing pyplot (no display server available).

Generate these charts (save as PNG, dpi=150):

Chart 1: Combined Traffic Trend (all sites)

  • 2-row subplot: top = daily clicks, bottom = daily impressions
  • One line per site, color-coded
  • X-axis: dates formatted as %m-%d, rotated 45 degrees
  • Legend in upper-left

Chart 2: Per-site Detail (one per site with enough data)

  • 2-row subplot: top = daily clicks with 7-day moving average, bottom = average position (inverted Y-axis — lower is better)
  • Fill area under clicks line with alpha=0.3
# 7-day moving average calculation
if len(clicks) >= 7:
    ma7 = [sum(clicks[max(0,i-6):i+1]) / min(7, i+1) for i in range(len(clicks))]

Chart 3: Device Distribution

  • 1-row, N-column pie charts (one per site)
  • Show percentage labels

Chart 4: Country Distribution (pie chart per site)

  • Use figsize=(10, 7) and radius=0.9 — smaller sizes make labels illegible
  • Place on its own full-width row in the PDF (do NOT put side-by-side with keyword chart)
  • In PDF, render at width=16*cm
  • Top 8 countries by clicks
  • Map country codes to readable names using this mapping:
COUNTRY_NAMES = {
    'idn': 'Indonesia', 'hkg': 'Hong Kong', 'mac': 'Macau', 'kor': 'South Korea',
    'usa': 'United States', 'jpn': 'Japan', 'sgp': 'Singapore', 'mys': 'Malaysia',
    'twn': 'Taiwan', 'tha': 'Thailand', 'phl': 'Philippines', 'ind': 'India',
    'vnm': 'Vietnam', 'gbr': 'United Kingdom', 'deu': 'Germany', 'fra': 'France',
    'aus': 'Australia', 'can': 'Canada', 'bra': 'Brazil', 'mex': 'Mexico',
    'chn': 'China', 'pak': 'Pakistan', 'bgd': 'Bangladesh', 'lka': 'Sri Lanka',
    'mmr': 'Myanmar', 'khm': 'Cambodia', 'npl': 'Nepal', 'are': 'UAE',
    'sau': 'Saudi Arabia', 'tur': 'Turkey', 'egy': 'Egypt', 'nga': 'Nigeria',
    'ken': 'Kenya', 'zaf': 'South Africa', 'col': 'Colombia', 'arg': 'Argentina',
    'per': 'Peru', 'chl': 'Chile', 'nzl': 'New Zealand', 'ita': 'Italy',
    'esp': 'Spain', 'nld': 'Netherlands', 'rus': 'Russia', 'pol': 'Poland',
}

Step 7: Generate PDF Report

Use reportlab with A4 page size. The report has 7 sections:

PDF Structure

Cover Page
  - Report title (in user's language)
  - Subtitle: "Google Search Console Data Analysis & Recommendations"
  - Report date, data range, data source, covered sites

Section 1: Executive Summary
  - Summary table (all sites: clicks, impressions, avg CTR, avg position, trends)
  - Key findings (5-6 bullet points highlighting most important insights)

Section 2: Traffic Trends
  - Combined traffic trend chart (all sites)
  - Per-site detail charts (clicks + position)

Section 3: Top Pages (TOP 10)
  - Table per site: rank, page path, clicks, impressions, CTR, position
  - Shorten long URLs: if > 45 chars, truncate with "..."

Section 4: Top Keywords (TOP 15)
  - Table per site: rank, keyword, clicks, impressions, CTR, position

Section 5: Country & Device Distribution
  - Device pie charts
  - Country bar charts (for major sites)
  - Country tables (all sites, top 10)

Section 6: Growth Analysis
  - Per site: growing pages table (green header), declining pages table (red header)
  - New page count, lost page count

Section 7: Recommendations & Action Plan
  - AI-generated recommendations based on the data (see analysis guidelines below)
  - Priority action table (P0/P1/P2)

PDF Style Configuration

Use 1.2cm margins (not the default 2cm) to maximize content area. Usable width = 21cm - 2×1.2cm = 18.6cm. Set all chart/table widths to 18.6cm accordingly.

from reportlab.lib.pagesizes import A4
from reportlab.lib.units import mm, cm
from reportlab.lib.styles import getSampleStyleSheet, ParagraphStyle
from reportlab.lib.enums import TA_CENTER
from reportlab.platypus import (SimpleDocTemplate, Paragraph, Spacer, Table,
                                 TableStyle, Image, PageBreak, HRFlowable)
from reportlab.lib import colors as rl_colors

doc = SimpleDocTemplate(pdf_path, pagesize=A4,
                        leftMargin=1.2*cm, rightMargin=1.2*cm,
                        topMargin=1.2*cm, bottomMargin=1.2*cm)

# Define styles — use 'CNFont' (the registered STHeiti font)
styles = getSampleStyleSheet()
styles.add(ParagraphStyle(name='CNTitle', fontName='CNFont', fontSize=22,
                          alignment=TA_CENTER, spaceAfter=6*mm, leading=28))
styles.add(ParagraphStyle(name='CNSubtitle', fontName='CNFont', fontSize=12,
                          alignment=TA_CENTER, textColor=rl_colors.grey, spaceAfter=10*mm))
styles.add(ParagraphStyle(name='CNHeading1', fontName='CNFont', fontSize=16,
                          spaceAfter=4*mm, spaceBefore=8*mm, leading=22,
                          textColor=rl_colors.HexColor('#1a73e8')))
styles.add(ParagraphStyle(name='CNHeading2', fontName='CNFont', fontSize=13,
                          spaceAfter=3*mm, spaceBefore=5*mm, leading=18,
                          textColor=rl_colors.HexColor('#333333')))
styles.add(ParagraphStyle(name='CNBody', fontName='CNFont', fontSize=10,
                          spaceAfter=2*mm, leading=16))
styles.add(ParagraphStyle(name='CNSmall', fontName='CNFont', fontSize=8,
                          textColor=rl_colors.grey, leading=12))
styles.add(ParagraphStyle(name='CNBullet', fontName='CNFont', fontSize=10,
                          spaceAfter=1.5*mm, leading=16, leftIndent=10*mm,
                          bulletIndent=5*mm))

Table Style Template

Use this consistent style for all data tables:

table_style = TableStyle([
    ('FONTNAME', (0,0), (-1,-1), 'CNFont'),
    ('FONTSIZE', (0,0), (-1,-1), 7),        # Small font for dense data
    ('BACKGROUND', (0,0), (-1,0), rl_colors.HexColor('#1a73e8')),  # Blue header
    ('TEXTCOLOR', (0,0), (-1,0), rl_colors.white),
    ('ALIGN', (2,0), (-1,-1), 'RIGHT'),     # Numbers right-aligned
    ('ALIGN', (0,0), (0,-1), 'CENTER'),     # Rank column centered
    ('GRID', (0,0), (-1,-1), 0.5, rl_colors.HexColor('#dddddd')),
    ('ROWBACKGROUNDS', (0,1), (-1,-1), [rl_colors.white, rl_colors.HexColor('#f8f9fa')]),
    ('TOPPADDING', (0,0), (-1,-1), 2),
    ('BOTTOMPADDING', (0,0), (-1,-1), 2),
])

Bullet Points

Use Unicode bullet character \u2022 (•) for list items:

story.append(Paragraph(f"\u2022 {text}", styles['CNBullet']))

This renders correctly with STHeiti font. Do NOT use other bullet approaches.

Step 8: Generate SEO Recommendations

Analyze the data and generate recommendations following these guidelines:

Analysis Framework

  1. CTR Analysis: If average CTR < 5%, recommend Title/Description optimization.
  2. Position Opportunities: Find keywords ranking 5-15 (page 1-2 boundary) — these are low-hanging fruit for optimization.
  3. Country Focus: Identify the #1 traffic country and recommend localized content.
  4. Growth Momentum: Sites with click growth > 100% are in "breakout" phase — recommend increasing content investment.
  5. New Sites: Sites with < 30 days of data need basic SEO foundations (Sitemap submission, internal linking).
  6. Declining Pages: If many pages have declining clicks, recommend content quality audit.
  7. Device Split: If mobile > 60%, emphasize mobile optimization and Core Web Vitals.
  8. Technical SEO: Always recommend hreflang for multi-region sites, 404 fixes, and page speed optimization.

Priority Classification

  • P0 (Do immediately): CTR optimization, fixing unindexed pages
  • P1 (Do this month): Content localization, keyword optimization for positions 5-15
  • P2 (Plan for next quarter): hreflang implementation, Core Web Vitals, Sitemap tuning

Step 9: Present Results

After generating the PDF:

  1. Confirm the PDF file path to the user
  2. Provide a summary in chat covering:
    • Report structure (sections and page count)
    • Key highlights per site (1-2 sentences each)
    • Top 3 priority recommendations
  3. Mention the raw data JSON file path for further analysis
  4. Offer next steps (e.g., "Do you want me to analyze a specific page or keyword in more detail?")

Common Errors and Solutions

| Error | Cause | Solution | |-------|-------|----------| | 403 Forbidden on API call | Service account not added to Search Console | Add the service account email as a user in Search Console settings | | 403 Google Search Console API has not been enabled | API not enabled | Enable it at https://console.cloud.google.com/apis/library/searchconsole.googleapis.com | | Empty rows in response | No data for that site/date range | Check if the site URL exactly matches the Search Console property (trailing slash matters!) | | jwt.encode error | Missing cryptography package | pip install cryptography | | PDF shows garbled Chinese | Wrong font | Use TTFont with STHeiti, NOT UnicodeCIDFont with STSong-Light | | Matplotlib timeout on first run | Building font cache | Set bash timeout to 180000ms; this only happens once | | MPLCONFIGDIR warning | No write access to ~/.matplotlib | Harmless; matplotlib creates a temp cache automatically |

Example Usage

User: "Help me generate an SEO report for my websites using Search Console" → Ask for key file path and site URLs, then run the full pipeline.

User: "Analyze search performance for example.com over the last 90 days and export to PDF" → Run with default 90-day range, generate full report.

User: "Compare search traffic between my 3 sites" → Run for all 3 sites, emphasize the comparison aspects in the summary table and trends chart.

Layout Rules (Lessons Learned)

Top Pages Table — Use Paragraph for Word Wrap

CRITICAL: The Page column contains long URLs that WILL overflow into adjacent columns if you use plain strings. Always wrap ALL table cells in Paragraph() objects:

style_cell = ParagraphStyle('Cell', fontName='CNFontLight', fontSize=9, leading=13, wordWrap='CJK')
style_cell_hd = ParagraphStyle('CellHd', fontName='CNFont', fontSize=9, leading=13, textColor=colors.white)

page_rows = [[Paragraph("Page", style_cell_hd), Paragraph("Clicks", style_cell_hd), ...]]
for r in rows:
    page_rows.append([Paragraph(url, style_cell), Paragraph(f"{clicks:,}", style_cell), ...])

Country Chart — Keep on Its Own Row

Do NOT put the country pie chart side-by-side with the keyword chart. The pie becomes too small to read. Always put it on a separate row at width=16*cm.

File Naming — Always Include Date

Output filename must include today's date to avoid overwriting previous reports:

OUTPUT = f"search_console_report_{datetime.date.today()}.pdf"

Per-Site Layout Order (One Page Per Site)

  1. KPI metrics row (Clicks / Impressions / CTR / Avg. Position)
  2. Top Keywords chart (full width, 18.6cm)
  3. Country pie chart (own row, 16cm)
  4. Top Pages table (with Paragraph cells)
  5. PageBreak()

Reference Script

See gen_report.py in this skill directory for the complete, production-ready implementation with all fixes applied.

Output Files

| File | Description | |------|-------------| | sc_detailed_data.json | Raw API data for all sites (reproducible) | | report_charts/*.png | Generated chart images (temp, /tmp/) | | project/reports/search_console_report_YYYY-MM-DD.pdf | Final PDF report (date-stamped, never overwritten) |