first-commit

2026-05-04 14:58:14 -04:00
commit a46764fb1b
1210 changed files with 233231 additions and 0 deletions
@@ -0,0 +1,254 @@
+---
+name: pptx-html-fidelity-audit
+description: Audit a python-pptx export against its source HTML deck, identify layout/content drift (footer overflow, cropped content, missing italic/em, lost styling, off-rhythm spacing), and re-export with strict footer-rail + cursor-flow layout discipline. Use this skill whenever the user has a .pptx that was generated from an HTML slide deck and asks to compare/audit/verify/fix the export — including phrases like "compare ppt with html", "fidelity audit", "fix the pptx", "ppt is cut off", "footer overlap", "italic missing in pptx", "re-export the deck", "pptx-html-fidelity-audit", or any case where a python-pptx → HTML round-trip needs verification or repair. Also trigger when the user shows you a deck.html and a deck.pptx side by side and is debugging visual differences.
+triggers:
+  - "pptx fidelity"
+  - "pptx audit"
+  - "ppt 跑掉"
+  - "字型不對"
+  - "footer overlap"
+  - "verify pptx"
+  - "html to pptx"
+od:
+  mode: utility
+  scenario: engineering
+---
+
+# PPTX ↔ HTML Fidelity Audit
+
+A repeatable workflow for catching the ways a `python-pptx` export silently drifts from its HTML source — and fixing them with a layout discipline that prevents the same regressions on the next pass.
+
+## When this skill applies
+
+The user has:
+
+- A source HTML slide deck (typically a single-file deck with `<section class="slide">` blocks):
+
+  ```html
+  <section class="slide light">
+    <div class="chrome">2026 · Q2 review</div>
+    <span class="kicker">Pillar 03</span>
+    <h2 class="h-xl">Shipping <em>velocity</em> doubled</h2>
+    <p class="lead">…</p>
+    <div class="foot">page 5 / 14</div>
+  </section>
+  ```
+
+- A PPTX file generated from that deck via python-pptx (or similar).
+- A suspicion (or visible evidence) that the PPTX doesn't match the HTML — text bleeding into the footer, italic words gone flat, hero slides not centered, sections cropped, tag styling lost.
+
+If the user only has *one* of those two artifacts, this skill doesn't apply yet — first generate the missing one, or ask the user to provide it.
+
+## Why this is hard (and why a skill helps)
+
+PPTX is a fixed-canvas, absolute-positioned medium. HTML is a fluid, flow-based medium. A naive python-pptx export pins each block at hand-picked `(top, left)` coordinates, which works for the *first slide it was tested on* and silently fails for every other slide whose content has different intrinsic height. The result is the most common drift modes:
+
+1. **Footer overflow** — content's `top + height` crosses into the footer row.
+2. **Off-canvas content** — bottom of last block exceeds `7.5"` (16:9 canvas).
+3. **Italic loss** — `<em>` in HTML never gets `run.font.italic = True`.
+4. **Hero slides not centered** — vertical-stack slides use `MARGIN_TOP` instead of computing center.
+5. **Box bounds intruding** — the text fits, but the *shape's bounding box* is oversized and visually crosses the rail.
+6. **Tag/styling loss** — colored chrome rows, kicker uppercase tracking, mono-vs-serif assignments quietly fall back to defaults.
+
+Every one of these is a *layout discipline* problem, not a content problem. Once you adopt the discipline, they stop happening.
+
+---
+
+## Workflow
+
+The audit is five steps. Don't skip any of them — the discipline only works if the audit produces a real list of issues to drive the re-export. A fix-without-audit pass tends to leave half the issues alive.
+
+### Step 1 — Extract ground truth from the PPTX
+
+Run `scripts/extract_pptx.py <path-to.pptx> > pptx_dump.json`. The script walks every shape on every slide and dumps text, position (`top` / `left`), size (`width` / `height`), and per-run typography (font name, size pt, bold, italic, color). This is the *actual* state of the export — don't trust the export script's intent, trust the dump.
+
+For 14-slide decks, the dump is ~30–60 KB and human-readable.
+
+### Step 2 — Walk the HTML structure
+
+Read the source HTML and enumerate `<section class="slide">` blocks. For each, note:
+
+- The slide's theme (`light` / `dark` / `hero light` / `hero dark`).
+- The `chrome` row text (top metadata).
+- The `kicker` (small uppercase eyebrow above the headline).
+- The headline (h-hero / h-xl / etc.) and any sub-head.
+- The body copy and any structured blocks (pipeline steps, cards, pillars, observation cards).
+- The `foot` row (bottom metadata).
+- Any `<em>` or italic-styled spans — italic is the silent regression.
+
+Map each HTML slide to a PPTX slide index. For decks following the convention "slide 1 = cover, slide N = closing", the mapping is positional.
+
+### Step 3 — Build the audit table
+
+For each slide, walk shapes from the dump and check against expected layout rules. Use this exact table format — the severity column is what drives the fix priority:
+
+```
+| Slide | Issue | Severity |
+|---|---|---|
+| 1 cover | meta-row 底端 6.95" 蓋過 footer (6.7") | 🔴 |
+| 5 checklist | row B 步驟描述底端 7.2" 切到 footer | 🔴 |
+| 8 3E | 收束段落直接坐在 footer 起點 | 🔴 |
+| 9 on-day | step 描述底端剛好碰 footer，無安全距 | 🟠 |
+| 多處 | em (Playfair italic) 未保留 | 🟡 |
+```
+
+Severity rubric:
+
+- 🔴 **critical** — content cropped, text invisible, footer overlap, off-canvas. Must fix.
+- 🟠 **high** — content visible but visual hierarchy broken, no breathing room, hero not centered. Should fix.
+- 🟡 **medium** — italic/em missing, font fallback wrong, color drift. Fix in this pass.
+- 🟢 **low** — minor spacing/alignment, sub-pixel offsets. Note but don't block.
+
+After the table, write a short root-cause section: 90 % of the issues usually come from 2–3 systemic causes (e.g. "no footer rail enforced", "hero stacks pinned to MARGIN_TOP instead of centered", "italic never propagated"). Naming the systemic causes makes the re-export script much smaller and more correct.
+
+### Step 4 — Re-export with footer-rail + cursor-flow layout discipline
+
+This is the load-bearing technique. See `references/layout-discipline.md` for the full rules; the summary:
+
+**Define the rails up front, once, for the whole deck:**
+
+```python
+from pptx.util import Inches
+
+CANVAS_W       = Inches(13.333)   # 16:9
+CANVAS_H       = Inches(7.5)
+MARGIN_X       = Inches(0.6)
+MARGIN_TOP     = Inches(0.5)
+CONTENT_MAX_Y  = Inches(6.70)     # NOTHING in content area may cross this
+FOOTER_TOP     = Inches(6.85)     # footer row pinned here, edge-to-edge
+```
+
+> **Customizing the rails.** The defaults above suit a 16:9 canvas with a slim footer. If your design system uses a wider footer or a 4:3 canvas, override these constants in your export script and pass the same values to `verify_layout.py` via `--content-max-y` / `--canvas-h` / `--canvas-w`. See `references/layout-discipline.md` §1 for the full constant table.
+
+
+**Use a cursor for content blocks instead of pinning each block at an absolute y:**
+
+```python
+class Cursor:
+    """Advances down the slide; refuses to cross the footer rail."""
+    def __init__(self, y_start, cap=CONTENT_MAX_Y):
+        self.y = y_start
+        self.cap = cap
+    def take(self, h, gap=Inches(0.12)):  # ~1 line of whitespace at 14pt; tighten/loosen per design system
+        top = self.y
+        self.y = top + h + gap
+        if self.y > self.cap:
+            raise OverflowError(
+                f"cursor at {self.y} exceeds footer rail {self.cap}; "
+                f"reduce block height or split slide"
+            )
+        return top
+```
+
+For each slide, instantiate `Cursor(MARGIN_TOP)` and `take(height)` each block in reading order. The slide refuses to render if any block would cross the rail, so overflows become loud build errors instead of silent visual bugs.
+
+**Hero (vertically-centered) slides use a budget instead of a cursor:**
+
+```python
+def hero_layout(blocks):
+    """blocks = list of (height, gap_after) tuples in reading order."""
+    total = sum(h + g for h, g in blocks)
+    y_start = (CANVAS_H - total) / 2
+    return Cursor(y_start)
+```
+
+That single change kills "hero slide content sticks to top" — the most common hero defect.
+
+**Tighten box height to fit text + minimal padding.** PowerPoint reveals shape bounds when they overlap (selection halos, Z-order conflicts), and an oversized box can visually cross the footer rail even when the text inside doesn't. Compute box height from text metrics + ~0.05" pad, not from generous wrappers.
+
+**Preserve italic / em explicitly:**
+
+```python
+def add_run(p, text, font, size_pt, italic=False, bold=False, color=None):
+    r = p.add_run()
+    r.text = text
+    r.font.name = font
+    r.font.size = Pt(size_pt)
+    r.font.italic = italic
+    r.font.bold = bold
+    if color:
+        r.font.color.rgb = color
+    return r
+```
+
+When walking HTML, detect `<em>` / `<i>` / inline style `font-style: italic` and pass `italic=True`. Use the EN serif face (Playfair Display, Source Serif, or fallback Georgia) for italic display copy — the CJK serif typically has no italic and looks broken if you try to italicize it.
+
+For deeper font issues that the layout rails can't catch — variable-font traps where PowerPoint silently swaps to Calibri / Microsoft JhengHei, missing `<a:ea>` slot causing CJK runs to fall back, fake-italic on Han characters — read `references/font-discipline.md`. The five layers there cover everything `verify_layout.py` can't see.
+
+### Step 5 — Verify post-export
+
+After writing the new `.pptx`, run `scripts/verify_layout.py <path-to.pptx>`. The script:
+
+- Walks every shape on every slide.
+- Asserts `top + height ≤ CONTENT_MAX_Y` for content shapes (footer/page-number shapes are allowed below the rail).
+- Asserts `top + height ≤ CANVAS_H` for all shapes (no off-canvas).
+- Asserts `left + width ≤ CANVAS_W` and `left ≥ 0`.
+- Reports violations as a single block: slide index, shape name, observed bottom, rail.
+
+Zero violations is the gate for "this re-export is shippable". Don't claim the audit is fixed without running the verifier — the human eye misses 1–2 mm overflow at zoom-out, the script doesn't.
+
+---
+
+## Output to the user
+
+After Step 5 passes, report:
+
+1. **Audit table** — the table from Step 3.
+2. **Root causes** — 1-paragraph systemic explanation.
+3. **Fix list** — terse list of what was changed and why (e.g. "hero slides switched to budget centering", "all content blocks routed through Cursor", "em runs explicitly italic").
+4. **Verification** — "0 rail violations across N slides, file size X KB".
+5. **Path** — absolute path to the re-exported `.pptx`.
+
+The user is reading for two reasons: confirming the visible bugs are fixed, and trusting the systemic fix is right. Cover both.
+
+---
+
+## Bundled resources
+
+- `scripts/extract_pptx.py` — dump every shape on every slide as JSON. Run before the audit. **Important:** also run on the *original* export to compare, and on the *re-exported* one to confirm.
+- `scripts/verify_layout.py` — post-export rail checker. Returns nonzero exit code on violations so it slots into a CI pipeline if needed.
+- `references/layout-discipline.md` — the full footer-rail + cursor-flow rule set with code snippets for each common slide type (hero, content, pipeline, two-column, observation grid).
+- `references/font-discipline.md` — five-layer font audit: mapping, presence, variable-vs-static traps, the three XML language slots (`latin` / `ea` / `cs`), CJK + Latin italic interaction.
+- `references/audit-table-template.md` — copy-pasteable table template with severity legend.
+
+Read the references when:
+
+- The deck has slide types beyond what the SKILL.md covers (multi-column dashboards, embedded images, charts) → `layout-discipline.md`.
+- The audit shows 🟡 typography issues — italic missing, CJK falling back, unexpected `Calibri` / `Microsoft JhengHei` in the XML → `font-discipline.md`.
+- You want to drop the audit table directly into a report or markdown deliverable → `audit-table-template.md`.
+
+---
+
+## Anti-patterns to avoid
+
+- **Patching individual slides without naming the systemic cause.** If you fix slide 5 by lowering its block by 0.2", you'll be back fixing slide 9, 11, and 14 next. Find the rule that produced all four problems.
+- **Trusting the original export script's intent.** Always run the extractor against the actual file. Drift between intent and reality is the bug.
+- **Skipping verification because "it looked fine in PowerPoint preview".** Preview anti-aliasing hides 1–2 mm overflows. The script doesn't.
+- **Italicizing scripts that have no italic tradition.** CJK, Arabic, Hebrew, Devanagari, Thai, and Khmer all produce a synthesized slant when forced into `italic=True`, and the result looks mechanically deformed. Italicize *only* runs whose primary script supports italic — Latin, Cyrillic, Greek. See `references/font-discipline.md` Layer 5 for the implementation pattern.
+- **Using `MARGIN_TOP` for hero slides.** Hero slides need *budget centering*, not top-anchored. This is the most common hero defect and the cheapest to fix.
+
+---
+
+## Why geometry-based verification, not visual diff
+
+An earlier iteration of this skill leaned on visual diffing — render the
+.pptx through Keynote → PDF → PNG, screenshot the HTML through Chrome
+headless, stitch them side-by-side with `magick`. It worked, but with
+three sharp drawbacks:
+
+- **Platform lock-in.** Keynote AppleScript is macOS-only; `magick` and
+  font-discovery commands vary across OSes; CI pipelines on Linux can't
+  reproduce the chain.
+- **Imprecision.** A 1-2 mm overflow gets anti-aliased away in a PNG
+  preview. The human eye misses it; the script catches it as a hard
+  numeric violation.
+- **Setup cost.** Every contributor needs the full graphics toolchain
+  installed before they can audit. Geometry checks need only
+  `python-pptx`.
+
+Geometry-based verification gives up one thing the visual diff is good
+at: catching cases where shape positions are correct but the rendered
+glyph looks wrong (font fallback, kerning bugs, missing weight). When
+that case appears, fall back to a manual screenshot review — the
+five-layer audit in `references/font-discipline.md` covers most of the
+underlying causes.
@@ -0,0 +1,58 @@
+# Audit Table Template
+
+Drop-in markdown template for the Step-3 audit deliverable. Keep the column order and severity legend stable across audits — readers learn to scan for 🔴 first.
+
+## Template
+
+```markdown
+**Fidelity audit · `<deck-name>` · <date>**
+
+| Slide | Issue | Severity |
+|---|---|---|
+| 1 cover     | meta-row 底端 6.95" 蓋過 footer (6.7")        | 🔴 |
+| 2 principle | meta-row 蓋 footer                             | 🔴 |
+| 5 checklist | row B 步驟描述底端 7.2" 切到 footer            | 🔴 |
+| 8 3E        | 收束段落直接坐在 footer 起點                   | 🔴 |
+| 9 on-day    | step 描述底端剛好碰 footer，無安全距           | 🟠 |
+| 10 obs      | row 2 obs-card 底端 6.95" 切 footer            | 🔴 |
+| 11 P&D      | Note 段底端 7.34" 完全壓在 footer 之下          | 🔴 |
+| 13 deliv.   | pipeline 描述底端 7.05" 切 footer              | 🔴 |
+| 14 closing  | meta-row 底端 7.24" 壓到 footer 之外           | 🔴 |
+| 多處        | em (Playfair italic)、特殊字級對比未保留        | 🟡 |
+
+**Root causes**
+
+1. **No footer rail enforced.** Content blocks pinned at hand-picked y-coordinates; the script had no `CONTENT_MAX_Y` invariant, so `top + height` silently crossed `6.7"` whenever the content was taller than the test slide.
+2. **Hero slides anchored at `MARGIN_TOP`.** Vertical centering was done by eye; cover and chapter-intro slides drift down as block heights vary.
+3. **Italic propagation skipped.** `<em>` spans in HTML mapped to plain runs; the EN serif italic identity was lost across all hero slides.
+
+**Fix plan**
+
+- Introduce `CONTENT_MAX_Y = 6.70"` and `FOOTER_TOP = 6.85"` as module-level constants.
+- Route all content blocks through a `Cursor` that refuses to cross the rail.
+- Switch hero slides to `hero_layout(blocks)` — compute total stack height, center on canvas.
+- Tighten `desc_h` (pipeline `0.85"`, checklist `0.65"`) to fit text + 0.05" pad.
+- Add `italic=True` path in `add_run()` that swaps to EN serif for italic Latin runs; skip italic for CJK.
+- Add post-export `verify_layout.py` step; require zero rail violations.
+```
+
+## Severity legend (reproduce inline in reports)
+
+```markdown
+- 🔴 **critical** — content cropped, text invisible, footer overlap, off-canvas. Must fix.
+- 🟠 **high** — content visible but visual hierarchy broken, no breathing room. Should fix.
+- 🟡 **medium** — italic/em missing, font fallback wrong, color drift. Fix in this pass.
+- 🟢 **low** — minor spacing/alignment, sub-pixel offsets. Note but don't block.
+```
+
+## Verification footer (append after re-export)
+
+```markdown
+**Verification**
+
+- ✅ 0 rail violations across 14 slides
+- ✅ All shapes within canvas (`top + height ≤ 7.5"`, `left + width ≤ 13.333"`)
+- ✅ Italic preserved on all `<em>` runs (EN serif), skipped on CJK runs
+- ✅ Hero slides centered (cover, 03 act-i, 06 act-ii, 11 act-iii, 13 closing)
+- File: `<absolute-path>.pptx` · 54.7 KB
+```
@@ -0,0 +1,363 @@
+# Font Discipline for PPTX Exports
+
+Companion to `layout-discipline.md`. The rail / cursor primitives in that
+file catch geometric drift; this file catches the typography drift that
+geometry can't see — variable-font traps, missing CJK slots, fake italic
+on Han characters. These are the bugs that pass `verify_layout.py` and
+still look wrong.
+
+Read this when:
+
+- The audit table has 🟡 entries about italic / em / font fallback.
+- PowerPoint silently swaps to Calibri / Arial / Microsoft JhengHei /
+  Georgia after you specified a different family.
+- `unzip pptx | grep typeface` shows a face that isn't in your design system.
+
+## Layer 1 — Font mapping in the export script
+
+Walk each CSS class used by the source HTML and confirm the export
+script maps it to the **same** font family.
+
+⚠️ **Trap:** the visual category your eye reads is not always the
+class's semantic category. Editorial decks routinely bind `.lead`,
+`.callout`, or `.q-big` to a serif face, not the sans-serif you'd guess
+from "lead". Open the HTML's CSS, read the `font-family` declaration
+for each class, and copy the literal family name into the export's
+font table.
+
+Don't rely on visual intuition; rely on grep.
+
+> **Coverage gap for Latin-slot scripts (Cyrillic / Greek / Vietnamese).**
+> Russian / Ukrainian / Greek runs go through `<a:latin>`, not `<a:ea>` —
+> they use the Latin slot. Many display fonts (Playfair Display, Source
+> Serif 4) ship with weak or missing Cyrillic / Greek glyphs, and most
+> drop Vietnamese Extended diacritics (ếẫỡỗ). PowerPoint silently falls
+> back to Calibri / Times New Roman per missing glyph, producing
+> mid-paragraph face shifts that look like a styling bug.
+>
+> When mapping a CSS class to a Latin font, check the font actually
+> covers your scripts:
+>
+> ```bash
+> # macOS / Linux: list the unicode blocks a font supports
+> fc-query -f '%{charset}\n' "$(fc-match -f '%{file}\n' 'Playfair Display')" | head
+> ```
+>
+> ```powershell
+> # Windows: PowerShell + System.Drawing reads the registered family list
+> [System.Reflection.Assembly]::LoadWithPartialName("System.Drawing") | Out-Null
+> $f = New-Object System.Drawing.Text.PrivateFontCollection
+> # Coverage detail (Unicode ranges) is best read in fontforge:
+> # File → Open → pick the .ttf / .otf → Element → Font Info → OS/2 → Unicode Ranges.
+> ```
+>
+> Cross-platform fallback: open the font in fontforge → Element → Font Info → OS/2 → Unicode Ranges.
+>
+> If coverage is missing, either swap to a face that has it (e.g.
+> Inter / IBM Plex Sans for Cyrillic; Be Vietnam Pro for Vietnamese) or
+> set a different `<a:latin>` per language run.
+
+## Layer 2 — Font presence on the rendering machine
+
+PowerPoint uses the OS font cache. If the family name in your XML isn't
+installed, PowerPoint silently falls back. Check:
+
+```bash
+fc-list | grep -i "noto serif"            # Linux / WSL
+mdfind "kMDItemFSName == '*NotoSerif*'"   # macOS
+```
+
+```powershell
+# Windows (PowerShell)
+Get-ChildItem -Path "$env:WINDIR\Fonts","$env:LOCALAPPDATA\Microsoft\Windows\Fonts" `
+  -Filter "*NotoSerif*" -ErrorAction SilentlyContinue
+```
+
+Install missing families:
+
+```bash
+brew install --cask \
+  font-noto-serif-tc \
+  font-playfair-display \
+  font-source-serif-4 \
+  font-ibm-plex-mono
+```
+
+The `verify_layout.py` script can't see this — it only checks
+geometry. A standalone font audit step is required.
+
+## Layer 3 — Variable fonts vs. static families ← most common trap
+
+Modern fonts often ship as a **single variable file** containing all
+weights (`NotoSerifTC[wght].ttf`). Looks elegant, but PowerPoint Mac /
+Windows have spotty support:
+
+- macOS reports the variable font's family name as its **default static
+  instance** — usually ExtraLight or Regular.
+- PowerPoint asks the OS for "Noto Serif TC, weight 700"; the OS
+  reports the family as `Noto Serif TC ExtraLight`; PowerPoint can't
+  match → falls back to a system serif.
+
+Diagnose:
+
+```bash
+ls -la ~/Library/Fonts/ | grep -i NotoSerif
+```
+
+| What you see                           | Verdict                                 |
+| -------------------------------------- | --------------------------------------- |
+| One `*[wght].ttf` file                 | Variable. PowerPoint may not match.     |
+| Multiple `*-Regular.otf`, `*-Bold.otf` | Static family. Safe.                    |
+
+Fix by using the static family equivalent:
+
+| Don't use (variable)        | Use instead (static)              |
+| --------------------------- | --------------------------------- |
+| `Noto Serif TC` (variable)  | `Noto Serif CJK TC`               |
+| `Source Serif 4` (variable) | `Source Serif Pro` / `Source Serif 4` static instances |
+| `Inter` (variable)          | Per-weight `Inter Regular` / `Inter Bold` |
+
+After fixing the export, re-run `extract_pptx.py` and confirm the
+`font` field matches the static name.
+
+## Layer 4 — PPTX XML's three-language slots
+
+PowerPoint chooses a typeface per run by language script. Each run can
+declare three:
+
+| Attribute               | Used for                         |
+| ----------------------- | -------------------------------- |
+| `<a:latin typeface=…>`  | Latin script (a-z, A-Z, digits)  |
+| `<a:ea typeface=…>`     | East Asian (CJK) — **Chinese / Japanese / Korean go here** |
+| `<a:cs typeface=…>`     | Complex script (Arabic, Hebrew, Thai) |
+
+Audit a file:
+
+```bash
+unzip -o /path/to/deck.pptx -d /tmp/audit
+grep -h -oE 'typeface="[^"]+"' /tmp/audit/ppt/slides/slide*.xml | sort -u
+```
+
+Expected output: only the design-system fonts. If you see
+`Microsoft JhengHei`, `Calibri`, `Arial`, `Georgia`, `Consolas`,
+something has fallen back.
+
+**Common defect:** export script writes `<a:latin>` only. Chinese runs
+have no `<a:ea>` directive → PowerPoint picks the OS default
+(Microsoft JhengHei on Windows, Hiragino Sans on Mac). Result: Chinese
+characters in the wrong serif/sans family.
+
+Fix: when adding a run with mixed-language content, set all three
+attributes that apply.
+
+```python
+from pptx.oxml.ns import qn
+
+def set_run_fonts(run, latin: str | None = None, ea: str | None = None, cs: str | None = None):
+    rPr = run._r.get_or_add_rPr()
+    if latin:
+        el = rPr.find(qn('a:latin'))
+        if el is None:
+            el = rPr.makeelement(qn('a:latin'), {})
+            rPr.append(el)
+        el.set('typeface', latin)
+    if ea:
+        el = rPr.find(qn('a:ea'))
+        if el is None:
+            el = rPr.makeelement(qn('a:ea'), {})
+            rPr.append(el)
+        el.set('typeface', ea)
+    if cs:
+        el = rPr.find(qn('a:cs'))
+        if el is None:
+            el = rPr.makeelement(qn('a:cs'), {})
+            rPr.append(el)
+        el.set('typeface', cs)
+```
+
+PptxGenJS sets all three by default; raw XML injection or python-pptx
+without explicit `ea` slot does not.
+
+## Layer 5 — Italic + script interaction
+
+🚨 **`italic=True` is a Latin-script feature.** Apply it only to runs
+whose characters belong to scripts where italic is part of the writing
+tradition (Latin, Cyrillic, Greek). For everything else — CJK, Arabic,
+Hebrew, Devanagari, Thai, Khmer — PowerPoint synthesizes a slanted
+bitmap that looks mechanically deformed. The chain of failures, using
+CJK as the canonical example:
+
+1. `<a:latin>` slot has Playfair Display Italic (a Latin-only font).
+2. The CJK characters in the run have no glyph in Playfair → PowerPoint
+   substitutes a system CJK font.
+3. The substituted CJK font is forced into `italic=True` → since no
+   real CJK italic exists, PowerPoint synthesizes a slanted bitmap →
+   characters look mechanically deformed.
+
+The same pattern triggers for Arabic, Hebrew, Devanagari, and Thai —
+none of these scripts has an italic tradition, and faking it produces
+a slant that's visually broken.
+
+**Rule:** italic only applies to runs whose primary script supports it
+(Latin / Cyrillic / Greek). Indicate emphasis on other scripts via:
+
+- color tone (`COLOR_INK_60` for muted, full ink for emphasis)
+- weight contrast (Regular 400 vs. Bold 700)
+- a script-native italic variant **only if one actually ships** — most
+  don't
+
+Practical implementation:
+
+```python
+# Unicode ranges where italic should be suppressed.
+# Principle: include scripts whose writing tradition has no italic style.
+# Synthesized italic on these scripts produces a slanted bitmap that looks
+# mechanically deformed.
+NO_ITALIC_RANGES = (
+    (0x3400, 0x9FFF),    # CJK Unified Ideographs
+    (0xF900, 0xFAFF),    # CJK Compatibility Ideographs
+    (0x3040, 0x30FF),    # Hiragana + Katakana
+    (0xAC00, 0xD7AF),    # Hangul Syllables
+    (0x0590, 0x05FF),    # Hebrew
+    (0x0600, 0x06FF),    # Arabic
+    (0x0750, 0x077F),    # Arabic Supplement
+    # Indic scripts — none have an italic tradition; PowerPoint synthesizes
+    # a fake slant on all of them. Add new ranges here when the deck mixes
+    # in additional scripts (e.g. Sinhala U+0D80–U+0DFF).
+    (0x0900, 0x097F),    # Devanagari (Hindi, Marathi, Sanskrit)
+    (0x0980, 0x09FF),    # Bengali
+    (0x0A00, 0x0A7F),    # Gurmukhi (Punjabi)
+    (0x0A80, 0x0AFF),    # Gujarati
+    (0x0B00, 0x0B7F),    # Oriya
+    (0x0B80, 0x0BFF),    # Tamil
+    (0x0C00, 0x0C7F),    # Telugu
+    (0x0C80, 0x0CFF),    # Kannada
+    (0x0D00, 0x0D7F),    # Malayalam
+    # Southeast Asian
+    (0x0E00, 0x0E7F),    # Thai
+    (0x0E80, 0x0EFF),    # Lao
+    (0x1780, 0x17FF),    # Khmer
+)
+
+
+def has_no_italic_script(text: str) -> bool:
+    return any(
+        any(lo <= ord(c) <= hi for lo, hi in NO_ITALIC_RANGES)
+        for c in text
+    )
+
+
+def add_run_with_italic_safety(p, text, *, latin_face: str, ea_face: str,
+                               cs_face: str | None, size_pt: int,
+                               italic: bool, **kwargs):
+    """Drop italic if the run contains characters from scripts without italic tradition.
+
+    Args:
+        latin_face: Font for Latin / Cyrillic / Greek runs (a:latin slot).
+        ea_face: Font for CJK runs (a:ea slot).
+        cs_face: Font for complex scripts — Arabic, Hebrew, Devanagari,
+            Thai, etc. (a:cs slot). Pass None when the run contains no
+            complex-script characters; set_run_fonts skips the slot.
+    """
+    r = p.add_run()
+    r.text = text
+    r.font.size = Pt(size_pt)
+    r.font.italic = italic and not has_no_italic_script(text)
+    set_run_fonts(r, latin=latin_face, ea=ea_face, cs=cs_face)
+    return r
+```
+
+For mixed-script runs (e.g. `"In <em>2026</em> 開始"`), split into
+multiple runs at language boundaries so the italic attribute can apply
+to the Latin run only.
+
+## Beyond CJK — other scripts
+
+The five layers above are written in CJK examples because that's the
+most common pairing in Open Design today, but the same machinery
+applies to other scripts. Quick reference:
+
+| Script family            | XML slot   | Italic OK? | Most common defect                                                                  | Recommended faces                                |
+| ------------------------ | ---------- | ---------- | ----------------------------------------------------------------------------------- | ------------------------------------------------ |
+| Latin (en, de, es, vi…)  | `a:latin`  | ✅          | Vietnamese Extended diacritics dropped → fallback Calibri mid-paragraph             | Be Vietnam Pro, IBM Plex Sans, Source Sans 3     |
+| Cyrillic (ru, uk, bg)    | `a:latin`  | ✅          | Display fonts (Playfair, Source Serif) lack Cyrillic → fallback Calibri             | Inter, IBM Plex Sans, Roboto                     |
+| Greek (el)               | `a:latin`  | ✅          | Same as Cyrillic — display faces missing Greek → fallback                           | Inter, IBM Plex Sans                             |
+| CJK (zh, ja, ko)         | `a:ea`     | ❌          | Variable-font trap (Layer 3); missing `a:ea` slot → fallback Microsoft JhengHei     | Noto Sans CJK *, Source Han Sans, IBM Plex Sans JP |
+| Arabic / Hebrew / Persian | `a:cs`    | ❌          | `<a:rtl val="1"/>` not set → text direction breaks; kashida changes width           | Noto Naskh Arabic, IBM Plex Sans Arabic, Amiri   |
+| Devanagari / Bengali     | `a:cs`     | ❌          | PowerPoint defaults to Mangal/Vrinda (low fidelity); cluster shaping bumps line height | Noto Sans Devanagari, Mukta, Hind             |
+| Thai / Lao / Khmer       | `a:cs`     | ❌          | No inter-word spaces → PowerPoint's break engine produces poor wraps; tone marks bump line height | Noto Sans Thai, Sarabun, Noto Sans Khmer  |
+
+For RTL scripts (Arabic / Hebrew / Persian), set both `<a:cs typeface=…>`
+and `<a:rtl val="1"/>` on the run's `rPr`. Right-alignment, bidi text
+flow, and chrome / footer mirroring are out of scope for `verify_layout.py`
+today and need manual review — see the Tier 2 follow-up note in the
+audit checklist.
+
+> **RTL discipline scope.** Full RTL support is roughly 15–20% of the
+> font + layout discipline surface area: Unicode TR9 bidi resolution,
+> chrome / footer / page-number mirroring, kashida (Arabic
+> elongation) interaction with line-fill, and right-anchored
+> alignment. This skill covers the typeface + slot mechanics only;
+> bidi and mirroring are flagged for a Tier 2 `rtl-discipline.md`
+> follow-up when fa / ar / he usage volume justifies the investment.
+
+## Line height per script
+
+The `Cursor.take(gap=Inches(0.12))` default suits 14pt Latin body copy.
+Other scripts need more vertical headroom because of stacked diacritics,
+matras, or tone marks:
+
+| Script                                   | Recommended `gap` at 14pt body |
+| ---------------------------------------- | ------------------------------ |
+| Latin (no Vietnamese Extended)           | `Inches(0.12)` (default)       |
+| Latin (with Vietnamese Extended ếẫỗ)     | `Inches(0.14)`                 |
+| CJK                                      | `Inches(0.14–0.16)`            |
+| Devanagari / Bengali (matras / conjuncts)| `Inches(0.16–0.18)`            |
+| Thai / Lao / Khmer (tone marks above)    | `Inches(0.16–0.18)`            |
+| Arabic / Hebrew                          | `Inches(0.13)`                 |
+
+When the deck mixes scripts, take the max — line breathing-room is
+visual, an under-spaced Thai run in an otherwise Latin deck reads as
+"the Thai slide is broken".
+
+> **Source for these numbers.** Measured against Noto Sans / Noto
+> Serif / IBM Plex line-height at 14pt body with full diacritic stacks
+> (e.g. Devanagari conjuncts ष्ट्र, Thai 4-mark sequences ก़ํ้, stacked
+> Vietnamese ỗ). Adjust downward for condensed faces (Inter Condensed,
+> Noto Sans Condensed) and upward for display sizes ≥ 24pt where
+> diacritic ratios grow.
+
+## Audit checklist
+
+After re-export, confirm all five layers:
+
+- [ ] Layer 1: Each CSS class in the HTML maps to the intended family
+      in the export script's font table.
+- [ ] Layer 2: All declared families exist on the rendering machine
+      (`fc-list | grep`).
+- [ ] Layer 3: No variable-font filename pretending to be a static
+      family. `~/Library/Fonts/` shows multi-file static families for
+      every face used.
+- [ ] Layer 4: `unzip + grep typeface` returns only the design-system
+      fonts. No `Microsoft JhengHei` / `Calibri` / `Arial` / `Georgia`
+      / `Consolas` residue.
+- [ ] Layer 5: No run from a no-italic script (CJK / Arabic / Hebrew /
+      Devanagari / Thai) has `italic=True` set with a Latin italic
+      face in the `<a:latin>` slot.
+- [ ] **Beyond CJK:** RTL slides set `<a:rtl val="1"/>` on the
+      paragraph's `pPr` — verify with:
+
+      ```bash
+      unzip -o deck.pptx -d /tmp/audit
+      grep -h '<a:rtl' /tmp/audit/ppt/slides/*.xml | sort -u
+      # Expect a hit for every fa / ar / he slide; empty output on
+      # an RTL deck means the directionality wasn't propagated.
+      ```
+
+      Cursor `gap` is bumped per the line-height table above when the
+      deck includes Vietnamese, Devanagari, Thai, or Khmer content.
+
+If all five pass and the user still reports "the type looks wrong",
+ask for a screenshot pointing at the specific glyph or word — the
+remaining bugs are usually license-restricted fonts not embedded into
+the file (see `SKILL.md` Step 5 verification).
@@ -0,0 +1,371 @@
+# Footer-Rail + Cursor-Flow Layout Discipline
+
+The full rule set referenced from `SKILL.md` Step 4. Read this when the deck has slide types beyond simple title-+-body or when you're building the re-export script from scratch.
+
+> **How to use this file.** Skim §1-3 once to internalize the rules
+> (constants, `Cursor`, hero budget centering). Then jump to the slide-type
+> snippet that matches what you're building — pipeline, two-column,
+> observation grid, etc. — and adapt. The file is meant to be navigated,
+> not read end-to-end.
+
+## 1. Constants — define once at the top of the export script
+
+```python
+from pptx.util import Inches, Pt, Emu
+from pptx.dml.color import RGBColor
+
+# Canvas (16:9). Override only if the deck explicitly targets 4:3 or 1:1.
+CANVAS_W       = Inches(13.333)
+CANVAS_H       = Inches(7.5)
+
+# Margins
+MARGIN_X       = Inches(0.6)            # left / right symmetric
+MARGIN_TOP     = Inches(0.5)            # below the chrome row
+CONTENT_LEFT   = MARGIN_X
+CONTENT_RIGHT  = CANVAS_W - MARGIN_X
+CONTENT_W      = CONTENT_RIGHT - CONTENT_LEFT
+
+# Vertical rails — the load-bearing pair
+CHROME_TOP     = Inches(0.32)           # top metadata row
+CHROME_H       = Inches(0.20)
+CONTENT_TOP    = MARGIN_TOP             # cursor starts here on content slides
+CONTENT_MAX_Y  = Inches(6.70)           # NOTHING in content area may cross
+FOOTER_TOP     = Inches(6.85)           # foot row pinned here
+FOOTER_H       = Inches(0.22)
+
+# Theme colors — derive from the HTML :root block, do not invent
+COLOR_INK      = RGBColor(0x0a, 0x1f, 0x3d)   # dark theme background / light text color
+COLOR_PAPER    = RGBColor(0xf1, 0xf3, 0xf5)   # light theme background / dark text color
+COLOR_INK_60   = RGBColor(0x68, 0x77, 0x8e)   # 60 % opacity ink (precomputed)
+COLOR_PAPER_60 = RGBColor(0x9b, 0xa0, 0xa6)   # 60 % opacity paper
+
+# Typography stacks. EN italic uses serif-en; CJK never italicizes.
+FONT_SERIF_EN  = "Playfair Display"
+FONT_SERIF_FB  = "Source Serif 4"
+FONT_SERIF_ZH  = "Noto Serif TC"
+FONT_SANS_ZH   = "Noto Sans TC"
+FONT_MONO      = "IBM Plex Mono"
+```
+
+## 2. The Cursor primitive
+
+Used on all non-hero slides. The cursor advances down the slide and refuses to cross `CONTENT_MAX_Y`.
+
+```python
+class Cursor:
+    def __init__(self, y_start=CONTENT_TOP, cap=CONTENT_MAX_Y):
+        self.y = y_start
+        self.cap = cap
+        self.history = []   # list of (top, height, label) for debugging
+
+    def take(self, h, gap=Inches(0.12), label=""):
+        top = self.y
+        self.y = top + h + gap
+        self.history.append((top, h, label))
+        if self.y > self.cap:
+            raise OverflowError(
+                f"Cursor exceeded rail at '{label}': "
+                f"y={self.y} cap={self.cap}; "
+                f"history={self.history}"
+            )
+        return top
+
+    def remaining(self):
+        return self.cap - self.y
+```
+
+Usage:
+
+```python
+c = Cursor()
+add_kicker(slide, top=c.take(Inches(0.18), label="kicker"))
+add_h_xl(slide,   top=c.take(Inches(1.0),  label="h-xl"))
+add_lead(slide,   top=c.take(Inches(0.8),  label="lead"))
+add_pipeline(slide, top=c.take(Inches(2.6), label="pipeline"))
+```
+
+> **Per-script `gap` tuning.** The default `Inches(0.12)` matches 14pt
+> Latin body copy. Decks that include CJK, Devanagari, Thai, or
+> Khmer need more breathing room — line clusters and stacked tone
+> marks bump the rendered line height. Pass an explicit `gap=` per
+> block, or override the `Cursor` default at the top of your export.
+> The full per-script table is in
+> [`font-discipline.md` § Line height per script](font-discipline.md).
+>
+> **Detecting the highest-demand script in a mixed deck.** A deck
+> can mix `en` slides with `th` slides — locale alone isn't the
+> signal. Scan each slide's text against the Unicode ranges in
+> `font-discipline.md` Layer 5's `NO_ITALIC_RANGES` (extend with the
+> Vietnamese Extended block U+1E00–U+1EFF for ếẫỗ), record the
+> per-slide max-gap, and instantiate the slide's `Cursor` with that
+> value. For a uniform deck-wide setting, take the max across all
+> slides.
+
+If a slide raises `OverflowError`, fix one of three things:
+
+1. **Reduce block height** — the box was generously sized; tighten to actual text height.
+2. **Reduce gap** — the inter-block gap is excessive; trim from `0.18"` to `0.10"`.
+3. **Split the slide** — the content genuinely doesn't fit; this is a design problem, not a layout problem.
+
+Don't "solve" it by raising `CONTENT_MAX_Y`. The rail exists for a reason — content that crosses it will overlap the footer at full-screen presentation.
+
+## 3. Hero slides — budget centering, not cursor flow
+
+Hero slides (cover, chapter intros, big-quote pages) are vertically centered. The cursor model would put them at the top with empty space below — visually wrong.
+
+```python
+def hero_layout(blocks):
+    """
+    blocks: list of (height, gap_after) tuples in top-to-bottom reading order.
+    Returns a Cursor whose y_start is computed so the stack is centered.
+    """
+    total_h = sum(h + g for h, g in blocks)
+    y_start = (CANVAS_H - total_h) / 2
+    # Pin cap to bottom of available area so we still catch overflow.
+    return Cursor(y_start=y_start, cap=CANVAS_H - FOOTER_H - Inches(0.2))
+```
+
+Hero usage:
+
+```python
+# Plan the stack first.
+HERO_BLOCKS = [
+    (Inches(0.18), Inches(0.30)),   # kicker
+    (Inches(1.50), Inches(0.20)),   # h-hero
+    (Inches(0.45), Inches(0.40)),   # h-sub
+    (Inches(0.70), Inches(0.30)),   # lead
+    (Inches(0.20), Inches(0.00)),   # meta-row
+]
+c = hero_layout(HERO_BLOCKS)
+for (h, g), block_fn in zip(HERO_BLOCKS, [k_kicker, k_hero, k_sub, k_lead, k_meta]):
+    block_fn(slide, top=c.take(h, gap=g))
+```
+
+The pattern reads as: "list each block's actual height, then center the entire stack". One source of truth, no manual `MARGIN_TOP`.
+
+## 4. Footer is always pinned, never advanced
+
+Don't route the footer through the cursor — it has its own rail.
+
+```python
+def add_footer(slide, left_text, right_text, theme="dark"):
+    color = COLOR_PAPER_60 if theme == "dark" else COLOR_INK_60
+    add_text(slide,
+        left=CONTENT_LEFT, top=FOOTER_TOP,
+        width=CONTENT_W / 2, height=FOOTER_H,
+        text=left_text, font=FONT_MONO, size_pt=9,
+        color=color, align="left", letter_spacing=2.0)
+    add_text(slide,
+        left=CANVAS_W / 2, top=FOOTER_TOP,
+        width=CONTENT_W / 2, height=FOOTER_H,
+        text=right_text, font=FONT_MONO, size_pt=9,
+        color=color, align="right", letter_spacing=2.0)
+```
+
+`add_chrome` is the same idea pinned at `CHROME_TOP`. Both rails sit *outside* the content area, so they never collide with the cursor.
+
+## 5. Box height ≠ text height — but tight is better than loose
+
+PowerPoint draws shape bounds visibly when:
+
+- Two shapes overlap (selection halos in editor, faint anti-alias seam in presentation mode).
+- A shape with a fill or border crosses the rail.
+- Z-order conflicts cause one shape to clip another.
+
+So even when the *text* fits within the content area, an oversized *box* can intrude. Tighten box height to:
+
+```
+box_h = (n_lines * line_height_pt + 2 * pad_pt) / 72
+```
+
+where `pad_pt` is 2–4 pt (≈ 0.03–0.05"). For multi-line text frames, set `text_frame.word_wrap = True` and don't pad vertically — let the text frame's intrinsic metrics size itself.
+
+For headline blocks with a known line count, you can also set:
+
+```python
+tf = shape.text_frame
+tf.auto_size = MSO_AUTO_SIZE.SHAPE_TO_FIT_TEXT
+```
+
+Then read `shape.height` *after* adding text to find the actual height for the cursor.
+
+## 6. Italic preservation — only EN serif, never CJK
+
+The single most common silent regression. HTML `<em>`, `<i>`, and inline `font-style: italic` should all map to `run.font.italic = True`. But:
+
+- **EN/Latin display copy** (Playfair Display, Source Serif) has a real italic. Use it.
+- **CJK display copy** (Noto Serif TC, Source Han Serif) has no italic. Synthesizing produces a slanted bitmap that looks broken. Skip italic for CJK runs even if the HTML had `<em>` around the CJK text.
+- **EN body copy** can use sans italic if the body family supports it; if not, swap to serif italic for the duration of the run.
+
+```python
+def add_run(p, text, *, font, size_pt, italic=False, bold=False, color=None):
+    r = p.add_run()
+    r.text = text
+    # If italic is requested, force an EN serif that supports it.
+    if italic:
+        r.font.name = FONT_SERIF_EN if not _is_cjk(text) else font
+        r.font.italic = not _is_cjk(text)
+    else:
+        r.font.name = font
+        r.font.italic = False
+    r.font.size = Pt(size_pt)
+    r.font.bold = bool(bold)
+    if color is not None:
+        r.font.color.rgb = color
+    return r
+
+def _is_cjk(s):
+    return any('\u4e00' <= c <= '\u9fff' or '\u3040' <= c <= '\u30ff' for c in s)
+```
+
+When walking HTML, detect italic spans:
+
+```python
+from html.parser import HTMLParser
+
+class ItalicSpans(HTMLParser):
+    def __init__(self):
+        super().__init__()
+        self.italic_depth = 0
+        self.runs = []   # list of (text, italic_bool)
+        self._buf = []
+        self._italic = False
+
+    def handle_starttag(self, tag, attrs):
+        if tag in ("em", "i"):
+            self._flush()
+            self.italic_depth += 1
+            self._italic = True
+        elif tag == "span":
+            style = dict(attrs).get("style", "")
+            if "italic" in style:
+                self._flush()
+                self.italic_depth += 1
+                self._italic = True
+
+    def handle_endtag(self, tag):
+        if tag in ("em", "i", "span") and self.italic_depth > 0:
+            self._flush()
+            self.italic_depth -= 1
+            self._italic = self.italic_depth > 0
+
+    def handle_data(self, data):
+        self._buf.append(data)
+
+    def _flush(self):
+        if self._buf:
+            self.runs.append(("".join(self._buf), self._italic))
+            self._buf = []
+```
+
+## 7. Slide-type recipes
+
+### 7.1 Cover / hero with vertical center
+
+```python
+def slide_cover(prs, *, title, subtitle, lead, meta, chrome_l, chrome_r):
+    slide = prs.slides.add_slide(blank_layout)
+    paint_bg(slide, COLOR_INK)
+    add_chrome(slide, chrome_l, chrome_r, theme="dark")
+
+    blocks = [
+        (Inches(0.18), Inches(0.32)),   # kicker
+        (Inches(1.50), Inches(0.18)),   # h-hero
+        (Inches(0.45), Inches(0.36)),   # h-sub
+        (Inches(0.70), Inches(0.30)),   # lead
+        (Inches(0.20), Inches(0.00)),   # meta
+    ]
+    c = hero_layout(blocks)
+    add_kicker(slide, top=c.take(*blocks[0]), text="SOP · Coach Edition")
+    add_h_hero(slide, top=c.take(*blocks[1]), text=title)
+    add_h_sub(slide,  top=c.take(*blocks[2]), text=subtitle)
+    add_lead(slide,   top=c.take(*blocks[3]), text=lead)
+    add_meta_row(slide, top=c.take(*blocks[4]), items=meta)
+
+    add_footer(slide, "主責教練 SOP", "— 2026 —", theme="dark")
+```
+
+### 7.2 Content with pipeline (4–5 step horizontal flow)
+
+```python
+def slide_pipeline(prs, *, kicker, headline, intro, label, steps):
+    slide = prs.slides.add_slide(blank_layout)
+    paint_bg(slide, COLOR_PAPER)
+    add_chrome(slide, "On-Day · Coach Actions", "08 / 14", theme="light")
+
+    c = Cursor()
+    add_kicker(slide, top=c.take(Inches(0.18), label="kicker"), text=kicker)
+    add_h_xl(slide,   top=c.take(Inches(0.95), label="h-xl"), text=headline)
+    add_lead(slide,   top=c.take(Inches(0.65), label="lead"), text=intro)
+    add_pipeline(slide,
+        top=c.take(Inches(2.30), label="pipeline"),
+        section_label=label,
+        steps=steps,
+        n_cols=len(steps))
+
+    add_footer(slide, "Page 08 · 教練當天行動", "Witness, don't intervene", theme="light")
+```
+
+`add_pipeline` internally lays out N step cards across `CONTENT_W` with `step_h` derived from the longest step's text height. Don't fix `step_h` to a constant — let it grow to fit, and let the cursor's overflow guard catch problems.
+
+### 7.3 Two-column comparison / concern cards
+
+```python
+def slide_two_col(prs, *, kicker, headline, intro, left, right):
+    slide = prs.slides.add_slide(blank_layout)
+    paint_bg(slide, COLOR_INK)
+    add_chrome(slide, "First-Time Caveats · 首辦提醒", "05 / 14", theme="dark")
+
+    c = Cursor()
+    add_kicker(slide,  top=c.take(Inches(0.18)), text=kicker)
+    add_h_xl(slide,    top=c.take(Inches(0.95)), text=headline)
+    add_lead(slide,    top=c.take(Inches(0.55)), text=intro)
+    pair_top = c.take(Inches(3.00), label="pair")
+    col_w = (CONTENT_W - Inches(0.4)) / 2
+    add_concern_card(slide, left=CONTENT_LEFT,            top=pair_top, w=col_w, h=Inches(2.9), data=left)
+    add_concern_card(slide, left=CONTENT_LEFT + col_w + Inches(0.4), top=pair_top, w=col_w, h=Inches(2.9), data=right)
+
+    add_footer(slide, "Page 05 · 首次辦理特別提醒", "典禮 ≠ 領導日", theme="dark")
+```
+
+Notice the pattern: `c.take(Inches(3.00), label="pair")` reserves 3.0" of vertical space for *the whole pair row*; then the two columns are placed side-by-side at that `top`. The cursor doesn't know about columns, only about row heights.
+
+### 7.4 Observation grid (3 × 2 cards)
+
+```python
+def slide_obs_grid(prs, *, kicker, headline, intro, cards):
+    assert len(cards) == 6
+    slide = prs.slides.add_slide(blank_layout)
+    paint_bg(slide, COLOR_PAPER)
+    add_chrome(slide, "Observation · 觀察筆記", "09 / 14", theme="light")
+
+    c = Cursor()
+    add_kicker(slide, top=c.take(Inches(0.18)), text=kicker)
+    add_h_xl(slide,   top=c.take(Inches(0.95)), text=headline)
+    add_lead(slide,   top=c.take(Inches(0.55)), text=intro)
+    grid_top = c.take(Inches(2.40), label="3x2 grid")
+
+    col_w = (CONTENT_W - Inches(0.6)) / 3
+    row_h = Inches(1.10)
+    for i, card in enumerate(cards):
+        col = i % 3
+        row = i // 3
+        x = CONTENT_LEFT + col * (col_w + Inches(0.3))
+        y = grid_top + row * (row_h + Inches(0.20))
+        add_obs_card(slide, left=x, top=y, w=col_w, h=row_h, data=card)
+
+    add_footer(slide, "Page 09 · 觀察筆記六項指標", "記錄用 · 不當場評分", theme="light")
+```
+
+## 8. Common pitfalls and how the discipline catches them
+
+| Pitfall | How the discipline catches it |
+|---|---|
+| Hero slide stuck to top | `hero_layout(blocks)` budgets total height and centers automatically |
+| Last content block crosses footer | `Cursor.take()` raises `OverflowError` before render |
+| Box bounds intrude on rail | tighten `box_h` to text height + 0.05" pad; verifier flags violations |
+| Italic gone flat | `add_run(..., italic=True)` swaps to EN serif; CJK skipped |
+| Footer text overlaps content | footer pinned at `FOOTER_TOP`, never routed through cursor |
+| Chrome row drifts down on long titles | chrome pinned at `CHROME_TOP`, never advanced |
+| Off-canvas content | `verify_layout.py` asserts `top + height ≤ CANVAS_H` |
+| Mixed font fallback | always pass `font=FONT_*` constant; never let python-pptx pick |
@@ -0,0 +1,2 @@
+__pycache__/
+*.pyc
@@ -0,0 +1,134 @@
+#!/usr/bin/env python3
+"""
+Extract every shape on every slide of a .pptx into a JSON dump.
+
+Usage:
+    python extract_pptx.py <path/to/deck.pptx>            # prints to stdout
+    python extract_pptx.py <path/to/deck.pptx> -o dump.json
+
+The dump captures the *actual* state of the export — text content, position,
+size, and per-run typography (font name, size, bold, italic, color). Use this
+as the ground truth for the fidelity audit; do not trust the export script's
+intent.
+
+Coordinates are reported in inches (rounded to 3 decimals) so they're
+human-readable when comparing against rails like CONTENT_MAX_Y = 6.70".
+"""
+from __future__ import annotations
+
+import argparse
+import json
+import sys
+from pathlib import Path
+
+try:
+    from pptx import Presentation
+    from pptx.util import Emu
+except ImportError:
+    sys.stderr.write(
+        "python-pptx is required. Install with: pip install python-pptx\n"
+    )
+    sys.exit(2)
+
+
+def emu_to_in(emu: int | None) -> float | None:
+    if emu is None:
+        return None
+    return round(emu / 914400, 3)
+
+
+def color_repr(color) -> str | None:
+    """Best-effort color extraction. Returns hex string or None."""
+    if color is None:
+        return None
+    try:
+        # ColorFormat.type may be None when no explicit color is set.
+        if color.type is None:
+            return None
+        rgb = color.rgb
+        if rgb is None:
+            return None
+        return f"#{str(rgb).lower()}"
+    except (AttributeError, ValueError, TypeError):
+        return None
+
+
+def extract_runs(text_frame) -> list[dict]:
+    runs = []
+    for para in text_frame.paragraphs:
+        for run in para.runs:
+            font = run.font
+            runs.append({
+                "text": run.text,
+                "font": font.name,
+                "size_pt": float(font.size.pt) if font.size is not None else None,
+                "bold": bool(font.bold) if font.bold is not None else None,
+                "italic": bool(font.italic) if font.italic is not None else None,
+                # Color is independent of font name/size: a run can inherit
+                # font from the theme yet set its own color. Color drift is
+                # one of the things this audit needs to catch, so don't gate
+                # the extraction on unrelated font attributes.
+                "color": color_repr(font.color),
+            })
+    return runs
+
+
+def extract_shape(shape) -> dict:
+    data = {
+        "name": shape.name,
+        "shape_type": str(shape.shape_type) if shape.shape_type is not None else None,
+        "left_in": emu_to_in(shape.left),
+        "top_in": emu_to_in(shape.top),
+        "width_in": emu_to_in(shape.width),
+        "height_in": emu_to_in(shape.height),
+    }
+    if shape.left is not None and shape.height is not None and shape.top is not None:
+        data["bottom_in"] = emu_to_in(shape.top + shape.height)
+        data["right_in"] = emu_to_in(shape.left + shape.width)
+    if shape.has_text_frame:
+        tf = shape.text_frame
+        data["text"] = tf.text
+        data["runs"] = extract_runs(tf)
+    return data
+
+
+def extract_pptx(path: Path) -> dict:
+    prs = Presentation(str(path))
+    canvas = {
+        "width_in": emu_to_in(prs.slide_width),
+        "height_in": emu_to_in(prs.slide_height),
+    }
+    slides = []
+    for i, slide in enumerate(prs.slides, 1):
+        shapes = [extract_shape(s) for s in slide.shapes]
+        slides.append({"index": i, "shapes": shapes})
+    return {
+        "source": str(path),
+        "canvas": canvas,
+        "slide_count": len(slides),
+        "slides": slides,
+    }
+
+
+def main() -> int:
+    ap = argparse.ArgumentParser(description=__doc__.split("\n\n")[0])
+    ap.add_argument("path", type=Path, help=".pptx file to extract")
+    ap.add_argument("-o", "--output", type=Path, help="write JSON to this path; default stdout")
+    args = ap.parse_args()
+
+    if not args.path.exists():
+        ap.error(f"file not found: {args.path}")
+
+    data = extract_pptx(args.path)
+    payload = json.dumps(data, ensure_ascii=False, indent=2)
+    if args.output:
+        args.output.write_text(payload, encoding="utf-8")
+        sys.stderr.write(f"wrote {args.output} ({len(payload)} bytes, {data['slide_count']} slides)\n")
+    else:
+        sys.stdout.write(payload)
+        sys.stdout.write("\n")
+    return 0
+
+
+if __name__ == "__main__":
+    raise SystemExit(main())
@@ -0,0 +1,144 @@
+#!/usr/bin/env python3
+"""
+Verify a re-exported .pptx against footer-rail + canvas-bound invariants.
+
+Usage:
+    python verify_layout.py <path/to/deck.pptx>
+    python verify_layout.py <path/to/deck.pptx> --content-max-y 6.70 --canvas-h 7.5
+
+Exits 0 on no violations, 1 on any violation. Prints a single block of
+violations sorted by slide index, one per line:
+
+    slide 5  shape 'desc-row-B-1'  bottom 7.214" crosses footer rail 6.70"
+    slide 11 shape 'note-paragraph' bottom 7.342" exceeds canvas 7.50"
+
+Use this as the gate for "this re-export is shippable". Don't claim the audit
+is fixed without running this script — the human eye misses 1–2 mm overflow
+at zoom-out, the script doesn't.
+
+Footer / chrome shapes are exempt from the content rail. Two heuristics
+identify them, in this order:
+
+1. **By name** — any shape whose name contains "footer", "foot", "chrome",
+   "page", or "pagination" (case-insensitive). Use semantic names in your
+   export script if you can.
+2. **By position** — any shape whose `top` is at or below the footer-zone
+   threshold (default `--footer-zone-top 6.80`). This catches python-pptx's
+   auto-generated names like "TextBox 3" when the export script didn't name
+   them. The threshold sits ~0.10" above FOOTER_TOP so chrome rows pinned
+   exactly at FOOTER_TOP are still recognized.
+"""
+from __future__ import annotations
+
+import argparse
+import sys
+from pathlib import Path
+
+try:
+    from pptx import Presentation
+except ImportError:
+    sys.stderr.write(
+        "python-pptx is required. Install with: pip install python-pptx\n"
+    )
+    sys.exit(2)
+
+
+FOOTER_NAME_HINTS = ("footer", "foot", "chrome", "page", "pagination")
+EPS_IN = 0.005   # ignore sub-pixel overflows (~0.13mm)
+
+
+def is_footer_by_name(name: str) -> bool:
+    n = (name or "").lower()
+    return any(hint in n for hint in FOOTER_NAME_HINTS)
+
+
+def emu_to_in(emu: int | None) -> float:
+    return (emu or 0) / 914400
+
+
+def verify(path: Path, content_max_y: float, canvas_w: float, canvas_h: float,
+           footer_zone_top: float) -> list[str]:
+    prs = Presentation(str(path))
+    violations: list[str] = []
+
+    actual_w = emu_to_in(prs.slide_width)
+    actual_h = emu_to_in(prs.slide_height)
+    if abs(actual_w - canvas_w) > EPS_IN or abs(actual_h - canvas_h) > EPS_IN:
+        violations.append(
+            f"canvas mismatch: file is {actual_w:.3f}\" x {actual_h:.3f}\", "
+            f"expected {canvas_w}\" x {canvas_h}\""
+        )
+
+    for i, slide in enumerate(prs.slides, 1):
+        for shape in slide.shapes:
+            if shape.top is None or shape.height is None:
+                continue
+            top = emu_to_in(shape.top)
+            left = emu_to_in(shape.left)
+            bottom = top + emu_to_in(shape.height)
+            right = left + emu_to_in(shape.width)
+            name = shape.name or "<unnamed>"
+
+            # Off-canvas (hard fail for any shape).
+            if bottom > canvas_h + EPS_IN:
+                violations.append(
+                    f"slide {i:<2} shape '{name}' bottom {bottom:.3f}\" "
+                    f"exceeds canvas {canvas_h}\""
+                )
+            if right > canvas_w + EPS_IN:
+                violations.append(
+                    f"slide {i:<2} shape '{name}' right {right:.3f}\" "
+                    f"exceeds canvas width {canvas_w}\""
+                )
+            if top < -EPS_IN:
+                violations.append(
+                    f"slide {i:<2} shape '{name}' top {top:.3f}\" is negative"
+                )
+            if left < -EPS_IN:
+                violations.append(
+                    f"slide {i:<2} shape '{name}' left {left:.3f}\" is negative"
+                )
+
+            # Footer rail (only enforced on content shapes).
+            # Shape is exempt if (a) named like a footer, or
+            # (b) pinned at-or-below the footer zone threshold.
+            if is_footer_by_name(name) or top >= footer_zone_top - EPS_IN:
+                continue
+            if bottom > content_max_y + EPS_IN:
+                violations.append(
+                    f"slide {i:<2} shape '{name}' bottom {bottom:.3f}\" "
+                    f"crosses footer rail {content_max_y}\""
+                )
+
+    return violations
+
+
+def main() -> int:
+    ap = argparse.ArgumentParser(description=__doc__.split("\n\n")[0])
+    ap.add_argument("path", type=Path, help=".pptx file to verify")
+    ap.add_argument("--content-max-y", type=float, default=6.70,
+                    help="content rail in inches; nothing in content area may cross (default 6.70)")
+    ap.add_argument("--canvas-w", type=float, default=13.333,
+                    help="expected canvas width in inches (default 13.333 = 16:9)")
+    ap.add_argument("--canvas-h", type=float, default=7.5,
+                    help="expected canvas height in inches (default 7.5 = 16:9)")
+    ap.add_argument("--footer-zone-top", type=float, default=6.80,
+                    help="any shape with top >= this is treated as footer/chrome "
+                         "(default 6.80; sits 0.10\" above the typical FOOTER_TOP=6.85\")")
+    args = ap.parse_args()
+
+    if not args.path.exists():
+        ap.error(f"file not found: {args.path}")
+
+    violations = verify(args.path, args.content_max_y, args.canvas_w, args.canvas_h,
+                        args.footer_zone_top)
+    if violations:
+        sys.stderr.write("\n".join(violations) + "\n")
+        sys.stderr.write(f"\n{len(violations)} violation(s) found in {args.path}\n")
+        return 1
+    sys.stderr.write(f"OK: 0 violations across all slides in {args.path}\n")
+    return 0
+
+
+if __name__ == "__main__":
+    raise SystemExit(main())