AI PPT 生成引擎设计
原创
灵阙教研团队
B 基础 进阶 |
约 12 分钟阅读
更新于 2026-02-28 AI 导读
AI PPT 生成引擎设计 模板系统、布局算法、内容感知设计、图像放置与导出流水线的工程化架构 一、PPT 生成的本质挑战 PPT 生成不是"把文字放到幻灯片上"——它是一个受约束的布局优化问题:在有限的画布空间内,将文字、图像、图表等元素排列成视觉上和谐、信息上清晰的版面。 这个问题之所以难,是因为它同时涉及三个领域: 内容理解:从输入文本中提取结构化信息(标题、要点、数据)...
AI PPT 生成引擎设计
模板系统、布局算法、内容感知设计、图像放置与导出流水线的工程化架构
一、PPT 生成的本质挑战
PPT 生成不是"把文字放到幻灯片上"——它是一个受约束的布局优化问题:在有限的画布空间内,将文字、图像、图表等元素排列成视觉上和谐、信息上清晰的版面。
这个问题之所以难,是因为它同时涉及三个领域:
- 内容理解:从输入文本中提取结构化信息(标题、要点、数据)
- 视觉设计:将信息映射为视觉元素(排版、配色、层级)
- 格式工程:将设计输出为标准文件格式(PPTX、PDF、图片)
系统架构全景
用户输入(文本/大纲/文件)
|
v
[内容解析引擎]
|
v
[幻灯片规划器] -- 决定页数、每页类型、内容分配
|
v
[布局引擎] -- 根据模板+内容选择最优布局
|
v
[图像生成] -- AI 生成配图 / 图表渲染
|
v
[样式引擎] -- 应用配色方案、字体、间距
|
v
[渲染导出] -- PPTX / PDF / PNG
|
v
最终文件
二、模板系统设计
2.1 模板 Schema
模板不是一个固定的 PPTX 文件,而是一套声明式的布局规则:
// types/template.ts
interface PPTTemplate {
id: string;
name: string;
description: string;
category: 'business' | 'education' | 'creative' | 'minimal';
// Visual identity
colorScheme: ColorScheme;
typography: TypographyConfig;
// Layout rules
layouts: SlideLayout[];
// Style generation prompt (for AI image generation)
stylePrompt: string;
keywords: string[];
// Dimensions
width: number; // pixels (default: 1920)
height: number; // pixels (default: 1080)
}
interface ColorScheme {
primary: string; // Main brand color
secondary: string; // Accent color
background: string; // Slide background
text: string; // Body text
heading: string; // Heading text
accent: string; // Highlights, links
gradient?: {
from: string;
to: string;
angle: number;
};
}
interface TypographyConfig {
headingFont: string;
bodyFont: string;
headingSize: number; // px
bodySize: number; // px
lineHeight: number; // multiplier
headingWeight: number; // 400-900
}
interface SlideLayout {
type: 'title' | 'content' | 'two-column' | 'image-left'
| 'image-right' | 'full-image' | 'comparison' | 'data'
| 'quote' | 'closing';
zones: LayoutZone[];
padding: { top: number; right: number; bottom: number; left: number };
}
interface LayoutZone {
id: string;
role: 'title' | 'subtitle' | 'body' | 'image' | 'chart' | 'icon';
bounds: { x: number; y: number; width: number; height: number }; // 0-1 normalized
style?: Record<string, string>;
optional?: boolean;
}
2.2 模板解析与应用
// template-engine.ts
class TemplateEngine {
private templates: Map<string, PPTTemplate>;
constructor(templates: PPTTemplate[]) {
this.templates = new Map(templates.map(t => [t.id, t]));
}
resolveTemplate(templateId: string): PPTTemplate {
const template = this.templates.get(templateId);
if (!template) {
throw new Error(`Template not found: ${templateId}`);
}
return template;
}
selectLayout(
template: PPTTemplate,
slideContent: SlideContent,
): SlideLayout {
/**
* Content-aware layout selection.
* Choose the best layout based on what content is available.
*/
const hasImage = !!slideContent.image;
const hasChart = !!slideContent.chartData;
const bulletCount = slideContent.bullets?.length ?? 0;
const isTitle = slideContent.slideType === 'title';
if (isTitle) {
return this.findLayout(template, 'title');
}
if (hasChart) {
return this.findLayout(template, 'data');
}
if (hasImage && bulletCount > 0) {
// Alternate image position for visual rhythm
return this.findLayout(
template,
slideContent.index % 2 === 0 ? 'image-left' : 'image-right'
);
}
if (hasImage && bulletCount === 0) {
return this.findLayout(template, 'full-image');
}
if (bulletCount > 4) {
return this.findLayout(template, 'two-column');
}
return this.findLayout(template, 'content');
}
private findLayout(template: PPTTemplate, type: string): SlideLayout {
return template.layouts.find(l => l.type === type)
?? template.layouts.find(l => l.type === 'content')!;
}
applyColorScheme(
baseColors: ColorScheme,
overrides?: Partial<ColorScheme>,
): ColorScheme {
/**
* Apply color overrides, filtering out undefined values.
* This prevents the { ...defaults, ...overrides } trap
* where undefined clobbers defaults.
*/
if (!overrides) return baseColors;
const filtered = Object.fromEntries(
Object.entries(overrides).filter(([_, v]) => v !== undefined)
);
return { ...baseColors, ...filtered };
}
}
三、布局算法
3.1 约束满足布局
幻灯片布局本质是一个约束满足问题(CSP):每个元素有位置和大小约束,元素之间不能重叠,整体需要视觉平衡。
// layout-solver.ts
interface LayoutConstraint {
element: string;
type: 'position' | 'size' | 'alignment' | 'spacing';
value: unknown;
}
class LayoutSolver {
private readonly canvasWidth: number;
private readonly canvasHeight: number;
private readonly padding: { top: number; right: number; bottom: number; left: number };
constructor(width: number, height: number, padding: typeof LayoutSolver.prototype.padding) {
this.canvasWidth = width;
this.canvasHeight = height;
this.padding = padding;
}
solve(zones: LayoutZone[], content: SlideContent): ResolvedElement[] {
const elements: ResolvedElement[] = [];
const usableWidth = this.canvasWidth - this.padding.left - this.padding.right;
const usableHeight = this.canvasHeight - this.padding.top - this.padding.bottom;
for (const zone of zones) {
// Skip optional zones with no content
if (zone.optional && !this.hasContentForZone(zone, content)) {
continue;
}
const resolved: ResolvedElement = {
id: zone.id,
role: zone.role,
x: this.padding.left + zone.bounds.x * usableWidth,
y: this.padding.top + zone.bounds.y * usableHeight,
width: zone.bounds.width * usableWidth,
height: zone.bounds.height * usableHeight,
content: this.getContentForZone(zone, content),
style: zone.style ?? {},
};
// Auto-adjust text size to fit
if (zone.role === 'body' || zone.role === 'title') {
resolved.fontSize = this.calculateFontSize(
resolved.content as string,
resolved.width,
resolved.height,
zone.role === 'title' ? 48 : 24,
);
}
elements.push(resolved);
}
return elements;
}
private calculateFontSize(
text: string,
maxWidth: number,
maxHeight: number,
idealSize: number,
): number {
/**
* Binary search for the largest font size that fits the box.
* Approximate: assume average char width = fontSize * 0.6
*/
const lines = text.split('\n');
let fontSize = idealSize;
while (fontSize > 12) {
const charWidth = fontSize * 0.6;
const lineHeight = fontSize * 1.5;
const charsPerLine = Math.floor(maxWidth / charWidth);
let totalLines = 0;
for (const line of lines) {
totalLines += Math.ceil(line.length / charsPerLine);
}
if (totalLines * lineHeight <= maxHeight) {
return fontSize;
}
fontSize -= 2;
}
return 12; // minimum readable size
}
private hasContentForZone(zone: LayoutZone, content: SlideContent): boolean {
switch (zone.role) {
case 'image': return !!content.image;
case 'chart': return !!content.chartData;
case 'subtitle': return !!content.subtitle;
default: return true;
}
}
private getContentForZone(zone: LayoutZone, content: SlideContent): unknown {
switch (zone.role) {
case 'title': return content.title;
case 'subtitle': return content.subtitle;
case 'body': return content.bullets?.join('\n') ?? content.bodyText ?? '';
case 'image': return content.image;
case 'chart': return content.chartData;
default: return '';
}
}
}
四、配色方案生成
4.1 基于主题的自动配色
// color-generator.ts
interface ColorPalette {
primary: string;
secondary: string;
accent: string;
background: string;
text: string;
heading: string;
}
function generateColorScheme(
topic: string,
mood: 'professional' | 'creative' | 'warm' | 'cool' | 'dark',
): ColorPalette {
/**
* Generate color scheme based on topic semantics and mood.
* Uses predefined palettes with topic-based selection.
*/
const palettes: Record<string, ColorPalette> = {
professional: {
primary: '#1a365d',
secondary: '#2b6cb0',
accent: '#3182ce',
background: '#ffffff',
text: '#2d3748',
heading: '#1a202c',
},
creative: {
primary: '#6b21a8',
secondary: '#a855f7',
accent: '#f59e0b',
background: '#faf5ff',
text: '#374151',
heading: '#1f2937',
},
warm: {
primary: '#c2410c',
secondary: '#ea580c',
accent: '#f97316',
background: '#fffbeb',
text: '#451a03',
heading: '#7c2d12',
},
cool: {
primary: '#0e7490',
secondary: '#06b6d4',
accent: '#22d3ee',
background: '#ecfeff',
text: '#164e63',
heading: '#155e75',
},
dark: {
primary: '#f8fafc',
secondary: '#94a3b8',
accent: '#3b82f6',
background: '#0f172a',
text: '#cbd5e1',
heading: '#f1f5f9',
},
};
return palettes[mood] ?? palettes.professional;
}
function ensureContrast(
foreground: string,
background: string,
minRatio: number = 4.5,
): string {
/**
* WCAG contrast check.
* If contrast is insufficient, adjust foreground color.
*/
const ratio = calculateContrastRatio(foreground, background);
if (ratio >= minRatio) return foreground;
// Darken or lighten foreground to meet contrast requirement
const bgLuminance = relativeLuminance(background);
if (bgLuminance > 0.5) {
return darken(foreground, (minRatio - ratio) * 10);
} else {
return lighten(foreground, (minRatio - ratio) * 10);
}
}
五、图像放置与 AI 图像生成集成
5.1 内容感知图像放置
// image-placement.ts
interface ImagePlacement {
x: number;
y: number;
width: number;
height: number;
objectFit: 'cover' | 'contain' | 'fill';
mask?: 'none' | 'rounded' | 'circle' | 'blob';
}
function calculateImagePlacement(
zoneBounds: { x: number; y: number; width: number; height: number },
imageAspect: number, // width / height
layoutType: string,
): ImagePlacement {
const zoneAspect = zoneBounds.width / zoneBounds.height;
if (layoutType === 'full-image') {
// Full bleed: cover the entire zone
return {
...zoneBounds,
objectFit: 'cover',
mask: 'none',
};
}
if (layoutType === 'image-left' || layoutType === 'image-right') {
// Side image: contain within zone, center vertically
if (imageAspect > zoneAspect) {
// Image is wider than zone
const height = zoneBounds.width / imageAspect;
const yOffset = (zoneBounds.height - height) / 2;
return {
x: zoneBounds.x,
y: zoneBounds.y + yOffset,
width: zoneBounds.width,
height,
objectFit: 'contain',
mask: 'rounded',
};
} else {
const width = zoneBounds.height * imageAspect;
const xOffset = (zoneBounds.width - width) / 2;
return {
x: zoneBounds.x + xOffset,
y: zoneBounds.y,
width,
height: zoneBounds.height,
objectFit: 'contain',
mask: 'rounded',
};
}
}
// Default: contain with center alignment
return {
...zoneBounds,
objectFit: 'contain',
mask: 'rounded',
};
}
5.2 AI 图像生成集成
// slide-image-generator.ts
async function generateSlideImage(
content: SlideContent,
template: PPTTemplate,
quality: '2k' | '4k' = '2k',
): Promise<string> {
/**
* Generate an image that fits the slide's visual context.
* The prompt incorporates template style for consistency.
*/
const sizeMap = {
'2k': { width: 1920, height: 1080 },
'4k': { width: 3840, height: 2160 },
};
const size = sizeMap[quality];
const prompt = buildImagePrompt(content, template);
// Try primary provider, fallback to secondary
try {
return await generateWithGoogle(prompt, size);
} catch {
return await generateWithPoe(prompt, size);
}
}
function buildImagePrompt(
content: SlideContent,
template: PPTTemplate,
): string {
/**
* Construct image generation prompt that maintains
* visual consistency with the template style.
*/
const parts = [
template.stylePrompt,
`Subject: ${content.title}`,
content.imageHint ? `Visual: ${content.imageHint}` : '',
`Color palette: ${template.colorScheme.primary}, ${template.colorScheme.secondary}`,
template.keywords.join(', '),
'Professional quality, clean composition, no text overlay',
];
return parts.filter(Boolean).join('. ');
}
六、导出 Pipeline
6.1 PPTX 生成(使用 python-pptx)
# pptx_exporter.py
from pptx import Presentation
from pptx.util import Inches, Pt, Emu
from pptx.dml.color import RGBColor
from pptx.enum.text import PP_ALIGN
from io import BytesIO
import requests
def export_pptx(
slides_data: list[dict],
template_config: dict,
output_path: str,
) -> str:
"""Export resolved slides to PPTX file."""
prs = Presentation()
prs.slide_width = Emu(template_config['width'] * 914400 // 96)
prs.slide_height = Emu(template_config['height'] * 914400 // 96)
colors = template_config['colorScheme']
for slide_data in slides_data:
slide_layout = prs.slide_layouts[6] # Blank layout
slide = prs.slides.add_slide(slide_layout)
# Set background
bg = slide.background
fill = bg.fill
fill.solid()
fill.fore_color.rgb = RGBColor.from_string(
colors['background'].lstrip('#')
)
# Add elements
for element in slide_data['elements']:
if element['role'] in ('title', 'subtitle', 'body'):
add_text_element(slide, element, colors, template_config)
elif element['role'] == 'image':
add_image_element(slide, element)
prs.save(output_path)
return output_path
def add_text_element(
slide, element: dict, colors: dict, config: dict,
) -> None:
"""Add a text box to the slide."""
left = Emu(int(element['x'] * 914400 / 96))
top = Emu(int(element['y'] * 914400 / 96))
width = Emu(int(element['width'] * 914400 / 96))
height = Emu(int(element['height'] * 914400 / 96))
txBox = slide.shapes.add_textbox(left, top, width, height)
tf = txBox.text_frame
tf.word_wrap = True
# Determine text properties based on role
if element['role'] == 'title':
font_size = Pt(element.get('fontSize', 48))
font_color = colors['heading']
font_bold = True
alignment = PP_ALIGN.LEFT
elif element['role'] == 'subtitle':
font_size = Pt(element.get('fontSize', 24))
font_color = colors['text']
font_bold = False
alignment = PP_ALIGN.LEFT
else:
font_size = Pt(element.get('fontSize', 18))
font_color = colors['text']
font_bold = False
alignment = PP_ALIGN.LEFT
# Split text into paragraphs
text = str(element.get('content', ''))
for i, line in enumerate(text.split('\n')):
if i == 0:
p = tf.paragraphs[0]
else:
p = tf.add_paragraph()
p.text = line
p.font.size = font_size
p.font.color.rgb = RGBColor.from_string(font_color.lstrip('#'))
p.font.bold = font_bold
p.alignment = alignment
p.font.name = config['typography']['bodyFont']
def add_image_element(slide, element: dict) -> None:
"""Add an image to the slide."""
image_url = element.get('content')
if not image_url:
return
# Download image
response = requests.get(image_url, timeout=30)
image_stream = BytesIO(response.content)
left = Emu(int(element['x'] * 914400 / 96))
top = Emu(int(element['y'] * 914400 / 96))
width = Emu(int(element['width'] * 914400 / 96))
height = Emu(int(element['height'] * 914400 / 96))
slide.shapes.add_picture(image_stream, left, top, width, height)
6.2 多格式导出
| 格式 | 工具 | 用途 | 质量 |
|---|---|---|---|
| PPTX | python-pptx | 可编辑演示文稿 | 原始矢量 |
| LibreOffice headless | 不可编辑分发 | 高 | |
| PNG/JPG | Puppeteer / wkhtmltoimage | 社交媒体缩略图 | 取决于分辨率 |
| HTML | 自定义渲染器 | Web 预览 | 像素级 |
七、端到端流程示例
输入: "帮我做一份关于 2026 年 AI 趋势的 PPT,10 页,商务风格"
Step 1 - 内容解析:
LLM 生成 10 页大纲 (title + bullets for each page)
Step 2 - 模板选择:
匹配 "business" category -> "corporate-blue" template
Step 3 - 布局规划:
Page 1: title layout
Pages 2-8: content / two-column / image-left (auto-selected)
Page 9: data layout (with chart)
Page 10: closing layout
Step 4 - 图像生成:
为 5 个需要配图的页面生成 AI 图片 (parallel, 2 at a time)
Step 5 - 样式应用:
Apply corporate-blue color scheme + typography
Step 6 - 渲染导出:
Generate PPTX file -> Upload to R2 -> Return download URL
八、常见陷阱与经验
模板解析中的 undefined 陷阱
当前端发送 { id: "template-id" } 而非完整模板数据时,后端必须通过 findTemplateById() 解析完整模板。直接使用 spread 合并会导致 stylePrompt、colors 等字段为 undefined,生成出的 PPT 丢失所有风格。
中文字体适配
PPTX 中使用中文字体时:
- 系统必须安装对应字体(如思源黑体)
- 或使用 Web 字体嵌入
- 不同操作系统上字体名称可能不同
图片分辨率
生成的图片必须满足最终输出的分辨率要求:
- 标准模式(1920x1080):图片至少 2K
- 高清模式(3840x2160):图片至少 4K
- 低于要求的图片必须被拦截,不能进入渲染流程
Maurice | maurice_wen@proton.me