演示文稿自动化:从文本到幻灯片
原创
灵阙教研团队
B 基础 进阶 |
约 11 分钟阅读
更新于 2026-02-28 AI 导读
演示文稿自动化:从文本到幻灯片 内容解析、幻灯片结构规划、LLM 驱动布局、图像生成集成与风格一致性 一、自动化的核心问题 将一段文本转化为一份演示文稿,本质上是一个"信息重构"问题: 内容重构:从连续文本中提取层级结构(标题 -> 要点 -> 细节) 空间重构:将线性信息映射到二维画布上 视觉重构:为抽象信息选择恰当的视觉表达形式 这不是简单的格式转换。一段 2000 字的文本,转化为 10...
演示文稿自动化:从文本到幻灯片
内容解析、幻灯片结构规划、LLM 驱动布局、图像生成集成与风格一致性
一、自动化的核心问题
将一段文本转化为一份演示文稿,本质上是一个"信息重构"问题:
- 内容重构:从连续文本中提取层级结构(标题 -> 要点 -> 细节)
- 空间重构:将线性信息映射到二维画布上
- 视觉重构:为抽象信息选择恰当的视觉表达形式
这不是简单的格式转换。一段 2000 字的文本,转化为 10 页 PPT 后,每页只能承载约 50-80 个有效字符。选择"展示什么"和"隐藏什么",是比排版更重要的决策。
从文本到幻灯片的处理阶段
原始文本 / 大纲 / 文档
|
v
[Stage 1: 内容解析]
提取主题、层级、关键信息
|
v
[Stage 2: 幻灯片规划]
决定页数、每页类型、内容分配
|
v
[Stage 3: LLM 布局]
为每页选择最佳布局和视觉元素
|
v
[Stage 4: 图像集成]
AI 生成配图 / 图表 / 图标
|
v
[Stage 5: 风格统一]
确保全套幻灯片视觉一致
|
v
最终演示文稿
二、Stage 1:内容解析
2.1 LLM 驱动的结构化提取
// content-parser.ts
interface ParsedContent {
title: string;
subtitle?: string;
audience: string;
keyThemes: string[];
sections: ContentSection[];
suggestedPageCount: number;
}
interface ContentSection {
heading: string;
level: 1 | 2 | 3;
points: string[];
dataPoints?: DataPoint[];
suggestedVisual?: 'chart' | 'image' | 'diagram' | 'icon' | 'none';
}
async function parseContent(
rawText: string,
targetPages: number = 10,
language: string = 'zh',
): Promise<ParsedContent> {
const prompt = `Analyze the following text and extract structured presentation content.
Target: ${targetPages} slides for a professional presentation.
Language: ${language}
Input text:
${rawText}
Output as JSON with this structure:
{
"title": "main presentation title",
"subtitle": "optional subtitle",
"audience": "who is this for",
"keyThemes": ["theme1", "theme2"],
"sections": [
{
"heading": "section title",
"level": 1,
"points": ["bullet point 1", "bullet point 2"],
"dataPoints": [{"label": "...", "value": 42, "unit": "%"}],
"suggestedVisual": "chart"
}
],
"suggestedPageCount": ${targetPages}
}
Rules:
- Each bullet point should be concise (under 15 words)
- Identify data that could be visualized as charts
- Suggest visual types for each section
- Maintain the logical flow of the original text`;
const response = await llm.chat({
model: 'gemini-2.5-flash',
messages: [{ role: 'user', content: prompt }],
responseFormat: { type: 'json_object' },
temperature: 0.2,
});
return JSON.parse(response.content) as ParsedContent;
}
2.2 内容密度评估
解析后需要评估每个 section 的信息密度,决定是否需要拆分或合并:
function assessContentDensity(
sections: ContentSection[],
targetPages: number,
): ContentSection[] {
/**
* Ensure each section fits well into 1-2 slides.
* Split dense sections, merge sparse ones.
*/
const result: ContentSection[] = [];
for (const section of sections) {
const wordCount = section.points.reduce(
(sum, p) => sum + p.length, 0
);
const hasData = (section.dataPoints?.length ?? 0) > 0;
if (wordCount > 200 || section.points.length > 6) {
// Split: too dense for one slide
const mid = Math.ceil(section.points.length / 2);
result.push({
...section,
heading: section.heading + ' (1/2)',
points: section.points.slice(0, mid),
});
result.push({
...section,
heading: section.heading + ' (2/2)',
points: section.points.slice(mid),
dataPoints: section.dataPoints,
suggestedVisual: hasData ? 'chart' : section.suggestedVisual,
});
} else if (wordCount < 30 && section.points.length < 2) {
// Mark for potential merge with adjacent section
result.push({ ...section, _sparse: true } as any);
} else {
result.push(section);
}
}
return mergeSparse(result, targetPages);
}
三、Stage 2:幻灯片规划
3.1 页面类型分配
// slide-planner.ts
type SlideType =
| 'cover' // 封面
| 'agenda' // 目录/议程
| 'section-divider' // 章节分隔页
| 'content' // 内容页(文字为主)
| 'data' // 数据页(图表为主)
| 'image-feature' // 图文结合
| 'comparison' // 对比页
| 'quote' // 引用页
| 'summary' // 总结页
| 'closing'; // 结束页
interface SlidePlan {
index: number;
type: SlideType;
title: string;
content: SlideContent;
notes?: string; // Speaker notes
}
function planSlides(
parsed: ParsedContent,
targetPages: number,
): SlidePlan[] {
const plans: SlidePlan[] = [];
// Page 1: Cover
plans.push({
index: 0,
type: 'cover',
title: parsed.title,
content: {
title: parsed.title,
subtitle: parsed.subtitle ?? parsed.audience,
},
});
// Page 2: Agenda (if >= 8 pages)
if (targetPages >= 8) {
plans.push({
index: 1,
type: 'agenda',
title: '目录',
content: {
title: '目录',
bullets: parsed.sections
.filter(s => s.level === 1)
.map(s => s.heading),
},
});
}
// Content pages
let pageIndex = plans.length;
const sectionsPerPage = Math.ceil(
parsed.sections.length / (targetPages - 3) // reserve for cover/agenda/closing
);
for (const section of parsed.sections) {
// Add section divider for level-1 sections
if (section.level === 1 && pageIndex > 2) {
plans.push({
index: pageIndex++,
type: 'section-divider',
title: section.heading,
content: { title: section.heading },
});
}
// Content page
const slideType = determineSlideType(section);
plans.push({
index: pageIndex++,
type: slideType,
title: section.heading,
content: {
title: section.heading,
bullets: section.points,
chartData: section.dataPoints,
imageHint: section.suggestedVisual === 'image'
? section.heading
: undefined,
},
});
}
// Last page: Closing
plans.push({
index: pageIndex,
type: 'closing',
title: '谢谢',
content: {
title: '谢谢',
subtitle: '欢迎提问',
},
});
return plans;
}
function determineSlideType(section: ContentSection): SlideType {
if (section.dataPoints && section.dataPoints.length > 0) return 'data';
if (section.suggestedVisual === 'image') return 'image-feature';
if (section.suggestedVisual === 'diagram') return 'comparison';
return 'content';
}
四、Stage 3:LLM 驱动的智能布局
4.1 内容感知布局选择
不同类型的内容需要不同的布局策略。关键洞察:布局选择不应该是随机的或顺序的,而应该由内容驱动。
// layout-intelligence.ts
interface LayoutDecision {
layoutType: string;
reasoning: string;
emphasisElement: string; // What should draw attention first
visualBalance: 'left' | 'right' | 'center' | 'split';
}
async function intelligentLayout(
slidePlan: SlidePlan,
previousLayout?: string,
sequencePosition: number = 0,
): Promise<LayoutDecision> {
/**
* Use LLM to make nuanced layout decisions.
* Consider: content type, previous slide's layout (avoid repetition),
* position in sequence (rhythm).
*/
const bulletCount = slidePlan.content.bullets?.length ?? 0;
const hasImage = !!slidePlan.content.imageHint;
const hasChart = !!slidePlan.content.chartData;
const textLength = (slidePlan.content.bullets ?? [])
.reduce((s, b) => s + b.length, 0);
// Rule-based fast path for obvious cases
if (slidePlan.type === 'cover') {
return {
layoutType: 'title',
reasoning: 'Cover slide always uses title layout',
emphasisElement: 'title',
visualBalance: 'center',
};
}
if (hasChart) {
return {
layoutType: 'data',
reasoning: 'Data points present, using chart-focused layout',
emphasisElement: 'chart',
visualBalance: 'split',
};
}
// For content slides: alternate image position for rhythm
if (hasImage) {
const side = previousLayout === 'image-left' ? 'image-right' : 'image-left';
return {
layoutType: side,
reasoning: `Alternating image position from previous: ${previousLayout}`,
emphasisElement: 'image',
visualBalance: side === 'image-left' ? 'left' : 'right',
};
}
// Many bullets: use two-column
if (bulletCount > 4) {
return {
layoutType: 'two-column',
reasoning: `${bulletCount} bullets split across two columns`,
emphasisElement: 'body',
visualBalance: 'split',
};
}
return {
layoutType: 'content',
reasoning: 'Standard content layout',
emphasisElement: 'title',
visualBalance: 'left',
};
}
4.2 视觉节奏(Visual Rhythm)
好的 PPT 不是每页都用相同布局,而是有节奏感:
Cover -> Agenda -> Divider -> Content -> Image -> Content
| | | | | |
center center center left right left
(strong) (medium) (strong) (standard) (visual) (standard)
visual rhythm: strong-medium-strong-standard-visual-standard
function applyVisualRhythm(
plans: SlidePlan[],
layouts: LayoutDecision[],
): LayoutDecision[] {
/**
* Post-process layout decisions to ensure visual rhythm.
* Avoid: 3+ consecutive same-type layouts.
* Ensure: visual variety every 2-3 slides.
*/
for (let i = 2; i < layouts.length; i++) {
const prev2 = layouts[i - 2].layoutType;
const prev1 = layouts[i - 1].layoutType;
const current = layouts[i].layoutType;
if (prev2 === prev1 && prev1 === current && current === 'content') {
// Three consecutive content layouts -> force variety
if (plans[i].content.bullets && plans[i].content.bullets!.length >= 2) {
layouts[i] = {
...layouts[i],
layoutType: 'two-column',
reasoning: 'Forced variety to avoid 3 consecutive content layouts',
};
}
}
}
return layouts;
}
五、Stage 4:图像生成集成
5.1 按需生成策略
不是每页都需要 AI 图像。过多图像反而会分散注意力:
| 页面类型 | 图像策略 | 来源优先级 |
|---|---|---|
| 封面 | 必须 | AI 生成 (与主题匹配) |
| 章节分隔 | 可选 | 纯色/渐变背景 |
| 内容页 | 仅在 image-feature 布局 | AI 生成 |
| 数据页 | 图表代替图片 | Chart.js / Mermaid |
| 结束页 | 可选 | 品牌图片/纯色 |
5.2 风格一致的批量生成
// batch-image-gen.ts
async function generateSlideImages(
plans: SlidePlan[],
template: PPTTemplate,
concurrency: number = 2,
): Promise<Map<number, string>> {
/**
* Generate images for slides that need them.
* Ensure style consistency across all images using template's stylePrompt.
*/
const imageMap = new Map<number, string>();
// Filter slides that need images
const needsImage = plans.filter(p =>
p.type === 'cover' ||
p.type === 'image-feature' ||
(p.content.imageHint && p.type !== 'data')
);
// Generate in batches
for (let i = 0; i < needsImage.length; i += concurrency) {
const batch = needsImage.slice(i, i + concurrency);
const results = await Promise.allSettled(
batch.map(async (plan) => {
const prompt = [
template.stylePrompt,
`Topic: ${plan.title}`,
plan.content.imageHint || plan.title,
`Colors: ${template.colorScheme.primary}, ${template.colorScheme.secondary}`,
'Clean professional illustration, no text, high quality',
].join('. ');
const imageUrl = await generateImage(prompt, {
width: 1920,
height: 1080,
});
return { index: plan.index, url: imageUrl };
})
);
for (const result of results) {
if (result.status === 'fulfilled') {
imageMap.set(result.value.index, result.value.url);
}
}
}
return imageMap;
}
六、Stage 5:风格一致性保障
6.1 设计 Token 系统
// design-tokens.ts
interface DesignTokens {
// Spacing scale (8pt grid)
spacing: {
xs: 8;
sm: 16;
md: 24;
lg: 32;
xl: 48;
xxl: 64;
};
// Border radius
radius: {
none: 0;
sm: 4;
md: 8;
lg: 16;
full: 9999;
};
// Shadow
shadow: {
none: string;
sm: string;
md: string;
lg: string;
};
// Color (from template)
color: ColorScheme;
// Typography (from template)
typography: TypographyConfig;
}
function createDesignTokens(template: PPTTemplate): DesignTokens {
return {
spacing: { xs: 8, sm: 16, md: 24, lg: 32, xl: 48, xxl: 64 },
radius: { none: 0, sm: 4, md: 8, lg: 16, full: 9999 },
shadow: {
none: 'none',
sm: '0 1px 2px rgba(0,0,0,0.05)',
md: '0 4px 6px rgba(0,0,0,0.07)',
lg: '0 10px 15px rgba(0,0,0,0.1)',
},
color: template.colorScheme,
typography: template.typography,
};
}
6.2 一致性校验
function validateConsistency(slides: ResolvedSlide[]): string[] {
/**
* Check that all slides maintain visual consistency.
* Returns list of warnings.
*/
const warnings: string[] = [];
const fonts = new Set<string>();
const colors = new Set<string>();
for (const slide of slides) {
for (const element of slide.elements) {
if (element.fontFamily) fonts.add(element.fontFamily);
if (element.color) colors.add(element.color);
}
}
// Too many fonts
if (fonts.size > 3) {
warnings.push(
`Too many fonts (${fonts.size}). Limit to 2-3 for consistency.`
);
}
// Too many colors
if (colors.size > 6) {
warnings.push(
`Too many colors (${colors.size}). Stick to template palette.`
);
}
// Check alignment consistency
const titlePositions = slides
.map(s => s.elements.find(e => e.role === 'title'))
.filter(Boolean)
.map(e => ({ x: e!.x, y: e!.y }));
const uniqueX = new Set(titlePositions.map(p => Math.round(p.x / 10) * 10));
if (uniqueX.size > 2) {
warnings.push(
'Title positions are inconsistent across slides. Align to grid.'
);
}
return warnings;
}
七、完整示例:端到端生成
输入: 一篇 1500 字的产品发布文章
-> Stage 1 (内容解析, ~5s):
提取 5 个主题 + 12 个要点 + 3 组数据
-> Stage 2 (幻灯片规划, ~1s):
规划 10 页: cover + agenda + 3 dividers + 4 content + closing
-> Stage 3 (布局决策, ~2s):
cover(center) -> agenda(center) -> divider(center)
-> content(left) -> image-right -> two-column -> data(split)
-> divider(center) -> content(left) -> closing(center)
-> Stage 4 (图像生成, ~30-60s):
生成 3 张 AI 配图 (cover + 2 content pages)
-> Stage 5 (风格统一, ~1s):
应用 corporate-blue 模板
校验: 2 fonts, 5 colors, titles aligned -> PASS
-> 导出 PPTX (~2s)
总耗时: ~40-70s
八、质量提升的关键决策
内容取舍 > 排版技巧
PPT 的核心价值不在于"好看",而在于"清晰"。自动化生成时最重要的决策是:
- 每页只传递一个核心信息
- 要点不超过 5 个(Miller's Law 的 7 +/- 2)
- 数据用图表而非文字
- 详细内容放 Speaker Notes,不放幻灯片
LLM 在 Pipeline 中的角色
LLM 不应该直接生成 PPT 文件。它的最佳角色是:
- 内容解析:从非结构化文本中提取结构
- 布局建议:基于内容语义选择布局
- 文案优化:将长句压缩为要点
- 图像提示:生成图像 prompt
格式化、排版、文件生成应该由确定性的代码完成,而非依赖 LLM 的非确定性输出。
Maurice | maurice_wen@proton.me