Transcribe and Generate Chaptered Documents from Audio
Long recordings (lectures, interviews, podcasts, training sessions) produce walls of text that are difficult to navigate. What readers need is a structured Markdown document with titled chapters that reflect the natural topic flow of the conversation. LM-Kit.NET lets you chain speech-to-text with LLM-powered content analysis to automatically segment a recording into logical chapters, generate descriptive titles for each, and produce a table of contents. The entire pipeline runs locally with no cloud dependency. This tutorial builds an automated chapter generator that converts long audio files into navigable, chaptered Markdown documents.
Why Chaptered Documents Matter
Two enterprise problems that chaptered audio transcription solves:
- Training and onboarding content. Organizations record hours of training sessions, product walkthroughs, and onboarding presentations. New employees cannot efficiently consume hour-long recordings. A chaptered document lets them jump directly to "Setting Up the Development Environment" or "Submitting Your First Pull Request" without scrubbing through video. Combined with timestamps, chapters turn a monolithic recording into a navigable reference manual.
- Podcast and media production. Content creators publish episode transcripts for SEO, accessibility, and audience reference. A flat transcript is unusable for readers who want to find the segment about a specific topic. Chaptered transcripts with a table of contents let readers (and search engines) find and link to specific discussions, increasing content discoverability and engagement.
Prerequisites
| Requirement | Minimum |
|---|---|
| .NET SDK | 8.0+ |
| VRAM | ~4.5 GB (Whisper model + chat model) |
| Disk | ~4 GB free for model downloads |
| Audio file | A .wav file (16-bit PCM, any sample rate) |
Step 1: Create the Project
dotnet new console -n ChapteredTranscription
cd ChapteredTranscription
dotnet add package LM-Kit.NET
Step 2: Understand the Pipeline
Long audio recording (.wav)
│
▼
┌─────────────────────┐
│ SpeechToText │ Whisper model
│ (transcribe) │ Timestamped segments
└─────────┬───────────┘
│
▼
┌─────────────────────┐
│ LLM Analysis │ SingleTurnConversation
│ (identify topics) │ Find topic boundaries
└─────────┬───────────┘
│
▼
┌─────────────────────┐
│ LLM Formatting │ SingleTurnConversation
│ (generate chapters)│ Title each chapter
└─────────┬───────────┘
│
▼
Chaptered Markdown document
├── Table of Contents
├── Chapter 1: [Title]
│ └── Content with timestamps
├── Chapter 2: [Title]
│ └── Content with timestamps
└── ...
Step 3: Transcribe and Generate Chapters
using System.Text;
using LMKit.Media.Audio;
using LMKit.Model;
using LMKit.Speech;
using LMKit.TextGeneration;
using LMKit.TextGeneration.Chat;
LMKit.Licensing.LicenseManager.SetLicenseKey("");
Console.InputEncoding = Encoding.UTF8;
Console.OutputEncoding = Encoding.UTF8;
// ──────────────────────────────────────
// 1. Load Whisper model
// ──────────────────────────────────────
Console.WriteLine("Loading Whisper model...");
using LM whisperModel = LM.LoadFromModelID("whisper-large-turbo3",
downloadingProgress: (_, len, read) =>
{
if (len.HasValue) Console.Write($"\r Downloading: {(double)read / len.Value * 100:F1}% ");
return true;
},
loadingProgress: p => { Console.Write($"\r Loading: {p * 100:F0}% "); return true; });
Console.WriteLine("\n");
// ──────────────────────────────────────
// 2. Load chat model for chapter generation
// ──────────────────────────────────────
Console.WriteLine("Loading chat model...");
using LM chatModel = LM.LoadFromModelID("qwen3:8b",
downloadingProgress: (_, len, read) =>
{
if (len.HasValue) Console.Write($"\r Downloading: {(double)read / len.Value * 100:F1}% ");
return true;
},
loadingProgress: p => { Console.Write($"\r Loading: {p * 100:F0}% "); return true; });
Console.WriteLine("\n");
// ──────────────────────────────────────
// 3. Transcribe the audio
// ──────────────────────────────────────
string audioPath = "lecture.wav";
if (!File.Exists(audioPath))
{
Console.WriteLine($"Place a WAV file at '{audioPath}' and run again.");
return;
}
var stt = new SpeechToText(whisperModel)
{
EnableVoiceActivityDetection = true,
SuppressNonSpeechTokens = true,
SuppressHallucinations = true
};
Console.WriteLine($"Transcribing {audioPath}...");
using var audio = new WaveFile(audioPath);
Console.WriteLine($" Duration: {audio.Duration:hh\\:mm\\:ss}\n");
var transcription = stt.Transcribe(audio);
Console.WriteLine($" Segments: {transcription.Segments.Count}\n");
// Build a timestamped transcript
var timestampedTranscript = new StringBuilder();
foreach (var seg in transcription.Segments)
{
timestampedTranscript.AppendLine($"[{seg.Start:hh\\:mm\\:ss}] {seg.Text}");
}
string fullTimestamped = timestampedTranscript.ToString();
// ──────────────────────────────────────
// 4. Generate chaptered document
// ──────────────────────────────────────
Console.WriteLine("Generating chaptered document...\n");
var chapterGenerator = new SingleTurnConversation(chatModel)
{
SystemPrompt = "You are a document structuring assistant. Your task is to convert a " +
"timestamped transcript into a well-organized Markdown document with chapters.\n\n" +
"Instructions:\n" +
"1. Analyze the transcript to identify distinct topics or subject changes.\n" +
"2. Group consecutive segments that discuss the same topic into chapters.\n" +
"3. Give each chapter a clear, descriptive title that summarizes its content.\n" +
"4. Include the timestamp range for each chapter.\n" +
"5. Clean up the text within each chapter: fix grammar, remove filler words, " +
"and organize into readable paragraphs.\n" +
"6. Generate a Table of Contents at the top with links to each chapter.\n\n" +
"Output format:\n" +
"# [Document Title]\n\n" +
"## Table of Contents\n" +
"1. [Chapter Title 1](#chapter-1-anchor) (HH:MM:SS)\n" +
"2. [Chapter Title 2](#chapter-2-anchor) (HH:MM:SS)\n\n" +
"---\n\n" +
"## Chapter 1: [Title]\n" +
"*Timestamp: HH:MM:SS - HH:MM:SS*\n\n" +
"[Cleaned-up content organized into paragraphs]\n\n" +
"---\n\n" +
"## Chapter 2: [Title]\n" +
"*Timestamp: HH:MM:SS - HH:MM:SS*\n\n" +
"[Content]\n\n" +
"Rules:\n" +
"- Create between 3 and 15 chapters depending on recording length and topic variety.\n" +
"- Each chapter should cover a coherent topic.\n" +
"- Do not skip or omit any content from the transcript.\n" +
"- Preserve the speaker's key points and details.\n" +
"- Output only the Markdown document.",
MaximumCompletionTokens = 8192
};
var document = new StringBuilder();
chapterGenerator.AfterTextCompletion += (_, e) =>
{
if (e.SegmentType == TextSegmentType.UserVisible)
{
document.Append(e.Text);
Console.Write(e.Text);
}
};
chapterGenerator.Submit(
$"Create a chaptered document from this timestamped transcript:\n\n{fullTimestamped}");
Console.WriteLine("\n");
// Save the chaptered document
string outputPath = Path.ChangeExtension(audioPath, ".md");
File.WriteAllText(outputPath, document.ToString());
Console.ForegroundColor = ConsoleColor.Cyan;
Console.WriteLine($"Chaptered document saved: {outputPath}");
Console.ResetColor();
Step 4: Process Long Recordings in Chunks
For recordings longer than 30 minutes, process the transcript in overlapping chunks to stay within the model's context window:
using System.Text;
using LMKit.Media.Audio;
using LMKit.Model;
using LMKit.Speech;
using LMKit.TextGeneration;
using LMKit.TextGeneration.Chat;
LMKit.Licensing.LicenseManager.SetLicenseKey("");
Console.InputEncoding = Encoding.UTF8;
Console.OutputEncoding = Encoding.UTF8;
// ──────────────────────────────────────
// 1. Load Whisper model
// ──────────────────────────────────────
Console.WriteLine("Loading Whisper model...");
using LM whisperModel = LM.LoadFromModelID("whisper-large-turbo3",
downloadingProgress: (_, len, read) =>
{
if (len.HasValue) Console.Write($"\r Downloading: {(double)read / len.Value * 100:F1}% ");
return true;
},
loadingProgress: p => { Console.Write($"\r Loading: {p * 100:F0}% "); return true; });
Console.WriteLine("\n");
// ──────────────────────────────────────
// 2. Load chat model for chapter generation
// ──────────────────────────────────────
Console.WriteLine("Loading chat model...");
using LM chatModel = LM.LoadFromModelID("qwen3:8b",
downloadingProgress: (_, len, read) =>
{
if (len.HasValue) Console.Write($"\r Downloading: {(double)read / len.Value * 100:F1}% ");
return true;
},
loadingProgress: p => { Console.Write($"\r Loading: {p * 100:F0}% "); return true; });
Console.WriteLine("\n");
// ──────────────────────────────────────
// 3. Transcribe the audio
// ──────────────────────────────────────
string audioPath = "lecture.wav";
if (!File.Exists(audioPath))
{
Console.WriteLine($"Place a WAV file at '{audioPath}' and run again.");
return;
}
var stt = new SpeechToText(whisperModel)
{
EnableVoiceActivityDetection = true,
SuppressNonSpeechTokens = true,
SuppressHallucinations = true
};
Console.WriteLine($"Transcribing {audioPath}...");
using var audio = new WaveFile(audioPath);
Console.WriteLine($" Duration: {audio.Duration:hh\\:mm\\:ss}\n");
var transcription = stt.Transcribe(audio);
Console.WriteLine($" Segments: {transcription.Segments.Count}\n");
// Build a timestamped transcript
var timestampedTranscript = new StringBuilder();
foreach (var seg in transcription.Segments)
{
timestampedTranscript.AppendLine($"[{seg.Start:hh\\:mm\\:ss}] {seg.Text}");
}
string fullTimestamped = timestampedTranscript.ToString();
// ──────────────────────────────────────
// 4. Generate chaptered document
// ──────────────────────────────────────
Console.WriteLine("Generating chaptered document...\n");
var chapterGenerator = new SingleTurnConversation(chatModel)
{
SystemPrompt = "You are a document structuring assistant. Your task is to convert a " +
"timestamped transcript into a well-organized Markdown document with chapters.\n\n" +
"Instructions:\n" +
"1. Analyze the transcript to identify distinct topics or subject changes.\n" +
"2. Group consecutive segments that discuss the same topic into chapters.\n" +
"3. Give each chapter a clear, descriptive title that summarizes its content.\n" +
"4. Include the timestamp range for each chapter.\n" +
"5. Clean up the text within each chapter: fix grammar, remove filler words, " +
"and organize into readable paragraphs.\n" +
"6. Generate a Table of Contents at the top with links to each chapter.\n\n" +
"Output format:\n" +
"# [Document Title]\n\n" +
"## Table of Contents\n" +
"1. [Chapter Title 1](#chapter-1-anchor) (HH:MM:SS)\n" +
"2. [Chapter Title 2](#chapter-2-anchor) (HH:MM:SS)\n\n" +
"---\n\n" +
"## Chapter 1: [Title]\n" +
"*Timestamp: HH:MM:SS - HH:MM:SS*\n\n" +
"[Cleaned-up content organized into paragraphs]\n\n" +
"---\n\n" +
"## Chapter 2: [Title]\n" +
"*Timestamp: HH:MM:SS - HH:MM:SS*\n\n" +
"[Content]\n\n" +
"Rules:\n" +
"- Create between 3 and 15 chapters depending on recording length and topic variety.\n" +
"- Each chapter should cover a coherent topic.\n" +
"- Do not skip or omit any content from the transcript.\n" +
"- Preserve the speaker's key points and details.\n" +
"- Output only the Markdown document.",
MaximumCompletionTokens = 8192
};
var document = new StringBuilder();
chapterGenerator.AfterTextCompletion += (_, e) =>
{
if (e.SegmentType == TextSegmentType.UserVisible)
{
document.Append(e.Text);
Console.Write(e.Text);
}
};
chapterGenerator.Submit(
$"Create a chaptered document from this timestamped transcript:\n\n{fullTimestamped}");
Console.WriteLine("\n");
Console.WriteLine("\n=== Long Recording: Chunk-Based Chapter Generation ===\n");
// Split segments into chunks of roughly 10 minutes each
int segmentsPerChunk = 0;
var chunks = new List<List<AudioSegment>>();
var currentChunk = new List<AudioSegment>();
TimeSpan chunkDuration = TimeSpan.FromMinutes(10);
TimeSpan chunkStart = TimeSpan.Zero;
foreach (var seg in transcription.Segments)
{
currentChunk.Add(seg);
if (seg.End - chunkStart >= chunkDuration && currentChunk.Count > 0)
{
chunks.Add(currentChunk);
currentChunk = new List<AudioSegment>();
chunkStart = seg.End;
}
}
if (currentChunk.Count > 0)
chunks.Add(currentChunk);
Console.WriteLine($"Split into {chunks.Count} chunks\n");
// Generate chapters for each chunk
var chunkChapters = new List<string>();
var chunkProcessor = new SingleTurnConversation(chatModel)
{
SystemPrompt = "You are a document structuring assistant. Analyze this section of a " +
"timestamped transcript and identify the distinct topics discussed. " +
"For each topic, output a chapter in this format:\n\n" +
"## [Chapter Title]\n" +
"*Timestamp: HH:MM:SS - HH:MM:SS*\n\n" +
"[Cleaned-up content in paragraphs]\n\n---\n\n" +
"Rules:\n" +
"- Create 1-5 chapters per section based on topic changes.\n" +
"- Clean the text: fix grammar, remove filler words.\n" +
"- Preserve all content and details.\n" +
"- Output only the Markdown.",
MaximumCompletionTokens = 4096
};
for (int i = 0; i < chunks.Count; i++)
{
Console.Write($" Processing chunk {i + 1}/{chunks.Count}... ");
var chunkText = new StringBuilder();
foreach (var seg in chunks[i])
{
chunkText.AppendLine($"[{seg.Start:hh\\:mm\\:ss}] {seg.Text}");
}
var chunkOutput = new StringBuilder();
chunkProcessor.AfterTextCompletion += (_, e) =>
{
if (e.SegmentType == TextSegmentType.UserVisible)
chunkOutput.Append(e.Text);
};
chunkProcessor.Submit($"Structure this transcript section into chapters:\n\n{chunkText}");
chunkChapters.Add(chunkOutput.ToString());
Console.ForegroundColor = ConsoleColor.Green;
Console.WriteLine("done");
Console.ResetColor();
}
// Assemble the full document with table of contents
var fullDocument = new StringBuilder();
fullDocument.AppendLine($"# {Path.GetFileNameWithoutExtension(audioPath)}\n");
fullDocument.AppendLine("## Table of Contents\n");
// Extract chapter titles for TOC
int chapterNum = 1;
foreach (string chapterBlock in chunkChapters)
{
foreach (string line in chapterBlock.Split('\n'))
{
if (line.StartsWith("## ") && !line.StartsWith("## Table"))
{
string title = line[3..].Trim();
fullDocument.AppendLine($"{chapterNum}. {title}");
chapterNum++;
}
}
}
fullDocument.AppendLine("\n---\n");
// Renumber chapters sequentially
chapterNum = 1;
foreach (string chapterBlock in chunkChapters)
{
string renumbered = chapterBlock;
foreach (string line in chapterBlock.Split('\n'))
{
if (line.StartsWith("## ") && !line.StartsWith("## Table"))
{
string title = line[3..].Trim();
renumbered = renumbered.Replace(line, $"## Chapter {chapterNum}: {title}");
chapterNum++;
}
}
fullDocument.AppendLine(renumbered);
}
string longOutputPath = Path.ChangeExtension(audioPath, ".chapters.md");
File.WriteAllText(longOutputPath, fullDocument.ToString());
Console.WriteLine($"\nFull chaptered document: {longOutputPath}");
Step 5: Domain-Specific Chapter Styles
Customize chapter generation for different content types:
using System.Text;
using LMKit.Media.Audio;
using LMKit.Model;
using LMKit.Speech;
using LMKit.TextGeneration;
using LMKit.TextGeneration.Chat;
LMKit.Licensing.LicenseManager.SetLicenseKey("");
Console.InputEncoding = Encoding.UTF8;
Console.OutputEncoding = Encoding.UTF8;
// ──────────────────────────────────────
// 1. Load Whisper model
// ──────────────────────────────────────
Console.WriteLine("Loading Whisper model...");
using LM whisperModel = LM.LoadFromModelID("whisper-large-turbo3",
downloadingProgress: (_, len, read) =>
{
if (len.HasValue) Console.Write($"\r Downloading: {(double)read / len.Value * 100:F1}% ");
return true;
},
loadingProgress: p => { Console.Write($"\r Loading: {p * 100:F0}% "); return true; });
Console.WriteLine("\n");
// ──────────────────────────────────────
// 2. Load chat model for chapter generation
// ──────────────────────────────────────
Console.WriteLine("Loading chat model...");
using LM chatModel = LM.LoadFromModelID("qwen3:8b",
downloadingProgress: (_, len, read) =>
{
if (len.HasValue) Console.Write($"\r Downloading: {(double)read / len.Value * 100:F1}% ");
return true;
},
loadingProgress: p => { Console.Write($"\r Loading: {p * 100:F0}% "); return true; });
Console.WriteLine("\n");
// ──────────────────────────────────────
// 3. Transcribe the audio
// ──────────────────────────────────────
string audioPath = "lecture.wav";
if (!File.Exists(audioPath))
{
Console.WriteLine($"Place a WAV file at '{audioPath}' and run again.");
return;
}
var stt = new SpeechToText(whisperModel)
{
EnableVoiceActivityDetection = true,
SuppressNonSpeechTokens = true,
SuppressHallucinations = true
};
Console.WriteLine($"Transcribing {audioPath}...");
using var audio = new WaveFile(audioPath);
Console.WriteLine($" Duration: {audio.Duration:hh\\:mm\\:ss}\n");
var transcription = stt.Transcribe(audio);
Console.WriteLine($" Segments: {transcription.Segments.Count}\n");
// Build a timestamped transcript
var timestampedTranscript = new StringBuilder();
foreach (var seg in transcription.Segments)
{
timestampedTranscript.AppendLine($"[{seg.Start:hh\\:mm\\:ss}] {seg.Text}");
}
string fullTimestamped = timestampedTranscript.ToString();
// ──────────────────────────────────────
// 4. Generate chaptered document
// ──────────────────────────────────────
Console.WriteLine("Generating chaptered document...\n");
var chapterGenerator = new SingleTurnConversation(chatModel)
{
SystemPrompt = "You are a document structuring assistant. Your task is to convert a " +
"timestamped transcript into a well-organized Markdown document with chapters.\n\n" +
"Instructions:\n" +
"1. Analyze the transcript to identify distinct topics or subject changes.\n" +
"2. Group consecutive segments that discuss the same topic into chapters.\n" +
"3. Give each chapter a clear, descriptive title that summarizes its content.\n" +
"4. Include the timestamp range for each chapter.\n" +
"5. Clean up the text within each chapter: fix grammar, remove filler words, " +
"and organize into readable paragraphs.\n" +
"6. Generate a Table of Contents at the top with links to each chapter.\n\n" +
"Output format:\n" +
"# [Document Title]\n\n" +
"## Table of Contents\n" +
"1. [Chapter Title 1](#chapter-1-anchor) (HH:MM:SS)\n" +
"2. [Chapter Title 2](#chapter-2-anchor) (HH:MM:SS)\n\n" +
"---\n\n" +
"## Chapter 1: [Title]\n" +
"*Timestamp: HH:MM:SS - HH:MM:SS*\n\n" +
"[Cleaned-up content organized into paragraphs]\n\n" +
"---\n\n" +
"## Chapter 2: [Title]\n" +
"*Timestamp: HH:MM:SS - HH:MM:SS*\n\n" +
"[Content]\n\n" +
"Rules:\n" +
"- Create between 3 and 15 chapters depending on recording length and topic variety.\n" +
"- Each chapter should cover a coherent topic.\n" +
"- Do not skip or omit any content from the transcript.\n" +
"- Preserve the speaker's key points and details.\n" +
"- Output only the Markdown document.",
MaximumCompletionTokens = 8192
};
var document = new StringBuilder();
chapterGenerator.AfterTextCompletion += (_, e) =>
{
if (e.SegmentType == TextSegmentType.UserVisible)
{
document.Append(e.Text);
Console.Write(e.Text);
}
};
chapterGenerator.Submit(
$"Create a chaptered document from this timestamped transcript:\n\n{fullTimestamped}");
Console.WriteLine("\n");
// Lecture/course content: structured as learning modules
var lectureProcessor = new SingleTurnConversation(chatModel)
{
SystemPrompt = "Convert this transcript into a structured lecture document. " +
"Chapters should be titled as learning topics (e.g., " +
"'Introduction to Neural Networks', 'Backpropagation Algorithm'). " +
"Within each chapter, organize content as:\n" +
"- Key concepts (bold)\n" +
"- Explanations\n" +
"- Examples mentioned by the speaker\n" +
"Include timestamps. Output only Markdown.",
MaximumCompletionTokens = 8192
};
// Interview/podcast: structured by question/topic
var interviewProcessor = new SingleTurnConversation(chatModel)
{
SystemPrompt = "Convert this transcript into a chaptered interview document. " +
"Each chapter should represent a distinct question or topic " +
"discussed. Title chapters as the topic or question being addressed " +
"(e.g., 'On Building the Company', 'Advice for Young Entrepreneurs'). " +
"Clean up the dialogue while preserving the conversational tone. " +
"Include timestamps. Output only Markdown.",
MaximumCompletionTokens = 8192
};
// Conference talk: structured as presentation sections
var talkProcessor = new SingleTurnConversation(chatModel)
{
SystemPrompt = "Convert this conference talk transcript into a structured document. " +
"Identify the natural sections of the presentation " +
"(introduction, problem statement, approach, results, Q&A). " +
"Title each chapter to reflect the presentation flow. " +
"Preserve technical details, data points, and quoted findings. " +
"Include timestamps. Output only Markdown.",
MaximumCompletionTokens = 8192
};
Step 6: Batch Process Multiple Recordings
Generate chaptered documents for an entire content library:
using System.Text;
using LMKit.Media.Audio;
using LMKit.Model;
using LMKit.Speech;
using LMKit.TextGeneration;
using LMKit.TextGeneration.Chat;
LMKit.Licensing.LicenseManager.SetLicenseKey("");
Console.InputEncoding = Encoding.UTF8;
Console.OutputEncoding = Encoding.UTF8;
// ──────────────────────────────────────
// 1. Load Whisper model
// ──────────────────────────────────────
Console.WriteLine("Loading Whisper model...");
using LM whisperModel = LM.LoadFromModelID("whisper-large-turbo3",
downloadingProgress: (_, len, read) =>
{
if (len.HasValue) Console.Write($"\r Downloading: {(double)read / len.Value * 100:F1}% ");
return true;
},
loadingProgress: p => { Console.Write($"\r Loading: {p * 100:F0}% "); return true; });
Console.WriteLine("\n");
// ──────────────────────────────────────
// 2. Load chat model for chapter generation
// ──────────────────────────────────────
Console.WriteLine("Loading chat model...");
using LM chatModel = LM.LoadFromModelID("qwen3:8b",
downloadingProgress: (_, len, read) =>
{
if (len.HasValue) Console.Write($"\r Downloading: {(double)read / len.Value * 100:F1}% ");
return true;
},
loadingProgress: p => { Console.Write($"\r Loading: {p * 100:F0}% "); return true; });
Console.WriteLine("\n");
// ──────────────────────────────────────
// 3. Transcribe the audio
// ──────────────────────────────────────
string audioPath = "lecture.wav";
if (!File.Exists(audioPath))
{
Console.WriteLine($"Place a WAV file at '{audioPath}' and run again.");
return;
}
var stt = new SpeechToText(whisperModel)
{
EnableVoiceActivityDetection = true,
SuppressNonSpeechTokens = true,
SuppressHallucinations = true
};
Console.WriteLine($"Transcribing {audioPath}...");
using var audio = new WaveFile(audioPath);
Console.WriteLine($" Duration: {audio.Duration:hh\\:mm\\:ss}\n");
var transcription = stt.Transcribe(audio);
Console.WriteLine($" Segments: {transcription.Segments.Count}\n");
// Build a timestamped transcript
var timestampedTranscript = new StringBuilder();
foreach (var seg in transcription.Segments)
{
timestampedTranscript.AppendLine($"[{seg.Start:hh\\:mm\\:ss}] {seg.Text}");
}
string fullTimestamped = timestampedTranscript.ToString();
// ──────────────────────────────────────
// 4. Generate chaptered document
// ──────────────────────────────────────
Console.WriteLine("Generating chaptered document...\n");
var chapterGenerator = new SingleTurnConversation(chatModel)
{
SystemPrompt = "You are a document structuring assistant. Your task is to convert a " +
"timestamped transcript into a well-organized Markdown document with chapters.\n\n" +
"Instructions:\n" +
"1. Analyze the transcript to identify distinct topics or subject changes.\n" +
"2. Group consecutive segments that discuss the same topic into chapters.\n" +
"3. Give each chapter a clear, descriptive title that summarizes its content.\n" +
"4. Include the timestamp range for each chapter.\n" +
"5. Clean up the text within each chapter: fix grammar, remove filler words, " +
"and organize into readable paragraphs.\n" +
"6. Generate a Table of Contents at the top with links to each chapter.\n\n" +
"Output format:\n" +
"# [Document Title]\n\n" +
"## Table of Contents\n" +
"1. [Chapter Title 1](#chapter-1-anchor) (HH:MM:SS)\n" +
"2. [Chapter Title 2](#chapter-2-anchor) (HH:MM:SS)\n\n" +
"---\n\n" +
"## Chapter 1: [Title]\n" +
"*Timestamp: HH:MM:SS - HH:MM:SS*\n\n" +
"[Cleaned-up content organized into paragraphs]\n\n" +
"---\n\n" +
"## Chapter 2: [Title]\n" +
"*Timestamp: HH:MM:SS - HH:MM:SS*\n\n" +
"[Content]\n\n" +
"Rules:\n" +
"- Create between 3 and 15 chapters depending on recording length and topic variety.\n" +
"- Each chapter should cover a coherent topic.\n" +
"- Do not skip or omit any content from the transcript.\n" +
"- Preserve the speaker's key points and details.\n" +
"- Output only the Markdown document.",
MaximumCompletionTokens = 8192
};
var document = new StringBuilder();
chapterGenerator.AfterTextCompletion += (_, e) =>
{
if (e.SegmentType == TextSegmentType.UserVisible)
{
document.Append(e.Text);
Console.Write(e.Text);
}
};
chapterGenerator.Submit(
$"Create a chaptered document from this timestamped transcript:\n\n{fullTimestamped}");
Console.WriteLine("\n");
Console.WriteLine("\n=== Batch Chapter Generation ===\n");
string inputDir = "recordings";
string outputDir = "chaptered_docs";
if (!Directory.Exists(inputDir))
{
Console.WriteLine($"Create a '{inputDir}' folder with WAV files, then run again.");
return;
}
Directory.CreateDirectory(outputDir);
string[] wavFiles = Directory.GetFiles(inputDir, "*.wav");
Console.WriteLine($"Found {wavFiles.Length} recording(s)\n");
var libraryIndex = new StringBuilder();
libraryIndex.AppendLine("# Content Library\n");
foreach (string wavPath in wavFiles)
{
string fileName = Path.GetFileNameWithoutExtension(wavPath);
Console.Write($" {Path.GetFileName(wavPath)}: ");
try
{
using var wav = new WaveFile(wavPath);
var result = stt.Transcribe(wav);
// Build timestamped transcript
var ts = new StringBuilder();
foreach (var seg in result.Segments)
ts.AppendLine($"[{seg.Start:hh\\:mm\\:ss}] {seg.Text}");
// Generate chapters
var chapDoc = new StringBuilder();
chapterGenerator.AfterTextCompletion += (_, e) =>
{
if (e.SegmentType == TextSegmentType.UserVisible)
chapDoc.Append(e.Text);
};
chapterGenerator.Submit(
$"Create a chaptered document from this timestamped transcript:\n\n{ts}");
string outPath = Path.Combine(outputDir, $"{fileName}.md");
File.WriteAllText(outPath, chapDoc.ToString());
libraryIndex.AppendLine($"- [{fileName}]({fileName}.md) ({wav.Duration:hh\\:mm\\:ss})");
Console.ForegroundColor = ConsoleColor.Green;
Console.WriteLine($"done → {outPath}");
Console.ResetColor();
}
catch (Exception ex)
{
Console.ForegroundColor = ConsoleColor.Red;
Console.WriteLine($"failed: {ex.Message}");
Console.ResetColor();
}
}
string indexPath = Path.Combine(outputDir, "INDEX.md");
File.WriteAllText(indexPath, libraryIndex.ToString());
Console.WriteLine($"\nLibrary index: {indexPath}");
Console.WriteLine($"All documents: {Path.GetFullPath(outputDir)}");
Model Selection
Whisper Models (Transcription)
| Model ID | VRAM | Speed | Best For |
|---|---|---|---|
whisper-large-turbo3 |
~870 MB | Moderate | Best accuracy (recommended) |
whisper-small |
~260 MB | Fast | Large backlogs, draft chapters |
Chat Models (Chapter Generation)
| Model ID | VRAM | Quality | Best For |
|---|---|---|---|
gemma3:4b |
~3.5 GB | Good | Short recordings (<30 min), simple topics |
qwen3:8b |
~6 GB | Very good | Long recordings, complex topic segmentation (recommended) |
gemma3:12b |
~8 GB | Excellent | Dense technical content, nuanced topic boundaries |
Chapter generation benefits from larger models because identifying topic boundaries requires understanding the semantic flow of the conversation. Use qwen3:8b or larger for best results.
Common Issues
| Problem | Cause | Fix |
|---|---|---|
| Too few chapters (everything in one) | Recording stays on one topic; or model did not segment properly | Use a larger model; add guidance about expected number of chapters |
| Too many chapters | Model over-segments on minor topic shifts | Add "Create between 3 and 10 chapters" to the system prompt |
| Chapters overlap or repeat content | Processing chunks independently without context | Include the last 2-3 segments of the previous chunk as overlap |
| Table of contents links broken | Anchor format mismatch | Use consistent heading format; some Markdown renderers need lowercase anchors |
| Output truncated on long recordings | MaximumCompletionTokens too low or transcript exceeds context |
Increase tokens; use chunk-based processing (Step 4) |
| Timestamps missing in chapters | LLM dropped timestamps during reformatting | Strengthen system prompt: "You MUST include the timestamp range for every chapter" |
Next Steps
- Transcribe Audio with Local Speech-to-Text: foundational transcription guide.
- Generate Structured Meeting Notes from Audio Recordings: meeting-specific note format.
- Transcribe and Reformat Audio with LLM Post-Processing: clean up transcripts before chaptering.
- Build a Document Summarization Pipeline for Large Archives: summarize chaptered documents for cataloging.