Automate Contract and Compliance Document Review
Legal teams, procurement departments, and compliance officers review thousands of contracts, regulatory filings, and policy documents annually. Each document must be classified by type, key clauses extracted, dates and obligations identified, and risk indicators flagged. LM-Kit.NET combines document classification, structured extraction, and multi-turn conversation to build an automated contract review pipeline that runs entirely on-premises, keeping sensitive legal documents off external servers.
Why Local Contract Review Matters
Two enterprise problems that on-device contract review solves:
- Confidentiality requirements. Contracts contain trade secrets, financial terms, and strategic commitments. Uploading them to cloud AI services creates data exposure risk and may violate NDA terms. Local processing ensures that no contract text leaves the organization's infrastructure.
- Volume and consistency. A mid-size company processes hundreds of vendor agreements, NDAs, and service contracts per quarter. Manual review is slow and inconsistent. An automated pipeline applies the same extraction rules to every document, reducing missed clauses and standardizing the output format.
Prerequisites
| Requirement | Minimum |
|---|---|
| .NET SDK | 8.0+ |
| VRAM | 4+ GB |
| Disk | ~3 GB free for model download |
Step 1: Create the Project
dotnet new console -n ContractReview
cd ContractReview
dotnet add package LM-Kit.NET
Step 2: Understand the Pipeline
Incoming ┌───────────────┐ ┌───────────────┐ ┌───────────────┐
contract ───► │ 1. Classify │ ───► │ 2. Extract │ ───► │ 3. Validate │
│ type │ │ key fields │ │ & flag │
└───────────────┘ └───────────────┘ └───────────────┘
│
▼
Structured report
with risk flags
The pipeline uses three LM-Kit.NET capabilities:
Categorizationto identify the contract type (NDA, MSA, SOW, employment, lease)TextExtractionto pull key fields using a type-specific schema- Validation logic to flag missing fields, expired dates, and risk indicators
Step 3: The Complete Contract Review Pipeline
using System.Text;
using System.Text.Json;
using LMKit.Data;
using LMKit.Extraction;
using LMKit.Model;
using LMKit.TextAnalysis;
LMKit.Licensing.LicenseManager.SetLicenseKey("");
Console.InputEncoding = Encoding.UTF8;
Console.OutputEncoding = Encoding.UTF8;
// ──────────────────────────────────────
// 1. Load model
// ──────────────────────────────────────
Console.WriteLine("Loading model...");
using LM model = LM.LoadFromModelID("qwen3:8b",
downloadingProgress: (_, len, read) =>
{
if (len.HasValue) Console.Write($"\r Downloading: {(double)read / len.Value * 100:F1}% ");
return true;
},
loadingProgress: p => { Console.Write($"\r Loading: {p * 100:F0}% "); return true; });
Console.WriteLine("\n");
// ──────────────────────────────────────
// 2. Define contract categories
// ──────────────────────────────────────
string[] contractTypes = { "nda", "master_service_agreement", "statement_of_work", "employment_agreement", "lease_agreement" };
string[] contractDescriptions =
{
"A non-disclosure or confidentiality agreement protecting shared information between parties",
"A master service agreement defining overall terms for an ongoing business relationship",
"A statement of work specifying deliverables, timelines, and costs for a specific project",
"An employment contract defining terms of hire, compensation, benefits, and termination",
"A lease or rental agreement for property, equipment, or other assets"
};
var categorizer = new Categorization(model)
{
AllowUnknownCategory = true
};
// ──────────────────────────────────────
// 3. Define extraction schemas per type
// ──────────────────────────────────────
var ndaSchema = new List<TextExtractionElement>
{
new("disclosing_party", TextExtractionElement.ElementType.String,
"The party sharing confidential information", isRequired: true),
new("receiving_party", TextExtractionElement.ElementType.String,
"The party receiving confidential information", isRequired: true),
new("effective_date", TextExtractionElement.ElementType.String,
"The date the NDA takes effect (YYYY-MM-DD format)"),
new("expiration_date", TextExtractionElement.ElementType.String,
"The date the NDA expires or the confidentiality period ends (YYYY-MM-DD format)"),
new("confidentiality_period_years", TextExtractionElement.ElementType.Integer,
"Number of years confidential information must be protected"),
new("governing_law", TextExtractionElement.ElementType.String,
"The jurisdiction whose laws govern the agreement"),
new("permitted_disclosures", TextExtractionElement.ElementType.StringArray,
"Exceptions to confidentiality (e.g., court orders, regulatory requirements)"),
new("non_solicitation_clause", TextExtractionElement.ElementType.Bool,
"Whether the NDA includes a non-solicitation provision")
};
var msaSchema = new List<TextExtractionElement>
{
new("party_a", TextExtractionElement.ElementType.String,
"First contracting party", isRequired: true),
new("party_b", TextExtractionElement.ElementType.String,
"Second contracting party", isRequired: true),
new("effective_date", TextExtractionElement.ElementType.String,
"Contract start date (YYYY-MM-DD format)", isRequired: true),
new("termination_date", TextExtractionElement.ElementType.String,
"Contract end date or renewal date (YYYY-MM-DD format)"),
new("auto_renewal", TextExtractionElement.ElementType.Bool,
"Whether the contract renews automatically"),
new("termination_notice_days", TextExtractionElement.ElementType.Integer,
"Number of days notice required for termination"),
new("liability_cap", TextExtractionElement.ElementType.String,
"Maximum liability amount or formula"),
new("indemnification_clause", TextExtractionElement.ElementType.Bool,
"Whether indemnification provisions exist"),
new("governing_law", TextExtractionElement.ElementType.String,
"Governing jurisdiction"),
new("payment_terms", TextExtractionElement.ElementType.String,
"Payment schedule and terms (e.g., Net 30, Net 60)")
};
var sowSchema = new List<TextExtractionElement>
{
new("project_name", TextExtractionElement.ElementType.String,
"Name or title of the project", isRequired: true),
new("client", TextExtractionElement.ElementType.String,
"The client or customer party", isRequired: true),
new("provider", TextExtractionElement.ElementType.String,
"The service provider party", isRequired: true),
new("start_date", TextExtractionElement.ElementType.String,
"Project start date (YYYY-MM-DD format)"),
new("end_date", TextExtractionElement.ElementType.String,
"Project completion or deadline date (YYYY-MM-DD format)"),
new("total_value", TextExtractionElement.ElementType.Number,
"Total contract value or project budget"),
new("currency", TextExtractionElement.ElementType.String,
"Currency code (e.g., USD, EUR)"),
new("deliverables", new List<TextExtractionElement>
{
new("name", TextExtractionElement.ElementType.String, "Deliverable name"),
new("due_date", TextExtractionElement.ElementType.String, "Due date for this deliverable"),
new("acceptance_criteria", TextExtractionElement.ElementType.String, "How the deliverable is accepted")
}, isArray: true, description: "List of project deliverables with dates"),
new("payment_milestones", new List<TextExtractionElement>
{
new("milestone", TextExtractionElement.ElementType.String, "Milestone description"),
new("amount", TextExtractionElement.ElementType.Number, "Payment amount for this milestone"),
new("trigger", TextExtractionElement.ElementType.String, "Condition that triggers payment")
}, isArray: true, description: "Payment schedule tied to project milestones")
};
// Map contract type to schema
Dictionary<string, List<TextExtractionElement>> schemas = new()
{
["nda"] = ndaSchema,
["master_service_agreement"] = msaSchema,
["statement_of_work"] = sowSchema
};
// ──────────────────────────────────────
// 4. Process contracts
// ──────────────────────────────────────
string[] sampleContracts =
{
"MUTUAL NON-DISCLOSURE AGREEMENT\n\n" +
"This Mutual Non-Disclosure Agreement (the \"Agreement\") is entered into as of January 15, 2025 " +
"(the \"Effective Date\") by and between Apex Technologies Inc., a Delaware corporation " +
"(\"Disclosing Party\"), and Meridian Solutions LLC, a California limited liability company " +
"(\"Receiving Party\").\n\n" +
"The parties agree to protect all Confidential Information disclosed under this Agreement " +
"for a period of three (3) years from the date of disclosure. This Agreement shall remain " +
"in effect until December 31, 2027.\n\n" +
"Permitted disclosures include: (a) information required by court order; (b) information " +
"required by regulatory authorities; (c) information already in the public domain.\n\n" +
"This Agreement includes a mutual non-solicitation provision for a period of twelve (12) months.\n\n" +
"This Agreement shall be governed by the laws of the State of Delaware.",
"MASTER SERVICE AGREEMENT\n\n" +
"This Master Service Agreement (\"Agreement\") is entered into effective February 1, 2025, " +
"by and between CloudStack Corp. (\"Provider\") and RetailMax Inc. (\"Client\").\n\n" +
"Term: This Agreement shall commence on the Effective Date and continue for an initial term " +
"of twenty-four (24) months, terminating on January 31, 2027. The Agreement shall automatically " +
"renew for successive twelve-month periods unless either party provides ninety (90) days " +
"written notice of non-renewal.\n\n" +
"Liability: Provider's total aggregate liability shall not exceed the total fees paid by Client " +
"in the twelve (12) months preceding the claim. Each party shall indemnify the other against " +
"third-party claims arising from breach of this Agreement.\n\n" +
"Payment: All invoices are due Net 45 from date of receipt.\n\n" +
"Governing Law: This Agreement shall be governed by the laws of the State of New York."
};
var extractor = new TextExtraction(model)
{
NullOnDoubt = true
};
Console.WriteLine("=== Contract Review Pipeline ===\n");
foreach (string contract in sampleContracts)
{
// Step A: Classify contract type
int typeIndex = categorizer.GetBestCategory(contractTypes, contractDescriptions, contract);
string contractType = typeIndex >= 0 ? contractTypes[typeIndex] : "unknown";
float classifyConfidence = categorizer.Confidence;
Console.ForegroundColor = ConsoleColor.Yellow;
Console.WriteLine($"Contract Type: {contractType.Replace('_', ' ').ToUpper()} ({classifyConfidence:P0})");
Console.ResetColor();
if (!schemas.TryGetValue(contractType, out var schema))
{
Console.WriteLine(" No extraction schema defined for this type. Skipping.\n");
continue;
}
// Step B: Extract key fields
extractor.Elements = schema;
extractor.SetContent(contract);
TextExtractionResult result = extractor.Parse();
Console.ForegroundColor = ConsoleColor.Cyan;
Console.WriteLine($" Extraction confidence: {result.Confidence:P0}");
Console.ResetColor();
// Step C: Validate and flag risks
var flags = new List<string>();
// Check for missing required dates
string? expirationDate = result.GetValue<string>("expiration_date")
?? result.GetValue<string>("termination_date")
?? result.GetValue<string>("end_date");
if (string.IsNullOrEmpty(expirationDate))
{
flags.Add("WARNING: No expiration or termination date found");
}
else if (DateTime.TryParse(expirationDate, out DateTime expDate) && expDate < DateTime.Today)
{
flags.Add($"EXPIRED: Contract expired on {expirationDate}");
}
// Check for auto-renewal risk
bool? autoRenewal = result.GetValue<bool?>("auto_renewal");
if (autoRenewal == true)
{
int? noticeDays = result.GetValue<int?>("termination_notice_days");
flags.Add($"AUTO-RENEWAL: Contract auto-renews. Notice required: {noticeDays?.ToString() ?? "not specified"} days");
}
// Check for liability cap
string? liabilityCap = result.GetValue<string>("liability_cap");
if (contractType == "master_service_agreement" && string.IsNullOrEmpty(liabilityCap))
{
flags.Add("RISK: No liability cap specified in MSA");
}
// Print extracted fields
Console.WriteLine($" Extracted data:");
Console.ForegroundColor = ConsoleColor.DarkGray;
Console.WriteLine($" {result.Json}");
Console.ResetColor();
// Print risk flags
if (flags.Count > 0)
{
Console.ForegroundColor = ConsoleColor.Red;
Console.WriteLine("\n Risk Flags:");
foreach (string flag in flags)
Console.WriteLine($" {flag}");
Console.ResetColor();
}
else
{
Console.ForegroundColor = ConsoleColor.Green;
Console.WriteLine("\n No risk flags.");
Console.ResetColor();
}
Console.WriteLine();
}
Step 4: Processing PDF Contracts
The same pipeline works with PDF files:
// Process a folder of contract PDFs
string contractsFolder = "contracts";
string[] contractFiles = Directory.GetFiles(contractsFolder, "*.pdf");
var csvOutput = new List<string>();
csvOutput.Add("file,type,confidence,party_a,party_b,effective_date,expiration_date,flags");
foreach (string file in contractFiles)
{
string fileName = Path.GetFileName(file);
Console.Write($" {fileName}... ");
var attachment = new Attachment(file);
// Classify
int typeIndex = categorizer.GetBestCategory(contractTypes, contractDescriptions, attachment);
string contractType = typeIndex >= 0 ? contractTypes[typeIndex] : "unknown";
if (!schemas.TryGetValue(contractType, out var schema))
{
Console.WriteLine($"[{contractType}] - no schema");
continue;
}
// Extract
extractor.Elements = schema;
extractor.SetContent(attachment);
TextExtractionResult result = extractor.Parse();
Console.WriteLine($"[{contractType}] confidence={result.Confidence:P0}");
// Export to CSV
string partyA = result.GetValue<string>("party_a")
?? result.GetValue<string>("disclosing_party")
?? result.GetValue<string>("client") ?? "";
string partyB = result.GetValue<string>("party_b")
?? result.GetValue<string>("receiving_party")
?? result.GetValue<string>("provider") ?? "";
string effectiveDate = result.GetValue<string>("effective_date")
?? result.GetValue<string>("start_date") ?? "";
string expirationDate = result.GetValue<string>("expiration_date")
?? result.GetValue<string>("termination_date")
?? result.GetValue<string>("end_date") ?? "";
csvOutput.Add($"\"{fileName}\",\"{contractType}\",{result.Confidence:F2}," +
$"\"{partyA}\",\"{partyB}\",\"{effectiveDate}\",\"{expirationDate}\",\"\"");
}
File.WriteAllLines("contract_review_results.csv", csvOutput);
Console.WriteLine($"\nExported results to contract_review_results.csv");
Step 5: Adding Compliance Checks
Extend the pipeline with domain-specific compliance validation:
static List<string> RunComplianceChecks(string contractType, TextExtractionResult result)
{
var issues = new List<string>();
// Universal checks
string? governingLaw = result.GetValue<string>("governing_law");
if (string.IsNullOrEmpty(governingLaw))
issues.Add("COMPLIANCE: No governing law specified");
// NDA-specific checks
if (contractType == "nda")
{
int? confidentialityYears = result.GetValue<int?>("confidentiality_period_years");
if (confidentialityYears.HasValue && confidentialityYears > 5)
issues.Add($"REVIEW: Unusually long confidentiality period ({confidentialityYears} years)");
bool? nonSolicitation = result.GetValue<bool?>("non_solicitation_clause");
if (nonSolicitation == true)
issues.Add("NOTE: Non-solicitation clause present (may restrict future hiring)");
}
// MSA-specific checks
if (contractType == "master_service_agreement")
{
string? paymentTerms = result.GetValue<string>("payment_terms");
if (paymentTerms != null && paymentTerms.Contains("Net 60", StringComparison.OrdinalIgnoreCase))
issues.Add("REVIEW: Extended payment terms (Net 60). Company standard is Net 30.");
bool? indemnification = result.GetValue<bool?>("indemnification_clause");
if (indemnification != true)
issues.Add("RISK: No indemnification clause found in MSA");
}
// SOW-specific checks
if (contractType == "statement_of_work")
{
double totalValue = result.GetValue<double>("total_value");
if (totalValue > 100000)
issues.Add($"APPROVAL: Contract value (${totalValue:N0}) exceeds $100K threshold. VP approval required.");
}
return issues;
}
Model Selection
| Model ID | VRAM | Accuracy | Best For |
|---|---|---|---|
gemma3:4b |
~3.5 GB | Good | Standard contracts, high throughput |
qwen3:8b |
~6 GB | Very good | Complex clauses, multilingual contracts |
gemma3:12b |
~8 GB | Excellent | Dense legal language, nested obligations |
qwen3:14b |
~10 GB | Excellent | Highest accuracy for critical review |
For contract review, qwen3:8b provides the best balance of accuracy and speed. Its strong reasoning capabilities handle nested legal clauses and cross-references well. Use gemma3:12b or larger for high-stakes contracts where accuracy is paramount.
Common Issues
| Problem | Cause | Fix |
|---|---|---|
| Wrong contract type assigned | Overlapping categories | Add more specific descriptions; use Guidance on the categorizer |
| Dates in wrong format | Regional date formats (DD/MM vs MM/DD) | Add format hint in Guidance: "European dates use DD/MM/YYYY format" |
| Missing clauses from multi-page PDFs | Extraction limited to first page | Use SetContent(attachment) without page limits for full document |
| Low confidence on scanned contracts | No OCR configured | Set extractor.OcrEngine to a VlmOcr instance for scanned documents |
| Nested obligation references missed | Model struggles with cross-references | Use a larger model; add Guidance describing the document structure |
Next Steps
- Extract Structured Data from Unstructured Text: deep dive into schema-driven extraction.
- Build a Classification and Extraction Pipeline: general-purpose classify-then-extract pattern.
- Build a Self-Healing Extraction Pipeline with Fallbacks: add retry and fallback strategies.
- Chat with PDF Documents: interactive Q&A for deep contract analysis.