AI Language Detection from Image for C# .NET Applications

🎯 Purpose of the Sample

The Language Detection from Image Demo demonstrates how to leverage the LM-Kit.NET SDK to automatically detect the language of text embedded in images. By using advanced language models with vision support and integrated OCR, this solution streamlines the process of identifying the language in visual content, minimizing manual intervention and enabling seamless multilingual operations.

👥 Industry Target Audience

This demo is particularly beneficial for developers and organizations involved in:

Global Enterprises: Enhance cross-lingual communication by automatically detecting languages in multimedia content.
Mobile & Web App Development: Integrate image-based language detection into applications to deliver localized experiences.
Media & Publishing: Automatically tag and index images by language, simplifying content management and searchability.
Intelligent Agent Solutions: Empower autonomous systems to interpret multilingual visual inputs and make real-time decisions without human oversight.

🚀 Problem Solved

Manually determining the language of text within images is inefficient and error-prone, especially in scenarios requiring rapid, autonomous decision-making. This demo addresses the challenge by automating language detection from image content, an essential capability for agentic solutions. Additionally, it offers a selection of vision-enabled Small Language Models (SLMs) that can run efficiently on CPUs, making the solution accessible and cost-effective without the need for specialized GPU hardware.

💻 Sample Application Description

The Language Detection from Image Demo is a console application that allows users to select a vision-enabled SLM and specify an image file. The application processes the image to detect the language of the embedded text and displays the detected language along with the processing time.

✨ Key Features

Model Selection: Choose from predefined vision-enabled SLMs that run efficiently on CPUs, or provide a custom model URI.
Progress Feedback: Visual indicators display model download and loading progress.
Image-Based Language Detection: Automatically detects the language in images by leveraging integrated OCR and advanced language models.
Performance Metrics: Displays processing time to evaluate efficiency.

🤖 Benefits for Agentic Solutions

Integrating image-based language detection into autonomous agents offers significant advantages:

Real-Time Decision Making: Agents can immediately identify the language of visual content, enabling prompt localization or routing of information.
Enhanced User Experience: Accurate language detection allows agents to deliver more relevant and personalized interactions.
Scalability on Standard Hardware: With support for CPU-friendly SLMs, solutions can be deployed cost-effectively in environments without specialized GPU hardware.
Improved Context Awareness: Agents equipped with language detection capabilities can better understand and respond to diverse user inputs, driving smarter automation and more efficient service delivery.

🛠️ Getting Started

📋 Prerequisites

.NET Framework 4.6.2 or .NET 6.0

📥 Download the Project

▶️ Running the Application

Clone the repository:

git clone https://github.com/LM-Kit/lm-kit-net-samples.git

Navigate to the project directory:

cd lm-kit-net-samples/console_framework_4.62/language_detection_from_image

cd lm-kit-net-samples/console_net/language_detection_from_image

Build and run the application:
```
dotnet build
dotnet run
```
Follow the on-screen prompts to select a model and provide the path to an image file for language detection.

💡 Example Usage

Select a Model: Choose a vision-enabled SLM that runs on CPU or enter a custom model URI.
Provide Image Path: Input the file path to an image containing text.
Language Detection: The application processes the image, extracts text via integrated OCR, and detects the language.
Review Results: The detected language and processing time are displayed.
Repeat or Exit: Continue with additional images or exit the application by submitting an empty input.

By incorporating this demo into your projects, you can empower autonomous agents to intelligently process multilingual visual content using CPU-friendly models, driving efficiency and enhancing user interactions in complex, dynamic environments.

Table of Contents