AI-powered ID Information
Extraction for a Forex Trading
Platform

About the Client

This article describes a real-life project. However, we cannot disclose our client’s name & the project details for privacy purposes.

Our client is a fintech startup operating a forex trading platform. While headquartered in Singapore, the company has Japanese roots and primarily serves users across Asia. Operating in the financial sector, the client faces strict compliance requirements for user verification (KYC/AML) and needs a secure, scalable, and multilingual solution to streamline its onboarding process.

Project Overview

The project focuses on developing an OCR-based system capable of
automatically extracting personal information from ID documents.

Industry

Financial Services

Technology

Computer Vision; Small Language Model (Generative AI based); Azure Cognitive Service 

Country

Japan

Timeline

03/2023 - 08/2023

Challenges

The client faces several key challenges in their onboarding and KYC processes:

Diverse document formats

The platform needs to validate more than 300 types of ID documents, with formats that frequently change depending on the issuing country.

Multilingual complexity

Serving users across Asia requires robust support for multiple languages, including Japanese, English, Chinese, Korean, Vietnamese, Malay, and Filipino.

Poor image quality

User-uploaded documents are often tilted, blurry, or poorly cropped, making it difficult to achieve accurate OCR results.

High operational costs

Manual data entry and verification through BPO services are costly, time-consuming, and prone to human error. 

Solution

RikkeiSoft delivers an AI-powered ID Information Extractor that integrates Computer Vision, OCR, and Natural Language Processing into one streamlined workflow:

Computer Vision Pre-processing

The system automatically rotates, resizes, and enhances ID images to ensure clarity and consistency, which improves downstream recognition accuracy.

Text Detection & OCR

Advanced text region detection identifies relevant areas of the ID card, while Azure OCR services securely extract the raw text from these regions.

Small Language Model (Generative AI-based)

A fine-tuned small language model parses OCR outputs and accurately extracts structured personal data fields such as full name, date of birth, nationality, and ID number. This step reduces errors caused by variations in ID formats.

Validation & Error Handling

The extractor applies rule-based validation (e.g., date format checks, numeric field verification) and confidence scoring to flag uncertain cases for manual review, ensuring both speed and reliability.

Scalable & Secure Design

The solution is architected to scale seamlessly for different ID formats across new markets, while maintaining strict compliance with financial data privacy and security standards.

Results

80%

Time Reduction

70%

Cost Reduction

The implementation delivers measurable improvements in both efficiency and customer experience. The system now supports over 300 types of ID documents across multiple Asian countries and provides robust multilingual recognition. Document processing time reduces by 80%, significantly accelerating customer onboarding, while operational costs fall by 70% thanks to the elimination of manual data entry. Most importantly, the client achieves a more secure, reliable, and user-friendly verification process, strengthening customer trust and improving overall satisfaction.