HOME   |   COMPANY   |   CONTACT US                










Development Services Overview

Optical Character Recognition (OCR)

Document Clean-Up Facility

Electronic Fax Integration

GIS Integration

MS Exchange / Outlook Integration

Vector View Plug-Ins

Adobe for Document Preparation




Optical Character Recognition (OCR)

CDMR incorporates only the latest state-of-the-art OCR engines into the DocControl system. Depending on individual customer requirements, the OCR engine integrated with DocControl provide high accuracy levels, artificial intelligence and other specific OCR features such as zoning, text verification and visual references. Although the OCR feature is extremely useful for some organizations, it may be absolutely redundant for others. That is why CDMR has made its OCR module an optional extension. Our consultants will be able to advise you if your DMS requires the OCR feature and how best to take advantage of it.

These are some of the common OCR Module features:

  • High recognition accuracy

  • Artificial intelligence

  • Built-in and user-defined lexicons

  • Text verification and correction during the recognition process

  • Zone designation

  • Different fonts and font sizes

  • Pre-processing for fax, dot matrix and degraded documents to improve recognition results

  • Recognized text can be exported in many different formats such as MS Word & MS Excel

  • Process documents in two-page mode for open-faced books and magazines

  • Many more OCR properties and functions

What is OCR?

OCR is the process of converting scanned images of machine printed or handwritten text (numerals, letters, and symbols), into a computer processable format (such as ASCII). The character recognition process from a document image can be divided into three operational steps - document analysis, character recognition and contextual processing.

What is OCR used for?

Commercial OCR systems can largely be grouped into two categories - general purpose page readers and task-specific readers.

General purpose page readers are designed to handle a broader range of documents such as business letters, technical reportss and newspapers. These systems capture an image of a document page and separate the page into text regions and non-text regions. Non-text regions such as graphics and line drawings are often saved separately from the text and associated recognition results. Text regions are segmented into lines, words, and characters and the characters are passed to the recognizer. Recognition results are output in a format that can be post processed by application software.

A task-specific reader handles only specific document types such as pre-specified forms, cheques or survey letters. These readers usually capture only a few predefined document regions. Such systems emphasize high throughput rates and low error rates for its region.


   Telephone: (603) 7710-1563/2/1      Email: sales@cdmr.com

Copyright 1997-2002 Virtual Studio Sdn Bhd. All Rights Reserved.