Optical Character Recognition
(OCR)
CDMR incorporates only the
latest state-of-the-art OCR engines into the DocControl system.
Depending on individual customer requirements, the OCR engine integrated
with DocControl provide high accuracy levels, artificial intelligence
and other specific OCR features such as zoning, text verification and
visual references. Although the OCR feature is extremely useful for some
organizations, it may be absolutely redundant for others. That is why
CDMR has made its OCR module an optional extension. Our consultants will
be able to advise you if your DMS requires the OCR feature and how best
to take advantage of it.
These are some of the common
OCR Module features:
-
High recognition accuracy
-
Artificial intelligence
-
Built-in and user-defined
lexicons
-
Text verification and
correction during the recognition process
-
Zone designation
-
Different fonts and font
sizes
-
Pre-processing for fax, dot
matrix and degraded documents to improve recognition results
-
Recognized text can be
exported in many different formats such as MS Word & MS Excel
-
Process documents in two-page
mode for open-faced books and magazines
-
Many more OCR properties and
functions …
What is OCR?
OCR
is the process of converting
scanned images of machine printed or handwritten text (numerals,
letters, and symbols), into a computer processable format (such as
ASCII). The character recognition process from a document image can be
divided into three operational steps - document analysis, character
recognition and contextual processing.
What is OCR used for?
Commercial OCR systems can
largely be grouped into two categories -
general purpose page readers and
task-specific
readers.
General purpose page readers
are designed to handle a broader range of documents such as business
letters, technical reportss and newspapers. These systems capture an
image of a document page and separate the page into text regions and
non-text regions. Non-text regions such as graphics and line drawings
are often saved separately from the text and associated recognition
results. Text regions are segmented into lines, words, and characters
and the characters are passed to the recognizer. Recognition results are
output in a format that can be post processed by application software.
A task-specific reader
handles only specific document types such as pre-specified forms,
cheques or survey letters. These readers usually capture only a few
predefined document regions. Such systems emphasize high throughput
rates and low error rates for its region.
|