CuneiForm (software)
This article relies largely or entirely on a single source. (December 2007) |
Original author(s) | Cognitive Technologies |
---|---|
Developer(s) | Cognitive Technologies |
Stable release | 12
/ December 12, 2007 |
Preview release | sources
/ April 2, 2008 |
Written in | C and C++ |
Operating system | Cross-platform |
Type | Optical character recognition |
License | Freeware/BSD licenses |
Website | http://openocr.org |
In computer software, CuneiForm is an OCR tool. It was originally developed at Cognitive Technologies and, after a few years with no development, released as freeware on December 12, 2007. The kernel of OCR engine was released under the open source BSD license license at the beginning of April 2008.[1]
Features
CuneiForm is the OmniFont system[clarification needed]. Algorithms used in CuneiForm come from the rules of writing of letters, from their topology, and do not require definition of patterns or teaching. CuneiForm recognizes any printing fonts (scanned books, newspapers, magazines, output from laser and dot-matrix printers, text from typewriters, etc.). It does not recognize handwritten or pseudo-handwritten text nor does it recognize decorative fonts (e.g. Gothic). There are special settings in CuneiForm for recognition of text from dot-matrix printer and 200x100 DPI resolution faxes.
CuneiForm can save text formatting and recognizes complicated tables of any structure.
It recognizes Bulgarian, Croatian, Czech, Danish, Dutch, English, Estonian, French, German, Hungarian, Italian, Latvian, Lithuanian, Polish, Portuguese, Romanian, Russian, Russian-English bilingual, Serbian, Slovene, Spanish, Swedish, Turkish, Ukrainian text.
CuneiForm can save recognized text in RTF, HTML or plain text format. It can also pass text to Microsoft Word or Microsoft Excel.
On-line recognition service
In June, 2008 Cognitive Technologies launched a free on-line recognition service on OpenOCR.org. Before its launch 10,000 recognitions per day using this service were planned by the end of 2008.
History
Once leader of OCR software in Russia, CuneiForm was in competition with Abbyy Fine Reader.
In 1993, Cognitive Technologies signed an OEM contract with Corel Corporation, which allowed the Cognitive recognition library to be built into the popular publishing package Corel Draw 3.0 (and subsequent versions).
In 1996, OCR CuneiForm'96 was released, which was the first OCR package to include the adaptive recognition method of character recognition. This is method based on a combination of two types of printed characters recognition algorithms: multifont and omnifont. This self-learning system is capable of recognizing poorly printed symbols by creating an internal font generated by those symbols which were printed well enough to be recognized. Thus dynamic adjustment (adaptation) for specific input characters is used.
Future
Cognitive Technologies has started a program to make OCR available for all users. Its first step was releasing CuneiForm as freeware.
Cognitive Technologies plans to start developing a new version of the software as an investor and coordinator of the project. Developers decided on the BSD license for the release to take into account all legal and technical nuances, but the whole program or its separate modules may be released later licensed under the GPL.[2]
In September 2008, part of Cuneiform was released as open source software. One of the missing parts is table analysis, However, Cognitive has promised to release this component in the future.
Cuneiform is being ported to Linux, BSD and Mac OS X [3]. This branch of code will finally be merged with Cognitive codebase.