OCR or optical character recognition has come a long way in the last decade. This technology provides a complete solution for form processing and document capture. However, the process could harbour several distortions which resulted in poorly scanned photo/text-photo images and natural images rendering the OCR unreliable. To combat this shortcoming, several new methods have evolved and we now have the ability to correct or remove the image distortions and improve the OCR accuracy to optimal levels.
While we have already seen the need for image processing, it would be delightful to know that there are several open source libraries available that will help you improve the optical character recognition accuracy. JAI media Apis, JMagick, ImageJ, AForge.Net, OpenCV, and ImageMagick are few of the renowned open source libraries that are capable of processing the images as per your needs.
Our Experience with ImageMagick:
We have tested the ImageMagick open source library while working on a 3D inspection based application and the results were phenomenal particularly to select and process an object from the huge inspection file.
ImageMagick allows users to create, edit, or convert images with support for over 200 types of files. Users can resize, flip, mirror, distort, rotate, and transform images along with a dozen other features. The program was used by us to improve the characters in the huge data files.
The crop function was highly useful in improving the accuracy of the text. This coupled with the sharpen function widely heightened the quality of the picture by sharpening the edges.
One of the unique features of ImageMagick was the sampling tool. It allows taking the samples from the image to adjust the variations due to noise in the picture. This ensured a better image quality with less noise.
Fig 1: Before and After image processing effects
Users also get additional features such as generalized pixel distortion, noise, and color reduction, transform, and special effects. These features ensure that you get considerable improvement in the quality of the image. While these features can be utilized from the command-line, you can also use them from programs written in various languages such as C, C++, Pascal, Python, PHP, etc. The API and ABI also appear to be stable, mitigating any fears of a security breach. ImageMagick is capable of running on Linux, Windows, Mac Os X, iOS, Android OS, and others.
While not all agree on the benefits offered by image processing tools, practical use hints at tremendous advantages. One of the most common perks of the image processing tools is reducing the possibility of mistake or capturing wrong data. Users can also save the time of OCR and reduce efforts that otherwise would have to be invested in correcting the extracted data. Besides, processing the image before OCR ensures that words, texts, tables, and data are identified according to the pre-set criteria of the software. It results in the categorization of data and graph resulting in an enhanced final output.
At ProtoTech, we have acquired decade-long experience in designing, developing and implementing bespoke CAD/CAE software solutions and applications for our global clientele.
Get in touch with our experts today to learn more about this topic or discuss ideas using other tools.