Optical Character Recognition (ABBYY FineReader OCR)

Discussion of computer hardware and software.
Also feel free to discuss your audio/visual setups, or get help here!
Need help with computer hardware or software? Ask here!
Feel free to share and/or discuss "Freeware" or "Shareware" software here as well.

Moderators: Moderator5, Moderator3, FECC-Moderator, Site Mechanic

Forum rules
Discussion or posting of serials/cracks/keygens/etc as well as links to them will not be permitted.
Discussion of "Warez" or Pirated Software is not permitted.

 __ ___ _   _     __  _   _ ___            _   _    _ ___  _        _   
(_   | / \ |_)   (_  / \ |_  | \    / /\  |_) |_   |_) |  |_)  /\  / \_/
__)  | \_/ |     __) \_/ |   |  \/\/ /--\ | \ |_   |  _|_ | \ /--\ \_ | 
Post Reply

User avatar

Topic author
JimmyCool
Posts: 5716
Registered for: 16 years 3 months
Location: Viña del Mar, Chile
Mood:
Has thanked: 1010 times
Been thanked: 2070 times
Contact:

Optical Character Recognition (ABBYY FineReader OCR)

#1129039

Post by JimmyCool »

What is OCR?

Suppose you wanted to digitize a magazine article or a printed contract. You could spend hours retyping and then correcting misprints. Or you could convert all the required materials into digital format in several minutes using a scanner (or a digital camera) and Optical Character Recognition software.

What exactly is meant by OCR?

..
Optical Character Recognition, or OCR, is a technology that enables you to convert different types of documents, such as scanned paper documents, PDF files or images captured by a digital camera into editable and searchable data.

Imagine you’ve got a paper document - for example, magazine article, brochure, or PDF contract your partner sent to you by email. Obviously, a scanner is not enough to make this information available for editing, say in Microsoft Word. All a scanner can do is create an image or a snapshot of the document that is nothing more than a collection of black and white or colour dots, known as a raster image. In order to extract and repurpose data from scanned documents, camera images or image-only PDFs, you need an OCR software that would single out letters on the image, put them into words and then - words into sentences, thus enabling you to access and edit the content of the original document.

What Technology lies behind OCR?

The exact mechanisms that allow humans to recognize objects are yet to be understood, but the three basic principles are already well known by scientists – integrity, purposefulness and adaptability (IPA). These principles constitute the core of ABBYY FineReader OCR allowing it to replicate natural or human-like recognition.

Let’s take a look on how FineReader OCR recognizes text. First, the program analyzes the structure of document image. It divides the page into elements such as blocks of texts, tables, images, etc. The lines are divided into words and then - into characters. Once the characters have been singled out, the program compares them with a set of pattern images. It advances numerous hypotheses about what this character is. Basing on these hypotheses the program analyzes different variants of breaking of lines into words and words into characters. After processing huge number of such probabilistic hypotheses, the program finally takes the decision, presenting you the recognized text.

In addition, ABBYY FineReader provides dictionary support for 36 languages. This enables secondary analysis of the text elements on word level. With dictionary support, the program ensures even more accurate analysis and recognition of documents and simplifies further verification of recognition results.

How to use OCR Software?

Using ABBYY FineReader OCR is easy: the process generally consists of three stages: Open (Scan) the document, Recognize it and then Save in a convenient format (DOC, RTF, XLS, PDF, HTML, TXT etc.) or export data directly to one of Office applications such as Microsoft Word, Excel or Adobe Acrobat.

In addition, the latest version of ABBYY FineReader supports Automated Tasks mode which is essential when you deal with routine tasks regularly. With this feature, recognition tasks run automatically without having to manually execute all of the above mentioned steps.

What benefits does OCR bring to You?

With FineReader OCR, recognized document looks just like the original. Advanced, powerful OCR software allows you to save a lot of time and effort when creating, processing and repurposing various documents. With ABBYY FineReader OCR, you can scan paper documents for further editing and sharing with your colleagues and partners. You can extract quotes from books and magazines and use them for creating your course studies and papers without the need of retyping. With a digital camera and FineReader OCR, you can capture text outdoors from banners, posters and timetables and then use the captured information for your purposes. In the same way, you can capture information from paper documents and books – for example if there is no a scanner close at hand or you cannot use it. In addition, you can use OCR software for creating searchable PDF archives.

The entire process of data conversion from original paper document, image or PDF takes less than a minute, and the final recognized document looks just like the original!

ABBYY FineReader 11 Professional Edition

Image

BUY ($99 - $229)
Try for free (15 days use)

Source: ABBYY

More on ORC on Wikipedia:
http://en.wikipedia.org/wiki/Optical_character_recognition

It can really save you a looooooot of time!
You can find cheaper software based on the OCR technology too.


Note: This post is intended solely for FECC forum readers. Unauthorized reproduction or distribution of this post is strictly prohibited.

User avatar

rjm
Posts: 11323
Registered for: 13 years 2 months
Location: Cali
Has thanked: 2717 times
Been thanked: 1200 times
Contact:

Re: Optical Character Recognition (ABBYY FineReader OCR)

#1129195

Post by rjm »

Does anyone have any idea when they'll be able to OCR real "handwritten" words, words that were written on paper, in cursive, much as speech recognition works?

I would think if speech recognition can be done, a computer ought to be able to "learn" someone's cursive, especially since it can be done with specially connected pens. Why can't you scan in paper with cursive writing, and get editable text? This is like "the final frontier": it seems they still can't do it.

I usually like to write in longhand, and then word-process it. The only way to do that, from paper, is to read it into the computer - so far. Yes, you can use special equipment, but I prefer a fountain pen.

Well, maybe they ARE getting there! http://www.a2ia.com//Web_Bao/Historic_Document_Conversion.aspx

rjm


"And even in our sleep pain that cannot forget falls drop by drop upon the heart, and in our own despair, against our will, comes wisdom to us by the awful grace of God."
Aeschylus

"Treat me mean and cruel, treat me like a fool, but love me!"

My Tumblr blog: https://robinmark64.tumblr.com/

https://www.youtube.com/user/robinmark64

User avatar

promiseland
Posts: 10091
Registered for: 12 years 11 months
Has thanked: 760 times
Been thanked: 1261 times

Re: Optical Character Recognition (ABBYY FineReader OCR)

#1129206

Post by promiseland »

rjm wrote:Does anyone have any idea when they'll be able to OCR real "handwritten" words, words that were written on paper, in cursive, much as speech recognition works?

I would think if speech recognition can be done, a computer ought to be able to "learn" someone's cursive, especially since it can be done with specially connected pens. Why can't you scan in paper with cursive writing, and get editable text? This is like "the final frontier": it seems they still can't do it.

I usually like to write in longhand, and then word-process it. The only way to do that, from paper, is to read it into the computer - so far. Yes, you can use special equipment, but I prefer a fountain pen.

Well, maybe they ARE getting there! http://www.a2ia.com//Web_Bao/Historic_Document_Conversion.aspx

rjm
That would be nice but handwriting recognition is available only for Tablet PCs, and the software has to learn your style, the same as Dragon has to learn your individual voice, and being everyone's handwriting is unique as a fingerprint, it would be almost impossible to develop software to recognize everyone's handwriting as a whole with "on the fly" deciphering, because errors would be Inevitable.



User avatar

rjm
Posts: 11323
Registered for: 13 years 2 months
Location: Cali
Has thanked: 2717 times
Been thanked: 1200 times
Contact:

Re: Optical Character Recognition (ABBYY FineReader OCR)

#1149966

Post by rjm »

NEW OCR QUESTION!

I have Omni Page Pro 14 on my machine. The upgrade to 18 is 200 bucks. Abby looks a little more affordable. And with the upgrade, even more, but will they upgrade from a competing product?

And is there a big difference between, say, version 14 of OmniPage Pro, and Abbyy? This is a major purchase, and if I can get by with this (which was paid for, in full at the time), well, I will. So, is it worth it?

I'm doing a lot of researching lately, and I need good OCR. The little server based apps for phone mostly suck. Which is why these are so expensive, obviously.

rjm


"And even in our sleep pain that cannot forget falls drop by drop upon the heart, and in our own despair, against our will, comes wisdom to us by the awful grace of God."
Aeschylus

"Treat me mean and cruel, treat me like a fool, but love me!"

My Tumblr blog: https://robinmark64.tumblr.com/

https://www.youtube.com/user/robinmark64
Post Reply