How to install an OCR program on Debian

OCR program in Debian

Lately I have been digitizing several amounts of documents that I had at home. Documents that take up space and that I need to free but do not want to lose. That is why searching the internet I found a solution based on an OCR program and the smartphone camera.

With the smartphone camera I would photograph the document and then run an OCR program to the image to create the text document so that it can be used and saved on the computer. But What program to use for OCR recognition on Debian or other Gnu / Linux distribution?

Browsing the Internet, I found several websites that talked about this type of program. In Gnu / Linux, an OCR program is made up of the recognition engine and the interface. As for the recognition engine, there is a very good one called tesseract-ocr (I personally tested it and it works very well) which is the one we will use and the interface, in this case, we will choose gImageReader, which has a very friendly interface for all types of users.

So, to install it we open a terminal and write the following:

sudo aptitude install tesseract-ocr tesseract-ocr-spa gimagereader

Once the installation is finished, We have to run gImageReader and it is ready to use. We just have to select the image or batch of images that we want to digitize and press the option at the top called "Recognize All". This will start the character recognition of the document and Ports it to a txt document that we can open with any text editor.

The gImageReader interface is very intuitive and easy to use, so using the OCR program is very easy and fast, making the task of digitizing text documents very easy.

Of course, if we have isolated documents, we have to go image after image because if we do it as a batch of images, we would create a single txt document with all the text of the documents. In any case, there is no longer an excuse to have our text documents in digital format Do not you think?


3 comments, leave yours

Leave a Comment

Your email address will not be published. Required fields are marked with *

*

*

  1. Responsible for the data: AB Internet Networks 2008 SL
  2. Purpose of the data: Control SPAM, comment management.
  3. Legitimation: Your consent
  4. Communication of the data: The data will not be communicated to third parties except by legal obligation.
  5. Data storage: Database hosted by Occentus Networks (EU)
  6. Rights: At any time you can limit, recover and delete your information.

  1.   Diego Retero said

    Using the mobile camera to scan is a bad idea, buy yourself the cheapest scanner and it will give you better results than the most expensive mobile on the market

  2.   Daniel said

    Very good, I suppose it also runs on Ubuntu and derivatives. you have to try it. Greetings.

  3.   DaBry.O.Diaz said

    Thank you very much!… This gImageReader Program is really Great! It was very useful for me on my Linux-Debian-Q4OS I was needing it very urgently; to be able to digitize some images, from a Coexistence Manual, in a Residential Complex; that was on paper for 20 years, and had to be updated! First Scan the entire document, page by page, with an Epsom Printer Scanner; and then with the image files, I was able to edit and correct each text very easily and directly in the same program; From there I generate simple plain text documents, and with this I finally copied, pasted and made final editing and corrections with the Rich Text Editor of Libre Office. gImageReader really very useful and good ... Again thank you very much and Blessings ... Sincerely: DaBry.O.Díaz