Just as OpenCV is the tool par excellence in computer vision, TensorFlow in deep learning, and scikit-learn in machine learning, in the case of OCR , it is Tesseract .
For this reason, the objective of this short post is to learn how to install Tesseract in any of the three most important operating systems: masOS , Ubuntu and Windows .
By the end of this article you will know:
- How to install Tesseract on masOS .
- How to install Tesseract on Ubuntu .
- How to install Tesseract on Windows .
Prepared? Let’s get started!
How to Install Tesseract on macOS
Installing Tesseract on macOS is absurdly simple.
You just have to open your terminal and run this command:
brew install tesseract
How to Install Tesseract on Ubuntu
To install Tesseract on Ubuntu we will follow a similar sequence of steps as for macOS (after all, they are both Unix-based operating systems ).
Open your terminal and run:
sudo apt-get install tesseract-ocr
How to Install Tesseract on Windows
To install Tesseract on Windows go to this link: https://github.com/UB-Mannheim/tesseract/wiki
Then, depending on whether your architecture is 32-bit or 64-bit, you’ll need to download the first or second installer:
Once you download the installer, run it and follow the installer instructions.
Be sure to check the Additional script data (download) and Additional language data (download) options , however:
It’s also important that you remember where you installed Tesseract, because we’ll need to add this location to the Path later:
Once the installation is complete, edit your Path environment variable , and add the path to the tesseract.exe file, which will be inside the directory where you installed Tesseract.
For example, in my case it is C:\Program Files\Tesseract-OCR\tesseract.exe .
Validating the Tesseract Installation
Regardless of the operating system you use, the way to validate that your Tesseract installation was successful is to open a terminal and run the following command:
You should see something like this:
tesseract v220.127.116.1120118 leptonica-1.78.0 libgif 5.1.4 : libjpeg 8d (libjpeg-turbo 1.5.3) : libpng 1.6.34 : libtiff 4.0.9 : zlib 1.2.11 : libwebp 0.6.1 : libopenjp2 2.3.0 Found AVX2 Found AVX Founded FMA Found SSE4.1 Found libarchive 3.5.0 zlib/1.2.11 liblzma/5.2.3 bz2lib/1.0.6 liblz4/1.7.5 libzstd/1.4.5 Found libcurl/7.77.0-DEV Schannel zlib/1.2.11 zstd/1.4.5 libidn2/2.0.4 nghttp2/1.31.0
In this post we learned how to install Tesseract on three of the most popular operating systems out there: macOS , Ubuntu , and Windows .
In the case of Unix-based OS, with just one instruction we can easily download and install Tesseract.
Unfortunately, with Windows we had to take more steps, but nothing too traumatic.
So we are prepared for the articles to come, in which we will explore the capabilities, benefits and characteristics of Tesseract and the world of OCR in general.