Teseract is an OCR tool Developed by HP Labs.It is one of the most powerful and accurate OCR system.And it is Open Source too...so I decided to give a try
Two Options
Two Options
- Directly installing (what's the fun in that?)
- Compile from the source code
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
sudo apt-get install autotools-dev libleptonica-dev | |
sudo apt-get install autoconf automake libtool | |
sudo apt-get install libpng12-dev | |
sudo apt-get install libjpeg62-dev | |
sudo apt-get install libtiff4-dev | |
sudo apt-get install zlib1g-dev | |
sudo apt-get install libicu-dev # (if you plan to make the training tools) | |
sudo apt-get install libpango1.0-dev # (if you plan to make the training tools) | |
sudo apt-get install libcairo2-dev # (if you plan to make the training tools) |
After installing all dependencies extracted the source code into a folder.Now it is compile time..:)
make step may took some time.After compiling we need to add the language data file which is pasted into /usr/local/share/tessdata and don't forget to give proper permissions otherwise tesseract cannot access the language file.
After everything just run
wow..the acuracy is unbelievable.!!!
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
./autogen.sh | |
./configure | |
make | |
sudo make install | |
sudo ldconfig |
make step may took some time.After compiling we need to add the language data file which is pasted into /usr/local/share/tessdata and don't forget to give proper permissions otherwise tesseract cannot access the language file.
After everything just run
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
tesseract phototest.tif out |
wow..the acuracy is unbelievable.!!!