tesseract | Tesseract Open Source OCR Engine | Computer Vision library
kandi X-RAY | tesseract Summary
Support
Quality
Security
License
Reuse
Currently covering the most popular Java, JavaScript and Python libraries. See a Sample Here
tesseract Key Features
tesseract Examples and Code Snippets
Trending Discussions on tesseract
Trending Discussions on tesseract
QUESTION
Trying to read some data with tesseract but it's already strugling with date and time, so I created a minimal test case.
code:
#include
#include
#include
#include
#include
#include
#include
using namespace std;
using namespace cv;
int main(int argc, const char * argv[]) {
string outText, imPath = argv[1];
cv::Mat image_final = cv::imread(imPath, CV_8UC1);
tesseract::TessBaseAPI *api = new tesseract::TessBaseAPI();
api->Init(NULL, "eng", tesseract::OEM_LSTM_ONLY);
api->SetPageSegMode(tesseract::PSM_AUTO_ONLY);
cv::adaptiveThreshold(image_final,image_final,255,ADAPTIVE_THRESH_MEAN_C, cv::THRESH_BINARY,11,2);
api->SetImage(image_final.data, image_final.cols, image_final.rows, 3, image_final.step);
api->SetVariable("tessedit_char_whitelist", "0123456789- :");
outText = string(api->GetUTF8Text());
api->End();
std::istringstream iss(outText);
for (std::string line; std::getline(iss, line); ) {
boost::algorithm::trim(line);
if (!line.empty()) cout << line << endl;
}
cv::imwrite("out.png", image_final);
return 0;
}
1122-03-08 18:10
2122-030 18:10
I even tried to whitelist these characters (which will not be the case in the final version) but still getting very bad results.
ANSWER
Answered 2022-Apr-17 at 21:56It looks like the main issue is setting bytes_per_pixel
to 3
instead of 1
in api->SetImage
.
The image after cv::adaptiveThreshold
is 1 color channel (1 byte per pixel) and not 3.
Replace api->SetImage(image_final.data, image_final.cols, image_final.rows, 3, image_final.step);
with:
api->SetImage(image_final.data, image_final.cols, image_final.rows, 1, image_final.step);
Replace cv::imread(imPath, CV_8UC1)
with cv::imread(imPath, cv::IMREAD_GRAYSCALE)
You may also try replacing tesseract::PSM_AUTO_ONLY
with tesseract::PSM_AUTO
or tesseract::PSM_SINGLE_BLOCK
.
According to the comment in the header file:
PSM_AUTO_ONLY = 2, ///< Automatic page segmentation, but no OSD, or OCR.
(Unless this is in purpose - I never used the C++ interface).
I have tried to reproduce the problem using pytesseract and Python, but I am getting an error when setting PSM to 2.
I am probably also using different version of Tesseract.
The result is perfect, and it supposed to be perfect with the image from your post.
Python code:
import cv2
from pytesseract import pytesseract
# Tesseract path
pytesseract.tesseract_cmd = "C:\\Program Files\\Tesseract-OCR\\tesseract.exe"
img = cv2.imread("out.png", cv2.IMREAD_GRAYSCALE) # Read input image as Grayscale
text = pytesseract.image_to_string(img, config="-c tessedit"
"_char_whitelist=' '0123456789-:"
" --psm 3 "
"lang='eng'")
print(text)
Output:
2022-03-08 18:19:15
QUESTION
I am trying to read numbers from an image with 20x10 resolution. I know this question might be a duplicate. I've gone through most of the questions here on stack overflow but none of the answers seems to work for me. Here is the image I am trying to read text from:
Here is the my current code:
import pytesseract as pt
from PIL import Image
pt.pytesseract.tesseract_cmd = r"C:\Program Files\Tesseract-OCR\tesseract.exe"
img = Image.open('foo.PNG')
text = pt.image_to_string(img)
print(text)
?
I am new to pytesseract
and image processing. Any suggestion or help will be greatly appreciated.
ANSWER
Answered 2022-Feb-18 at 11:41Actually, I have to say that tesseract
is very touchy to play with. According to my experiences, I can easily say that if you -as a human- are not able to read a text clearly, you shouldn't expect tesseract
to read it either.
First of all; to get better results, it is a must to make a good preprocessing. I strongly recommend anyone dealing with tesseract to check their documentation about Improving the quality.
In your case, problem is about the resolution. Is low resolution a reason for tesseract not to read a text ? Answer is absolutely yes. Documentation says:
Tesseract works best on images which have a DPI of at least 300 dpi, so it may be beneficial to resize images.
In here DPI means dots per inch and its suggested lower limit is 300 DPI which is higher than your image. When you resize the image to a higher resolution, for example 10 times bigger:
Now even if DPI satisfies, now you are losing the accuracy and getting noises.
Note: It also doesn't mean that higher resolution means better results. Please check here.
Note: If you really need to continue on these types of images, you may need to have a look at here. First you get higher resolution and then deblurring operation, this may help to figure it out.
QUESTION
I am using Tesseract version 4.1.1 on dotnetcore 3.1 project which works perfectly on windows but when I publish it on ubuntu it throws the following error on this line
new TesseractEngine(Tessdatapath, LanguageCode, EngineMode.TesseractAndLstm);
Could not load file or assembly 'Tesseract, Version=4.1.1.0, Culture=neutral, PublicKeyToken=null'. The system cannot find the file specified.
I copied the x64 & x86 dlls with the publish files and made sure they are on the same level with tessdata
I tried to install tesseract on ubuntu and copied the .so files inside the x64 & x86 folders but still no luck
ANSWER
Answered 2021-Nov-03 at 17:04So here is how I fixed it
It turned out that system didnt display the correct error message because it couldnt use the library System.Drawing.Common which is not supported by Linux.
Fixed that by using libgdiplux the Linux implementation of System.Drawing.Common
sudo apt-get -f install libgdiplus
Then it displayed the correct message which is
Failed to find library "libleptonica-1.80.0.so" for platform x64.
To fix that I had to compile this leptonica version from here http://www.leptonica.org/download.html
this helped me to compile it http://www.leptonica.org/source/README.html
So now that I have "libleptonica-1.80.0.so" installed I created link inside my x64 folder to leptonica files following this comment Tesseract Issue #503
QUESTION
i would like to use ocrmypdf to convert some pdf-file from a picture to a readable pdf -
Tried it with the following simple code: (the invoice.pdf is of course available in the same path as the python-script and the output.pdf should be generated)
import ocrmypdf
if __name__ == '__main__':
fn = r"C:\Users\Polzi\Documents\DEV\Python-Diverses\PDFOCR\invoice.pdf"
ocrmypdf.ocr(fn, 'output.pdf', deskew=True)
But unfortunately i get this error message:
$ python exPDFOCR.py
[WinError 2] Das System kann die angegebene Datei nicht finden
Traceback (most recent call last):
File "C:\Users\Polzi\Documents\DEV\Python-Diverses\PDFOCR\exPDFOCR.py", line 25, in
ocrmypdf.ocr('invoice.pdf', 'output.pdf', deskew=True)
File "C:\Users\Polzi\Documents\DEV\.venv\testing\lib\site-packages\ocrmypdf\api.py", line 336, in ocr
check_options(options, plugin_manager)
File "C:\Users\Polzi\Documents\DEV\.venv\testing\lib\site-packages\ocrmypdf\_validation.py", line 271, in check_options
ocr_engine_languages = plugin_manager.hook.get_ocr_engine().languages(options)
File "C:\Users\Polzi\Documents\DEV\.venv\testing\lib\site-packages\ocrmypdf\builtin_plugins\tesseract_ocr.py", line 155, in languages
return tesseract.get_languages()
File "C:\Users\Polzi\Documents\DEV\.venv\testing\lib\site-packages\ocrmypdf\_exec\tesseract.py", line 143, in get_languages
proc = run(
File "C:\Users\Polzi\Documents\DEV\.venv\testing\lib\site-packages\ocrmypdf\subprocess\__init__.py", line 53, in run
proc = subprocess_run(args, env=env, **kwargs)
File "c:\users\polzi\appdata\local\programs\python\python39\lib\subprocess.py", line 505, in run
with Popen(*popenargs, **kwargs) as process:
File "c:\users\polzi\appdata\local\programs\python\python39\lib\subprocess.py", line 951, in __init__
self._execute_child(args, executable, preexec_fn, close_fds,
File "c:\users\polzi\appdata\local\programs\python\python39\lib\subprocess.py", line 1420, in _execute_child
hp, ht, pid, tid = _winapi.CreateProcess(executable, args,
FileNotFoundError: [WinError 2] Das System kann die angegebene Datei nicht finden
Why can´t he find the file in the same folder as the py-file is executed?
ANSWER
Answered 2022-Jan-15 at 19:26Sometimes the first error message may be misleading without a clear cause
In this case the primary message "The system cannot find the specified file"
Will lead a user to concentrate on why a filename is not correct, as in this case.
What the error should report is that a required file in the dependencies was not found. which can be caused by one or more Tesseract or related Leptonica / Language data files not in the correct location either due to no install or poor install.
It transpired that installing tesseract on windows from https://github.com/UB-Mannheim/tesseract/wiki "the script now works fine"
Note a missing dependency was the cause of a similar message here Import ocrmypdf in Visual Stdio Code in Python
QUESTION
I am attempting to collect data from a shop in a game ( starbase ) in order to feed the data to a website in order to be able to display them as a candle stick chart
So far I have started using Tesseract OCR 5.0.0 and I have been running into issues as I cannot get the values reliably
I have seen that the images can be pre-processed in order to increase the reliability but I have run into a bottleneck as I am not familiar enough with Tesseract and OpenCV in order to know what to do more
Please note that since this is an in-game UI the images are going to be very constant as there is no colour variations / light changes / font size changes / ... I technically only need to get it to work once and that's it
Here are the steps I have taken so far and the results :
I have started by getting a screen of only the part of the UI I am interested in in order to remove as much clutter as possible
I have then set a threshold as shown here ( I will also be using the cropping part when doing the automation but I am not there yet ), set the language to English and the psm
argument to 6 witch gives me the following code :
import cv2
import pytesseract
def clean_text(text):
ret = text.replace("\n\n", "\n") # remove the blank lines
return ret
pytesseract.pytesseract.tesseract_cmd = r'C:\Program Files\Tesseract-OCR\tesseract'
img = cv2.imread('screens/ressources_list_array_1.png', 0)
thresh = 255 - cv2.threshold(img, 0, 255, cv2.THRESH_BINARY_INV + cv2.THRESH_OTSU)[1]
print("======= Output")
print(clean_text(pytesseract.image_to_string(thresh, lang='eng', config='--psm 6')))
cv2.imshow('thresh', thresh)
cv2.waitKey()
Here is an example of the output I get :
======= Output
Aegisium Ore 4490 456
Ajatite Ore 600 332
Arkanium Ore 84999 53
Bastium Ore 2350 421
Charodium Ore 5 280 366
Corazium Ore 39 896 212
Exorium Ore 5 380 112
Ice 980 141
Karnite Crystal ele) 111
Kutonium Ore 14 000 215
Lukium Ore 31 000 158
Nhurgite Crystal 3144 64
Surtrite Crystal 4198 70
Valkite Ore 545 150
Vokarium Ore 1850 415
Ymrium Ore 69 899 60
There are two main issues :
1 - It is not reliable enough, you can see it confused 6 000
with ele)
2 - it is not properly understanding where the numbers start and end, making the differentiation of the 2 columns difficult
I think I can solve the second issue by further splitting the image into 3 columns but I am unsure if it's not going to be a big hit on CPU / GPU usage witch I would preferably avoid
I also found the documentation of OpenCV that shows all of the possible Image processing methods but there is a lot and I am unsure on witch ones to use to further increase reliability
Any help is much appreciated
ANSWER
Answered 2022-Jan-03 at 23:02Pytesseract, on its own, doesn't handle table detection very well - the table format isn't retained in the output, which can make it difficult to parse, as seen in your output.
So splitting the table into distinct columns, performing OCR on each, and then rejoining the columns will help. This is slower, but it is more accurate.
Dilation can help, which adds white pixels to existing white areas (using the threshold and image you currently have). This expands the narrow areas of the numbers.
In my experience, to improve the accuracy generally means splitting the table up into different sections, as well as testing different thresholds and dilation settings.
import cv2
import numpy as np
import pandas as pd
def read_img(img):
'''
Read in a grayscale image.
'''
img = cv2.imread(img)
img = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)
return img
img = read_img("img_path.png")
thresh = 255 - cv2.threshold(img, 0, 255, cv2.THRESH_BINARY_INV + cv2.THRESH_OTSU)[1] # your current threshold
dilated = cv2.dilate(thresh, np.ones((3,1)), iterations=1) # dilate vertically (don't want to smudge the numbers together)
cols = []
for i, v in enumerate([dilated[:,0:200],thresh[:,200:500],dilated[:,800:900]]): # split image into columns by array slicing
# Note that the middle column isn't dilated, when so, a decimal point is found
config_options = '--psm 6'
cols.append(clean_text(pytesseract.image_to_string(v, lang='eng', config=config_options)).split('\n'))
pd.DataFrame(cols).T
0 1 2
0 Aegisium Ore 4490 456
1 Ajatite Ore 600 332
2 Arkanlum Ore 84999 53
3 Bastium Ore 2350 421
4 Charodium Ore 5 280 366
5 Corazium Ore 39 896 212
6 Exorlum Ore 5 380 112
7 Ice 980 141
8 Karnite Crystal 6 000 111
9 Kutonlum Ore 14 000 215
10 Lukium Ore 31 000 158
11 Nhurgite Crystal 3144 64
12 Surtrite Crystal 4198 70
13 Valkite Ore 545 150
14 Vokarlum Ore 1850 415
15 Ymrium Ore 69 899 60
The np.ones provides a kernel for the dilation to use. Documentation.
Lastly, depending on your use case, AWS Textract does a good job parsing tables and numbers, and they provide sample Python code in the documentation to connect to the API, which worked really well for me, at least. Hopefully some of this is helpful.
QUESTION
I've been trying to get tesseract OCR to extract some digits from a pre-cropped image and it's not working well at all even though the images are fairly clear. I've tried looking around for solutions but all the other questions I've seen on here involve a problem with cropping or skewed text.
Here's an example of my code which tries to read the image and output to the command line.
#convert image to greyscale for OCR
im_g = cv2.cvtColor(im, cv2.COLOR_BGR2GRAY)
#create threshold image to simplify things.
im_t = cv2.threshold(im_g, 0, 255, cv2.THRESH_OTSU | cv2.THRESH_BINARY_INV)[1]
#define kernel size
rect_kernel = cv2.getStructuringElement(cv2.MORPH_RECT, (20,20))
#Apply dilation to threshold image
im_d = cv2.dilate(im_t, rect_kernel, iterations = 1)
#Find countours
contours = cv2.findContours(im_t, cv2.RETR_EXTERNAL, cv2.CHAIN_APPROX_NONE)[0]
for cnt in contours:
x,y,w,h = cv2.boundingRect(cnt)
#crop
im_c = im[y:y+h, x:x+w]
speed = pytesseract.image_to_string(im_c)
print(im_path +" : " + speed)
The output for it is:
frame10008.jpg : VAeVAs}
I've gotten a tiny improvement in some images by adding the following config to the tesseract image to string function:
config="--psm 7"
Without the new config, it would detect nothing for this image. Now it outputs
frame100.jpg : | U |
Any ideas as to what I'm doing wrong? Is there a different approach I could be taking to solve this problem? I'm open to not using Tesseract at all.
ANSWER
Answered 2021-Dec-20 at 03:04I've found a decent workaround. First off I've made the image larger. More area for tesseract to work with helped it a lot. Second, to get rid of non-digit outputs, I've used the following config on the image to string function:
config = "--psm 7 outputbase digits"
That line now looks like this:
speed = pytesseract.image_to_string(im_c, config = "--psm 7 outputbase digits")
The data coming back is far from perfect but the success rate is high enough that I should be able to clean up the garbage data and interpolate where tesseract returns no digits.
QUESTION
I am working on a Kaggle notebook and whenever I run a cell that references the TensorFlow module at all, it prints out a huge warning about some sort of settings but still works. I looked up how to suppress warnings from TensorFlow, and everything I found said to do the following:
import os
os.environ["TF_CPP_MIN_LOG_LEVEL"] = "2" # Or "3", either one should work and I've tried both
I have tried putting this both before and after importing TensorFlow but to no avail. The message still prints out. This is the message I am getting:
User settings:
KMP_AFFINITY=granularity=fine,verbose,compact,1,0
KMP_BLOCKTIME=0
KMP_SETTINGS=1
KMP_WARNINGS=0
Effective settings:
KMP_ABORT_DELAY=0
KMP_ADAPTIVE_LOCK_PROPS='1,1024'
KMP_ALIGN_ALLOC=64
KMP_ALL_THREADPRIVATE=128
KMP_ATOMIC_MODE=2
KMP_BLOCKTIME=0
KMP_CPUINFO_FILE: value is not defined
KMP_DETERMINISTIC_REDUCTION=false
KMP_DEVICE_THREAD_LIMIT=2147483647
KMP_DISP_NUM_BUFFERS=7
KMP_DUPLICATE_LIB_OK=false
KMP_ENABLE_TASK_THROTTLING=true
KMP_FORCE_REDUCTION: value is not defined
KMP_FOREIGN_THREADS_THREADPRIVATE=true
KMP_FORKJOIN_BARRIER='2,2'
KMP_FORKJOIN_BARRIER_PATTERN='hyper,hyper'
KMP_GTID_MODE=3
KMP_HANDLE_SIGNALS=false
KMP_HOT_TEAMS_MAX_LEVEL=1
KMP_HOT_TEAMS_MODE=0
KMP_INIT_AT_FORK=true
KMP_LIBRARY=throughput
KMP_LOCK_KIND=queuing
KMP_MALLOC_POOL_INCR=1M
KMP_NUM_LOCKS_IN_BLOCK=1
KMP_PLAIN_BARRIER='2,2'
KMP_PLAIN_BARRIER_PATTERN='hyper,hyper'
KMP_REDUCTION_BARRIER='1,1'
KMP_REDUCTION_BARRIER_PATTERN='hyper,hyper'
KMP_SCHEDULE='static,balanced;guided,iterative'
KMP_SETTINGS=true
KMP_SPIN_BACKOFF_PARAMS='4096,100'
KMP_STACKOFFSET=64
KMP_STACKPAD=0
KMP_STACKSIZE=8M
KMP_STORAGE_MAP=false
KMP_TASKING=2
KMP_TASKLOOP_MIN_TASKS=0
KMP_TASK_STEALING_CONSTRAINT=1
KMP_TEAMS_THREAD_LIMIT=4
KMP_TOPOLOGY_METHOD=all
KMP_USE_YIELD=1
KMP_VERSION=false
KMP_WARNINGS=false
OMP_AFFINITY_FORMAT='OMP: pid %P tid %i thread %n bound to OS proc set {%A}'
OMP_ALLOCATOR=omp_default_mem_alloc
OMP_CANCELLATION=false
OMP_DEFAULT_DEVICE=0
OMP_DISPLAY_AFFINITY=false
OMP_DISPLAY_ENV=false
OMP_DYNAMIC=false
OMP_MAX_ACTIVE_LEVELS=1
OMP_MAX_TASK_PRIORITY=0
OMP_NESTED: deprecated; max-active-levels-var=1
OMP_NUM_THREADS: value is not defined
OMP_PLACES: value is not defined
OMP_PROC_BIND='intel'
OMP_SCHEDULE='static'
OMP_STACKSIZE=8M
OMP_TARGET_OFFLOAD=DEFAULT
OMP_THREAD_LIMIT=2147483647
OMP_WAIT_POLICY=PASSIVE
KMP_AFFINITY='verbose,warnings,respect,granularity=fine,compact,1,0'
Is there any way I can stop this from printing?
EDIT: Code to reproduce this message:
import tensorflow as tf
tf.constant(())
EDIT: Output of env
:
{'SHELL': '/bin/bash',
'KMP_WARNINGS': '0',
'DL_ANACONDA_HOME': '/opt/conda',
'KAGGLE_DATA_PROXY_TOKEN': '',
'KAGGLE_URL_BASE': 'https://www.kaggle.com',
'KAGGLE_KERNEL_INTEGRATIONS': '',
'CONTAINER_NAME': 'tf2-cpu/2-6',
'PWD': '/kaggle/working',
'TESSERACT_PATH': '/usr/bin/tesseract',
'TENSORFLOW_VERSION': '2.6.0',
'HOME': '/root',
'LANG': 'C.UTF-8',
'KMP_SETTINGS': '1',
'JAX_VERSION': '0.2.19',
'CONTAINER_URL': 'gcr.io/deeplearning-platform-release/tf-cpu.2-6:nightly-2021-11-17',
'ANACONDA_PYTHON_VERSION': '3.7',
'PYTHONPATH': '/kaggle/lib/kagglegym:/kaggle/lib:/kaggle/input/tensorflow-great-barrier-reef',
'KMP_BLOCKTIME': '0',
'KAGGLE_DATA_PROXY_PROJECT': 'kaggle-161607',
'KAGGLE_USER_SECRETS_TOKEN': '',
'SHLVL': '1',
'KAGGLE_KERNEL_RUN_TYPE': 'Interactive',
'PROJ_LIB': '/opt/conda/share/proj',
'MPLBACKEND': 'agg',
'LD_LIBRARY_PATH': '/usr/local/cuda/lib64:/usr/local/cuda/lib:/usr/local/lib/x86_64-linux-gnu:/usr/local/nvidia/lib:/usr/local/nvidia/lib64:',
'KMP_AFFINITY': 'granularity=fine,verbose,compact,1,0',
'MKL_THREADING_LAYER': 'GNU',
'LC_ALL': 'C.UTF-8',
'PATH': '/opt/conda/bin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin',
'PYTHONUSERBASE': '/root/.local',
'KAGGLE_DATA_PROXY_URL': 'https://dp.kaggle.net',
'_': '/opt/conda/bin/jupyter',
'GIT_PYTHON_REFRESH': 'quiet',
'PYDEVD_USE_FRAME_EVAL': 'NO',
'JPY_PARENT_PID': '9',
'TERM': 'xterm-color',
'CLICOLOR': '1',
'PAGER': 'cat',
'GIT_PAGER': 'cat',
'TF_CPP_MIN_LOG_LEVEL': '2',
'TF2_BEHAVIOR': '1'}
ANSWER
Answered 2021-Dec-09 at 15:47So I managed to fix the problem with the following line:
os.environ["KMP_SETTINGS"] = "false"
QUESTION
In my Colab installed and imported pytesseract as:
!pip install pytesseract
import pytesseract
import cv2
Load the image:
image = cv2.imread('drive/MyDrive/test.png')
Then I'll get this message: (2, 'Usage: pytesseract [-l lang] input_file') if I write code as:
pytesseract.pytesseract.tesseract_cmd = r'/usr/local/bin/pytesseract'
text = pytesseract.image_to_string(image)
And this message: /usr/bin/tesseract is not installed or it's not in your PATH. See README file for more information. if I write:
pytesseract.pytesseract.tesseract_cmd = (r'/usr/bin/tesseract')
text = pytesseract.image_to_string(image)
Do you know why and how can I fix it? Please tell me if you need more information.
ANSWER
Answered 2021-Nov-23 at 15:35Just be sure you've installed the underlying library the Python module is taking advantage of, for example:
!sudo apt install tesseract-ocr
# then you can do:
!pip install pytesseract
QUESTION
I have very simple python code:
import cv2
import pytesseract
pytesseract.pytesseract.tesseract_cmd = 'C:\\Tesseract-OCR\\tesseract.exe'
img = cv2.imread('1.png')
img = cv2.cvtColor(img, cv2.COLOR_BGR2RGB)
hImg,wImg,_ = img.shape
#detecting words
boxes = pytesseract.image_to_data(img)
for x,b in enumerate(boxes.splitlines()):
if x!=0:
b = b.split()
if len(b) == 12:
x,y,w,h = int(b[6]), int(b[7]), int(b[8]), int(b[9])
cv2.rectangle(img, (x,y), (w+x,h+y), (0,0,255), 3)
cv2.imshow('result', img)
cv2.waitKey(0)
ANSWER
Answered 2021-Nov-13 at 17:09You'll have better OCR results if you improve the quality
of the image you are giving Tesseract.
While tesseract version 3.05 (and older) handle inverted image (dark background and light text) without problem, for 4.x version use dark text on light background.
Convert from BGR
to HLS
to later remove background colors from the numbers in the top half of the image. Then, create a "blue" mask with cv2.inRange
and replace anything that's not "blue" with the color white.
hls=cv2.cvtColor(img,cv2.COLOR_BGR2HLS)
# Define lower and upper limits for the number colors.
blue_lo=np.array([114, 70, 70])
blue_hi=np.array([154, 225, 225])
# Mask image to only select "blue"
mask=cv2.inRange(hls,blue_lo,blue_hi)
# copy original image
img1 = img.copy()
img1[mask==0]=(255,255,255)
Help pytesseract by converting the image to black and white
This is converting an image to black and white. Tesseract does this internally (Otsu algorithm), but the result can be suboptimal, particularly if the page background is of uneven darkness.
rgb = cv2.cvtColor(img1, cv2.COLOR_HLS2RGB)
gray = cv2.cvtColor(rgb, cv2.COLOR_RGB2GRAY)
_, img1 = cv2.threshold(gray, 0, 255, cv2.THRESH_BINARY + cv2.THRESH_OTSU)
cv2.imshow('img_to_binary',img1)
...
hImg,wImg,_ = img.shape
#detecting words
boxes = pytesseract.image_to_data(img1)
for x,b in enumerate(boxes.splitlines()):
...
...
QUESTION
I am trying to run the following script on a databrick python notebook:
pip install presidio-image-redactor
pip install pytesseract
python -m spacy download en_core_web_lg
from PIL import Image
from presidio_image_redactor import ImageRedactorEngine
import pytesseract
image = Image.open("images/ImageData.PNG")
engine = ImageRedactorEngine()
redacted_image = engine.redact(image, (255, 192, 203))
Upon running the last line, I'm getting the error below:
TesseractNotFoundError: tesseract is not installed or it's not in your PATH.
am I missing anything?
ANSWER
Answered 2021-Nov-02 at 14:08You can use %sh
in a separate cell to execute the shell commands on the driver node. To install tesseract, you can do:
%sh apt-get -f -y install tesseract-ocr
If you need to install it to all nodes of the cluster, you need to use cluster init script with the same command (without %sh
)
Community Discussions, Code Snippets contain sources that include Stack Exchange Network
Vulnerabilities
No vulnerabilities reported
Install tesseract
Support
Find, review, and download reusable Libraries, Code Snippets, Cloud APIs from over 650 million Knowledge Items
Find more librariesExplore Kits - Develop, implement, customize Projects, Custom Functions and Applications with kandi kits
Save this library and start creating your kit
Share this Page