How to do only page segmentation / layout detection with Tesseract (mode --psm 2)?

I would like to use page segmentation from Tesseract without running the OCR, as I have my own custom OCR model, and it takes to long to run page segmentation AND OCR. I tried using the --psm 2 mode in command line mode of Tesseract, and in pytesseract, and it didn't work as promised.

I'm working in Linux, and am coding in Python 3.10.

I currently use the tesseract-ocr-api from layoutparser Documentation. The code looks like the following:

import layoutparser as lp
ocr_agent = lp.TesseractAgent()
res = ocr_agent.detect(img_path, return_response=True)
layout_info = res['data']

The layout_info then is a pd.DataFrame and contains Layout information on the level of blocks, paragraph, lines and words and also the OCR output. The problem is that this is very slow; on my machine it takes 7s per image and I actually don't need the OCR. Hence, I want page segmentation (also sometimes called layout detection) only.

According to the Tesseract (Documentation), there is the --psm 2mode "Automatic page segmentation, but no OSD, or OCR". When I try this in the command line, this does not produce an output file (even if the output type is defined):

tesseract img.png outfile --psm 2
tesseract img.png outfile --psm 2 tsv

I also tried working with the python wrapper pytesseract, but it is quite slow and it again returns the pd.DataFrame with the layout AND OCR data, despite --psm 2 being specified:

import cv2
import pytesseract

img = cv2.imread(img_path)
layout_info = pytesseract.image_to_data(img, config='tsv --psm 2', output_type='data.frame')

I'm using pytesseract==0.3.10 and tesseract 5.3.3-30-gea0b.

Do you have any ideas on how I can achieve page segmentation without OCR with Tesseract (or at least speed up the processing time of page segmenation + OCR?

source https://stackoverflow.com/questions/77704558/how-to-do-only-page-segmentation-layout-detection-with-tesseract-mode-psm-2

StacksPedia

Search This Blog

How to do only page segmentation / layout detection with Tesseract (mode --psm 2)?

Labels

Comments

Post a Comment

Popular posts from this blog

Confusion between commands.Bot and discord.Client | Which one should I use?

How to show number of registered users in Laravel based on usertype?

Why is my reports service not connecting?