LinuxQuestions.org

LinuxQuestions.org (/questions/)
-   Linux - Software (https://www.linuxquestions.org/questions/linux-software-2/)
-   -   tesseract-4 (pdfsandwich) and high load average/CPU load (https://www.linuxquestions.org/questions/linux-software-2/tesseract-4-pdfsandwich-and-high-load-average-cpu-load-4175636255/)

kaz2100 08-13-2018 04:38 AM

tesseract-4 (pdfsandwich) and high load average/CPU load
 
Hya,

I'd appreciate expert opinion.

System: Core i3 (quad core), debian amd64 buster

While preparing search-able pdf file, with pdfsandwich and tesseract-4, load average goes as high as 16 with all cpu's have 100% load. Only one thread of pdfsandwich.

I know with -nthreads option, I can restrict tesseract threads. But still load average goes insanely high. (8 with two threads, with all 4 CPU's 100% load)

I somewhat worry whether context switch (or whatever) is wasting resources. It is slower than old system (with tesseract-3)

Is there any easy thread/load control? This is a computation node in my cluster, it usually runs headless.

Any opinion or comment will be appreciated.

cheers

syg00 08-13-2018 05:12 AM

A quick search found this - might be worth reading.

kaz2100 08-13-2018 09:02 PM

Hya

syg00: Thanks, the trick in that link works. (env OMP_THREAD_LIMIT=1) I may need further optimization, but it runs within acceptable time, (20 sec, used to be almost an hour)

cheers


All times are GMT -5. The time now is 07:15 PM.