qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-devel] [PULL 7/8] tests/acceptance: Add test of NeXTcube frame


From: Philippe Mathieu-Daudé
Subject: Re: [Qemu-devel] [PULL 7/8] tests/acceptance: Add test of NeXTcube framebuffer using OCR
Date: Tue, 10 Sep 2019 14:58:06 +0200
User-agent: Mozilla/5.0 (X11; Linux x86_64; rv:60.0) Gecko/20100101 Thunderbird/60.7.0

On 9/10/19 2:07 PM, Thomas Huth wrote:
> On 10/09/2019 14.02, Peter Maydell wrote:
>> On Sat, 7 Sep 2019 at 16:47, Thomas Huth <address@hidden> wrote:
>>>
>>> From: Philippe Mathieu-Daudé <address@hidden>
>>>
>>> Add a test of the NeXTcube framebuffer using the Tesseract OCR
>>> engine on a screenshot of the framebuffer device.
>>>
>>> The test is very quick:
>>>
>>>   $ avocado --show=app,console run tests/acceptance/machine_m68k_nextcube.py
>>>   JOB ID     : 78844a92424cc495bd068c3874d542d1e20f24bc
>>>   JOB LOG    : 
>>> /home/phil/avocado/job-results/job-2019-08-13T13.16-78844a9/job.log
>>>    (1/3) 
>>> tests/acceptance/machine_m68k_nextcube.py:NextCubeMachine.test_bootrom_framebuffer_size:
>>>  PASS (2.16 s)
>>>    (2/3) 
>>> tests/acceptance/machine_m68k_nextcube.py:NextCubeMachine.test_bootrom_framebuffer_ocr_with_tesseract_v3:
>>>  -
>>>   ue r pun Honl'flx ; 5‘ 55‘
>>>   avg ncaaaaa 25 MHZ, memary jag m
>>>   Backplane slat «a
>>>   Ethernet address a a r a r3 2
>>>   Memgry sackets aea canflqured far 16MB Darlly page made stMs but have 
>>> 16MB page made stMs )nstalled
>>
>> By the way, do we know why the output from this test case is
>> garbled like this ? It suggests that something's not right
>> somewhere...

I got better result using few options to tune, but later noticed they
differ on Fedora/Ubuntu.
Tesseract v4 has better result but it is alpha and we need to install
train data. Not that big, 15MiB:
https://github.com/tesseract-ocr/tessdata_best/blob/master/eng.traineddata
I preferred to keep the simplest tests with acceptable result, we are
not interested in fully understandable text output: we only want to know
the framebuffer model works. Reading "Ethernet address" is good and
quick enough.

> The text is created from the framebuffer with the OCR-tool Tesseract -
> which is just not good enough to recognize all words properly here.
> 
>  Thomas
> 



reply via email to

[Prev in Thread] Current Thread [Next in Thread]