Windows11 SnippingTool OCR from commandline

How it works

Windows11 SnippingTool ships with impressive OCR functionality, my collegae recommend it to me and i hear it can even running offline.

Through reverse engineering analysis of OneOCR.dll, I identified several exported functions:

Then I used windbg to break on those functions to dig out the function arguments, with the help of IDA, i figuad out the basic argument passing logic.

The main challenge was the image data format, i have limited knowledge of graphics programming, so I initially assumed I would find some png/jpeg magic number, after some failed attempts, i luckily found some OpenCV matrix references in the code, so i pull the opencv code to try and guess,

i dump the suspacius image data and dump it in windbg, then write opencv program to read and i figure out the format is CV_8UC4 (8-bit unsigned integer with 4 channels)

The final implementation involved reconstructing the entire data processing pipeline, enabling Windows OCR functionality without GUI dependencies.

The next step is to RE onnxruntime.dll and make this code run even on linux, i tried but seems the library performs extensive image preprocessing, and model is decrypt and loaded multiple times, so i failed this part.

How to use

Tested on Win11 23H2 + SnippingTool 11.2409.25.0

Build this code needs opencv installed.

The code depends on the DLLs and offline AI model, the easiest way is copy those files from SnippingTool folder, puts them in the same folder of ocr.exe

includes:

oneocr.dll
oneocr.onemodel
onnxruntime.dll

On my computer the SnippingTool folder located at:

C:\Program Files\WindowsApps\Microsoft.ScreenSketch_11.2409.25.0_x64__8wekyb3d8bbwe\SnippingTool

References

https://github.com/b1tg/win11-oneocr

文章目录

How it works

How to use

References