I was going through the _init_hardsubx() function in src/lib_ccx/hardsubx.c and noticed a few problems with how errors are handled:
-
TessBaseAPICreate() return value is never checked. If it fails and returns NULL, every subsequent Tesseract call on that handle is going to blow up.
-
When tessdata_path lookup fails, the code frees pars_vec, pars_values, and ctx, but never calls TessBaseAPIDelete() on the handle that was already created causing a resource leak.
-
When TessBaseAPIInit4() fails, the handle gets leaked because only free(ctx) is called without deleting the Tesseract handle first.
Basically any error path between TessBaseAPICreate() and the successful TessBaseAPIInit4() leaks the Tesseract handle, and before TessBaseAPICreate() we don't even know if the handle is valid.
Steps to trigger:
- Point CCExtractor at a system with no tessdata installed (or corrupt traineddata files)
- The tessdata probe paths will fail and hit these error returns
- Tesseract handle is leaked every time
Suggested fix:
- Add a NULL check right after
TessBaseAPICreate()
- Add
TessBaseAPIDelete(ctx->tess_handle) in each error path where the handle was created but Init hasn't succeeded yet
- After Init succeeds, use
TessBaseAPIEnd() + TessBaseAPIDelete() (which the dec_sub malloc failure path already does correctly)
I was going through the
_init_hardsubx()function insrc/lib_ccx/hardsubx.cand noticed a few problems with how errors are handled:TessBaseAPICreate()return value is never checked. If it fails and returns NULL, every subsequent Tesseract call on that handle is going to blow up.When
tessdata_pathlookup fails, the code freespars_vec,pars_values, andctx, but never callsTessBaseAPIDelete()on the handle that was already created causing a resource leak.When
TessBaseAPIInit4()fails, the handle gets leaked because onlyfree(ctx)is called without deleting the Tesseract handle first.Basically any error path between
TessBaseAPICreate()and the successfulTessBaseAPIInit4()leaks the Tesseract handle, and beforeTessBaseAPICreate()we don't even know if the handle is valid.Steps to trigger:
Suggested fix:
TessBaseAPICreate()TessBaseAPIDelete(ctx->tess_handle)in each error path where the handle was created but Init hasn't succeeded yetTessBaseAPIEnd()+TessBaseAPIDelete()(which thedec_submalloc failure path already does correctly)