When a different tesseract configuration is applied that produces an invalid OCR Text or an invalid OCR Text is passed on to translate(ocr_text)
it produces the following error:
web_1 | File "./app/main.py", line 158, in textract
web_1 | translation = translate(ocr_text)
web_1 | File "./app/main.py", line 122, in translate
web_1 | language = detect_lang(text)
web_1 | File "./app/main.py", line 64, in detect_lang
web_1 | possible_lang = translator.detect(img_str)
web_1 | File "/usr/local/lib/python3.8/dist-packages/googletrans/client.py", line 255, in detect
web_1 | data = self._translate(text, 'en', 'auto', kwargs)
web_1 | File "/usr/local/lib/python3.8/dist-packages/googletrans/client.py", line 78, in _translate
web_1 | token = self.token_acquirer.do(text)
web_1 | File "/usr/local/lib/python3.8/dist-packages/googletrans/gtoken.py", line 194, in do
web_1 | self._update()
web_1 | File "/usr/local/lib/python3.8/dist-packages/googletrans/gtoken.py", line 62, in _update
web_1 | code = self.RE_TKK.search(r.text).group(1).replace('var ', '')
web_1 | AttributeError: 'NoneType' object has no attribute 'group'
We could unit test this function and perhap passing it None to see if that's the issue indeed (because I'm not sure it is). A try/catch
block would fix this temporarily as well.