We are extracting text from invoices in the tables with rows of what they purchase.
We have bboxes that give strange values and the camelot is not extracting all kinds of tables correctly. It stores some very good. It stores the invoice into the MySQL table for every row in an invoice. For the bboxes it extracts some values with strange values. so instead of 0, it finds 4BNE instead.
The job is to repair the EXISTING solution.
Our system is already built and the errors are inside of the existing file. Your job is to repair this error inside the existing structure. Do not make your own files and interpretations. It must work inside of the existing python file with all that is there.
Your experience: - Python - camelot - opencv - tesseract -5+ years
Price is price holder
We have a short deadline, 2 days.
Errors to solve: The errors that must be fixed are: 1. an error when reading an invoice that on the invoice showed 0.00, when extracted it became 4Nne 2. the repeated addings of rows from an invoice that generates errors when making calculation verifications.
Then there are some funny errors also that is only logical stuff 3. decimal errors, showing 210000 instead of 2100.00 (European and UK differences in decimals; , and . stands for different things) 4. verification script is not setting the correct flags 5. multiple suppliers in a table that should only hold one supplier. Only add supplier if the information is different to the other record
Wait with this: 6. Camelot can not extract from all invoices for some reason(I know it cannot extract form OCR so it is not that). It must be solved to be able to get into production. I cannot show this error as it can not be visualised.