Description: |
We need to train the AWS Comprehend service to read information from text documents and extract the relevant fields. In order to do this we need to annotate our sample data.
https://docs.aws.amazon.com/comprehend/latest/dg/cer-annotation-csv.html
This project is to create the annotated CSV file as described in the link above. There will be 80 files to process, a mix of PDFs, Excel, Word and Emails. In each case the following will need to happen.
1. convert the file to a text file 2. add the records to the training CSV file for the custom entities (company name, email, postcode, customer po, product and order quantity)
We are looking for someone with knowledge of AWS Comprehend and preferable with Custom Entity Recognition. |