The problem requires us to create a pipeline that will convert OCR outputs of different kinds of documents to a key-value like structure where keys are all the important fields one might need from, for example, an invoice like - invoice number, name of vendor ...