Frequently asked questions

Read here for clarity on common questions around document classification,extraction, compliance, privacy, pricing & more.

Document AI & Data extraction
Recognic console is designed to understand and extract useful information from complex document structures & images using a 5 step process to parse the information into an easy to ingest JSON or XML format. This journey of the document from unstructured data to meaningful information is known as the document and image processing life cycle.
- Read & Identify: Recognic.AI learns to read & Identify the layout & data points of a document by performing OCR on hundreds of thousands of document samples fed to the system

- Classification: It uniquely identifies the document type & trains the system to classify the document to a specific category

- Orientation: The AI system uses pre-processing capabilities to fix the issues with a low-quality image/document to make it extraction ready

- Extraction: Recognic.AI uses Google’s vision API to read & extract data from documents, which further helps in deriving structured data from unstructured documents and making that available to business apps and users in industry-standard formats like JSON, XML & CSV, etc.

- Analytics: An analytics dashboard that can help streamline document processing workflows, by tracking success metrics & measuring the latency of the documents processed with in-depth reports and visualizations
Our team is always available to address any issues or concerns you have. If you’re facing errors in extracting data correctly on the console you can raise a support ticket from the console and our team will get back to you at the earliest with possible solutions.
We are working on the technology which would help extract data from handwritten documents, currently we do not offer services in this domain. Please stay tuned for further updates.
Yes we do have a limitation on the file size of a document. As of now we support 8mb of file size at max.
This highly depends on the complexity, quality & structure of the document but generally, we extract data from standard documents like Aadhar/PAN card in a matter of seconds.
Recognic uses Google Cloud Vision API, which is one of the best in the industry in terms of accuracy. We also have a pre-processing stage which runs multiple Image processing algorithms to enhance the input image for better results
While capturing an image of a document please make sure the background is white / no overlapping text / plain background. Avoid Screenshots of Documents. PDFs should be primarily system generated. If PDFs contain scanned images they should be of reasonably good quality. Maximum size of a document should not exceed 8MB
Console
Recognic supports a wide range of documents which are as follows:
- Identity Cards
- Onboarding Forms
- Financial Documents
- Government Taxation Forms
- Invoices
- Contract Documents
- Our self-serve onboarding feature has the capability to capture data from any document defined by its layout. Prime examples of such documents include Identity documents, bank statements, Payslips, ITR, or any other custom document.
The extracted data can be downloaded in various industry-standard formats like JSON, CSV & XML
Yes, The self-serve onboarding feature supports data extraction from any custom document template. The entire process of training and extraction accuracy can be controlled by the user
Recognic supports the following file formats: PDF, PNG, JPEG, JPG & TIFF
There are multiple provisions to import/upload documents on the console which are as follows:
1. Upload from computer
2. Upload using Google drive
3. Upload using SFTP
If you’ve signed up on the console using Google sign in then please follow the steps below:
Go to standalone > Select ‘Domain’ & ‘Document type’ > Click on ‘Upload file’ > Click on ‘Upload via Google Drive’ > Select file to upload.
Note: If you’ve not used Google sign-in to sign up on the console then you need to allow google drive permissions after completing all the above steps.
Extraction accuracy depends highly on the quality of documents from which data has to be extracted, to ensure high accuracy we would recommend some measures like good image/document quality.
Data Privacy
We do not store any personal user data & We are committed to ensure compliance of the highest standards in the industry. We extract data from documents provided by customers for the primary purpose of extraction, based on the requirements stated by our clients, and for the secondary purpose of further research and development of data extraction technology.
Our API’s are secured through APIGEE gateway which is the highest level of security when it comes to API management. All the data processed on the google cloud platform is encrypted. Please click here to know more about the encryption process.
Recognic runs on Google cloud & the services are deployed in each region exclusively.
1. India
2. South-east Asia
3. North America
4. UK
We do not store any data on our servers. However for identity documents we do store cropped data anonymously. We have data security policies in place to erase data after 7 days(Configurable) of upload date, hence no data is stored by Recognic once the process is complete.
Pricing & Integrations
Before we can give you a price estimate, we need to understand your requirements. Ideally, we would need to understand the volume of documents you'll be processing on the console and the data fields which you need to extract from those documents or if any custom development/training would be required. If you are interested in getting a quote from us, please visit our website and signup for a detailed conversation you can have with our sales team.
Recognic API’s are RESTful and very easy to integrate with any backend system. For more details about our APIs and how to integrate it please refer to our website for a detailed documentation. Our team will always be available for any assistance required on integration with business processes.

Try the demo

Upload a document (like an identity card, bank statement or forms) and see the structured data extracted.

Explore product