Scraping information from a pdf

Hi all,

Does anyone know of a widget/Pipe/API we can use to scrape pdf (in this case a CV) for information like address, telephone number, etc…
Did anyone of the user circle of Tadabase add this feature to a Tadabase application?

Thanks and kind regards
Peter

Hi Peter,
I think that is possible if you use tadabase Api, I had used python libraries for scrapping data.

There’s a lot of OCR & text parsing services out there.

https://docparser.com/ is great. It’s super simple to use (I’ve used it with Integromat) but it’s expensive.

Other options like AWS Textract or Google Cloud Vision are nice but much more difficult to return specific pieces of a document.

1 Like