Try https://docstrange.nanonets.com/ once, 10k docs you can use for free. Strong table performance. Do give feedback if any. Powered by bigger model compared to our open source one which is quiet popular on HF.
If with LLM's you can deanonymize at scale, on a personal level, you should also be able to figure out what posts are leading to this deanonymization and remove them or modify them.
Top 3 models on huggingface are all OCR models. Most automation projects involve documents where you need a model finetuned to understand all elements inside documents and provide grounding and confidence scores etc which is why these subset of models are gaining popularity
Would be intersting to see where funding goes to fix these issues. News would heavily impact public opinion and hence political influence and public funding.