Percorrer por autor "Junior, Paulo Roberto Machado Silva"
A mostrar 1 - 1 de 1
Resultados por página
Opções de ordenação
- Intelligent OCR application for text extraction and structuring on online platforms and newspapersPublication . Junior, Paulo Roberto Machado Silva; Alves, Paulo; Fernandes, José Eduardo; Cunha, Márcio Rodrigues daThe monitoring of print media is a important function for the advertising industry, enabling the identification of advertisements in newspapers and magazines for market analysis. However, automating this extraction is challenging due to the complex layouts of these publications. Conventional Optical Character Recognition (OCR) systems, capable of transcribing individual characters, often fail to retain structural organization and logicalreading order. To address these issues, the proposed process integrates Document Layout Analysis(DLA) with OCR in a multi-stage process. YOLOv10 and YOLOv12 models detect and segment document elements, and the resulting regions are then passed to PaddleOCR for text extraction. Experimental results show that the first pre-trained model achieved a mAP@50 of 0.728 on a 2,000 images sample from DocLayNet. The second pre-trained model achieved a mAP@50 of 0.519 on a custom dataset. The fusion strategy reduced detection redundancy, and comparative evaluation against a production baseline indicates competitive performance. The final workflow produces a semi-structured JSON output that preserves the association between bounding box coordinates and extracted text. Future work will assess Vision Language Models (VLMs) to improve reading order reconstruction in more complex layouts.
