🔍 Code Extractor

Search Components

Full-Text: Fast keyword matching | Semantic: AI-powered understanding of intent (finds similar concepts)

Search Results for "tesseract"

Found 4 matching component(s)

class DocumentProcessor_v4

Handles document processing and text extraction using llmsherpa (same approach as offline_docstore_multi_vice.py).

File: /tf/active/vicechatdev/docchat/document_processor.py

class documentprocessor
class DocumentProcessor_v8

Process different document types for indexing

File: /tf/active/vicechatdev/docchat/document_indexer.py

class documentprocessor
function test_extraction_methods

A test function that compares two PDF text extraction methods (regular llmsherpa and OCR-based Tesseract) on a specific purchase order document from FileCloud, checking for vendor name detection.

File: /tf/active/vicechatdev/contract_validity_analyzer/test_extraction_methods.py

testing pdf-extraction ocr document-processing text-extraction
class DocumentProcessor_v3

A comprehensive PDF document processor that handles text extraction, OCR (Optical Character Recognition), layout analysis, table detection, and metadata extraction from PDF files.

File: /tf/active/vicechatdev/invoice_extraction/core/document_processor.py

pdf-processing ocr text-extraction document-processing invoice-processing

Search Examples

validation - Find validation functions
database - Find database-related components
email - Find email processing functions
api - Find API-related components
authentication - Find auth components