Search - Code Extractor

function create_word_report_improved

Generates a formatted Microsoft Word document report containing warranty disclosures with table of contents, structured sections, and references.

File: /tf/active/vicechatdev/improved_convert_disclosures_to_table.py

document-generation word-processing report-generation docx warranty-management

function main_v14

Orchestrates the conversion of an improved markdown file containing warranty disclosures into multiple tabular formats (CSV, Excel, Word) with timestamp-based file naming.

File: /tf/active/vicechatdev/improved_convert_disclosures_to_table.py

file-conversion markdown-processing warranty-data csv-export excel-export

function test_complex_url_hyperlink

A test function that validates the creation of Word documents with complex FileCloud URLs containing special characters, query parameters, and URL fragments as clickable hyperlinks.

File: /tf/active/vicechatdev/test_complex_hyperlink.py

testing word-document hyperlink docx url-handling

function create_word_report

Generates a formatted Microsoft Word document report containing warranty disclosures with a table of contents, metadata, and structured sections for each warranty.

File: /tf/active/vicechatdev/convert_disclosures_to_table.py

document-generation word-document docx report-generation warranty

function main_v25

Converts a markdown file containing warranty disclosure data into multiple tabular formats (CSV, Excel, Word) with timestamped output files.

File: /tf/active/vicechatdev/convert_disclosures_to_table.py

markdown-conversion data-extraction report-generation csv-export excel-export

function create_enhanced_word_document

Converts markdown-formatted warranty disclosure content into a formatted Microsoft Word document with hierarchical headings, styled text, lists, and special formatting for block references.

File: /tf/active/vicechatdev/improved_word_converter.py

document-generation markdown-to-word docx warranty-processing legal-documents

function main_v30

Main entry point function that reads a markdown file, converts it to an enhanced Word document with preserved heading structure, and saves it with a timestamped filename.

File: /tf/active/vicechatdev/improved_word_converter.py

document-conversion markdown-to-word file-processing docx main-entry-point

function clean_text_for_xml_v1

Sanitizes text strings to ensure XML 1.0 compatibility by removing or replacing invalid control characters and ensuring all characters meet XML specification requirements for Word document generation.

File: /tf/active/vicechatdev/enhanced_word_converter_fixed.py

text-processing xml sanitization data-cleaning word-documents

function create_enhanced_word_document_v1

Converts markdown content into a formatted Microsoft Word document with proper styling, table of contents, warranty sections, and reference handling for Project Victoria warranty disclosures.

File: /tf/active/vicechatdev/enhanced_word_converter_fixed.py

document-generation word-processing markdown-conversion docx formatting

function format_inline_references

Formats inline citation references (e.g., [1], [2]) in a Word document paragraph by applying italic styling to them while preserving the rest of the text.

File: /tf/active/vicechatdev/enhanced_word_converter_fixed.py

document-formatting word-processing python-docx text-formatting citations

function main_v2

Main orchestration function that reads an improved markdown file and converts it to an enhanced Word document with comprehensive formatting, including table of contents, warranty sections, disclosures, and bibliography.

File: /tf/active/vicechatdev/enhanced_word_converter_fixed.py

document-generation word-processing markdown-conversion docx file-processing

class OneCo_hybrid_RAG

A class named OneCo_hybrid_RAG

File: /tf/active/vicechatdev/OneCo_hybrid_RAG copy.py

class oneco_hybrid_rag

class DocumentDetail

Document detail view component

File: /tf/active/vicechatdev/document_detail_backup.py

class documentdetail

class OneCo_hybrid_RAG_v1

A class named OneCo_hybrid_RAG

File: /tf/active/vicechatdev/OneCo_hybrid_RAG_old.py

class oneco_hybrid_rag

class DocumentProcessor_v5

Process different document types for RAG context extraction

File: /tf/active/vicechatdev/offline_docstore_multi_vice.py

class documentprocessor

class DocumentConverter

A class that converts various document formats (Word, Excel, PowerPoint, OpenDocument, Visio) to PDF using LibreOffice's headless conversion capabilities, with support for parallel processing and directory structure preservation.

File: /tf/active/vicechatdev/pdfconverter.py

document-conversion pdf libreoffice batch-processing parallel-processing

class DocumentDetail_v1

Document detail view component

File: /tf/active/vicechatdev/document_detail_old.py

class documentdetail

class ReferenceManager_v4

Manages extraction and formatting of references for LLM chat responses. Handles both file references and BibTeX citations, formatting them according to various academic citation styles.

File: /tf/active/vicechatdev/OneCo_hybrid_RAG.py

class referencemanager

class OneCo_hybrid_RAG_v2

A class named OneCo_hybrid_RAG

File: /tf/active/vicechatdev/OneCo_hybrid_RAG.py

class oneco_hybrid_rag

class PDFConverter

A class that converts various document formats (Word, PowerPoint, Excel, images) to PDF format using LibreOffice and ReportLab libraries.

File: /tf/active/vicechatdev/msg_to_eml.py

pdf conversion document-processing file-conversion libreoffice

class DocxMerger

A class named DocxMerger

File: /tf/active/vicechatdev/word_merge.py

class docxmerger

function merge_word_documents

Merges track changes and comments from a revision Word document into a base Word document, creating a combined output document.

File: /tf/active/vicechatdev/word_merge.py

document-processing word-documents docx merge track-changes

class DocumentProcessor_v6

Process different document types for RAG context extraction

File: /tf/active/vicechatdev/offline_docstore_multi.py

class documentprocessor

class DocumentExtractor

A document text extraction class that supports multiple file formats including Word, PowerPoint, PDF, and plain text files, with automatic format detection and conversion capabilities.

File: /tf/active/vicechatdev/leexi/document_extractor.py

document-processing text-extraction pdf word powerpoint

function test_document_extractor

A test function that validates the DocumentExtractor class by testing file type support detection, text extraction from various document formats, and error handling.

File: /tf/active/vicechatdev/leexi/test_document_extractor.py

testing document-extraction file-processing validation text-extraction

function search_and_locate

Searches for specific numbered folders (01-08) in a SharePoint site and traces their locations, contents, and file distributions by type.

File: /tf/active/vicechatdev/SPFCsync/search_detailed.py

sharepoint search diagnostic folder-discovery microsoft-graph

function build_document_tree_lazy

Builds a single-level document tree structure for lazy loading, scanning only immediate children of a target directory without recursively loading subdirectories.

File: /tf/active/vicechatdev/docchat/app.py

file-system directory-tree lazy-loading document-management file-browser

function build_document_tree_recursive

Recursively builds a complete hierarchical tree structure of documents and folders from a target directory path, filtering for supported file types and skipping hidden/cache directories.

File: /tf/active/vicechatdev/docchat/app.py

file-system directory-traversal recursive document-management tree-structure

function view_document

Flask route handler that serves documents for in-browser viewing by accepting a file path as a query parameter, validating security constraints, and returning the file with appropriate MIME types and CORS headers.

File: /tf/active/vicechatdev/docchat/app.py

flask file-serving document-viewer security path-validation

function export_to_word

Flask route handler that exports a chat conversation to a formatted Microsoft Word (.docx) document with styled headings, timestamps, and references.

File: /tf/active/vicechatdev/docchat/app.py

export word-document docx chat-history conversation-export

class DocumentProcessor_v4

Handles document processing and text extraction using llmsherpa (same approach as offline_docstore_multi_vice.py).

File: /tf/active/vicechatdev/docchat/document_processor.py

class documentprocessor

class DocumentProcessor_v8

Process different document types for indexing

File: /tf/active/vicechatdev/docchat/document_indexer.py

class documentprocessor

function test_docx_file

Tests the ability to open and read a Microsoft Word (.docx) document file, validating file existence, size, and content extraction capabilities.

File: /tf/active/vicechatdev/docchat/test_problematic_files.py

document-testing file-validation docx word-document diagnostic

function test_libreoffice_conversion

Tests LibreOffice's ability to convert a document file to PDF format using headless mode, with timeout protection and comprehensive error reporting.

File: /tf/active/vicechatdev/docchat/test_problematic_files.py

libreoffice pdf-conversion document-processing testing subprocess

function main_v109

A test harness function that validates the ability to open and process PowerPoint and Word document files, with fallback to LibreOffice conversion for problematic files.

File: /tf/active/vicechatdev/docchat/test_problematic_files.py

testing document-processing file-validation powerpoint word

function check_dependencies

Validates the installation status of all required Python packages for the DocChat application by attempting to import each dependency and logging the results.

File: /tf/active/vicechatdev/docchat/integration.py

dependency-check validation installation package-management flask

function test_document_processor

A test function that validates the DocumentProcessor component's ability to extract text from PDF files with improved error handling and llmsherpa integration.

File: /tf/active/vicechatdev/contract_validity_analyzer/test_improved_processor.py

testing document-processing pdf-extraction text-extraction integration-test

function test_filecloud_connection

Tests the connection to a FileCloud server by establishing a client connection and performing a document search operation to verify functionality.

File: /tf/active/vicechatdev/contract_validity_analyzer/test_implementation.py

testing filecloud connection-test document-search integration-test

function test_ocr_fallback

A test function that validates OCR fallback functionality when the primary llmsherpa PDF text extraction method fails.

File: /tf/active/vicechatdev/contract_validity_analyzer/test_ocr_fallback.py

testing ocr pdf-processing text-extraction fallback-mechanism

function explore_documents

Explores and tests document accessibility across multiple FileCloud directory paths, attempting to download and validate document content from various locations in a hierarchical search pattern.

File: /tf/active/vicechatdev/contract_validity_analyzer/explore_documents.py

filecloud document-exploration diagnostic file-search pdf-validation

function parse_arguments

Parses command-line arguments for a contract validity analysis tool that processes FileCloud documents with configurable options for paths, concurrency, output, and file filtering.

File: /tf/active/vicechatdev/contract_validity_analyzer/main.py

cli argument-parsing command-line argparse configuration

class Config_v3

Configuration manager class that loads, manages, and persists configuration settings for a contract validity analyzer application, supporting YAML files and environment variable overrides.

File: /tf/active/vicechatdev/contract_validity_analyzer/config/config.py

configuration config-management yaml environment-variables settings

class FileCloudClient_v1

A client class for interacting with FileCloud storage systems through direct API calls, providing authentication, file search, download, and metadata retrieval capabilities.

File: /tf/active/vicechatdev/contract_validity_analyzer/utils/filecloud_client.py

filecloud storage api-client file-management document-management

class DocumentProcessor_v1

A document processing class that extracts text from PDF and Word documents using llmsherpa as the primary method with fallback support for PyPDF2, pdfplumber, and python-docx.

File: /tf/active/vicechatdev/contract_validity_analyzer/utils/document_processor_new.py

document-processing text-extraction pdf-processing word-processing llmsherpa

class DocumentProcessor_v2

A document processing class that extracts text from PDF and Word documents using llmsherpa as the primary method with fallback support for PyPDF2, pdfplumber, and python-docx.

File: /tf/active/vicechatdev/contract_validity_analyzer/utils/document_processor_old.py

document-processing text-extraction pdf-processing word-processing llmsherpa

function main_v88

Entry point function that demonstrates document processing workflow by creating an audited, watermarked, and protected PDF/A document from a DOCX file with audit trail data.

File: /tf/active/vicechatdev/document_auditor/main.py

document-processing pdf-generation audit-trail watermarking pdf-a-compliance

class DocumentConverter_v1

A class that converts various document formats (Word, Excel, PowerPoint, images) to PDF format using LibreOffice, unoconv, or PIL.

File: /tf/active/vicechatdev/document_auditor/src/document_converter.py

document-conversion pdf file-processing office-documents image-to-pdf

function api_export_document

Flask API endpoint that exports a document in either DOCX or PDF format, with authentication and authorization checks.

File: /tf/active/vicechatdev/vice_ai/complex_app.py

api export document pdf docx

function process_markdown_content

Parses markdown-formatted text content and converts it into a structured list of content elements with type annotations and formatting metadata suitable for document export.

File: /tf/active/vicechatdev/vice_ai/complex_app.py

markdown parser document-processing text-processing content-conversion

function add_formatted_content_to_word_v1

Converts processed markdown elements into formatted content within a Microsoft Word document, handling headers, paragraphs, lists, tables, and code blocks with appropriate styling.

File: /tf/active/vicechatdev/vice_ai/complex_app.py

markdown-conversion word-document document-generation formatting docx

Search Components

Search Results for "docx"

function create_word_report_improved

function main_v14

function test_complex_url_hyperlink

function create_word_report

function main_v25

function create_enhanced_word_document

function main_v30

function clean_text_for_xml_v1

function create_enhanced_word_document_v1

function format_inline_references

function main_v2

class OneCo_hybrid_RAG

class DocumentDetail

class OneCo_hybrid_RAG_v1

class DocumentProcessor_v5

class DocumentConverter

class DocumentDetail_v1

class ReferenceManager_v4

class OneCo_hybrid_RAG_v2

class PDFConverter

class DocxMerger

function merge_word_documents

class DocumentProcessor_v6

class DocumentExtractor

function test_document_extractor

function search_and_locate

function build_document_tree_lazy

function build_document_tree_recursive

function view_document

function export_to_word

class DocumentProcessor_v4

class DocumentProcessor_v8

function test_docx_file

function test_libreoffice_conversion

function main_v109

function check_dependencies

function test_document_processor

function test_filecloud_connection

function test_ocr_fallback

function explore_documents

function parse_arguments

class Config_v3

class FileCloudClient_v1

class DocumentProcessor_v1

class DocumentProcessor_v2

function main_v88

class DocumentConverter_v1

function api_export_document

function process_markdown_content

function add_formatted_content_to_word_v1

Search Examples