function process_single_file
Asynchronously processes a single file (likely PDF) through an LLM pipeline, generating a response PDF with optional conversation continuity, multi-page support, and editing workflow capabilities.
/tf/active/vicechatdev/e-ink-llm/processor.py
458 - 493
complex
Purpose
This function serves as a high-level interface for processing individual files through an e-ink optimized LLM workflow without starting a file watcher. It's designed for one-off file processing tasks where you need to extract content, send it to an LLM (likely OpenAI), and generate a formatted PDF response. Use cases include: processing handwritten notes, converting annotated PDFs to LLM responses, continuing multi-turn conversations from file context, and handling multi-page document analysis with optional text editing detection.
Source Code
async def process_single_file(file_path: str, api_key: Optional[str] = None,
conversation_id: Optional[str] = None,
compact_mode: bool = True,
auto_detect_session: bool = True,
enable_multi_page: bool = True,
max_pages: int = 50,
enable_editing_workflow: bool = True,
enable_hybrid_mode: bool = True) -> Optional[str]:
"""
Process a single file without starting the watcher
Args:
file_path: Path to file to process
api_key: OpenAI API key (optional if set in environment)
conversation_id: Conversation ID to continue (optional)
compact_mode: Use compact response formatting
auto_detect_session: Enable automatic session detection from file
enable_multi_page: Enable multi-page PDF processing
max_pages: Maximum pages to process for multi-page PDFs
enable_editing_workflow: Enable annotation detection and text editing workflow
Returns:
Path to generated response PDF or None if failed
"""
processor = EInkLLMProcessor(
api_key=api_key,
conversation_id=conversation_id,
compact_mode=compact_mode,
auto_detect_session=auto_detect_session,
enable_multi_page=enable_multi_page,
max_pages=max_pages,
enable_editing_workflow=enable_editing_workflow,
enable_hybrid_mode=enable_hybrid_mode
)
result = await processor.process_file(Path(file_path))
return str(result) if result else None
Parameters
| Name | Type | Default | Kind |
|---|---|---|---|
file_path |
str | - | positional_or_keyword |
api_key |
Optional[str] | None | positional_or_keyword |
conversation_id |
Optional[str] | None | positional_or_keyword |
compact_mode |
bool | True | positional_or_keyword |
auto_detect_session |
bool | True | positional_or_keyword |
enable_multi_page |
bool | True | positional_or_keyword |
max_pages |
int | 50 | positional_or_keyword |
enable_editing_workflow |
bool | True | positional_or_keyword |
enable_hybrid_mode |
bool | True | positional_or_keyword |
Parameter Details
file_path: String path to the input file to process. Expected to be a valid file path, likely a PDF file based on the context of multi-page and annotation detection features.
api_key: Optional OpenAI API key for authentication. If None, the function will attempt to use an API key from environment variables (likely OPENAI_API_KEY). Provide this if you want to override the environment setting.
conversation_id: Optional conversation identifier to continue an existing conversation thread. If provided, the processor will maintain context from previous interactions. Use None to start a fresh conversation.
compact_mode: Boolean flag to enable compact response formatting optimized for e-ink displays. When True (default), responses are formatted with space-efficient layouts suitable for limited screen real estate.
auto_detect_session: Boolean flag to enable automatic session detection from the input file. When True (default), the system attempts to identify and resume previous conversation sessions based on file metadata or content markers.
enable_multi_page: Boolean flag to enable processing of multi-page PDF documents. When True (default), all pages up to max_pages will be processed; when False, only the first page is processed.
max_pages: Integer specifying the maximum number of pages to process from multi-page PDFs. Default is 50. This prevents excessive processing time and API costs for very long documents.
enable_editing_workflow: Boolean flag to enable detection of annotations and text editing marks in the input file. When True (default), the system can identify handwritten edits, highlights, or markup and incorporate them into the LLM workflow.
enable_hybrid_mode: Boolean flag to enable hybrid response handling, likely combining multiple processing strategies or output formats. When True (default), provides enhanced response generation capabilities.
Return Value
Type: Optional[str]
Returns Optional[str] - either a string containing the file path to the generated response PDF if processing succeeds, or None if the processing fails. The returned path points to a newly created PDF file containing the LLM's formatted response.
Dependencies
watchdogpathlibtypingloggingasyncio
Required Imports
import asyncio
from pathlib import Path
from typing import Optional
Conditional/Optional Imports
These imports are only needed under specific conditions:
from input_processor import InputProcessor
Condition: Required for the EInkLLMProcessor class to function - must be available in the same package/module
Required (conditional)from llm_handler import LLMHandler
Condition: Required for the EInkLLMProcessor class to function - must be available in the same package/module
Required (conditional)from pdf_generator import PDFGenerator
Condition: Required for the EInkLLMProcessor class to function - must be available in the same package/module
Required (conditional)from session_manager import SessionManager
Condition: Required for the EInkLLMProcessor class to function - must be available in the same package/module
Required (conditional)from compact_formatter import CompactResponseFormatter
Condition: Required when compact_mode is True - must be available in the same package/module
Required (conditional)from session_detector import SessionDetector, detect_session_from_file
Condition: Required when auto_detect_session is True - must be available in the same package/module
Required (conditional)from multi_page_llm_handler import MultiPageLLMHandler
Condition: Required when enable_multi_page is True - must be available in the same package/module
Required (conditional)from editing_workflow import EditingWorkflowHandler
Condition: Required when enable_editing_workflow is True - must be available in the same package/module
Required (conditional)from conversation_context import ConversationContextManager
Condition: Required for the EInkLLMProcessor class to function - must be available in the same package/module
Required (conditional)from hybrid_response_handler import HybridResponseHandler
Condition: Required when enable_hybrid_mode is True - must be available in the same package/module
Required (conditional)Usage Example
import asyncio
from pathlib import Path
from typing import Optional
# Basic usage with environment API key
async def main():
result = await process_single_file(
file_path="/path/to/input.pdf"
)
if result:
print(f"Response generated at: {result}")
else:
print("Processing failed")
# Advanced usage with custom settings
async def advanced_example():
result = await process_single_file(
file_path="/path/to/document.pdf",
api_key="sk-your-openai-key",
conversation_id="conv_12345",
compact_mode=True,
enable_multi_page=True,
max_pages=20,
enable_editing_workflow=True,
enable_hybrid_mode=True
)
return result
# Run the async function
if __name__ == "__main__":
asyncio.run(main())
Best Practices
- Always use asyncio.run() or await this function within an async context - it cannot be called synchronously
- Ensure the file_path points to an existing, readable file before calling to avoid processing errors
- Set max_pages appropriately to balance processing time and API costs for large documents
- Provide api_key explicitly if running in environments where environment variables are not reliable
- Handle the None return value to detect and respond to processing failures gracefully
- Consider disabling enable_multi_page for single-page documents to improve performance
- Use conversation_id to maintain context across multiple file processing calls in a conversation flow
- Monitor API usage when processing multiple files, especially with enable_multi_page enabled
- Ensure all custom module dependencies are properly installed and accessible in your Python path
- The function creates a new EInkLLMProcessor instance for each call, so it's safe to call concurrently for different files
Tags
Similar Components
AI-powered semantic similarity - components with related functionality:
-
function process_single_remarkable_file 62.5% similar
-
function process_multi_page_pdf 58.3% similar
-
class MultiPageLLMHandler 55.4% similar
-
function main_v68 54.2% similar
-
function main_v10 53.3% similar