function init_engines
Initializes the RAG (Retrieval-Augmented Generation) engine and document indexer components, loads persisted sessions, and optionally starts background auto-indexing of documents.
/tf/active/vicechatdev/docchat/app.py
636 - 701
moderate
Purpose
This initialization function sets up the core components of a document chat system. It creates global instances of DocChatRAG and DocumentIndexer, loads any previously saved user sessions, and conditionally triggers automatic document indexing in a background thread if configured. The function includes comprehensive error handling with logging and traceback reporting for debugging initialization failures.
Source Code
def init_engines():
"""Initialize RAG engine and document indexer"""
global rag_engine, document_indexer
# Load persisted sessions
load_all_sessions()
try:
rag_engine = DocChatRAG()
logger.info("✓ RAG engine initialized")
except Exception as e:
logger.error(f"Failed to initialize RAG engine: {e}")
try:
import traceback
logger.error(f"Traceback: {traceback.format_exc()}")
except:
pass
try:
document_indexer = DocumentIndexer()
logger.info("✓ Document indexer initialized")
except Exception as e:
logger.error(f"Failed to initialize document indexer: {e}")
try:
import traceback
logger.error(f"Traceback: {traceback.format_exc()}")
except:
pass
# Auto-index documents on startup if enabled
if config.AUTO_INDEX_ON_STARTUP and document_indexer:
logger.info(f"Auto-indexing enabled. Scanning folder: {config.DOCUMENT_FOLDER}")
def auto_index():
"""Background task to auto-index documents on startup"""
try:
if config.DOCUMENT_FOLDER.exists():
logger.info("Starting automatic document indexing...")
results = document_indexer.index_folder(
config.DOCUMENT_FOLDER,
recursive=True,
force_reindex=False
)
new_docs = results['success'] - results['reindexed']
logger.info(
f"Auto-indexing complete: "
f"{new_docs} new, "
f"{results['reindexed']} re-indexed, "
f"{results['skipped']} skipped, "
f"{results['failed']} failed"
)
else:
logger.info(f"Document folder does not exist yet: {config.DOCUMENT_FOLDER}")
except Exception as e:
logger.error(f"CRITICAL: Error during auto-indexing: {e}")
try:
import traceback
logger.error(f"Traceback: {traceback.format_exc()}")
except:
pass
# Start auto-indexing in background thread
index_thread = threading.Thread(target=auto_index, daemon=True)
index_thread.start()
logger.info("✓ Auto-indexing started in background")
Return Value
This function does not return any value (implicitly returns None). It modifies global variables 'rag_engine' and 'document_indexer' as side effects.
Dependencies
flaskloggingdatetimeuuidpathlibthreadingwerkzeugfunctoolstracebackjsonostimepython-docxreportlab
Required Imports
import logging
import threading
import traceback
import config
from rag_engine import DocChatRAG
from document_indexer import DocumentIndexer
Conditional/Optional Imports
These imports are only needed under specific conditions:
import traceback
Condition: imported inside exception handlers for detailed error logging
OptionalUsage Example
# Declare global variables first
rag_engine = None
document_indexer = None
# Configure logger
import logging
logger = logging.getLogger(__name__)
logger.setLevel(logging.INFO)
# Define load_all_sessions function
def load_all_sessions():
# Load persisted session data
pass
# Configure settings in config module
import config
from pathlib import Path
config.AUTO_INDEX_ON_STARTUP = True
config.DOCUMENT_FOLDER = Path('./documents')
# Import required classes
from rag_engine import DocChatRAG
from document_indexer import DocumentIndexer
# Initialize engines
init_engines()
# Now rag_engine and document_indexer are available globally
if rag_engine:
response = rag_engine.query('What is in the documents?')
print(response)
Best Practices
- Ensure global variables 'rag_engine' and 'document_indexer' are declared before calling this function
- Configure all required settings in the config module before initialization
- The function uses daemon threads for background indexing, which will terminate when the main program exits
- Error handling is comprehensive but failures are logged rather than raised, allowing partial initialization
- Auto-indexing runs in background to avoid blocking application startup
- The function modifies global state, so it should only be called once during application startup
- Ensure the DOCUMENT_FOLDER path exists or handle the case where it doesn't
- Consider the performance impact of auto-indexing large document collections on startup
- The function uses force_reindex=False to avoid re-indexing unchanged documents
Tags
Similar Components
AI-powered semantic similarity - components with related functionality:
-
function init_chat_engine_v1 76.0% similar
-
function init_chat_engine 71.3% similar
-
function basic_rag_example 69.8% similar
-
function process_chat_background 68.2% similar
-
class DocChatRAG 67.1% similar