🔍 Code Extractor

function api_chat_v1

Maturity: 26

Handle chat API requests with support for long-running tasks

File:
/tf/active/vicechatdev/vice_ai/app.py
Lines:
726 - 1176
Complexity:
moderate

Purpose

Handle chat API requests with support for long-running tasks

Source Code

def api_chat():
    """Handle chat API requests with support for long-running tasks"""
    try:
        data = request.get_json()
        message = data.get('message', '').strip()
        
        if not message:
            return jsonify({'error': 'Message cannot be empty'}), 400
        
        if not chat_engine:
            return jsonify({'error': 'Chat engine not available'}), 500
        
        # Extract configuration options
        config = data.get('config', {})
        
        # Check if this should be processed asynchronously
        # Use async for extensive search or when explicitly requested
        use_async = config.get('enable_extensive_search', False) or config.get('async_processing', False)
        
        if use_async:
            # Start background processing
            task_id = str(uuid.uuid4())
            user_info = session['user']
            
            create_task(task_id, user_info.get('email', 'unknown'))
            
            # Start background thread
            thread = threading.Thread(
                target=process_chat_request_background,
                args=(task_id, message, config, user_info),
                daemon=True
            )
            thread.start()
            
            logger.info(f"🚀 Started async task {task_id} for user {user_info.get('name', 'unknown')}")
            
            return jsonify({
                'task_id': task_id,
                'status': 'processing',
                'message': 'Your request is being processed in the background. Please check the status using the task ID.',
                'estimated_time': 'This may take 5-20 minutes for extensive searches.'
            })
        
        else:
            # Process synchronously for quick requests
            # Configure chat engine based on user selections
            if hasattr(chat_engine, 'flow_control'):
                # Database search (for internal documents)
                chat_engine.flow_control['enable_search'] = config.get('enable_search', True)
                # Web search (for external web content) - map the frontend's enable_search to web search
                chat_engine.flow_control['enable_web_search'] = config.get('enable_search', True)
                logger.info(f"🔍 Search configuration - Database: {chat_engine.flow_control['enable_search']}, Web: {chat_engine.flow_control['enable_web_search']}")
                
                chat_engine.flow_control['enable_memory'] = config.get('enable_memory', True)
                chat_engine.flow_control['enable_extensive_search'] = config.get('enable_extensive_search', False)
                chat_engine.flow_control['extensive_search_chunks'] = config.get('extensive_chunks', 200)
                chat_engine.flow_control['target_summary_tokens'] = config.get('summary_tokens', 5000)  # Increased default to match UI
                chat_engine.flow_control['detail_level'] = config.get('detail_level', 'detailed')
                
                # Log target summary tokens for debugging
                target_tokens = config.get('summary_tokens', 5000)
                logger.info(f"📊 Target summary tokens configured: {target_tokens}")
                
                chat_engine.flow_control['enable_keyword_filtering'] = config.get('enable_keyword_filtering', False)
                chat_engine.flow_control['manual_keywords'] = config.get('manual_keywords', '')
                chat_engine.flow_control['enable_reference_filtering'] = config.get('enable_reference_filtering', True)
                chat_engine.flow_control['relevance_threshold'] = config.get('relevance_threshold', 0.7)
                
                # Log keyword filtering configuration for debugging
                if config.get('enable_keyword_filtering', False):
                    keywords = config.get('manual_keywords', '').strip()
                    logger.info(f"🔍 Keyword filtering configured - Keywords: '{keywords}'")

            # Handle chat history for context-aware processing
            if config.get('enable_memory', False) and config.get('chat_history'):
                chat_history = config.get('chat_history', [])
                logger.info(f"📚 Received chat history with {len(chat_history)} messages for context-aware processing")
                
                # Temporarily populate the chat memory with the provided history
                # Clear existing memory first
                if hasattr(chat_engine, 'chat_memory'):
                    logger.info(f"🔍 Chat memory before clearing: {len(chat_engine.chat_memory.messages)} messages")
                    chat_engine.chat_memory.messages = []
                    
                    # Add the history messages in pairs (user, assistant)
                    added_exchanges = 0
                    
                    # Group messages into user-assistant pairs
                    # Find all user messages and their corresponding assistant responses
                    user_messages = [msg for msg in chat_history if msg.get('role') == 'user']
                    assistant_messages = [msg for msg in chat_history if msg.get('role') == 'assistant']
                    
                    logger.info(f"🔍 Found {len(user_messages)} user messages and {len(assistant_messages)} assistant messages")
                    
                    # Pair them up (assuming they alternate in conversation)
                    max_pairs = min(len(user_messages), len(assistant_messages))
                    
                    for i in range(max_pairs):
                        user_msg = user_messages[i]
                        assistant_msg = assistant_messages[i]
                        
                        if user_msg and assistant_msg:
                            chat_engine.chat_memory.save_context(
                                {"role": "user", "content": user_msg.get('content', '')},
                                {"role": "assistant", "content": assistant_msg.get('content', '')}
                            )
                            added_exchanges += 1
                            logger.info(f"   đŸ’Ŧ Added exchange {added_exchanges}: '{user_msg.get('content', '')[:50]}...' -> '{assistant_msg.get('content', '')[:50]}...'")
                        else:
                            logger.warning(f"   âš ī¸  Missing message in pair {i+1}")
                    
                    if len(user_messages) != len(assistant_messages):
                        logger.warning(f"   âš ī¸  Uneven number of user ({len(user_messages)}) and assistant ({len(assistant_messages)}) messages")
                    
                    logger.info(f"✅ Populated chat memory with {added_exchanges} conversation exchanges (total messages: {len(chat_engine.chat_memory.messages)})")
                    
                    # Verify memory state after population
                    if hasattr(chat_engine.chat_memory, 'get_formatted_history'):
                        formatted_history = chat_engine.chat_memory.get_formatted_history()
                        logger.info(f"🔍 Formatted history length: {len(formatted_history)} characters")
                    
                else:
                    logger.warning("âš ī¸  Chat engine does not have chat_memory attribute")
            elif config.get('enable_memory', False):
                logger.info("📝 Memory enabled but no chat history provided from frontend")
                
            # Set instruction template if provided
            template = config.get('instruction_template')
            if template and hasattr(chat_engine, 'set_instruction_template'):
                chat_engine.set_instruction_template(template)
            elif template and hasattr(chat_engine, 'current_instruction_template'):
                chat_engine.current_instruction_template = template
            
            # Add data sources if specified
            collections = config.get('collections', [])
            logger.info(f"Selected collections: {collections}")
            
            if hasattr(chat_engine, 'data_handles'):
                # Always clear existing data sources first, regardless of whether new collections are selected
                try:
                    chat_engine.data_handles.clear_data()
                    logger.info("Cleared existing data sources")
                except Exception as e:
                    logger.warning(f"Could not clear existing data sources: {e}")
                
                # Add selected collections as chromaDB type (only if collections are selected)
                if collections:
                    for collection in collections:
                        available_collections = getattr(chat_engine, 'available_collections', [])
                        logger.info(f"Available collections: {available_collections}")
                        
                        if collection in available_collections:
                            try:
                                # Configure processing steps based on user settings
                                processing_steps = ["similarity", "extend_query"]
                                
                                # Add keyword filtering if enabled
                                if config.get('enable_keyword_filtering', False):
                                    processing_steps.append("keyword_filter")
                                    logger.info(f"🔍 Keyword filtering enabled for collection {collection}")
                                
                                # Add collection as chromaDB type which is the correct type for ChromaDB collections
                                chat_engine.data_handles.add_data(
                                    name=f"Internal data store: {collection}",
                                    type="chromaDB",
                                    data=collection,
                                    filters="",
                                    processing_steps=processing_steps,
                                    inclusions=10,
                                    instructions=""
                                )
                                logger.info(f"✅ Added ChromaDB collection: {collection} with processing steps: {processing_steps}")
                                
                            except Exception as e:
                                logger.error(f"❌ Failed to add collection {collection}: {e}")
                        else:
                            logger.warning(f"âš ī¸  Collection {collection} not found in available collections: {available_collections}")
                else:
                    logger.info("â„šī¸  No collections selected - data sources cleared")
                            
                # Log current data handles for debugging
                try:
                    current_handlers = list(chat_engine.data_handles.handlers.keys())
                    logger.info(f"🔍 Current data handlers after setup: {current_handlers}")
                except Exception as e:
                    logger.warning(f"Could not log current handlers: {e}")
            else:
                logger.warning("âš ī¸  Chat engine does not have data_handles attribute")
                    
            # Handle uploaded documents
            uploaded_docs = config.get('uploaded_documents', [])
            if uploaded_docs and hasattr(chat_engine, 'data_handles'):
                logger.info(f"📄 Processing {len(uploaded_docs)} uploaded documents")
                
                for doc_info in uploaded_docs:
                    try:
                        # Get document content from storage
                        user_email = session['user'].get('email', 'unknown')
                        user_documents = get_user_documents(user_email)
                        
                        # Check if required fields exist
                        if 'id' not in doc_info:
                            logger.error(f"❌ Document missing 'id' field: {doc_info}")
                            continue
                        
                        doc_id = doc_info['id']
                        doc_type = doc_info.get('type', 'uploaded')
                        
                        # Handle different types of documents
                        if doc_type == 'reference' and doc_id.startswith('ref_') and not doc_id.startswith('ref_block_'):
                            # This is a ChromaDB reference document - skip it since it's already been processed
                            # These come from AI responses that cite ChromaDB collection documents
                            logger.info(f"â„šī¸  Skipping ChromaDB reference document: {doc_id} (already processed in context)")
                            continue
                        elif doc_type == 'reference_block' or doc_id.startswith('ref_block_'):
                            # This is a reference block - should be processed as uploaded document
                            if doc_id in user_documents:
                                doc_data = user_documents[doc_id]
                                text_content = doc_data['text_content']
                                filename = doc_data['metadata'].get('original_filename', 'Unknown Block')
                                
                                # Add as text data source
                                processing_steps = ["similarity"]
                                if config.get('enable_keyword_filtering', False):
                                    processing_steps.append("keyword_filter")
                                
                                chat_engine.data_handles.add_data(
                                    name=f"Reference block: {filename}",
                                    type="text",
                                    data=text_content,
                                    filters="",
                                    processing_steps=processing_steps,
                                    inclusions=5,
                                    instructions=f"This content is from the reference block '{filename}'"
                                )
                                logger.info(f"✅ Added reference block: {filename}")
                            else:
                                logger.warning(f"âš ī¸  Reference block {doc_id} not found in user storage")
                        elif doc_type == 'reference' and 'uuid' in doc_info:
                            # This is a Neo4j reference document with UUID - try to retrieve from Neo4j
                            document_uuid = doc_info.get('uuid')
                            if document_uuid and hasattr(chat_engine, 'extensive_search_manager'):
                                try:
                                    document_content = chat_engine.extensive_search_manager.get_document_by_uuid(document_uuid)
                                    if document_content:
                                        # Add as text data source
                                        processing_steps = ["similarity"]
                                        if config.get('enable_keyword_filtering', False):
                                            processing_steps.append("keyword_filter")
                                        
                                        # Extract filename from content or use UUID
                                        filename = document_content.get('doc_name', f"Document {document_uuid[:8]}")
                                        text_content = document_content.get('content', '')
                                        
                                        chat_engine.data_handles.add_data(
                                            name=f"Reference document: {filename}",
                                            type="text",
                                            data=text_content,
                                            filters="",
                                            processing_steps=processing_steps,
                                            inclusions=5,
                                            instructions=f"This content is from the Neo4j reference document '{filename}' with UUID {document_uuid}"
                                        )
                                        logger.info(f"✅ Added Neo4j reference document: {filename} (UUID: {document_uuid})")
                                    else:
                                        logger.warning(f"âš ī¸  Neo4j document {document_uuid} not found or empty")
                                except Exception as e:
                                    logger.error(f"❌ Failed to retrieve Neo4j document {document_uuid}: {e}")
                            else:
                                logger.warning(f"âš ī¸  Neo4j reference document missing UUID or extensive_search_manager not available")
                        elif doc_id in user_documents:
                            # This is a regular uploaded document
                            doc_data = user_documents[doc_id]
                            text_content = doc_data['text_content']
                            filename = doc_data['metadata'].get('original_filename', 'Unknown')
                            
                            # Add as text data source
                            processing_steps = ["similarity"]
                            if config.get('enable_keyword_filtering', False):
                                processing_steps.append("keyword_filter")
                            
                            chat_engine.data_handles.add_data(
                                name=f"Uploaded document: {filename}",
                                type="text",
                                data=text_content,
                                filters="",
                                processing_steps=processing_steps,
                                inclusions=5,  # Fewer inclusions for uploaded docs
                                instructions=f"This content is from the uploaded document '{filename}'"
                            )
                            logger.info(f"✅ Added uploaded document: {filename}")
                        else:
                            logger.warning(f"âš ī¸  Document {doc_id} not found in user storage")
                            
                    except Exception as e:
                        logger.error(f"❌ Failed to add uploaded document {doc_info.get('name', 'unknown')}: {e}")
                        logger.error(f"   Document info: {doc_info}")
                        import traceback
                        logger.error(f"   Traceback: {traceback.format_exc()}")
            
            # Generate response using chat engine
            try:
                # Use the response_callback method which is the main interface for OneCo_hybrid_RAG
                if hasattr(chat_engine, 'response_callback'):
                    raw_response = chat_engine.response_callback(message)
                    
                    # Handle different response types more carefully
                    if hasattr(raw_response, '_repr_markdown_'):
                        # This is a Panel Markdown object - get the actual markdown content
                        try:
                            response = raw_response._repr_markdown_()
                        except:
                            response = str(raw_response)
                    elif hasattr(raw_response, 'object'):
                        # Panel pane object
                        if hasattr(raw_response.object, '_repr_markdown_'):
                            response = raw_response.object._repr_markdown_()
                        else:
                            response = str(raw_response.object)
                    elif hasattr(raw_response, 'value'):
                        # Some objects have a value attribute
                        response = str(raw_response.value)
                    elif str(type(raw_response)).startswith('<class \'panel.'):
                        # Panel object - try to extract content
                        try:
                            if hasattr(raw_response, '_repr_markdown_'):
                                response = raw_response._repr_markdown_()
                            elif hasattr(raw_response, 'object'):
                                response = str(raw_response.object)
                            else:
                                response = str(raw_response)
                        except:
                            response = str(raw_response)
                    else:
                        response = str(raw_response)
                        
                    # Clean up the response if it contains Markdown object representations
                    if isinstance(response, str) and 'Markdown(' in response:
                        # Extract content from Markdown(content) format
                        import re
                        match = re.search(r'Markdown\((.*)\)', response, re.DOTALL)
                        if match:
                            response = match.group(1).strip('\'"')
                    
                    # Post-process the response to fix file:// links and enhance document references
                    if isinstance(response, str):
                        import re
                        
                        # First, convert file:// links to document references with collection info
                        # Pattern to match file:// links in HTML format
                        file_link_pattern = r'<a href=[\'"]file:///[^\'">]*[/\\]([^\'">]+)[\'"][^>]*>([^<]+)</a>'
                        
                        def create_document_link(match):
                            filename = match.group(1)
                            link_text = match.group(2)
                            # Create a more informative link that shows it's a document reference
                            return f'📄 **[{link_text}]** (Document: {filename})'
                        
                        response = re.sub(file_link_pattern, create_document_link, response)
                        
                        # Handle remaining file:// links in markdown format
                        md_file_link_pattern = r'\[([^\]]+)\]\(file:///[^)]*[/\\]([^)]+)\)'
                        
                        def create_md_document_link(match):
                            link_text = match.group(1)
                            filename = match.group(2)
                            return f'📄 **[{link_text}]** (Document: {filename})'
                        
                        response = re.sub(md_file_link_pattern, create_md_document_link, response)
                        
                        # Enhance ChromaDB collection references to be more readable
                        collection_ref_pattern = r'Document from ChromaDB collection [\'"]([^\'"]+)[\'"]:\s*([^<\n]+)'
                        
                        def enhance_collection_reference(match):
                            collection = match.group(1)
                            document_info = match.group(2).strip()
                            # Make collection references more visually appealing and clickable-looking
                            return f'📚 **Collection: {collection}** → 📄 [{document_info}]'
                        
                        response = re.sub(collection_ref_pattern, enhance_collection_reference, response)
                        
                        # Clean up any remaining problematic HTML anchor tags (keep web URLs)
                        # Only remove local file links, keep http/https links
                        local_link_pattern = r'<a href=[\'"](?!https?://)([^\'">]*)[\'"][^>]*>([^<]+)</a>'
                        response = re.sub(local_link_pattern, r'📎 **\2**', response)
                        
                        # Additional enhancement for standalone document references in references section
                        # Look for patterns like "2.13.4.4.3.1 WX-RPT-14144-01.pdf" and make them more prominent
                        doc_pattern = r'(\d+(?:\.\d+)*\s+)([A-Z0-9_-]+\.(?:pdf|docx?|xlsx?|pptx?))'
                        response = re.sub(doc_pattern, r'📄 **\1\2**', response, flags=re.IGNORECASE)
                        
                        # Enhance "References" section header if present
                        response = re.sub(r'^(References?)\s*$', r'## 📚 \1', response, flags=re.MULTILINE)
                        
                        # Make numbered references more visually appealing
                        numbered_ref_pattern = r'^\[(\d+)\]:\s*'
                        response = re.sub(numbered_ref_pattern, r'**[\1]:** ', response, flags=re.MULTILINE)
                    
                    logger.info(f"✅ Generated response: {len(response)} characters")
                    
                    # Get available references from the chat engine after response generation
                    available_references = []
                    if hasattr(chat_engine, 'get_available_references'):
                        try:
                            available_references = chat_engine.get_available_references(response_text=response)
                            logger.info(f"📚 Found {len(available_references)} available references for frontend")
                        except Exception as e:
                            logger.warning(f"âš ī¸  Could not retrieve available references: {e}")
                            # Fallback to get all references without filtering
                            try:
                                available_references = chat_engine.get_available_references()
                                logger.info(f"📚 Fallback: Found {len(available_references)} total references")
                            except:
                                available_references = []
                    
                elif hasattr(chat_engine, 'chat_interface') and hasattr(chat_engine.chat_interface, 'send'):
                    # Use the Panel chat interface method
                    response = str(chat_engine.chat_interface.send(message))
                    available_references = []
                elif hasattr(chat_engine, 'process_message'):
                    # Use a custom process_message method
                    response = str(chat_engine.process_message(message))
                    available_references = []
                else:
                    # Try to call the chat engine directly
                    response = str(chat_engine(message))
                    available_references = []
                    
            except Exception as e:
                logger.error(f"Chat engine error: {e}")
                response = f"I apologize, but I encountered an error while processing your message: {str(e)}"
                available_references = []
            
            return jsonify({
                'response': response,
                'timestamp': datetime.now().isoformat(),
                'user': session['user']['name'],
                'available_references': available_references,
                'debug_info': {
                    'collections_requested': collections,
                    'available_collections': getattr(chat_engine, 'available_collections', []),
                    'has_data_handles': hasattr(chat_engine, 'data_handles'),
                    'database_search_enabled': config.get('enable_search', True),
                    'web_search_enabled': config.get('enable_search', True),
                    'references_count': len(available_references)
                }
            })
        
    except Exception as e:
        logger.error(f"API error: {e}")
        return jsonify({'error': 'Internal server error'}), 500

Return Value

Returns unspecified type

Required Imports

from flask import Flask
from flask import render_template
from flask import request
from flask import jsonify
from flask import session

Usage Example

# Example usage:
# result = api_chat()

Similar Components

AI-powered semantic similarity - components with related functionality:

  • function api_chat 65.4% similar

    Flask API endpoint that handles chat requests asynchronously, processing user queries through a RAG (Retrieval-Augmented Generation) engine with support for multiple modes, memory, web search, and custom configurations.

    From: /tf/active/vicechatdev/docchat/app.py
  • function process_chat_request_background 61.7% similar

    Process chat request in background thread

    From: /tf/active/vicechatdev/vice_ai/app.py
  • function api_chat_status 59.9% similar

    Flask API endpoint that retrieves the detailed status of a chat task by delegating to the api_task_status function using the provided task_id.

    From: /tf/active/vicechatdev/docchat/app.py
  • function api_send_chat_message_v1 57.1% similar

    Flask API endpoint that handles sending messages in a chat session, processes them through a RAG (Retrieval-Augmented Generation) engine with configurable LLM models, and returns AI-generated responses with references.

    From: /tf/active/vicechatdev/vice_ai/new_app.py
  • function process_chat_background 56.5% similar

    Processes chat requests asynchronously in a background thread, managing RAG engine interactions, progress updates, and session state for various query modes including basic, extensive, full_reading, and deep_reflection.

    From: /tf/active/vicechatdev/docchat/app.py
← Back to Browse