api_chat_upload_document - Code Extractor

function api_chat_upload_document

Maturity: 56

Flask API endpoint that handles document upload for chat context, processes the document to extract text content, and stores it for later retrieval in chat sessions.

File:
/tf/active/vicechatdev/vice_ai/complex_app.py

Lines:
2259 - 2331

Complexity:
complex

Purpose

This endpoint enables users to upload documents (PDF, DOCX, TXT, etc.) that will be processed and stored for use as context in chat conversations. It validates the uploaded file, extracts text content using a document processor, generates a unique document ID, and stores the processed content associated with the authenticated user's email.

Source Code

def api_chat_upload_document():
    """Upload document for chat context"""
    try:
        if 'file' not in request.files:
            return jsonify({'error': 'No file provided'}), 400
        
        file = request.files['file']
        if file.filename == '':
            return jsonify({'error': 'No file selected'}), 400
        
        user_email = get_user_email()
        if not user_email:
            return jsonify({'error': 'User not authenticated'}), 401
        
        # Generate unique document ID
        document_id = str(uuid.uuid4())
        
        try:
            # Process the document using the document processor
            if not document_processor:
                return jsonify({'error': 'Document processor not available'}), 500
            
            # Save file content to temporary file for processing
            with tempfile.NamedTemporaryFile(delete=False, suffix=os.path.splitext(file.filename)[1]) as temp_file:
                file_content = file.read()
                temp_file.write(file_content)
                temp_file.flush()
                
                try:
                    # Process the document
                    processed_result = document_processor.process_document(temp_file.name)
                    
                    if 'error' in processed_result:
                        return jsonify({'error': f'Document processing failed: {processed_result["error"]}'}), 400
                    
                    # Extract combined text content
                    extracted_content = document_processor.get_combined_text(processed_result)
                    
                    if not extracted_content or not extracted_content.strip():
                        return jsonify({'error': 'Could not extract text from document'}), 400
                    
                finally:
                    # Clean up temp file
                    try:
                        os.unlink(temp_file.name)
                    except:
                        pass
            
            # Store the document
            store_uploaded_document(
                user_email=user_email,
                document_id=document_id,
                name=file.filename,
                content=extracted_content,
                file_type=file.content_type or 'application/octet-stream'
            )
            
            return jsonify({
                'document_id': document_id,
                'name': file.filename,
                'size': len(file_content),
                'type': file.content_type,
                'content_length': len(extracted_content),
                'message': 'Document uploaded successfully'
            })
            
        except Exception as e:
            logger.error(f"Document processing error: {e}")
            return jsonify({'error': f'Failed to process document: {str(e)}'}), 500
            
    except Exception as e:
        logger.error(f"Upload document error: {e}")
        return jsonify({'error': 'Failed to upload document'}), 500

Return Value

Returns a JSON response with HTTP status code. On success (200): {'document_id': str, 'name': str, 'size': int, 'type': str, 'content_length': int, 'message': str}. On error: {'error': str} with status codes 400 (bad request/validation failure), 401 (authentication failure), or 500 (server/processing error).

Dependencies

flask
uuid
os
tempfile
logging

Required Imports

from flask import request, jsonify
import uuid
import os
import tempfile

Conditional/Optional Imports

These imports are only needed under specific conditions:

from document_processor import DocumentProcessor

Condition: Required for document processing functionality; must be available as 'document_processor' instance

Required (conditional)

import logging

Condition: Required for error logging via 'logger' instance

Required (conditional)

Usage Example

# Client-side usage example (JavaScript fetch):
const formData = new FormData();
formData.append('file', fileInput.files[0]);

fetch('/api/chat-upload-document', {
  method: 'POST',
  headers: {
    'Authorization': 'Bearer ' + authToken
  },
  body: formData
})
.then(response => response.json())
.then(data => {
  if (data.error) {
    console.error('Upload failed:', data.error);
  } else {
    console.log('Document uploaded:', data.document_id);
    console.log('Extracted content length:', data.content_length);
  }
})
.catch(error => console.error('Error:', error));

# Server-side context:
# This function is called automatically by Flask when POST request is made to /api/chat-upload-document
# Ensure document_processor and store_uploaded_document are properly initialized before use

Best Practices

Always validate file presence and filename before processing
Use temporary files with proper cleanup (try-finally blocks) to avoid disk space issues
Implement proper error handling at multiple levels (file validation, processing, storage)
Log errors with sufficient context for debugging
Validate extracted content is not empty before storing
Generate unique document IDs using UUID to prevent collisions
Clean up temporary files even if processing fails
Return appropriate HTTP status codes for different error scenarios
Verify user authentication before processing uploads
Check document_processor availability before attempting to use it
Consider implementing file size limits to prevent resource exhaustion
Consider implementing file type validation based on content, not just extension
Store both original file metadata and extracted content for reference

Similar Components

AI-powered semantic similarity - components with related functionality:

function api_get_chat_uploaded_documents 84.1% similar

Flask API endpoint that retrieves a list of documents uploaded by the authenticated user for chat functionality, returning document metadata without full content.
From: /tf/active/vicechatdev/vice_ai/complex_app.py
function api_upload_document_v1 80.8% similar

Flask API endpoint that handles document file uploads, validates file type and size, stores the file temporarily, and extracts basic text content for processing.
From: /tf/active/vicechatdev/vice_ai/new_app.py
function api_upload_document 78.9% similar

Flask API endpoint that handles document upload, validates file type and size, processes the document to extract text content, and stores the document metadata in the system.
From: /tf/active/vicechatdev/vice_ai/app.py
function api_delete_chat_uploaded_document 78.8% similar

Flask API endpoint that deletes a user's uploaded document by document ID, requiring authentication and returning success/error responses.
From: /tf/active/vicechatdev/vice_ai/complex_app.py
function upload_document 75.2% similar

Flask route handler that processes file uploads, saves them securely to disk, and indexes the document content for retrieval-augmented generation (RAG) search.
From: /tf/active/vicechatdev/docchat/blueprint.py

← Back to Browse

Assistant

Hi! I can help improve this code. Tell me what you'd like to enhance (e.g., "add error handling", "optimize performance", "improve readability", "add type hints").

Code Comparison

Original Code

                            def api_chat_upload_document():
    """Upload document for chat context"""
    try:
        if 'file' not in request.files:
            return jsonify({'error': 'No file provided'}), 400
        
        file = request.files['file']
        if file.filename == '':
            return jsonify({'error': 'No file selected'}), 400
        
        user_email = get_user_email()
        if not user_email:
            return jsonify({'error': 'User not authenticated'}), 401
        
        # Generate unique document ID
        document_id = str(uuid.uuid4())
        
        try:
            # Process the document using the document processor
            if not document_processor:
                return jsonify({'error': 'Document processor not available'}), 500
            
            # Save file content to temporary file for processing
            with tempfile.NamedTemporaryFile(delete=False, suffix=os.path.splitext(file.filename)[1]) as temp_file:
                file_content = file.read()
                temp_file.write(file_content)
                temp_file.flush()
                
                try:
                    # Process the document
                    processed_result = document_processor.process_document(temp_file.name)
                    
                    if 'error' in processed_result:
                        return jsonify({'error': f'Document processing failed: {processed_result["error"]}'}), 400
                    
                    # Extract combined text content
                    extracted_content = document_processor.get_combined_text(processed_result)
                    
                    if not extracted_content or not extracted_content.strip():
                        return jsonify({'error': 'Could not extract text from document'}), 400
                    
                finally:
                    # Clean up temp file
                    try:
                        os.unlink(temp_file.name)
                    except:
                        pass
            
            # Store the document
            store_uploaded_document(
                user_email=user_email,
                document_id=document_id,
                name=file.filename,
                content=extracted_content,
                file_type=file.content_type or 'application/octet-stream'
            )
            
            return jsonify({
                'document_id': document_id,
                'name': file.filename,
                'size': len(file_content),
                'type': file.content_type,
                'content_length': len(extracted_content),
                'message': 'Document uploaded successfully'
            })
            
        except Exception as e:
            logger.error(f"Document processing error: {e}")
            return jsonify({'error': f'Failed to process document: {str(e)}'}), 500
            
    except Exception as e:
        logger.error(f"Upload document error: {e}")
        return jsonify({'error': 'Failed to upload document'}), 500
                        

Improved Code

🔍 Code Extractor

function api_chat_upload_document

Purpose

Source Code

Return Value

Dependencies

Required Imports

Conditional/Optional Imports

Usage Example

Best Practices

Tags

Similar Components

function api_get_chat_uploaded_documents 84.1% similar

function api_upload_document_v1 80.8% similar

function api_upload_document 78.9% similar

function api_delete_chat_uploaded_document 78.8% similar

function upload_document 75.2% similar

function api_chat_upload_document

Purpose

Source Code

Return Value

Dependencies

Required Imports

Conditional/Optional Imports

Usage Example

Best Practices

Tags

Similar Components

function api_get_chat_uploaded_documents 84.1% similar

function api_upload_document_v1 80.8% similar

function api_upload_document 78.9% similar

function api_delete_chat_uploaded_document 78.8% similar

function upload_document 75.2% similar

✨ Improve Code: api_chat_upload_document

Code Comparison