function api_chat_upload_document
Flask API endpoint that handles document upload for chat context, processes the document to extract text content, and stores it for later retrieval in chat sessions.
/tf/active/vicechatdev/vice_ai/complex_app.py
2259 - 2331
complex
Purpose
This endpoint enables users to upload documents (PDF, DOCX, TXT, etc.) that will be processed and stored for use as context in chat conversations. It validates the uploaded file, extracts text content using a document processor, generates a unique document ID, and stores the processed content associated with the authenticated user's email.
Source Code
def api_chat_upload_document():
"""Upload document for chat context"""
try:
if 'file' not in request.files:
return jsonify({'error': 'No file provided'}), 400
file = request.files['file']
if file.filename == '':
return jsonify({'error': 'No file selected'}), 400
user_email = get_user_email()
if not user_email:
return jsonify({'error': 'User not authenticated'}), 401
# Generate unique document ID
document_id = str(uuid.uuid4())
try:
# Process the document using the document processor
if not document_processor:
return jsonify({'error': 'Document processor not available'}), 500
# Save file content to temporary file for processing
with tempfile.NamedTemporaryFile(delete=False, suffix=os.path.splitext(file.filename)[1]) as temp_file:
file_content = file.read()
temp_file.write(file_content)
temp_file.flush()
try:
# Process the document
processed_result = document_processor.process_document(temp_file.name)
if 'error' in processed_result:
return jsonify({'error': f'Document processing failed: {processed_result["error"]}'}), 400
# Extract combined text content
extracted_content = document_processor.get_combined_text(processed_result)
if not extracted_content or not extracted_content.strip():
return jsonify({'error': 'Could not extract text from document'}), 400
finally:
# Clean up temp file
try:
os.unlink(temp_file.name)
except:
pass
# Store the document
store_uploaded_document(
user_email=user_email,
document_id=document_id,
name=file.filename,
content=extracted_content,
file_type=file.content_type or 'application/octet-stream'
)
return jsonify({
'document_id': document_id,
'name': file.filename,
'size': len(file_content),
'type': file.content_type,
'content_length': len(extracted_content),
'message': 'Document uploaded successfully'
})
except Exception as e:
logger.error(f"Document processing error: {e}")
return jsonify({'error': f'Failed to process document: {str(e)}'}), 500
except Exception as e:
logger.error(f"Upload document error: {e}")
return jsonify({'error': 'Failed to upload document'}), 500
Return Value
Returns a JSON response with HTTP status code. On success (200): {'document_id': str, 'name': str, 'size': int, 'type': str, 'content_length': int, 'message': str}. On error: {'error': str} with status codes 400 (bad request/validation failure), 401 (authentication failure), or 500 (server/processing error).
Dependencies
flaskuuidostempfilelogging
Required Imports
from flask import request, jsonify
import uuid
import os
import tempfile
Conditional/Optional Imports
These imports are only needed under specific conditions:
from document_processor import DocumentProcessor
Condition: Required for document processing functionality; must be available as 'document_processor' instance
Required (conditional)import logging
Condition: Required for error logging via 'logger' instance
Required (conditional)Usage Example
# Client-side usage example (JavaScript fetch):
const formData = new FormData();
formData.append('file', fileInput.files[0]);
fetch('/api/chat-upload-document', {
method: 'POST',
headers: {
'Authorization': 'Bearer ' + authToken
},
body: formData
})
.then(response => response.json())
.then(data => {
if (data.error) {
console.error('Upload failed:', data.error);
} else {
console.log('Document uploaded:', data.document_id);
console.log('Extracted content length:', data.content_length);
}
})
.catch(error => console.error('Error:', error));
# Server-side context:
# This function is called automatically by Flask when POST request is made to /api/chat-upload-document
# Ensure document_processor and store_uploaded_document are properly initialized before use
Best Practices
- Always validate file presence and filename before processing
- Use temporary files with proper cleanup (try-finally blocks) to avoid disk space issues
- Implement proper error handling at multiple levels (file validation, processing, storage)
- Log errors with sufficient context for debugging
- Validate extracted content is not empty before storing
- Generate unique document IDs using UUID to prevent collisions
- Clean up temporary files even if processing fails
- Return appropriate HTTP status codes for different error scenarios
- Verify user authentication before processing uploads
- Check document_processor availability before attempting to use it
- Consider implementing file size limits to prevent resource exhaustion
- Consider implementing file type validation based on content, not just extension
- Store both original file metadata and extracted content for reference
Tags
Similar Components
AI-powered semantic similarity - components with related functionality:
-
function api_get_chat_uploaded_documents 84.1% similar
-
function api_upload_document_v1 80.8% similar
-
function api_upload_document 78.9% similar
-
function api_delete_chat_uploaded_document 78.8% similar
-
function upload_document 75.2% similar