function store_document
Thread-safe function that stores document information (file path, text content, metadata) in a global dictionary indexed by user email and document ID.
/tf/active/vicechatdev/vice_ai/app.py
94 - 105
simple
Purpose
This function manages document storage for a multi-user Flask application by maintaining a nested dictionary structure where documents are organized by user email. It uses thread locking to ensure safe concurrent access when multiple users upload documents simultaneously. The function creates user entries on-demand and timestamps each document with creation time.
Source Code
def store_document(user_email, document_id, file_path, text_content, metadata):
"""Store document information for a user session"""
with document_lock:
if user_email not in uploaded_documents:
uploaded_documents[user_email] = {}
uploaded_documents[user_email][document_id] = {
'file_path': file_path,
'text_content': text_content,
'metadata': metadata,
'created_at': datetime.now()
}
Parameters
| Name | Type | Default | Kind |
|---|---|---|---|
user_email |
- | - | positional_or_keyword |
document_id |
- | - | positional_or_keyword |
file_path |
- | - | positional_or_keyword |
text_content |
- | - | positional_or_keyword |
metadata |
- | - | positional_or_keyword |
Parameter Details
user_email: String representing the user's email address, used as the primary key to organize documents by user. Expected to be a valid email string.
document_id: Unique identifier for the document, typically a UUID or hash. Used as the secondary key to retrieve specific documents for a user.
file_path: String containing the file system path where the uploaded document is stored. Should be an absolute or relative path to the document file.
text_content: String containing the extracted text content from the document. This is the parsed/processed text that can be used for search, RAG, or other text processing operations.
metadata: Dictionary or object containing additional information about the document such as filename, file size, upload time, document type, or any custom metadata fields.
Return Value
This function returns None (implicit). It performs an in-place modification of the global 'uploaded_documents' dictionary and does not return any value.
Dependencies
datetimethreading
Required Imports
from datetime import datetime
from threading import Lock
Usage Example
from datetime import datetime
from threading import Lock
# Initialize required global variables
uploaded_documents = {}
document_lock = Lock()
# Define the function
def store_document(user_email, document_id, file_path, text_content, metadata):
with document_lock:
if user_email not in uploaded_documents:
uploaded_documents[user_email] = {}
uploaded_documents[user_email][document_id] = {
'file_path': file_path,
'text_content': text_content,
'metadata': metadata,
'created_at': datetime.now()
}
# Example usage
user_email = 'user@example.com'
document_id = 'doc_12345'
file_path = '/uploads/document.pdf'
text_content = 'This is the extracted text from the document.'
metadata = {'filename': 'document.pdf', 'size': 1024, 'type': 'pdf'}
store_document(user_email, document_id, file_path, text_content, metadata)
# Verify storage
print(uploaded_documents[user_email][document_id])
Best Practices
- Always initialize 'uploaded_documents' as an empty dictionary and 'document_lock' as a Lock() object in the global scope before using this function
- Ensure document_id is unique per user to avoid overwriting existing documents
- Consider implementing a cleanup mechanism to remove old documents and prevent unbounded memory growth
- The function modifies global state, so be cautious in multi-threaded environments and ensure proper lock usage
- Consider adding validation for parameters (e.g., checking if user_email is a valid email format, if file_path exists)
- For production use, consider replacing the in-memory dictionary with a persistent storage solution (database, Redis, etc.)
- The created_at timestamp uses the server's local time; consider using UTC for consistency across time zones
Tags
Similar Components
AI-powered semantic similarity - components with related functionality:
-
function store_uploaded_document 88.6% similar
-
function get_user_documents 83.9% similar
-
function store_uploaded_document_v1 83.2% similar
-
function get_uploaded_document_v1 75.4% similar
-
function get_uploaded_document 73.8% similar