function load_document_from_file
Loads a document from a JSON file stored in a documents directory, deserializes it into a ComplexDocument object, and returns it.
/tf/active/vicechatdev/vice_ai/complex_app.py
85 - 97
simple
Purpose
This function retrieves persisted document data from the file system. It reads a JSON file identified by a document ID, deserializes the JSON data into a ComplexDocument object using the from_dict method, and handles errors gracefully. It's used for document persistence and retrieval in a document management system, likely part of a Flask web application with RAG (Retrieval-Augmented Generation) capabilities.
Source Code
def load_document_from_file(doc_id):
"""Load document from file"""
try:
file_path = os.path.join(DOCUMENTS_DIR, f"{doc_id}.json")
if os.path.exists(file_path):
with open(file_path, 'r') as f:
data = json.load(f)
document = ComplexDocument.from_dict(data)
logger.info(f"📂 Document loaded from file: {doc_id}")
return document
except Exception as e:
logger.error(f"❌ Failed to load document {doc_id}: {e}")
return None
Parameters
| Name | Type | Default | Kind |
|---|---|---|---|
doc_id |
- | - | positional_or_keyword |
Parameter Details
doc_id: A unique identifier for the document to be loaded. This is used to construct the filename as '{doc_id}.json' in the DOCUMENTS_DIR directory. Expected to be a string, typically a UUID or other unique identifier.
Return Value
Returns a ComplexDocument object if the file exists and is successfully loaded and deserialized. Returns None if the file doesn't exist, if there's an error during loading/parsing, or if deserialization fails. The ComplexDocument type is a custom class that must have a from_dict class method for deserialization.
Dependencies
osjsonlogging
Required Imports
import os
import json
import logging
Usage Example
import os
import json
import logging
# Setup
DOCUMENTS_DIR = './documents'
logger = logging.getLogger(__name__)
logger.setLevel(logging.INFO)
logger.addHandler(logging.StreamHandler())
# Assuming ComplexDocument class exists
class ComplexDocument:
def __init__(self, id, content):
self.id = id
self.content = content
@classmethod
def from_dict(cls, data):
return cls(data['id'], data['content'])
# Create documents directory if it doesn't exist
os.makedirs(DOCUMENTS_DIR, exist_ok=True)
# Use the function
doc_id = 'doc_12345'
document = load_document_from_file(doc_id)
if document:
print(f'Successfully loaded document: {document.id}')
else:
print('Document not found or failed to load')
Best Practices
- Ensure DOCUMENTS_DIR exists and is writable before calling this function
- The ComplexDocument class must implement a from_dict() class method that can reconstruct the object from a dictionary
- Handle the None return value appropriately in calling code to account for missing or corrupted files
- Consider implementing file locking if multiple processes/threads might access the same document files concurrently
- The function silently returns None on errors - check logs for detailed error information
- Validate doc_id to prevent directory traversal attacks if it comes from user input
- Consider adding file validation (e.g., JSON schema validation) before deserialization for production use
Tags
Similar Components
AI-powered semantic similarity - components with related functionality:
-
function load_all_documents 72.9% similar
-
function save_document_to_file 64.8% similar
-
function get_document_v7 61.5% similar
-
function load_session_from_disk 61.2% similar
-
function load_chat_session_from_file 59.6% similar