function compare_document_versions
Compares two document versions by their UIDs and generates a summary of changes including metadata differences and hash comparisons.
/tf/active/vicechatdev/CDocs/utils/document_processor.py
430 - 481
moderate
Purpose
This function is designed to facilitate document version control by comparing two versions of a document stored in the CDocs system. It retrieves both versions using their unique identifiers, compares their content hashes to detect changes, identifies metadata differences (such as title changes), and returns a structured dictionary containing the comparison results. This is useful for audit trails, change tracking, and displaying version history to users.
Source Code
def compare_document_versions(old_version_uid: str, new_version_uid: str) -> Dict[str, Any]:
"""
Compare two document versions and generate a summary of changes.
Args:
old_version_uid: UID of older version
new_version_uid: UID of newer version
Returns:
Dictionary with comparison results
"""
try:
# Import here to avoid circular imports
from CDocs.models.document import DocumentVersion
# Get document versions
old_version = DocumentVersion(uid=old_version_uid)
new_version = DocumentVersion(uid=new_version_uid)
if not old_version or not new_version:
return {'error': 'One or both versions not found'}
# Basic comparison info
comparison = {
'old_version': old_version.version_number,
'new_version': new_version.version_number,
'changed': old_version.hash != new_version.hash if old_version.hash and new_version.hash else True,
'changes': []
}
# Add document metadata changes
old_doc = old_version.document
new_doc = new_version.document
if old_doc and new_doc:
if old_doc.title != new_doc.title:
comparison['changes'].append({
'type': 'metadata',
'field': 'title',
'old': old_doc.title,
'new': new_doc.title
})
# Add explicit change summary if available
if new_version.change_summary:
comparison['change_summary'] = new_version.change_summary
return comparison
except Exception as e:
logger.error(f"Error comparing document versions: {e}")
return {'error': f"Error comparing versions: {e}"}
Parameters
| Name | Type | Default | Kind |
|---|---|---|---|
old_version_uid |
str | - | positional_or_keyword |
new_version_uid |
str | - | positional_or_keyword |
Parameter Details
old_version_uid: String containing the unique identifier (UID) of the older document version to compare. This should be a valid UID that exists in the DocumentVersion model. The function will attempt to retrieve this version from the database.
new_version_uid: String containing the unique identifier (UID) of the newer document version to compare. This should be a valid UID that exists in the DocumentVersion model. The function will attempt to retrieve this version from the database.
Return Value
Type: Dict[str, Any]
Returns a dictionary (Dict[str, Any]) containing comparison results. On success, the dictionary includes: 'old_version' (version number of old version), 'new_version' (version number of new version), 'changed' (boolean indicating if content hash differs), 'changes' (list of dictionaries describing specific changes with 'type', 'field', 'old', and 'new' keys), and optionally 'change_summary' (if available in the new version). On error, returns a dictionary with a single 'error' key containing an error message string.
Dependencies
loggingtypingCDocs.models.document
Required Imports
from typing import Dict, Any
import logging
Conditional/Optional Imports
These imports are only needed under specific conditions:
from CDocs.models.document import DocumentVersion
Condition: imported lazily inside the function to avoid circular imports; always needed when function executes
Required (conditional)Usage Example
# Assuming CDocs environment is set up and logger is configured
from typing import Dict, Any
import logging
logger = logging.getLogger(__name__)
# Compare two document versions
old_uid = "abc123-old-version-uid"
new_uid = "def456-new-version-uid"
result = compare_document_versions(old_uid, new_uid)
if 'error' in result:
print(f"Error: {result['error']}")
else:
print(f"Comparing version {result['old_version']} to {result['new_version']}")
print(f"Document changed: {result['changed']}")
if result['changes']:
print("Changes detected:")
for change in result['changes']:
print(f" {change['field']}: '{change['old']}' -> '{change['new']}'")
if 'change_summary' in result:
print(f"Summary: {result['change_summary']}")
Best Practices
- Always check for the 'error' key in the returned dictionary before accessing other fields
- Ensure both UIDs are valid and exist in the database before calling this function
- The function uses lazy imports to avoid circular dependencies; ensure CDocs.models.document is available at runtime
- The 'changed' field relies on hash comparison; if hashes are not available, it defaults to True
- Currently only compares title metadata; extend the metadata comparison logic for additional fields as needed
- Handle exceptions appropriately as the function catches all exceptions and returns error dictionaries
- The function requires a configured logger instance in the module scope for error logging
- Consider validating UID format before passing to this function to avoid unnecessary database queries
Tags
Similar Components
AI-powered semantic similarity - components with related functionality:
-
function get_document_v2 74.9% similar
-
function get_document_v6 72.8% similar
-
function get_document_v5 66.8% similar
-
function get_document 65.3% similar
-
function get_document_v1 65.1% similar