🔍 Code Extractor

function compare_document_versions

Maturity: 59

Compares two document versions by their UIDs and generates a summary of changes including metadata differences and hash comparisons.

File:
/tf/active/vicechatdev/CDocs/utils/document_processor.py
Lines:
430 - 481
Complexity:
moderate

Purpose

This function is designed to facilitate document version control by comparing two versions of a document stored in the CDocs system. It retrieves both versions using their unique identifiers, compares their content hashes to detect changes, identifies metadata differences (such as title changes), and returns a structured dictionary containing the comparison results. This is useful for audit trails, change tracking, and displaying version history to users.

Source Code

def compare_document_versions(old_version_uid: str, new_version_uid: str) -> Dict[str, Any]:
    """
    Compare two document versions and generate a summary of changes.
    
    Args:
        old_version_uid: UID of older version
        new_version_uid: UID of newer version
        
    Returns:
        Dictionary with comparison results
    """
    try:
        # Import here to avoid circular imports
        from CDocs.models.document import DocumentVersion
        
        # Get document versions
        old_version = DocumentVersion(uid=old_version_uid)
        new_version = DocumentVersion(uid=new_version_uid)
        
        if not old_version or not new_version:
            return {'error': 'One or both versions not found'}
            
        # Basic comparison info
        comparison = {
            'old_version': old_version.version_number,
            'new_version': new_version.version_number,
            'changed': old_version.hash != new_version.hash if old_version.hash and new_version.hash else True,
            'changes': []
        }
        
        # Add document metadata changes
        old_doc = old_version.document
        new_doc = new_version.document
        
        if old_doc and new_doc:
            if old_doc.title != new_doc.title:
                comparison['changes'].append({
                    'type': 'metadata',
                    'field': 'title',
                    'old': old_doc.title,
                    'new': new_doc.title
                })
                
        # Add explicit change summary if available
        if new_version.change_summary:
            comparison['change_summary'] = new_version.change_summary
            
        return comparison
        
    except Exception as e:
        logger.error(f"Error comparing document versions: {e}")
        return {'error': f"Error comparing versions: {e}"}

Parameters

Name Type Default Kind
old_version_uid str - positional_or_keyword
new_version_uid str - positional_or_keyword

Parameter Details

old_version_uid: String containing the unique identifier (UID) of the older document version to compare. This should be a valid UID that exists in the DocumentVersion model. The function will attempt to retrieve this version from the database.

new_version_uid: String containing the unique identifier (UID) of the newer document version to compare. This should be a valid UID that exists in the DocumentVersion model. The function will attempt to retrieve this version from the database.

Return Value

Type: Dict[str, Any]

Returns a dictionary (Dict[str, Any]) containing comparison results. On success, the dictionary includes: 'old_version' (version number of old version), 'new_version' (version number of new version), 'changed' (boolean indicating if content hash differs), 'changes' (list of dictionaries describing specific changes with 'type', 'field', 'old', and 'new' keys), and optionally 'change_summary' (if available in the new version). On error, returns a dictionary with a single 'error' key containing an error message string.

Dependencies

  • logging
  • typing
  • CDocs.models.document

Required Imports

from typing import Dict, Any
import logging

Conditional/Optional Imports

These imports are only needed under specific conditions:

from CDocs.models.document import DocumentVersion

Condition: imported lazily inside the function to avoid circular imports; always needed when function executes

Required (conditional)

Usage Example

# Assuming CDocs environment is set up and logger is configured
from typing import Dict, Any
import logging

logger = logging.getLogger(__name__)

# Compare two document versions
old_uid = "abc123-old-version-uid"
new_uid = "def456-new-version-uid"

result = compare_document_versions(old_uid, new_uid)

if 'error' in result:
    print(f"Error: {result['error']}")
else:
    print(f"Comparing version {result['old_version']} to {result['new_version']}")
    print(f"Document changed: {result['changed']}")
    
    if result['changes']:
        print("Changes detected:")
        for change in result['changes']:
            print(f"  {change['field']}: '{change['old']}' -> '{change['new']}'")
    
    if 'change_summary' in result:
        print(f"Summary: {result['change_summary']}")

Best Practices

  • Always check for the 'error' key in the returned dictionary before accessing other fields
  • Ensure both UIDs are valid and exist in the database before calling this function
  • The function uses lazy imports to avoid circular dependencies; ensure CDocs.models.document is available at runtime
  • The 'changed' field relies on hash comparison; if hashes are not available, it defaults to True
  • Currently only compares title metadata; extend the metadata comparison logic for additional fields as needed
  • Handle exceptions appropriately as the function catches all exceptions and returns error dictionaries
  • The function requires a configured logger instance in the module scope for error logging
  • Consider validating UID format before passing to this function to avoid unnecessary database queries

Similar Components

AI-powered semantic similarity - components with related functionality:

  • function get_document_v2 74.9% similar

    Retrieves detailed information about a specific document version by its UID, including associated document context and version status.

    From: /tf/active/vicechatdev/document_controller_backup.py
  • function get_document_v6 72.8% similar

    Retrieves all versions of a document from the database given its unique identifier (UID).

    From: /tf/active/vicechatdev/document_controller_backup.py
  • function get_document_v5 66.8% similar

    Retrieves all versions of a controlled document from a Neo4j graph database, including metadata about which version is current.

    From: /tf/active/vicechatdev/CDocs/db/db_operations.py
  • function get_document 65.3% similar

    Retrieves comprehensive details of a controlled document by its UID, with optional inclusion of version history, review cycles, and approval cycles.

    From: /tf/active/vicechatdev/document_controller_backup.py
  • function get_document_v1 65.1% similar

    Retrieves comprehensive details of a controlled document by its UID, including optional version history, review cycles, and approval workflows.

    From: /tf/active/vicechatdev/CDocs/controllers/document_controller.py
← Back to Browse