🔍 Code Extractor

function prepare_audit_data_for_document_processor

Maturity: 61

Prepares comprehensive audit data for a controlled document version, aggregating information from document history, reviews, approvals, and audit events into a structured format for DocumentProcessor.

File:
/tf/active/vicechatdev/CDocs/controllers/document_controller.py
Lines:
1735 - 2135
Complexity:
complex

Purpose

This function serves as a data aggregation and formatting layer that collects all audit-related information for a controlled document version. It retrieves and formats document metadata, revision history, review cycles, approval cycles, and audit events from a Neo4j database, handling various date formats and potential data inconsistencies. The output is structured specifically for PDF generation and document archival purposes, ensuring all compliance and traceability information is properly captured.

Source Code

def prepare_audit_data_for_document_processor(document: ControlledDocument, version: DocumentVersion, user: DocUser) -> Dict[str, Any]:
    """
    Prepare audit data in the format expected by the DocumentProcessor.
    
    Parameters
    ----------
    document : ControlledDocument
        The document being converted
    version : DocumentVersion
        The document version being converted
    user : DocUser
        User performing the conversion
    
    Returns
    -------
    Dict[str, Any]
        Audit data in the format expected by DocumentProcessor
    """
    # Get document audit trail using the correct function
    audit_events = audit_trail.get_document_history(document.uid)
    logger = logging.getLogger('CDocs.controllers.document_controller')
    
    # Helper function to safely format dates from various types including neo4j.time.DateTime
    def format_date(date_value, format_str="%Y-%m-%d"):
        if not date_value:
            return ""
            
        # Handle neo4j.time.DateTime objects
        if hasattr(date_value, 'year') and hasattr(date_value, 'month') and hasattr(date_value, 'day'):
            try:
                return f"{date_value.year:04d}-{date_value.month:02d}-{date_value.day:02d}"
            except (AttributeError, TypeError):
                pass
        
        # Handle Python datetime objects
        if isinstance(date_value, datetime):
            return date_value.strftime(format_str)
            
        # Handle string dates
        if isinstance(date_value, str):
            # Try to parse the string date
            try:
                dt = datetime.fromisoformat(date_value.replace('Z', '+00:00'))
                return dt.strftime(format_str)
            except (ValueError, AttributeError):
                # If it already looks like a date string, return it
                if re.match(r'\d{4}-\d{2}-\d{2}', date_value):
                    return date_value.split('T')[0]
                
        # Return empty string for unparseable dates
        return ""

    # Format date strings
    created_date_str = format_date(version.created_date) or datetime.now().strftime("%Y-%m-%d")
    effective_date_str = format_date(version.effective_date)
    
    # Get all version history for revision history
    all_versions = []
    try:
        all_versions = document.get_all_versions()
    except Exception as e:
        logger.warning(f"Error getting document versions: {e}")
        # Try to get versions through direct DB query as fallback
        try:
            version_uids = db.run_query(
                """
                MATCH (d:ControlledDocument {UID: $doc_uid})-[:HAS_VERSION]->(v:DocumentVersion)
                RETURN v.UID as uid
                ORDER BY v.versionNumber DESC
                """,
                {"doc_uid": document.uid}
            )
            if version_uids:
                all_versions = [DocumentVersion(uid=record['uid']) for record in version_uids]
        except Exception as e2:
            logger.warning(f"Fallback for versions also failed: {e2}")
    
    # Format revision history
    revision_history = []
    for ver in all_versions:
        try:
            # Format date using helper function
            ver_date = format_date(ver.created_date) or datetime.now().strftime("%Y-%m-%d")
                    
            author_name = "Unknown"
            if hasattr(ver, 'author') and ver.author:
                author_name = ver.author.name
            elif hasattr(ver, 'created_by_name') and ver.created_by_name:
                author_name = ver.created_by_name
            
            # Get change summary, handling different property names
            change_summary = None
            if hasattr(ver, 'change_summary'):
                change_summary = ver.change_summary
            elif hasattr(ver, 'comment'):
                change_summary = ver.comment
                
            revision_history.append({
                "version": ver.version_number,
                "date": ver_date,
                "author": author_name,
                "changes": change_summary or f"Version {ver.version_number} created"
            })
        except Exception as e:
            logger.warning(f"Error processing version history for version {ver.uid}: {e}")
    
    # Get review cycles directly for the current version using the model
    from CDocs.models.review import ReviewCycle
    review_cycles = []
    
    try:
        # Query review cycles linked to this specific version
        result = db.run_query(
            """
            MATCH (v:DocumentVersion {UID: $version_uid})-[:FOR_REVIEW]-(r:ReviewCycle)
            RETURN r.UID as review_id
            """,
            {"version_uid": version.uid}
        )
        
        # Process each review cycle
        for record in result:
            if record.get("review_id"):
                try:
                    review_cycle = ReviewCycle(uid=record.get("review_id"))
                    if review_cycle:
                        review_cycles.append(review_cycle)
                except Exception as e:
                    logger.warning(f"Error loading review cycle {record.get('review_id')}: {e}")
    except Exception as e:
        logger.warning(f"Error retrieving review cycles for version {version.uid}: {e}")
    
    # Format the review information
    formatted_reviews = []
    for review_cycle in review_cycles:
        try:
            # Get reviewer assignments for this cycle - safely handling potential errors
            reviewer_assignments = []
            try:
                if hasattr(review_cycle, 'get_reviewer_assignments'):
                    reviewer_assignments = review_cycle.get_reviewer_assignments() or []
            except Exception as e:
                logger.warning(f"Error getting reviewer assignments: {e}")
                # Alternative method if direct method fails
                try:
                    assignment_results = db.run_query(
                        """
                        MATCH (r:ReviewCycle {UID: $review_uid})-[:HAS_REVIEWER]->(ra:ReviewAssignment)
                        RETURN ra
                        """,
                        {"review_uid": review_cycle.uid}
                    )
                    if assignment_results:
                        for record in assignment_results:
                            reviewer_assignments.append(record.get('ra'))
                except Exception as e2:
                    logger.warning(f"Alternative method for reviewer assignments also failed: {e2}")
            
            for assignment in reviewer_assignments:
                # FIXED: Only include date when review is actually completed
                # Review date should be empty if review is not completed
                review_date = ""
                decision = "PENDING"
                if hasattr(assignment, 'decision') and assignment.decision:
                    decision = str(assignment.decision).upper()
                    # Only set review_date if there's a decision (APPROVED/REJECTED)
                    if decision in ["APPROVED", "REJECTED"]:
                        if hasattr(assignment, 'decision_date') and assignment.decision_date:
                            # Use the helper function to handle different date types
                            review_date = format_date(assignment.decision_date)
                
                # Get any comments made by this reviewer - robustly
                reviewer_comments = []
                if hasattr(review_cycle, 'comments') and review_cycle.comments:
                    for comment in review_cycle.comments:
                        if hasattr(comment, 'commenter_uid') and hasattr(assignment, 'reviewer_uid') and \
                           comment.commenter_uid == assignment.reviewer_uid:
                            comment_text = comment.text or ""
                            # Truncate comment if too long to fit in table
                            if len(comment_text) > 200:
                                comment_text = comment_text[:197] + '...'
                            reviewer_comments.append(comment_text)
                
                # Get reviewer name - handling different possible attribute names
                reviewer_name = "Unknown"
                if hasattr(assignment, 'reviewer_name') and assignment.reviewer_name:
                    reviewer_name = assignment.reviewer_name
                elif hasattr(assignment, 'reviewer_uid'):
                    try:
                        reviewer = DocUser(uid=assignment.reviewer_uid)
                        if reviewer and reviewer.name:
                            reviewer_name = reviewer.name
                    except:
                        pass
                
                # FIXED: Use decision as status (not the review cycle status)
                # For the status field, show the reviewer's decision
                formatted_reviews.append({
                    "reviewer_name": reviewer_name,  # Essential for signature lookup
                    "reviewer_role": "Reviewer",  # Default role
                    "reviewer_username": assignment.reviewer_uid if hasattr(assignment, 'reviewer_uid') else "",
                    "review_date": review_date,  # FIXED: Only populated if review has decision
                    "status": decision,  # FIXED: Now shows the reviewer's decision
                    "decision": decision,
                    "comments": "; ".join(reviewer_comments) if reviewer_comments else ""
                })
        except Exception as e:
            logger.warning(f"Error processing review assignments: {e}")
    
    # Get approval cycles directly for the current version using the model
    from CDocs.models.approval import ApprovalCycle
    approval_cycles = []
    
    try:
        # Query approval cycles linked to this specific version
        result = db.run_query(
            """
            MATCH (v:DocumentVersion {UID: $version_uid})-[:FOR_APPROVAL]-(a:ApprovalCycle)
            RETURN a.UID as approval_id
            """,
            {"version_uid": version.uid}
        )
        
        # Process each approval cycle
        for record in result:
            if record.get("approval_id"):
                try:
                    approval_cycle = ApprovalCycle(uid=record.get("approval_id"))
                    if approval_cycle:
                        approval_cycles.append(approval_cycle)
                except Exception as e:
                    logger.warning(f"Error loading approval cycle {record.get('approval_id')}: {e}")
    except Exception as e:
        logger.warning(f"Error retrieving approval cycles for version {version.uid}: {e}")
    
    # Format the approval information
    formatted_approvals = []
    for approval_cycle in approval_cycles:
        try:
            # Get approver assignments for this cycle - safely handling potential errors
            approver_assignments = []
            try:
                if hasattr(approval_cycle, 'get_approver_assignments'):
                    approver_assignments = approval_cycle.get_approver_assignments() or []
            except Exception as e:
                logger.warning(f"Error getting approver assignments: {e}")
                # Alternative method if direct method fails
                try:
                    assignment_results = db.run_query(
                        """
                        MATCH (a:ApprovalCycle {UID: $approval_uid})-[:HAS_APPROVER]->(aa:ApprovalAssignment)
                        RETURN aa
                        """,
                        {"approval_uid": approval_cycle.uid}
                    )
                    if assignment_results:
                        for record in assignment_results:
                            approver_assignments.append(record.get('aa'))
                except Exception as e2:
                    logger.warning(f"Alternative method for approver assignments also failed: {e2}")
            
            for assignment in approver_assignments:
                # FIXED: Only include date when approval is actually completed
                # Approval date should be empty if approval is not completed
                approval_date = ""
                decision = "PENDING"
                if hasattr(assignment, 'decision') and assignment.decision:
                    decision = str(assignment.decision).upper()
                    # Only set approval_date if there's a decision
                    if decision in ["APPROVED", "REJECTED"]:
                        if hasattr(assignment, 'decision_date') and assignment.decision_date:
                            # Use the helper function to handle different date types
                            approval_date = format_date(assignment.decision_date)
                
                # Get sequence order to determine level (for approver role)
                approver_level = "Approver"
                if hasattr(assignment, 'sequence_order') and assignment.sequence_order:
                    approver_level = f"Level {assignment.sequence_order}"
                
                # Format comments for consistent display
                comment = ""
                if hasattr(assignment, 'decision_comments') and assignment.decision_comments:
                    comment = assignment.decision_comments
                elif hasattr(assignment, 'comments') and assignment.comments:
                    comment = assignment.comments
                
                # Truncate comment if too long to fit in table
                if len(comment) > 200:
                    comment = comment[:197] + '...'
                
                # Get approver name - handling different possible attribute names
                approver_name = "Unknown"
                if hasattr(assignment, 'approver_name') and assignment.approver_name:
                    approver_name = assignment.approver_name
                elif hasattr(assignment, 'approver_uid'):
                    try:
                        approver = DocUser(uid=assignment.approver_uid)
                        if approver and approver.name:
                            approver_name = approver.name
                    except:
                        pass
                
                # FIXED: Use decision as status (not the approval cycle status)
                formatted_approvals.append({
                    "approver_name": approver_name,  # Essential for signature lookup
                    "approver_role": approver_level,
                    "approval_date": approval_date,  # FIXED: Only populated if approval has decision
                    "status": decision,  # FIXED: Now shows the approver's decision
                    "decision": decision,
                    "comments": comment
                })
        except Exception as e:
            logger.warning(f"Error processing approval assignments: {e}")
    
    # Format audit events for event history
    event_history = []
    for event in audit_events:
        try:
            # Format timestamp consistently using our helper function
            event_date = format_date(event.get("timestamp"), "%Y-%m-%d %H:%M:%S")
            
            # Format details
            details_str = ""
            if event.get("details"):
                if isinstance(event.get("details"), dict):
                    # Convert dictionary to string representation
                    details_str = "; ".join([f"{k}: {v}" for k, v in event.get("details").items()])
                else:
                    details_str = str(event.get("details"))
                
                # Truncate if too long
                if len(details_str) > 200:
                    details_str = details_str[:197] + '...'
            
            event_history.append({
                "date": event_date,
                "user": event.get("userName", "System"),
                "action": event.get("eventType", ""),
                "description": event.get("description", ""),
                "details": details_str
            })
        except Exception as e:
            logger.warning(f"Error processing audit event: {e}")
    
    # Get department name with fallback
    department_name = document.department
    if hasattr(document, 'get_department_name'):
        try:
            department_name = document.get_department_name()
        except:
            pass
    
    # Get status name with fallback
    status_name = document.status
    if hasattr(document, 'get_status_name'):
        try:
            status_name = document.get_status_name()
        except:
            pass
    
    # Get document type name with fallback
    doc_type_name = document.doc_type
    if hasattr(document, 'get_doc_type_name') or hasattr(document, 'doc_type_name'):
        try:
            if hasattr(document, 'get_doc_type_name'):
                doc_type_name = document.get_doc_type_name()
            else:
                doc_type_name = document.doc_type_name
        except:
            pass
    
    # Get author name for the version
    author_name = user.name
    if hasattr(version, 'author') and version.author:
        author_name = version.author.name
    elif hasattr(version, 'created_by_name') and version.created_by_name:
        author_name = version.created_by_name
    
    # Build the final JSON structure according to expected format
    audit_data = {
        "document_title": document.title,
        "document_id": document.doc_number,
        "version": version.version_number,
        "author": author_name,
        "department": department_name,
        "creation_date": created_date_str,
        "effective_date": effective_date_str,
        "status": status_name,
        "doc_type": doc_type_name,
        "reviews": formatted_reviews,
        "approvals": formatted_approvals,
        "revision_history": revision_history,
        "event_history": event_history,
        "conversion_info": {
            "converted_by": user.name,
            "conversion_date": datetime.now().strftime("%Y-%m-%d %H:%M:%S"),
            "conversion_reason": "PDF conversion requested for document archival and distribution"
        }
    }
    
    return audit_data

Parameters

Name Type Default Kind
document ControlledDocument - positional_or_keyword
version DocumentVersion - positional_or_keyword
user DocUser - positional_or_keyword

Parameter Details

document: A ControlledDocument instance representing the document being converted. Must have properties like uid, title, doc_number, department, status, and doc_type. Used to retrieve document-level metadata and version history.

version: A DocumentVersion instance representing the specific version being converted. Must have properties like uid, version_number, created_date, effective_date, and author information. Used to retrieve version-specific audit data including reviews and approvals.

user: A DocUser instance representing the user performing the conversion. Must have a name property. Used to record who initiated the conversion in the audit trail.

Return Value

Type: Dict[str, Any]

Returns a dictionary containing comprehensive audit data with keys: 'document_title', 'document_id', 'version', 'author', 'department', 'creation_date', 'effective_date', 'status', 'doc_type', 'reviews' (list of reviewer assignments with decisions), 'approvals' (list of approver assignments with decisions), 'revision_history' (list of all version changes), 'event_history' (list of audit events), and 'conversion_info' (metadata about the conversion process). All dates are formatted as YYYY-MM-DD strings. Review and approval lists include only completed actions with decision dates.

Dependencies

  • logging
  • datetime
  • re
  • typing

Required Imports

import logging
from datetime import datetime
import re
from typing import Dict, Any
from CDocs import db
from CDocs.models.document import ControlledDocument, DocumentVersion
from CDocs.models.user_extensions import DocUser
from CDocs.utils import audit_trail

Conditional/Optional Imports

These imports are only needed under specific conditions:

from CDocs.models.review import ReviewCycle

Condition: imported within the function to retrieve review cycle data

Required (conditional)
from CDocs.models.approval import ApprovalCycle

Condition: imported within the function to retrieve approval cycle data

Required (conditional)

Usage Example

from CDocs.models.document import ControlledDocument, DocumentVersion
from CDocs.models.user_extensions import DocUser
from CDocs.controllers.document_controller import prepare_audit_data_for_document_processor

# Load document, version, and user objects
document = ControlledDocument(uid='doc-123')
version = DocumentVersion(uid='ver-456')
user = DocUser(uid='user-789')

# Prepare audit data for PDF generation
audit_data = prepare_audit_data_for_document_processor(document, version, user)

# Access structured audit information
print(f"Document: {audit_data['document_title']}")
print(f"Version: {audit_data['version']}")
print(f"Reviews: {len(audit_data['reviews'])} completed")
print(f"Approvals: {len(audit_data['approvals'])} completed")
print(f"Revision History: {len(audit_data['revision_history'])} versions")

# Pass to DocumentProcessor for PDF generation
# processor.generate_pdf(content, audit_data)

Best Practices

  • The function includes extensive error handling with fallback mechanisms for database queries, ensuring partial data is returned even if some queries fail
  • Date formatting is handled through a helper function that supports multiple date types including neo4j.time.DateTime, Python datetime, and ISO string formats
  • Review and approval dates are only populated when decisions are actually made (APPROVED/REJECTED), not for PENDING statuses
  • Long comments and details are truncated to 200 characters to prevent formatting issues in generated documents
  • The function logs warnings for non-critical errors rather than failing completely, allowing document conversion to proceed with available data
  • All database queries use parameterized queries to prevent injection attacks
  • The function attempts multiple methods to retrieve data (e.g., model methods first, then direct database queries as fallback)
  • Ensure the Neo4j database connection is active before calling this function
  • The returned dictionary structure must match the expected format of DocumentProcessor for successful PDF generation

Similar Components

AI-powered semantic similarity - components with related functionality:

  • function generate_audit_report 64.9% similar

    Generates a comprehensive audit report for a controlled document by aggregating document metadata, version history, review cycles, approvals, and categorized audit events.

    From: /tf/active/vicechatdev/CDocs/utils/audit_trail.py
  • function convert_document_to_pdf 63.8% similar

    Converts a document version to PDF format with audit trail, signatures, watermarks, and PDF/A compliance options, then uploads the result to FileCloud storage.

    From: /tf/active/vicechatdev/CDocs/controllers/document_controller.py
  • function get_document_audit_trail 62.1% similar

    Retrieves the complete audit trail for a controlled document from a Neo4j graph database, including timestamps, user actions, and event details.

    From: /tf/active/vicechatdev/document_controller_backup.py
  • function convert_document_to_pdf_v1 60.4% similar

    Converts a document version from an editable format (e.g., Word) to PDF without changing the document's status, uploading the result to FileCloud and updating the version record.

    From: /tf/active/vicechatdev/document_controller_backup.py
  • function create_document_v2 59.1% similar

    Creates a new version of a controlled document by generating version metadata, storing the file in FileCloud, updating the document's revision number, and creating an audit trail entry.

    From: /tf/active/vicechatdev/CDocs/controllers/document_controller.py
← Back to Browse