🔍 Code Extractor

function convert_document_to_pdf_v1

Maturity: 71

Converts a document version from an editable format (e.g., Word) to PDF without changing the document's status, uploading the result to FileCloud and updating the version record.

File:
/tf/active/vicechatdev/document_controller_backup.py
Lines:
1280 - 1436
Complexity:
complex

Purpose

This function provides a way to generate PDF versions of controlled documents for viewing and distribution purposes. It retrieves a specific document version (or the current version if not specified), downloads the editable file from FileCloud, converts it to PDF using a document converter, uploads the PDF back to FileCloud, and updates the document version record with the PDF path. The conversion is logged in the audit trail for compliance tracking.

Source Code

def convert_document_to_pdf(
    user: DocUser,
    document_uid: str,
    version_uid: Optional[str] = None
) -> Dict[str, Any]:
    """
    Convert a document version to PDF without changing status
    
    Parameters
    ----------
    user : DocUser
        User performing the conversion
    document_uid : str
        ID of the document
    version_uid : str, optional
        ID of a specific version (default is current version)
        
    Returns
    -------
    Dict[str, Any]
        Dictionary with conversion results
    """
    try:
        # Get document instance
        document = ControlledDocument(uid=document_uid)
        if not document.uid:
            raise ResourceNotFoundError(f"Document not found: {document_uid}")
            
        # Get version
        version = None
        if version_uid:
            version = DocumentVersion(uid=version_uid)
            if not version or version.document_uid != document_uid:
                raise ResourceNotFoundError(f"Version not found: {version_uid}")
        else:
            version = document.current_version
            if not version:
                raise ResourceNotFoundError(f"No versions found for document: {document_uid}")
                
        # Check if the version has an editable file
        if not version.word_file_path:
            raise BusinessRuleError("Version has no editable document to convert")
            
        # Check if PDF already exists
        if version.pdf_file_path:
            return {
                'success': True,
                'message': 'PDF version already exists',
                'document_uid': document_uid,
                'version_uid': version.uid,
                'pdf_path': version.pdf_file_path
            }
            
        # Create a temporary directory for processing
        temp_dir = tempfile.mkdtemp()
        
        try:
            # Download the editable file - without requiring user for direct file access
            editable_file_path = version.word_file_path
            
            # Use internal file download method without permission check
            # FIX: Get the FileCloud client properly
            try:
                # Initialize FileCloud client
                filecloud_client = get_filecloud_client()
                
                # Download file content
                file_content = filecloud_client.download_file(editable_file_path)
                if not isinstance(file_content, bytes):
                    raise BusinessRuleError("Failed to download editable document")
            except Exception as download_err:
                raise BusinessRuleError(f"Failed to download editable document: {str(download_err)}")
                
            # Save to temp file
            file_ext = os.path.splitext(editable_file_path)[1]
            temp_file_path = os.path.join(temp_dir, f"document{file_ext}")
            
            with open(temp_file_path, 'wb') as f:
                f.write(file_content)
                
            # Initialize the document converter
            converter = ControlledDocumentConverter()
            
            # Convert to PDF (simple conversion without signature page or audit trail)
            output_pdf_path = os.path.join(temp_dir, "document.pdf")
            
            try:
                converter.convert_to_pdf(temp_file_path, output_pdf_path)
            except Exception as convert_err:
                raise BusinessRuleError(f"Failed to convert document to PDF: {str(convert_err)}")
                
            # Upload PDF to FileCloud
            # Calculate the FileCloud path for the PDF
            editable_dir = os.path.dirname(editable_file_path)
            pdf_filename = f"{os.path.splitext(os.path.basename(editable_file_path))[0]}.pdf"
            pdf_file_path = os.path.join(editable_dir, pdf_filename)
            
            # Upload PDF to FileCloud
            with open(output_pdf_path, 'rb') as pdf_file:
                upload_result = upload_document_to_filecloud(
                    user=user,
                    file_content=pdf_file.read(),
                    document=document_uid,
                    file_path=pdf_file_path,
                    metadata={
                        'docNumber': document.doc_number,
                        'version': version.version_number,
                        'status': document.status,
                        'convertedBy': user.username,
                        'convertedDate': datetime.now().isoformat()
                    }
                )
                
            if not upload_result.get('success', False):
                raise BusinessRuleError(f"Failed to upload PDF to FileCloud: {upload_result.get('message', 'Unknown error')}")
                
            # Update document version with PDF path
            version.pdf_file_path = pdf_file_path
            
            # Log conversion event
            audit_trail.log_document_lifecycle_event(
                event_type="DOCUMENT_CONVERTED_TO_PDF",
                user=user,
                document_uid=document_uid,
                details={
                    'version_uid': version.uid,
                    'version_number': version.version_number,
                    'pdf_path': pdf_file_path
                }
            )
            
            return {
                'success': True,
                'message': 'Document successfully converted to PDF',
                'document_uid': document_uid,
                'version_uid': version.uid,
                'version_number': version.version_number,
                'pdf_path': pdf_file_path
            }
            
        except Exception as e:
            logger.error(f"Error in document conversion process: {str(e)}")
            raise BusinessRuleError(f"Failed to convert document to PDF: {str(e)}")
        finally:
            # Clean up temporary directory
            try:
                if os.path.exists(temp_dir):
                    shutil.rmtree(temp_dir)
            except:
                logger.warning(f"Failed to remove temporary directory: {temp_dir}")
                
    except (ResourceNotFoundError, ValidationError, PermissionError, BusinessRuleError) as e:
        # Re-raise known errors
        raise
    except Exception as e:
        logger.error(f"Error converting document to PDF: {str(e)}")
        raise BusinessRuleError(f"Failed to convert document to PDF: {str(e)}")

Parameters

Name Type Default Kind
user DocUser - positional_or_keyword
document_uid str - positional_or_keyword
version_uid Optional[str] None positional_or_keyword

Parameter Details

user: DocUser object representing the authenticated user performing the conversion. Used for permission checks, audit logging, and FileCloud operations. Must have 'CONVERT_DOCUMENT' permission (enforced by decorator).

document_uid: String identifier (UID) of the controlled document to convert. Must correspond to an existing ControlledDocument in the system. Used to retrieve the document and its versions.

version_uid: Optional string identifier (UID) of a specific document version to convert. If None, the current/latest version of the document will be used. If provided, must belong to the specified document_uid.

Return Value

Type: Dict[str, Any]

Returns a dictionary with conversion results. On success: {'success': True, 'message': str, 'document_uid': str, 'version_uid': str, 'version_number': str, 'pdf_path': str}. The 'pdf_path' contains the FileCloud path to the generated PDF. If PDF already exists, returns early with 'message': 'PDF version already exists'. On error, raises one of: ResourceNotFoundError (document/version not found), BusinessRuleError (no editable file, conversion failed), ValidationError, PermissionError.

Dependencies

  • logging
  • uuid
  • os
  • tempfile
  • typing
  • datetime
  • io
  • panel
  • shutil
  • traceback
  • CDocs
  • CDocs.db
  • CDocs.config
  • CDocs.models
  • CDocs.utils
  • CDocs.controllers
  • CDocs.db.schema_manager

Required Imports

import os
import tempfile
import shutil
from typing import Dict, Any, Optional
from datetime import datetime
from CDocs.models.document import ControlledDocument, DocumentVersion
from CDocs.models.user_extensions import DocUser
from CDocs.utils import audit_trail
from CDocs.controllers import require_permission, log_controller_action, ResourceNotFoundError, ValidationError, PermissionError, BusinessRuleError
from CDocs.controllers.filecloud_controller import upload_document_to_filecloud, get_filecloud_client
from CDocs.utils.document_converter import ControlledDocumentConverter

Usage Example

from CDocs.models.user_extensions import DocUser
from CDocs.controllers.document_controller import convert_document_to_pdf

# Get authenticated user
user = DocUser(username='john.doe')

# Convert current version of a document to PDF
result = convert_document_to_pdf(
    user=user,
    document_uid='doc-12345-abcde'
)

if result['success']:
    print(f"PDF created at: {result['pdf_path']}")
    print(f"Version: {result['version_number']}")

# Convert a specific version to PDF
result = convert_document_to_pdf(
    user=user,
    document_uid='doc-12345-abcde',
    version_uid='ver-67890-fghij'
)

if result['success']:
    print(f"Conversion complete: {result['message']}")

Best Practices

  • Ensure the user has 'CONVERT_DOCUMENT' permission before calling (enforced by decorator)
  • Handle ResourceNotFoundError when document or version doesn't exist
  • Handle BusinessRuleError for conversion failures or missing editable files
  • The function checks if PDF already exists to avoid redundant conversions
  • Temporary files are automatically cleaned up in the finally block
  • All conversion operations are logged to the audit trail for compliance
  • The function does not change document status - it only generates a PDF representation
  • Ensure FileCloud client is properly configured before calling this function
  • The function requires write access to temporary directories for file processing
  • PDF is stored in the same FileCloud directory as the editable file with .pdf extension
  • If version_uid is not provided, the current version is used automatically
  • The function updates the DocumentVersion object with the pdf_file_path after successful conversion

Similar Components

AI-powered semantic similarity - components with related functionality:

  • function convert_document_to_pdf 92.0% similar

    Converts a document version to PDF format with audit trail, signatures, watermarks, and PDF/A compliance options, then uploads the result to FileCloud storage.

    From: /tf/active/vicechatdev/CDocs/controllers/document_controller.py
  • function download_document_version 75.7% similar

    Downloads a specific version of a controlled document from FileCloud storage, with optional audit trail and watermark inclusion, and logs the download event.

    From: /tf/active/vicechatdev/document_controller_backup.py
  • function get_document_edit_url_v1 75.1% similar

    Generates a FileCloud URL to view or edit a controlled document, selecting the appropriate file format (PDF or Word) based on document status and version availability.

    From: /tf/active/vicechatdev/CDocs/controllers/document_controller.py
  • function download_document_version_v1 74.0% similar

    Downloads a specific version of a controlled document, with optional audit trail and watermark inclusion, returning file content and metadata.

    From: /tf/active/vicechatdev/CDocs/controllers/document_controller.py
  • function get_document_download_url 73.3% similar

    Retrieves a download URL for a controlled document, automatically selecting between editable (Word) and PDF formats based on document status or explicit request.

    From: /tf/active/vicechatdev/document_controller_backup.py
← Back to Browse