🔍 Code Extractor

class PDFGenerator

Maturity: 28

PDF document generation for reports and controlled documents This class provides methods to generate PDF documents from scratch, including audit reports, document covers, and certificate pages.

File:
/tf/active/vicechatdev/CDocs/utils/pdf_utils.py
Lines:
613 - 1156
Complexity:
moderate

Purpose

PDF document generation for reports and controlled documents This class provides methods to generate PDF documents from scratch, including audit reports, document covers, and certificate pages.

Source Code

class PDFGenerator:
    """
    PDF document generation for reports and controlled documents
    
    This class provides methods to generate PDF documents from scratch,
    including audit reports, document covers, and certificate pages.
    """
    
    def __init__(self, 
                 company_name: str = "Company", 
                 logo_path: Optional[str] = None,
                 font_dir: Optional[str] = None):
        """
        Initialize the PDF generator
        
        Parameters
        ----------
        company_name : str
            Name of the company to include in generated documents
        logo_path : str, optional
            Path to the company logo image
        font_dir : str, optional
            Directory containing custom fonts
        """
        self.company_name = company_name
        self.logo_path = logo_path
        
        # Register custom fonts if provided
        if font_dir and os.path.exists(font_dir):
            self._register_fonts(font_dir)
        
        # Initialize default styles
        self.styles = getSampleStyleSheet()
        self._initialize_custom_styles()
    
    def _register_fonts(self, font_dir: str):
        """
        Register custom fonts for use in PDF documents
        
        Parameters
        ----------
        font_dir : str
            Directory containing font files
        """
        try:
            # Common font families to look for
            font_families = {
                'arial': ['arial.ttf', 'arialbd.ttf', 'ariali.ttf', 'arialbi.ttf'],
                'times': ['times.ttf', 'timesbd.ttf', 'timesi.ttf', 'timesbi.ttf'],
                'calibri': ['calibri.ttf', 'calibrib.ttf', 'calibrii.ttf', 'calibriz.ttf'],
            }
            
            for family, fonts in font_families.items():
                for font_file in fonts:
                    font_path = os.path.join(font_dir, font_file)
                    if os.path.exists(font_path):
                        font_name = os.path.splitext(font_file)[0]
                        pdfmetrics.registerFont(TTFont(font_name, font_path))
                        logger.info(f"Registered font: {font_name} from {font_path}")
        except Exception as e:
            logger.error(f"Error registering fonts: {str(e)}")
    
    def _initialize_custom_styles(self):
        """Initialize custom paragraph styles for consistent document formatting"""
        # Add custom styles
        self.styles.add(ParagraphStyle(
            name='Title',
            parent=self.styles['Heading1'],
            fontSize=18,
            leading=22,
            alignment=1,  # Center
            spaceAfter=12
        ))
        
        self.styles.add(ParagraphStyle(
            name='Subtitle',
            parent=self.styles['Heading2'],
            fontSize=14,
            leading=18,
            alignment=1,  # Center
            spaceAfter=10
        ))
        
        self.styles.add(ParagraphStyle(
            name='Normal-Bold',
            parent=self.styles['Normal'],
            fontName='Helvetica-Bold'
        ))
        
        self.styles.add(ParagraphStyle(
            name='Normal-Italic',
            parent=self.styles['Normal'],
            fontName='Helvetica-Oblique'
        ))
        
        self.styles.add(ParagraphStyle(
            name='Caption',
            parent=self.styles['Normal'],
            fontSize=8,
            leading=10,
            alignment=1  # Center
        ))
        
        self.styles.add(ParagraphStyle(
            name='Header',
            parent=self.styles['Normal'],
            fontSize=9,
            leading=11,
            alignment=0  # Left
        ))
        
        self.styles.add(ParagraphStyle(
            name='Footer',
            parent=self.styles['Normal'],
            fontSize=9,
            leading=11,
            alignment=1  # Center
        ))
    
    def generate_document_cover(self, 
                               output_path: str,
                               doc_number: str,
                               title: str,
                               revision: str,
                               date: str,
                               author: str,
                               department: str,
                               doc_type: str,
                               confidentiality: str = "Internal Use") -> str:
        """
        Generate a cover page for a controlled document
        
        Parameters
        ----------
        output_path : str
            Path where the PDF will be saved
        doc_number : str
            Document number/identifier
        title : str
            Document title
        revision : str
            Revision/version number
        date : str
            Document date
        author : str
            Document author
        department : str
            Responsible department
        doc_type : str
            Document type
        confidentiality : str, optional
            Confidentiality level
            
        Returns
        -------
        str
            Path to the generated PDF
        """
        # Create PDF document
        doc = SimpleDocTemplate(
            output_path,
            pagesize=A4,
            leftMargin=inch,
            rightMargin=inch,
            topMargin=inch,
            bottomMargin=inch
        )
        
        # Create content elements
        elements = []
        
        # Add logo if available
        if self.logo_path and os.path.exists(self.logo_path):
            try:
                img = Image(self.logo_path, width=2*inch, height=1*inch)
                elements.append(img)
                elements.append(Spacer(1, 0.5*inch))
            except Exception as e:
                logger.warning(f"Could not load logo: {str(e)}")
        
        # Add company name
        elements.append(Paragraph(self.company_name, self.styles['Title']))
        elements.append(Spacer(1, 0.25*inch))
        
        # Add document type and confidentiality
        elements.append(Paragraph(f"{doc_type} - {confidentiality}", self.styles['Subtitle']))
        elements.append(Spacer(1, 0.5*inch))
        
        # Add document title
        elements.append(Paragraph(title, self.styles['Title']))
        elements.append(Spacer(1, 0.5*inch))
        
        # Add document information table
        data = [
            ["Document Number:", doc_number],
            ["Revision:", revision],
            ["Date:", date],
            ["Author:", author],
            ["Department:", department]
        ]
        
        # Create table with appropriate styling
        table = Table(data, colWidths=[1.5*inch, 3*inch])
        table.setStyle(TableStyle([
            ('FONTNAME', (0, 0), (0, -1), 'Helvetica-Bold'),
            ('FONTNAME', (1, 0), (1, -1), 'Helvetica'),
            ('ALIGN', (0, 0), (0, -1), 'RIGHT'),
            ('ALIGN', (1, 0), (1, -1), 'LEFT'),
            ('VALIGN', (0, 0), (-1, -1), 'MIDDLE'),
            ('GRID', (0, 0), (-1, -1), 0.5, colors.grey),
            ('BOX', (0, 0), (-1, -1), 1, colors.black),
            ('BACKGROUND', (0, 0), (0, -1), colors.lightgrey)
        ]))
        
        elements.append(table)
        elements.append(Spacer(1, 1*inch))
        
        # Add approval notice
        elements.append(Paragraph(
            "This document is subject to controlled distribution and requires "
            "formal approval before use.",
            self.styles['Normal-Bold']
        ))
        
        # Build the document
        doc.build(elements)
        
        logger.info(f"Generated document cover page: {output_path}")
        return output_path
    
    def generate_certificate_page(self,
                                 output_path: str,
                                 doc_number: str,
                                 title: str,
                                 revision: str,
                                 approvers: List[Dict[str, str]],
                                 approved_date: str,
                                 signature_dir: Optional[str] = None) -> str:
        """
        Generate an approval certificate for a controlled document
        
        Parameters
        ----------
        output_path : str
            Path where the PDF will be saved
        doc_number : str
            Document number/identifier
        title : str
            Document title
        revision : str
            Revision/version number
        approvers : List[Dict[str, str]]
            List of approvers containing 'name', 'role' and 'date' keys
        approved_date : str
            Final approval date
        signature_dir : str, optional
            Directory containing signature images
            
        Returns
        -------
        str
            Path to the generated PDF
        """
        # Create PDF document
        doc = SimpleDocTemplate(
            output_path,
            pagesize=A4,
            leftMargin=inch,
            rightMargin=inch,
            topMargin=inch,
            bottomMargin=inch
        )
        
        # Create content elements
        elements = []
        
        # Add certificate title
        elements.append(Paragraph("Document Approval Certificate", self.styles['Title']))
        elements.append(Spacer(1, 0.25*inch))
        
        # Add document information
        elements.append(Paragraph(f"Document: {title}", self.styles['Normal']))
        elements.append(Paragraph(f"Document Number: {doc_number}", self.styles['Normal']))
        elements.append(Paragraph(f"Revision: {revision}", self.styles['Normal']))
        elements.append(Spacer(1, 0.5*inch))
        
        # Add certificate text
        elements.append(Paragraph(
            "This document has been reviewed and approved according to the Document "
            f"Control Procedure. It was approved on {approved_date} and is subject "
            "to periodic review.",
            self.styles['Normal']
        ))
        elements.append(Spacer(1, 0.5*inch))
        
        # Add approvers table with signatures
        elements.append(Paragraph("Approvals:", self.styles['Normal-Bold']))
        elements.append(Spacer(1, 0.1*inch))
        
        # Create approvers table data
        approver_data = [["Name", "Role", "Date", "Signature"]]
        
        for approver in approvers:
            name = approver.get('name', '')
            role = approver.get('role', '')
            date = approver.get('date', '')
            
            # Check for signature image
            signature_img = None
            if signature_dir:
                # Look for signature file based on name 
                # (typically using a sanitized version of the name)
                safe_name = "".join(c for c in name if c.isalnum()).lower()
                sig_path = os.path.join(signature_dir, f"{safe_name}.png")
                alt_sig_path = os.path.join(signature_dir, f"{safe_name}.jpg")
                
                if os.path.exists(sig_path):
                    signature_img = SignatureImage(sig_path, width=1.5*inch, height=0.6*inch)
                elif os.path.exists(alt_sig_path):
                    signature_img = SignatureImage(alt_sig_path, width=1.5*inch, height=0.6*inch)
                else:
                    # Use a signature placeholder
                    signature_img = SignatureImage("nonexistent.png", width=1.5*inch, height=0.6*inch)
            else:
                # Use a signature placeholder
                signature_img = SignatureImage("nonexistent.png", width=1.5*inch, height=0.6*inch)
            
            # Add row to table
            approver_data.append([name, role, date, signature_img])
        
        # Create table with appropriate styling
        table = Table(approver_data, colWidths=[1.3*inch, 1.3*inch, 1*inch, 1.7*inch])
        table.setStyle(TableStyle([
            ('FONTNAME', (0, 0), (-1, 0), 'Helvetica-Bold'),
            ('ALIGN', (0, 0), (-1, 0), 'CENTER'),
            ('BACKGROUND', (0, 0), (-1, 0), colors.lightgrey),
            ('GRID', (0, 0), (-1, -1), 0.5, colors.grey),
            ('BOX', (0, 0), (-1, -1), 1, colors.black),
            ('VALIGN', (0, 0), (-1, -1), 'MIDDLE'),
            ('ALIGN', (0, 1), (-1, -1), 'CENTER'),
        ]))
        
        elements.append(table)
        elements.append(Spacer(1, 0.5*inch))
        
        # Add validity statement
        elements.append(Paragraph(
            "This certificate confirms that the document has been approved "
            "by all required stakeholders and is valid for use.",
            self.styles['Normal-Italic']
        ))
        
        # Add verification information
        verification_code = self._generate_verification_code(doc_number, revision, approved_date)
        elements.append(Spacer(1, 0.5*inch))
        elements.append(Paragraph(f"Verification Code: {verification_code}", self.styles['Caption']))
        elements.append(Paragraph(f"Generated on: {datetime.now().strftime('%Y-%m-%d %H:%M:%S')}", self.styles['Caption']))
        
        # Build the document
        doc.build(elements)
        
        logger.info(f"Generated approval certificate: {output_path}")
        return output_path
    
    def _generate_verification_code(self, doc_number: str, revision: str, date: str) -> str:
        """
        Generate a verification code for document validation
        
        Parameters
        ----------
        doc_number : str
            Document number/identifier
        revision : str
            Revision/version number
        date : str
            Approval date
            
        Returns
        -------
        str
            Verification code
        """
        # Create a hash of the document information
        verification_string = f"{doc_number}-{revision}-{date}"
        hash_object = hashlib.sha256(verification_string.encode())
        # Return a shortened version of the hash
        return hash_object.hexdigest()[:12].upper()
    
    def generate_audit_report(self,
                             output_path: str,
                             doc_number: str,
                             title: str,
                             revision: str,
                             audit_data: List[Dict[str, Any]],
                             audit_date: str,
                             auditor: str) -> str:
        """
        Generate an audit report for a document
        
        Parameters
        ----------
        output_path : str
            Path where the PDF will be saved
        doc_number : str
            Document number/identifier
        title : str
            Document title
        revision : str
            Revision/version number
        audit_data : List[Dict[str, Any]]
            List of audit entries
        audit_date : str
            Date of the audit
        auditor : str
            Name of the auditor
            
        Returns
        -------
        str
            Path to the generated PDF
        """
        # Create PDF document
        doc = SimpleDocTemplate(
            output_path,
            pagesize=A4,
            leftMargin=inch,
            rightMargin=inch,
            topMargin=inch,
            bottomMargin=inch
        )
        
        # Create content elements
        elements = []
        
        # Add report title
        elements.append(Paragraph("Document Audit Report", self.styles['Title']))
        elements.append(Spacer(1, 0.25*inch))
        
        # Add document information
        elements.append(Paragraph(f"Document: {title}", self.styles['Normal']))
        elements.append(Paragraph(f"Document Number: {doc_number}", self.styles['Normal']))
        elements.append(Paragraph(f"Revision: {revision}", self.styles['Normal']))
        elements.append(Paragraph(f"Audit Date: {audit_date}", self.styles['Normal']))
        elements.append(Paragraph(f"Auditor: {auditor}", self.styles['Normal']))
        elements.append(Spacer(1, 0.5*inch))
        
        # Add audit summary
        elements.append(Paragraph("Audit Summary", self.styles['Heading2']))
        
        # Count audit events by type
        event_counts = {}
        for entry in audit_data:
            event_type = entry.get('event_type', 'Unknown')
            event_counts[event_type] = event_counts.get(event_type, 0) + 1
        
        # Add summary table
        summary_data = [["Event Type", "Count"]]
        for event_type, count in event_counts.items():
            summary_data.append([event_type, str(count)])
        
        summary_table = Table(summary_data, colWidths=[3*inch, 1*inch])
        summary_table.setStyle(TableStyle([
            ('FONTNAME', (0, 0), (-1, 0), 'Helvetica-Bold'),
            ('ALIGN', (0, 0), (-1, 0), 'CENTER'),
            ('BACKGROUND', (0, 0), (-1, 0), colors.lightgrey),
            ('GRID', (0, 0), (-1, -1), 0.5, colors.grey),
            ('BOX', (0, 0), (-1, -1), 1, colors.black),
            ('VALIGN', (0, 0), (-1, -1), 'MIDDLE'),
            ('ALIGN', (1, 0), (1, -1), 'CENTER'),
        ]))
        
        elements.append(summary_table)
        elements.append(Spacer(1, 0.5*inch))
        
        # Add detailed audit log
        elements.append(Paragraph("Audit Log Detail", self.styles['Heading2']))
        
        # Create audit log table
        log_data = [["Timestamp", "User", "Event Type", "Details"]]
        
        for entry in audit_data:
            timestamp = entry.get('timestamp', '')
            user = entry.get('user', '')
            event_type = entry.get('event_type', '')
            details = entry.get('details', '')
            
            # Truncate long details
            if len(details) > 100:
                details = details[:97] + "..."
            
            log_data.append([timestamp, user, event_type, details])
        
        # Create table with appropriate styling
        log_table = Table(log_data, colWidths=[1.2*inch, 1*inch, 1.2*inch, 2.1*inch])
        log_table.setStyle(TableStyle([
            ('FONTNAME', (0, 0), (-1, 0), 'Helvetica-Bold'),
            ('ALIGN', (0, 0), (-1, 0), 'CENTER'),
            ('BACKGROUND', (0, 0), (-1, 0), colors.lightgrey),
            ('GRID', (0, 0), (-1, -1), 0.5, colors.grey),
            ('BOX', (0, 0), (-1, -1), 1, colors.black),
            ('VALIGN', (0, 0), (-1, -1), 'MIDDLE'),
            ('ALIGN', (0, 1), (2, -1), 'CENTER'),
            ('ALIGN', (3, 1), (3, -1), 'LEFT'),
        ]))
        
        # Add zebra striping
        for i in range(1, len(log_data), 2):
            log_table.setStyle(TableStyle([('BACKGROUND', (0, i), (-1, i), colors.whitesmoke)]))
        
        elements.append(log_table)
        elements.append(Spacer(1, 0.5*inch))
        
        # Add certification
        elements.append(Paragraph(
            "This report has been generated automatically by the Controlled "
            "Document Management System. It provides a complete audit trail "
            "of all recorded activities for this document.",
            self.styles['Normal-Italic']
        ))
        
        # Add report footer
        elements.append(Spacer(1, 0.5*inch))
        elements.append(Paragraph(f"Report generated: {datetime.now().strftime('%Y-%m-%d %H:%M:%S')}", self.styles['Caption']))
        elements.append(Paragraph(f"Report ID: {self._generate_report_id()}", self.styles['Caption']))
        
        # Build the document
        doc.build(elements)
        
        logger.info(f"Generated audit report: {output_path}")
        return output_path
    
    def _generate_report_id(self) -> str:
        """
        Generate a unique report ID
        
        Returns
        -------
        str
            Unique report ID
        """
        # Generate a timestamp-based report ID
        timestamp = datetime.now().strftime('%Y%m%d%H%M%S')
        random_suffix = hashlib.md5(os.urandom(8)).hexdigest()[:6]
        return f"RPT-{timestamp}-{random_suffix}"

Parameters

Name Type Default Kind
bases - -

Parameter Details

bases: Parameter of type

Return Value

Returns unspecified type

Class Interface

Methods

__init__(self, company_name, logo_path, font_dir)

Purpose: Initialize the PDF generator Parameters ---------- company_name : str Name of the company to include in generated documents logo_path : str, optional Path to the company logo image font_dir : str, optional Directory containing custom fonts

Parameters:

  • company_name: Type: str
  • logo_path: Type: Optional[str]
  • font_dir: Type: Optional[str]

Returns: None

_register_fonts(self, font_dir)

Purpose: Register custom fonts for use in PDF documents Parameters ---------- font_dir : str Directory containing font files

Parameters:

  • font_dir: Type: str

Returns: None

_initialize_custom_styles(self)

Purpose: Initialize custom paragraph styles for consistent document formatting

Returns: None

generate_document_cover(self, output_path, doc_number, title, revision, date, author, department, doc_type, confidentiality) -> str

Purpose: Generate a cover page for a controlled document Parameters ---------- output_path : str Path where the PDF will be saved doc_number : str Document number/identifier title : str Document title revision : str Revision/version number date : str Document date author : str Document author department : str Responsible department doc_type : str Document type confidentiality : str, optional Confidentiality level Returns ------- str Path to the generated PDF

Parameters:

  • output_path: Type: str
  • doc_number: Type: str
  • title: Type: str
  • revision: Type: str
  • date: Type: str
  • author: Type: str
  • department: Type: str
  • doc_type: Type: str
  • confidentiality: Type: str

Returns: Returns str

generate_certificate_page(self, output_path, doc_number, title, revision, approvers, approved_date, signature_dir) -> str

Purpose: Generate an approval certificate for a controlled document Parameters ---------- output_path : str Path where the PDF will be saved doc_number : str Document number/identifier title : str Document title revision : str Revision/version number approvers : List[Dict[str, str]] List of approvers containing 'name', 'role' and 'date' keys approved_date : str Final approval date signature_dir : str, optional Directory containing signature images Returns ------- str Path to the generated PDF

Parameters:

  • output_path: Type: str
  • doc_number: Type: str
  • title: Type: str
  • revision: Type: str
  • approvers: Type: List[Dict[str, str]]
  • approved_date: Type: str
  • signature_dir: Type: Optional[str]

Returns: Returns str

_generate_verification_code(self, doc_number, revision, date) -> str

Purpose: Generate a verification code for document validation Parameters ---------- doc_number : str Document number/identifier revision : str Revision/version number date : str Approval date Returns ------- str Verification code

Parameters:

  • doc_number: Type: str
  • revision: Type: str
  • date: Type: str

Returns: Returns str

generate_audit_report(self, output_path, doc_number, title, revision, audit_data, audit_date, auditor) -> str

Purpose: Generate an audit report for a document Parameters ---------- output_path : str Path where the PDF will be saved doc_number : str Document number/identifier title : str Document title revision : str Revision/version number audit_data : List[Dict[str, Any]] List of audit entries audit_date : str Date of the audit auditor : str Name of the auditor Returns ------- str Path to the generated PDF

Parameters:

  • output_path: Type: str
  • doc_number: Type: str
  • title: Type: str
  • revision: Type: str
  • audit_data: Type: List[Dict[str, Any]]
  • audit_date: Type: str
  • auditor: Type: str

Returns: Returns str

_generate_report_id(self) -> str

Purpose: Generate a unique report ID Returns ------- str Unique report ID

Returns: Returns str

Required Imports

import os
import io
import logging
import tempfile
import shutil

Usage Example

# Example usage:
# result = PDFGenerator(bases)

Similar Components

AI-powered semantic similarity - components with related functionality:

  • class AuditPageGenerator 68.0% similar

    A class that generates comprehensive PDF audit trail pages for documents, including document information, reviews, approvals, revision history, and event history with electronic signatures.

    From: /tf/active/vicechatdev/document_auditor/src/audit_page_generator.py
  • class ControlledDocumentConverter 58.3% similar

    A comprehensive document converter class that transforms controlled documents into archived PDFs with signature pages, audit trails, hash-based integrity verification, and PDF/A compliance for long-term archival.

    From: /tf/active/vicechatdev/CDocs/utils/document_converter.py
  • class PDFManipulator 57.8% similar

    Manipulates existing PDF documents This class provides methods to add watermarks, merge PDFs, extract pages, and perform other manipulation operations.

    From: /tf/active/vicechatdev/CDocs/utils/pdf_utils.py
  • class HashGenerator 56.6% similar

    A class that provides cryptographic hashing functionality for PDF documents, including hash generation, embedding, and verification for document integrity checking.

    From: /tf/active/vicechatdev/document_auditor/src/security/hash_generator.py
  • class DocumentProcessor 55.8% similar

    A comprehensive document processing class that converts documents to PDF, adds audit trails, applies security features (watermarks, signatures, hashing), and optionally converts to PDF/A format with document protection.

    From: /tf/active/vicechatdev/document_auditor/src/document_processor.py
← Back to Browse