🔍 Code Extractor

function add_table_to_pdf

Maturity: 49

Adds a formatted table to a ReportLab PDF document with automatic text wrapping, column width calculation, and alternating row colors.

File:
/tf/active/vicechatdev/vice_ai/complex_app.py
Lines:
1639 - 1773
Complexity:
moderate

Purpose

This function processes table data and adds it to a ReportLab PDF story object with professional formatting. It handles variable column counts, calculates optimal column widths based on available page space, wraps text using Paragraph objects, applies styling with headers and alternating row colors, and sanitizes HTML content. Designed for generating PDF reports with tabular data.

Source Code

def add_table_to_pdf(story, table_data):
    """Add a table to PDF document with proper formatting and text wrapping"""
    if not table_data:
        print("DEBUG: No table data provided for PDF")
        return
    
    from reportlab.platypus import Paragraph
    from reportlab.lib.styles import getSampleStyleSheet, ParagraphStyle
    
    print(f"DEBUG: Processing PDF table with {len(table_data)} rows")
    
    # Get styles for table cells
    styles = getSampleStyleSheet()
    
    # Create custom styles for table cells
    header_style = ParagraphStyle(
        'TableHeader',
        parent=styles['Normal'],
        fontSize=9,
        fontName='Helvetica-Bold',
        textColor=colors.whitesmoke,
        alignment=0,  # Left alignment
        leading=11,
        leftIndent=0,
        rightIndent=0,
        spaceAfter=0,
        spaceBefore=0
    )
    
    cell_style = ParagraphStyle(
        'TableCell',
        parent=styles['Normal'],
        fontSize=8,
        fontName='Helvetica',
        textColor=colors.black,
        alignment=0,  # Left alignment
        leading=10,
        leftIndent=0,
        rightIndent=0,
        spaceAfter=0,
        spaceBefore=0
    )
    
    # First pass: determine maximum columns
    max_cols = 0
    for row_data in table_data:
        cells = row_data.get('cells', [])
        max_cols = max(max_cols, len(cells))
    
    if max_cols == 0:
        print("DEBUG: No columns found in table data")
        return
    
    # Calculate column widths to fit page
    # Available width on A4 page with margins (about 6.5 inches)
    available_width = 6.5 * inch
    col_width = available_width / max_cols
    
    # Set minimum and maximum column widths
    min_col_width = 1.0 * inch
    max_col_width = 3.0 * inch
    
    if col_width < min_col_width:
        col_width = min_col_width
    elif col_width > max_col_width:
        col_width = max_col_width
    
    # Create column widths list
    col_widths = [col_width] * max_cols
    
    # Prepare table data for ReportLab with Paragraph objects for text wrapping
    pdf_table_data = []
    
    for row_idx, row_data in enumerate(table_data):
        cells = row_data.get('cells', [])
        
        # Convert cells to Paragraph objects for proper text wrapping
        formatted_cells = []
        for col_idx, cell in enumerate(cells):
            cell_text = str(cell).strip()
            
            # Remove HTML tags from cell content
            cell_text = clean_html_tags(cell_text)
            
            # Escape XML characters for ReportLab
            cell_text = cell_text.replace('&', '&amp;').replace('<', '&lt;').replace('>', '&gt;')
            
            # Create Paragraph object for proper text wrapping
            if row_idx == 0:  # Header row
                paragraph = Paragraph(cell_text, header_style)
            else:  # Data rows
                paragraph = Paragraph(cell_text, cell_style)
            
            formatted_cells.append(paragraph)
        
        # Pad row to match max columns
        while len(formatted_cells) < max_cols:
            if row_idx == 0:  # Header row
                formatted_cells.append(Paragraph('', header_style))
            else:  # Data rows
                formatted_cells.append(Paragraph('', cell_style))
        
        pdf_table_data.append(formatted_cells)
        print(f"DEBUG: PDF row {row_idx}: {len(formatted_cells)} cells")
    
    if not pdf_table_data:
        print("DEBUG: No valid table data for PDF")
        return
    
    # Create ReportLab Table with specified column widths
    table = Table(pdf_table_data, colWidths=col_widths, repeatRows=1)
    
    # Define table style
    table_style = TableStyle([
        # Basic table formatting
        ('BACKGROUND', (0, 0), (-1, 0), colors.grey),  # Header row background
        ('ALIGN', (0, 0), (-1, -1), 'LEFT'),  # Left align all cells
        ('VALIGN', (0, 0), (-1, -1), 'TOP'),  # Top vertical alignment
        
        # Data rows formatting
        ('ROWBACKGROUNDS', (0, 1), (-1, -1), [colors.beige, colors.white]),  # Alternating colors
        
        # Grid and borders
        ('GRID', (0, 0), (-1, -1), 0.5, colors.black),  # Grid lines
        ('LEFTPADDING', (0, 0), (-1, -1), 6),  # Cell padding
        ('RIGHTPADDING', (0, 0), (-1, -1), 6),
        ('TOPPADDING', (0, 0), (-1, -1), 4),
        ('BOTTOMPADDING', (0, 0), (-1, -1), 4),
    ])
    
    table.setStyle(table_style)
    
    print(f"DEBUG: PDF table created successfully with {max_cols} columns, width {col_width}")
    story.append(table)
    story.append(Spacer(1, 12))  # Add space after table

Parameters

Name Type Default Kind
story - - positional_or_keyword
table_data - - positional_or_keyword

Parameter Details

story: A ReportLab story list (typically from SimpleDocTemplate) where the table will be appended. This is a list that accumulates flowable elements (Paragraphs, Tables, Spacers) that will be rendered into the PDF document.

table_data: A list of dictionaries where each dictionary represents a row with a 'cells' key containing a list of cell values. Format: [{'cells': ['col1', 'col2', ...]}, ...]. The first row is treated as the header. Can be empty or None, in which case the function returns early without adding anything.

Return Value

Returns None. The function modifies the 'story' parameter in-place by appending a Table object and a Spacer to it. If table_data is empty or invalid, the function returns early without modifying the story.

Dependencies

  • reportlab

Required Imports

from reportlab.platypus import Table, TableStyle, Spacer, Paragraph
from reportlab.lib.styles import getSampleStyleSheet, ParagraphStyle
from reportlab.lib import colors
from reportlab.lib.units import inch

Conditional/Optional Imports

These imports are only needed under specific conditions:

from reportlab.platypus import Paragraph

Condition: imported inside the function for text wrapping in table cells

Required (conditional)
from reportlab.lib.styles import getSampleStyleSheet, ParagraphStyle

Condition: imported inside the function for creating custom cell styles

Required (conditional)

Usage Example

from reportlab.platypus import SimpleDocTemplate
from reportlab.lib.pagesizes import A4
from reportlab.lib import colors
from reportlab.lib.units import inch

# Helper function (must be defined)
def clean_html_tags(text):
    import re
    return re.sub(r'<[^>]+>', '', text)

# Create PDF document
pdf_file = 'output.pdf'
doc = SimpleDocTemplate(pdf_file, pagesize=A4)
story = []

# Prepare table data
table_data = [
    {'cells': ['Name', 'Age', 'City']},  # Header row
    {'cells': ['John Doe', '30', 'New York']},
    {'cells': ['Jane Smith', '25', 'Los Angeles']},
    {'cells': ['Bob Johnson', '35', 'Chicago']}
]

# Add table to PDF
add_table_to_pdf(story, table_data)

# Build PDF
doc.build(story)

Best Practices

  • Ensure the 'clean_html_tags' function is defined in the same module before calling this function
  • The first row in table_data is always treated as the header row with bold styling and grey background
  • Table data should be structured as a list of dictionaries with 'cells' keys containing lists of cell values
  • The function automatically pads rows with fewer columns to match the maximum column count
  • Column widths are automatically calculated but constrained between 1.0 and 3.0 inches
  • HTML special characters (&, <, >) in cell content are automatically escaped for ReportLab compatibility
  • Debug print statements are included; consider removing or replacing with proper logging in production
  • The function modifies the story list in-place, so pass the same story object to multiple calls to build a complete document
  • Use repeatRows=1 parameter in Table constructor to repeat header row on page breaks

Similar Components

AI-powered semantic similarity - components with related functionality:

  • function add_table_to_pdf_v1 92.8% similar

    Adds a formatted table to a PDF document story with proper text wrapping, styling, and header formatting using ReportLab's platypus components.

    From: /tf/active/vicechatdev/vice_ai/new_app.py
  • function add_formatted_content_to_pdf 76.0% similar

    Processes markdown elements and adds them to a PDF document story with appropriate formatting, handling headers, paragraphs, lists, and tables.

    From: /tf/active/vicechatdev/vice_ai/complex_app.py
  • function add_formatted_content_to_pdf_v1 75.7% similar

    Converts processed markdown elements into formatted PDF content by adding paragraphs, headers, lists, and tables to a ReportLab story object with appropriate styling.

    From: /tf/active/vicechatdev/vice_ai/new_app.py
  • function add_table_to_word 68.6% similar

    Adds a formatted table to a Word document using python-docx, with support for header rows and automatic column sizing based on table data.

    From: /tf/active/vicechatdev/vice_ai/new_app.py
  • function add_table_to_word_v1 67.1% similar

    Adds a formatted table to a Microsoft Word document using the python-docx library, with automatic column detection, header row styling, and debug logging.

    From: /tf/active/vicechatdev/vice_ai/complex_app.py
← Back to Browse