🔍 Code Extractor

class PDF

Maturity: 42

A custom PDF generation class that extends FPDF to create formatted PDF documents with titles, text, horizontal rules, and tables from pandas DataFrames.

File:
/tf/active/vicechatdev/resources/reports.py
Lines:
16 - 121
Complexity:
complex

Purpose

This class provides specialized methods for generating PDF reports with structured content. It handles complex table rendering including multi-indexed DataFrames, automatic column splitting for wide tables, and multi-line cell content. The class is designed for creating data-driven PDF reports with consistent formatting and layout management.

Source Code

class PDF(FPDF):
    def titles(self, title):
        # Add title at top of the page
        self.set_xy(10.0, 10.0)
        self.set_font('Arial', 'B', 14)
        for txt in title:
            self.cell(0, 10, txt)
            self.ln()
        self.horizontal_rule()
        self.ln()
        
    def add_text(self, txt):
        self.set_font('Arial', '', 12)
        self.write(10, txt=10)
            
    def horizontal_rule(self):
        self.set_line_width(0.0)  # Set line width (optional)
        self.set_draw_color(0, 0, 0)  # Set line color (RGB values, optional)
        self.line(10.0, self.get_y(), self.w - 10.0, self.get_y())  # Draw horizontal line
        
    def table(self, df, index_label=None):
        num_cols = df.shape[1]
        if num_cols > 6: #too wide for page so we split on the columns
            num_splits = num_cols // 6 #get number of chunks for chunk size of 6
            chunk_sizes = [6]*num_splits #define chunk sizes
            remainder = num_cols % 6 #if there's a remainder
            if remainder:
                chunk_sizes.append(remainder) #add remainder size
            #then we slice our chunks from the df and gather it in a list we can loop over
            df_chunks = [df.iloc[:, 0+sum(chunk_sizes[:i]):0+sum(chunk_sizes[:i])+size] for i, size in enumerate(chunk_sizes)]
        else:
            #if no df chunks, the entire df is one chunk
            df_chunks = [df]
        self.set_xy(10.0, 50.0)
        self.set_font('Arial', '', 12)
        page_width = self.w - 2 * self.l_margin
        for df in df_chunks:
            if isinstance(df.columns[0], tuple): #specific for our multi indexed friends
                groups=[]
                sizes=[]
                # if index_label:
                #     groups.append('')
                #     sizes.append('')
                for col in df.columns:
                    if not col[0] in groups:
                        groups.append(col[0])
                        sizes.append(col[1])
                col_width = page_width/(len(df.columns)+1) if index_label else page_width/len(df.columns)
                hdr_col_width = (page_width - col_width) /len(groups) if index_label else page_width/len(groups)
                if index_label:
                    self.cell(col_width, 10, '', border=1) #empty small cell
                for i in groups:
                    self.cell(hdr_col_width, 10, i, border=1)
                self.ln()
                if index_label:
                    self.cell(col_width, 10, 'Group Size', border=1) #empty small cell
                for i in sizes:
                    self.cell(hdr_col_width, 10, i, border=1)
                self.ln()
                if index_label:
                    self.cell(col_width, 10, index_label, border=1)
                for col in df.columns:
                    self.cell(col_width, 10, col[-1], border=1)
            else:
                col_width = page_width/len(df.columns)
                hdr_col_width = col_width
                for i, col in enumerate(df.columns):
                    self.cell(col_width, 10, col, border=1)
            self.ln()
            for row in df.itertuples():
                is_index=True
                for value in row if index_label else row[1:]:
                    if isinstance(value, float):
                        value = round(value, 2)
                    self.cell(col_width, 10, str(value), border=1)
                    is_index=False
                self.ln()
            self.ln(20)
            
    def multi_cell_table(self, df):
        page_width = self.w - 2 * self.l_margin
        col_width = page_width/(len(df.columns))
        self.set_font('Arial','B',12)
        for col in df.columns:
            self.cell(col_width, 10, col, border=1)
        self.ln()
        self.set_font('Arial','',12)
        for row in df.itertuples():
            longest_cell = max(row[1:], key=len)
            cell_margin = 2 #cell has margins
            offset = 0.5 #get string width is not entirely accurate and tends to underestimate, got to round up rather aggressively, so we offset + math ceil
            cell_lines = math.ceil(self.get_string_width(longest_cell) / (col_width - cell_margin) + offset)
            cell_lines = max(cell_lines, 2)
            x = self.get_x()
            y = self.get_y()
            if ((cell_lines+1) * 10 + y) > self.page_break_trigger: #if we approach page break, we manually break
                self.add_page()
                y = 10
            for i, value in enumerate(row[1:]):
                if value == longest_cell:
                    newlines = 0
                else:
                    newlines = math.ceil(cell_lines - (self.get_string_width(value) / (col_width - cell_margin)))
                self.set_y(y)
                self.set_x(x + col_width * i)
                self.multi_cell(w=col_width, h=10, txt=value + (" \n "*(newlines-1)), border=1)

Parameters

Name Type Default Kind
bases FPDF -

Parameter Details

bases: Inherits from FPDF class, which provides the base PDF generation functionality. No explicit __init__ parameters are defined, so it uses the parent FPDF constructor which accepts optional parameters like orientation ('P' or 'L'), unit ('mm', 'cm', 'in'), and format ('A4', 'Letter', etc.)

Return Value

Instantiation returns a PDF object that can be used to build PDF documents. Methods are primarily side-effect based, modifying the internal PDF state. The 'titles' method returns None after adding titles. The 'add_text' method returns None after adding text. The 'horizontal_rule' method returns None after drawing a line. The 'table' and 'multi_cell_table' methods return None after rendering tables.

Class Interface

Methods

titles(self, title: list) -> None

Purpose: Adds formatted title(s) at the top of the page with bold Arial 14pt font, followed by a horizontal rule

Parameters:

  • title: A list of strings where each string is rendered as a separate title line

Returns: None - modifies the PDF state by adding title content

add_text(self, txt: str) -> None

Purpose: Adds plain text content to the PDF using Arial 12pt font

Parameters:

  • txt: The text string to add to the PDF document

Returns: None - modifies the PDF state by adding text content

horizontal_rule(self) -> None

Purpose: Draws a horizontal line across the page at the current Y position

Returns: None - modifies the PDF state by drawing a line

table(self, df: pd.DataFrame, index_label: str = None) -> None

Purpose: Renders a pandas DataFrame as a formatted table in the PDF, with automatic column splitting for wide tables and support for multi-indexed columns

Parameters:

  • df: A pandas DataFrame to render as a table. Supports both regular and multi-indexed column DataFrames
  • index_label: Optional label for the index column. If provided, the DataFrame index is included as the first column with this label

Returns: None - modifies the PDF state by adding table content. Automatically splits tables wider than 6 columns into multiple tables

multi_cell_table(self, df: pd.DataFrame) -> None

Purpose: Renders a pandas DataFrame as a table with multi-line cell support, automatically wrapping long text content and handling page breaks

Parameters:

  • df: A pandas DataFrame to render as a table with multi-line cell support

Returns: None - modifies the PDF state by adding table content with wrapped cells. Automatically handles page breaks when content exceeds page height

Attributes

Name Type Description Scope
w float Inherited from FPDF - the width of the page in the current unit instance
l_margin float Inherited from FPDF - the left margin of the page in the current unit instance
page_break_trigger float Inherited from FPDF - the Y position threshold that triggers an automatic page break instance

Dependencies

  • fpdf
  • pandas
  • math

Required Imports

from fpdf import FPDF
import pandas as pd
import math

Usage Example

import pandas as pd
from fpdf import FPDF
import math

# Instantiate the PDF class
pdf = PDF()
pdf.add_page()

# Add titles
pdf.titles(['Report Title', 'Subtitle'])

# Add text content
pdf.add_text('This is some text content in the PDF.')

# Add a horizontal rule
pdf.horizontal_rule()

# Create a DataFrame and add it as a table
df = pd.DataFrame({
    'Column1': [1, 2, 3],
    'Column2': ['A', 'B', 'C'],
    'Column3': [10.5, 20.3, 30.7]
})
pdf.table(df, index_label='Index')

# Save the PDF
pdf.output('report.pdf')

Best Practices

  • Always call add_page() before adding content to the PDF
  • Use titles() method at the beginning of each page for consistent header formatting
  • For wide DataFrames (>6 columns), the table() method automatically splits them into multiple tables
  • When using multi-indexed DataFrames, ensure the first level of column index contains group names
  • The multi_cell_table() method is better for tables with long text content that needs wrapping
  • Call output() method at the end to save the PDF file
  • Be aware that table() method sets position to (10.0, 50.0) which may override previous content positioning
  • The class uses Arial font by default; ensure it's available in your fpdf installation
  • For tables with index labels, pass the index_label parameter to include the index column in the output
  • The class automatically handles page breaks in multi_cell_table() but not in regular table() method

Similar Components

AI-powered semantic similarity - components with related functionality:

  • function add_table_to_pdf 64.0% similar

    Adds a formatted table to a ReportLab PDF document with automatic text wrapping, column width calculation, and alternating row colors.

    From: /tf/active/vicechatdev/vice_ai/complex_app.py
  • function reagents_report 62.2% similar

    Generates a PDF report for reagents audit log data, including title, requester information, date, and a table of the provided dataframe.

    From: /tf/active/vicechatdev/resources/reports.py
  • function add_table_to_pdf_v1 62.2% similar

    Adds a formatted table to a PDF document story with proper text wrapping, styling, and header formatting using ReportLab's platypus components.

    From: /tf/active/vicechatdev/vice_ai/new_app.py
  • class PDFGenerator_v1 61.1% similar

    PDF document generation for reports and controlled documents This class provides methods to generate PDF documents from scratch, including audit reports, document covers, and certificate pages.

    From: /tf/active/vicechatdev/CDocs/utils/pdf_utils.py
  • class HybridPDFGenerator 58.9% similar

    A class that generates hybrid PDF documents combining formatted text content with embedded graphics, optimized for e-ink displays.

    From: /tf/active/vicechatdev/e-ink-llm/hybrid_pdf_generator.py
← Back to Browse