PDF - Code Extractor

class PDF

Maturity: 42

A custom PDF generation class that extends FPDF to create formatted PDF documents with titles, text, horizontal rules, and tables from pandas DataFrames.

File:
/tf/active/vicechatdev/resources/reports.py

Lines:
16 - 121

Complexity:
complex

Purpose

This class provides specialized methods for generating PDF reports with structured content. It handles complex table rendering including multi-indexed DataFrames, automatic column splitting for wide tables, and multi-line cell content. The class is designed for creating data-driven PDF reports with consistent formatting and layout management.

Source Code

class PDF(FPDF):
    def titles(self, title):
        # Add title at top of the page
        self.set_xy(10.0, 10.0)
        self.set_font('Arial', 'B', 14)
        for txt in title:
            self.cell(0, 10, txt)
            self.ln()
        self.horizontal_rule()
        self.ln()
        
    def add_text(self, txt):
        self.set_font('Arial', '', 12)
        self.write(10, txt=10)
            
    def horizontal_rule(self):
        self.set_line_width(0.0)  # Set line width (optional)
        self.set_draw_color(0, 0, 0)  # Set line color (RGB values, optional)
        self.line(10.0, self.get_y(), self.w - 10.0, self.get_y())  # Draw horizontal line
        
    def table(self, df, index_label=None):
        num_cols = df.shape[1]
        if num_cols > 6: #too wide for page so we split on the columns
            num_splits = num_cols // 6 #get number of chunks for chunk size of 6
            chunk_sizes = [6]*num_splits #define chunk sizes
            remainder = num_cols % 6 #if there's a remainder
            if remainder:
                chunk_sizes.append(remainder) #add remainder size
            #then we slice our chunks from the df and gather it in a list we can loop over
            df_chunks = [df.iloc[:, 0+sum(chunk_sizes[:i]):0+sum(chunk_sizes[:i])+size] for i, size in enumerate(chunk_sizes)]
        else:
            #if no df chunks, the entire df is one chunk
            df_chunks = [df]
        self.set_xy(10.0, 50.0)
        self.set_font('Arial', '', 12)
        page_width = self.w - 2 * self.l_margin
        for df in df_chunks:
            if isinstance(df.columns[0], tuple): #specific for our multi indexed friends
                groups=[]
                sizes=[]
                # if index_label:
                #     groups.append('')
                #     sizes.append('')
                for col in df.columns:
                    if not col[0] in groups:
                        groups.append(col[0])
                        sizes.append(col[1])
                col_width = page_width/(len(df.columns)+1) if index_label else page_width/len(df.columns)
                hdr_col_width = (page_width - col_width) /len(groups) if index_label else page_width/len(groups)
                if index_label:
                    self.cell(col_width, 10, '', border=1) #empty small cell
                for i in groups:
                    self.cell(hdr_col_width, 10, i, border=1)
                self.ln()
                if index_label:
                    self.cell(col_width, 10, 'Group Size', border=1) #empty small cell
                for i in sizes:
                    self.cell(hdr_col_width, 10, i, border=1)
                self.ln()
                if index_label:
                    self.cell(col_width, 10, index_label, border=1)
                for col in df.columns:
                    self.cell(col_width, 10, col[-1], border=1)
            else:
                col_width = page_width/len(df.columns)
                hdr_col_width = col_width
                for i, col in enumerate(df.columns):
                    self.cell(col_width, 10, col, border=1)
            self.ln()
            for row in df.itertuples():
                is_index=True
                for value in row if index_label else row[1:]:
                    if isinstance(value, float):
                        value = round(value, 2)
                    self.cell(col_width, 10, str(value), border=1)
                    is_index=False
                self.ln()
            self.ln(20)
            
    def multi_cell_table(self, df):
        page_width = self.w - 2 * self.l_margin
        col_width = page_width/(len(df.columns))
        self.set_font('Arial','B',12)
        for col in df.columns:
            self.cell(col_width, 10, col, border=1)
        self.ln()
        self.set_font('Arial','',12)
        for row in df.itertuples():
            longest_cell = max(row[1:], key=len)
            cell_margin = 2 #cell has margins
            offset = 0.5 #get string width is not entirely accurate and tends to underestimate, got to round up rather aggressively, so we offset + math ceil
            cell_lines = math.ceil(self.get_string_width(longest_cell) / (col_width - cell_margin) + offset)
            cell_lines = max(cell_lines, 2)
            x = self.get_x()
            y = self.get_y()
            if ((cell_lines+1) * 10 + y) > self.page_break_trigger: #if we approach page break, we manually break
                self.add_page()
                y = 10
            for i, value in enumerate(row[1:]):
                if value == longest_cell:
                    newlines = 0
                else:
                    newlines = math.ceil(cell_lines - (self.get_string_width(value) / (col_width - cell_margin)))
                self.set_y(y)
                self.set_x(x + col_width * i)
                self.multi_cell(w=col_width, h=10, txt=value + (" \n "*(newlines-1)), border=1)

Parameters

Name	Type	Default	Kind
`bases`	FPDF	-

Parameter Details

bases: Inherits from FPDF class, which provides the base PDF generation functionality. No explicit __init__ parameters are defined, so it uses the parent FPDF constructor which accepts optional parameters like orientation ('P' or 'L'), unit ('mm', 'cm', 'in'), and format ('A4', 'Letter', etc.)

Return Value

Instantiation returns a PDF object that can be used to build PDF documents. Methods are primarily side-effect based, modifying the internal PDF state. The 'titles' method returns None after adding titles. The 'add_text' method returns None after adding text. The 'horizontal_rule' method returns None after drawing a line. The 'table' and 'multi_cell_table' methods return None after rendering tables.

Class Interface

Methods

`titles(self, title: list) -> None`

Purpose: Adds formatted title(s) at the top of the page with bold Arial 14pt font, followed by a horizontal rule

Parameters:

title: A list of strings where each string is rendered as a separate title line

Returns: None - modifies the PDF state by adding title content

`add_text(self, txt: str) -> None`

Purpose: Adds plain text content to the PDF using Arial 12pt font

Parameters:

txt: The text string to add to the PDF document

Returns: None - modifies the PDF state by adding text content

`horizontal_rule(self) -> None`

Purpose: Draws a horizontal line across the page at the current Y position

Returns: None - modifies the PDF state by drawing a line

`table(self, df: pd.DataFrame, index_label: str = None) -> None`

Purpose: Renders a pandas DataFrame as a formatted table in the PDF, with automatic column splitting for wide tables and support for multi-indexed columns

Parameters:

df: A pandas DataFrame to render as a table. Supports both regular and multi-indexed column DataFrames
index_label: Optional label for the index column. If provided, the DataFrame index is included as the first column with this label

Returns: None - modifies the PDF state by adding table content. Automatically splits tables wider than 6 columns into multiple tables

`multi_cell_table(self, df: pd.DataFrame) -> None`

Purpose: Renders a pandas DataFrame as a table with multi-line cell support, automatically wrapping long text content and handling page breaks

Parameters:

df: A pandas DataFrame to render as a table with multi-line cell support

Returns: None - modifies the PDF state by adding table content with wrapped cells. Automatically handles page breaks when content exceeds page height

Attributes

Name	Type	Description	Scope
`w`	float	Inherited from FPDF - the width of the page in the current unit	instance
`l_margin`	float	Inherited from FPDF - the left margin of the page in the current unit	instance
`page_break_trigger`	float	Inherited from FPDF - the Y position threshold that triggers an automatic page break	instance

Dependencies

fpdf
pandas
math

Required Imports

from fpdf import FPDF
import pandas as pd
import math

Usage Example

import pandas as pd
from fpdf import FPDF
import math

# Instantiate the PDF class
pdf = PDF()
pdf.add_page()

# Add titles
pdf.titles(['Report Title', 'Subtitle'])

# Add text content
pdf.add_text('This is some text content in the PDF.')

# Add a horizontal rule
pdf.horizontal_rule()

# Create a DataFrame and add it as a table
df = pd.DataFrame({
    'Column1': [1, 2, 3],
    'Column2': ['A', 'B', 'C'],
    'Column3': [10.5, 20.3, 30.7]
})
pdf.table(df, index_label='Index')

# Save the PDF
pdf.output('report.pdf')

Best Practices

Always call add_page() before adding content to the PDF
Use titles() method at the beginning of each page for consistent header formatting
For wide DataFrames (>6 columns), the table() method automatically splits them into multiple tables
When using multi-indexed DataFrames, ensure the first level of column index contains group names
The multi_cell_table() method is better for tables with long text content that needs wrapping
Call output() method at the end to save the PDF file
Be aware that table() method sets position to (10.0, 50.0) which may override previous content positioning
The class uses Arial font by default; ensure it's available in your fpdf installation
For tables with index labels, pass the index_label parameter to include the index column in the output
The class automatically handles page breaks in multi_cell_table() but not in regular table() method

Similar Components

AI-powered semantic similarity - components with related functionality:

function add_table_to_pdf 64.0% similar

Adds a formatted table to a ReportLab PDF document with automatic text wrapping, column width calculation, and alternating row colors.
From: /tf/active/vicechatdev/vice_ai/complex_app.py
function reagents_report 62.2% similar

Generates a PDF report for reagents audit log data, including title, requester information, date, and a table of the provided dataframe.
From: /tf/active/vicechatdev/resources/reports.py
function add_table_to_pdf_v1 62.2% similar

Adds a formatted table to a PDF document story with proper text wrapping, styling, and header formatting using ReportLab's platypus components.
From: /tf/active/vicechatdev/vice_ai/new_app.py
class PDFGenerator_v1 61.1% similar

PDF document generation for reports and controlled documents This class provides methods to generate PDF documents from scratch, including audit reports, document covers, and certificate pages.
From: /tf/active/vicechatdev/CDocs/utils/pdf_utils.py
class HybridPDFGenerator 58.9% similar

A class that generates hybrid PDF documents combining formatted text content with embedded graphics, optimized for e-ink displays.
From: /tf/active/vicechatdev/e-ink-llm/hybrid_pdf_generator.py

← Back to Browse

Assistant

Hi! I can help improve this code. Tell me what you'd like to enhance (e.g., "add error handling", "optimize performance", "improve readability", "add type hints").

Code Comparison

Original Code

                            class PDF(FPDF):
    def titles(self, title):
        # Add title at top of the page
        self.set_xy(10.0, 10.0)
        self.set_font('Arial', 'B', 14)
        for txt in title:
            self.cell(0, 10, txt)
            self.ln()
        self.horizontal_rule()
        self.ln()
        
    def add_text(self, txt):
        self.set_font('Arial', '', 12)
        self.write(10, txt=10)
            
    def horizontal_rule(self):
        self.set_line_width(0.0)  # Set line width (optional)
        self.set_draw_color(0, 0, 0)  # Set line color (RGB values, optional)
        self.line(10.0, self.get_y(), self.w - 10.0, self.get_y())  # Draw horizontal line
        
    def table(self, df, index_label=None):
        num_cols = df.shape[1]
        if num_cols > 6: #too wide for page so we split on the columns
            num_splits = num_cols // 6 #get number of chunks for chunk size of 6
            chunk_sizes = [6]*num_splits #define chunk sizes
            remainder = num_cols % 6 #if there's a remainder
            if remainder:
                chunk_sizes.append(remainder) #add remainder size
            #then we slice our chunks from the df and gather it in a list we can loop over
            df_chunks = [df.iloc[:, 0+sum(chunk_sizes[:i]):0+sum(chunk_sizes[:i])+size] for i, size in enumerate(chunk_sizes)]
        else:
            #if no df chunks, the entire df is one chunk
            df_chunks = [df]
        self.set_xy(10.0, 50.0)
        self.set_font('Arial', '', 12)
        page_width = self.w - 2 * self.l_margin
        for df in df_chunks:
            if isinstance(df.columns[0], tuple): #specific for our multi indexed friends
                groups=[]
                sizes=[]
                # if index_label:
                #     groups.append('')
                #     sizes.append('')
                for col in df.columns:
                    if not col[0] in groups:
                        groups.append(col[0])
                        sizes.append(col[1])
                col_width = page_width/(len(df.columns)+1) if index_label else page_width/len(df.columns)
                hdr_col_width = (page_width - col_width) /len(groups) if index_label else page_width/len(groups)
                if index_label:
                    self.cell(col_width, 10, '', border=1) #empty small cell
                for i in groups:
                    self.cell(hdr_col_width, 10, i, border=1)
                self.ln()
                if index_label:
                    self.cell(col_width, 10, 'Group Size', border=1) #empty small cell
                for i in sizes:
                    self.cell(hdr_col_width, 10, i, border=1)
                self.ln()
                if index_label:
                    self.cell(col_width, 10, index_label, border=1)
                for col in df.columns:
                    self.cell(col_width, 10, col[-1], border=1)
            else:
                col_width = page_width/len(df.columns)
                hdr_col_width = col_width
                for i, col in enumerate(df.columns):
                    self.cell(col_width, 10, col, border=1)
            self.ln()
            for row in df.itertuples():
                is_index=True
                for value in row if index_label else row[1:]:
                    if isinstance(value, float):
                        value = round(value, 2)
                    self.cell(col_width, 10, str(value), border=1)
                    is_index=False
                self.ln()
            self.ln(20)
            
    def multi_cell_table(self, df):
        page_width = self.w - 2 * self.l_margin
        col_width = page_width/(len(df.columns))
        self.set_font('Arial','B',12)
        for col in df.columns:
            self.cell(col_width, 10, col, border=1)
        self.ln()
        self.set_font('Arial','',12)
        for row in df.itertuples():
            longest_cell = max(row[1:], key=len)
            cell_margin = 2 #cell has margins
            offset = 0.5 #get string width is not entirely accurate and tends to underestimate, got to round up rather aggressively, so we offset + math ceil
            cell_lines = math.ceil(self.get_string_width(longest_cell) / (col_width - cell_margin) + offset)
            cell_lines = max(cell_lines, 2)
            x = self.get_x()
            y = self.get_y()
            if ((cell_lines+1) * 10 + y) > self.page_break_trigger: #if we approach page break, we manually break
                self.add_page()
                y = 10
            for i, value in enumerate(row[1:]):
                if value == longest_cell:
                    newlines = 0
                else:
                    newlines = math.ceil(cell_lines - (self.get_string_width(value) / (col_width - cell_margin)))
                self.set_y(y)
                self.set_x(x + col_width * i)
                self.multi_cell(w=col_width, h=10, txt=value + (" \n "*(newlines-1)), border=1)
                        

Improved Code

🔍 Code Extractor

class PDF

Purpose

Source Code

Parameters

Parameter Details

Return Value

Class Interface

Methods

`titles(self, title: list) -> None`

`add_text(self, txt: str) -> None`

`horizontal_rule(self) -> None`

`table(self, df: pd.DataFrame, index_label: str = None) -> None`

`multi_cell_table(self, df: pd.DataFrame) -> None`

Attributes

Dependencies

Required Imports

Usage Example

Best Practices

Tags

Similar Components

function add_table_to_pdf 64.0% similar

function reagents_report 62.2% similar

function add_table_to_pdf_v1 62.2% similar

class PDFGenerator_v1 61.1% similar

class HybridPDFGenerator 58.9% similar

class PDF

Purpose

Source Code

Parameters

Parameter Details

Return Value

Class Interface

Methods

titles(self, title: list) -> None

add_text(self, txt: str) -> None

horizontal_rule(self) -> None

table(self, df: pd.DataFrame, index_label: str = None) -> None

multi_cell_table(self, df: pd.DataFrame) -> None

Attributes

Dependencies

Required Imports

Usage Example

Best Practices

Tags

Similar Components

function add_table_to_pdf 64.0% similar

function reagents_report 62.2% similar

function add_table_to_pdf_v1 62.2% similar

class PDFGenerator_v1 61.1% similar

class HybridPDFGenerator 58.9% similar

✨ Improve Code: PDF

Code Comparison

`titles(self, title: list) -> None`

`add_text(self, txt: str) -> None`

`horizontal_rule(self) -> None`

`table(self, df: pd.DataFrame, index_label: str = None) -> None`

`multi_cell_table(self, df: pd.DataFrame) -> None`