🔍 Code Extractor

function merge_pdfs

Maturity: 57

Merges multiple PDF files into a single consolidated PDF document by delegating to a PDFManipulator instance.

File:
/tf/active/vicechatdev/CDocs/utils/pdf_utils.py
Lines:
2126 - 2143
Complexity:
simple

Purpose

This function provides a convenient wrapper for combining multiple PDF files into one output file. It's useful for consolidating reports, combining document sections, or aggregating multiple PDF sources into a single deliverable. The function handles the instantiation of the PDFManipulator class and delegates the actual merging operation to it.

Source Code

def merge_pdfs(input_paths: List[str], output_path: str) -> str:
    """
    Merge multiple PDF files into a single document
    
    Parameters
    ----------
    input_paths : List[str]
        List of paths to input PDFs
    output_path : str
        Path where the merged PDF will be saved
        
    Returns
    -------
    str
        Path to the merged PDF
    """
    manipulator = PDFManipulator()
    return manipulator.merge_pdfs(input_paths, output_path)

Parameters

Name Type Default Kind
input_paths List[str] - positional_or_keyword
output_path str - positional_or_keyword

Parameter Details

input_paths: A list of string file paths pointing to the PDF files to be merged. The PDFs will be merged in the order they appear in the list. All paths must be valid and point to readable PDF files. Empty list or invalid paths may cause errors.

output_path: A string representing the file path where the merged PDF will be saved. Should include the filename with .pdf extension. The directory must exist and be writable. If a file already exists at this path, it may be overwritten depending on PDFManipulator implementation.

Return Value

Type: str

Returns a string containing the path to the successfully created merged PDF file. This is typically the same as the output_path parameter, confirming the file was created at the specified location.

Dependencies

  • fitz
  • pikepdf
  • reportlab
  • typing

Required Imports

from typing import List

Usage Example

from typing import List

# Assuming PDFManipulator is defined in the same module
# or imported from CDocs.pdf_utils import PDFManipulator

# Define input PDF files to merge
input_pdfs = [
    '/path/to/document1.pdf',
    '/path/to/document2.pdf',
    '/path/to/document3.pdf'
]

# Define output path
output_pdf = '/path/to/merged_output.pdf'

# Merge the PDFs
result_path = merge_pdfs(input_pdfs, output_pdf)
print(f'Merged PDF created at: {result_path}')

Best Practices

  • Ensure all input PDF paths exist and are readable before calling this function
  • Verify the output directory exists and has write permissions
  • Consider validating that input files are actually valid PDFs before merging
  • Handle potential exceptions from PDFManipulator.merge_pdfs() method
  • Be aware that the order of PDFs in input_paths determines the order in the merged output
  • Consider checking available disk space before merging large PDF files
  • The function creates a new PDFManipulator instance each time - consider reusing instances if merging multiple sets of PDFs

Similar Components

AI-powered semantic similarity - components with related functionality:

  • function merge_pdfs_v1 72.0% similar

    Merges multiple PDF files into a single output PDF file with robust error handling and fallback mechanisms.

    From: /tf/active/vicechatdev/msg_to_eml.py
  • function convert_to_pdf 58.4% similar

    Converts a document file to PDF format, automatically generating an output path if not specified.

    From: /tf/active/vicechatdev/CDocs/utils/pdf_utils.py
  • class DocumentMerger 58.2% similar

    A class that merges PDF documents with audit trail pages, combining an original PDF with an audit page and updating metadata to reflect the audit process.

    From: /tf/active/vicechatdev/document_auditor/src/document_merger.py
  • class PDFManipulator 51.5% similar

    Manipulates existing PDF documents This class provides methods to add watermarks, merge PDFs, extract pages, and perform other manipulation operations.

    From: /tf/active/vicechatdev/CDocs/utils/pdf_utils.py
  • function add_watermark 50.3% similar

    A wrapper function that adds a customizable text watermark to every page of a PDF document with configurable opacity and color.

    From: /tf/active/vicechatdev/CDocs/utils/pdf_utils.py
← Back to Browse