🔍 Code Extractor

class Watermarker

Maturity: 52

A class that adds watermark images to PDF documents with configurable opacity, scale, and positioning options.

File:
/tf/active/vicechatdev/document_auditor/src/security/watermark.py
Lines:
8 - 178
Complexity:
moderate

Purpose

The Watermarker class provides functionality to apply watermark images to all pages of a PDF document. It supports multiple positioning strategies (center, corner, tile), adjustable opacity and scale, and handles the entire workflow from image preparation to PDF generation. The watermark is placed behind the original PDF content, ensuring document readability while providing visual branding or protection.

Source Code

class Watermarker:
    """Adds watermarks to PDF documents"""
    
    def __init__(self):
        self.logger = logging.getLogger(__name__)
    
    def add_watermark(self, pdf_path, watermark_image_path, output_path, 
                     scale=0.2, opacity=0.1, position="center"):
        """
        Add a watermark image to each page of the PDF
        
        Args:
            pdf_path (str): Path to the PDF file
            watermark_image_path (str): Path to the watermark image
            output_path (str): Path where watermarked PDF will be saved
            scale (float): Scale factor for watermark (0.0-1.0)
            opacity (float): Opacity of watermark (0.0-1.0)
            position (str): Watermark position ('center', 'corner', 'tile')
            
        Returns:
            str: Path to the watermarked PDF
        """
        if not os.path.exists(pdf_path):
            raise FileNotFoundError(f"PDF file not found: {pdf_path}")
        
        if not os.path.exists(watermark_image_path):
            raise FileNotFoundError(f"Watermark image not found: {watermark_image_path}")
        
        try:
            # Prepare watermark image with specified opacity
            watermark = self._prepare_watermark(watermark_image_path, opacity)
            
            # Create a new PDF with the watermark 
            temp_output = f"{output_path}.temp"
            
            # Open source PDF
            pdf = fitz.open(pdf_path)
            
            # Create a new PDF for output to avoid overwriting issues
            out_pdf = fitz.open()
            
            # Process each page
            for page_num in range(len(pdf)):
                src_page = pdf[page_num]
                
                # Create new page with same dimensions
                out_page = out_pdf.new_page(
                    width=src_page.rect.width,
                    height=src_page.rect.height
                )
                
                # First insert watermark
                self._insert_watermark(out_page, watermark, scale, position)
                
                # Then place original page content over it
                out_page.show_pdf_page(
                    rect=src_page.rect,
                    src=pdf,
                    pno=page_num
                )
            
            # Save the result
            out_pdf.save(temp_output)
            out_pdf.close()
            pdf.close()
            
            # Move to final location
            if os.path.exists(output_path):
                os.remove(output_path)
            shutil.move(temp_output, output_path)
            
            # Clean up
            if hasattr(watermark, 'name') and os.path.exists(watermark.name):
                os.unlink(watermark.name)
            
            self.logger.info(f"Added watermark to PDF: {output_path}")
            return output_path
            
        except Exception as e:
            self.logger.error(f"Error adding watermark: {e}")
            raise
    
    def _insert_watermark(self, page, watermark, scale, position="center"):
        """Insert watermark image on the page with specified position"""
        # Get page dimensions
        page_rect = page.rect
        
        # Calculate scaled watermark dimensions
        watermark_width = watermark.width * scale
        watermark_height = watermark.height * scale
        
        if position == "center":
            # Position watermark in center
            watermark_rect = fitz.Rect(
                (page_rect.width - watermark_width) / 2,
                (page_rect.height - watermark_height) / 2,
                (page_rect.width + watermark_width) / 2,
                (page_rect.height + watermark_height) / 2
            )
            page.insert_image(watermark_rect, filename=watermark.name)
            
        elif position == "corner":
            # Position watermark in bottom right corner with margin
            margin = 20  # margin in points
            watermark_rect = fitz.Rect(
                page_rect.width - watermark_width - margin,
                page_rect.height - watermark_height - margin,
                page_rect.width - margin,
                page_rect.height - margin
            )
            page.insert_image(watermark_rect, filename=watermark.name)
            
        elif position == "tile":
            # Tile watermark across the page
            for x in range(0, int(page_rect.width), int(watermark_width * 1.5)):
                for y in range(0, int(page_rect.height), int(watermark_height * 1.5)):
                    watermark_rect = fitz.Rect(
                        x, y, x + watermark_width, y + watermark_height
                    )
                    page.insert_image(watermark_rect, filename=watermark.name)
        else:
            # Default to center if invalid position
            watermark_rect = fitz.Rect(
                (page_rect.width - watermark_width) / 2,
                (page_rect.height - watermark_height) / 2,
                (page_rect.width + watermark_width) / 2,
                (page_rect.height + watermark_height) / 2
            )
            page.insert_image(watermark_rect, filename=watermark.name)
    
    def _prepare_watermark(self, image_path, opacity=0.1):
        """
        Prepare watermark image with specified opacity
        
        Args:
            image_path (str): Path to the watermark image
            opacity (float): Opacity of watermark (0.0-1.0)
            
        Returns:
            object: Object with watermark info (name, width, height)
        """
        # Open image and convert to RGBA
        img = Image.open(image_path).convert("RGBA")
        
        # Apply opacity
        self.logger.info(f"Setting watermark opacity to {opacity:.0%}")
        alpha_value = int(opacity * 255)  # Convert opacity to alpha value (0-255)
        
        # Apply the alpha to all pixels
        data = img.getdata()
        new_data = []
        for item in data:
            # Modify alpha channel only, preserve original rgb
            new_alpha = min(item[3], alpha_value) if len(item) > 3 else alpha_value
            new_data.append((item[0], item[1], item[2], new_alpha))
        img.putdata(new_data)
        
        # Create temporary file for processed watermark
        temp = tempfile.NamedTemporaryFile(suffix=".png", delete=False)
        img.save(temp.name, "PNG")
        
        # Create a SimpleNamespace object to hold info about the watermark
        class WatermarkInfo:
            pass
        
        watermark = WatermarkInfo()
        watermark.name = temp.name
        watermark.width = img.width
        watermark.height = img.height
        
        return watermark

Parameters

Name Type Default Kind
bases - -

Parameter Details

__init__: The constructor takes no parameters and initializes a logger instance for the class. The logger is configured using the module's __name__ for proper logging hierarchy.

Return Value

Instantiation returns a Watermarker object. The main method add_watermark() returns a string containing the path to the newly created watermarked PDF file. Private methods return various types: _prepare_watermark() returns a WatermarkInfo object with name, width, and height attributes; _insert_watermark() returns None as it modifies the page in-place.

Class Interface

Methods

__init__(self)

Purpose: Initialize the Watermarker instance with a logger

Returns: None - initializes the instance

add_watermark(self, pdf_path: str, watermark_image_path: str, output_path: str, scale: float = 0.2, opacity: float = 0.1, position: str = 'center') -> str

Purpose: Add a watermark image to each page of a PDF document with specified parameters

Parameters:

  • pdf_path: Path to the source PDF file to be watermarked
  • watermark_image_path: Path to the image file to use as watermark (any PIL-supported format)
  • output_path: Path where the watermarked PDF will be saved
  • scale: Scale factor for watermark size relative to original (0.0-1.0, default 0.2)
  • opacity: Transparency level of watermark (0.0=invisible to 1.0=opaque, default 0.1)
  • position: Watermark placement strategy: 'center' (single centered), 'corner' (bottom-right), or 'tile' (repeated pattern)

Returns: String containing the path to the newly created watermarked PDF file

_insert_watermark(self, page, watermark, scale: float, position: str = 'center') -> None

Purpose: Internal method to insert watermark image onto a PDF page at the specified position

Parameters:

  • page: PyMuPDF page object to insert watermark into
  • watermark: WatermarkInfo object containing name, width, and height attributes
  • scale: Scale factor for watermark dimensions
  • position: Positioning strategy: 'center', 'corner', or 'tile'

Returns: None - modifies the page object in-place

_prepare_watermark(self, image_path: str, opacity: float = 0.1) -> object

Purpose: Internal method to load and process watermark image with specified opacity, creating a temporary file

Parameters:

  • image_path: Path to the watermark image file
  • opacity: Desired opacity level (0.0-1.0) to apply to the watermark

Returns: WatermarkInfo object with attributes: name (temp file path), width (image width in pixels), height (image height in pixels)

Attributes

Name Type Description Scope
logger logging.Logger Logger instance for recording watermarking operations and errors, initialized with the module's __name__ instance

Dependencies

  • logging
  • fitz
  • os
  • PIL
  • tempfile
  • shutil

Required Imports

import logging
import fitz
import os
from PIL import Image
import tempfile
import shutil

Usage Example

import logging
import fitz
import os
from PIL import Image
import tempfile
import shutil

# Configure logging (optional but recommended)
logging.basicConfig(level=logging.INFO)

# Instantiate the Watermarker
watermarker = Watermarker()

# Add watermark to a PDF with default settings (center position)
output_path = watermarker.add_watermark(
    pdf_path='document.pdf',
    watermark_image_path='logo.png',
    output_path='watermarked_document.pdf'
)

# Add watermark with custom settings
output_path = watermarker.add_watermark(
    pdf_path='document.pdf',
    watermark_image_path='logo.png',
    output_path='watermarked_document.pdf',
    scale=0.3,
    opacity=0.2,
    position='corner'
)

# Tile watermark across pages
output_path = watermarker.add_watermark(
    pdf_path='document.pdf',
    watermark_image_path='logo.png',
    output_path='watermarked_document.pdf',
    scale=0.15,
    opacity=0.05,
    position='tile'
)

Best Practices

  • Always ensure input PDF and watermark image files exist before calling add_watermark() to avoid FileNotFoundError
  • Use opacity values between 0.05 and 0.3 for subtle watermarks that don't obscure content
  • For tiled watermarks, use smaller scale values (0.1-0.2) to avoid overwhelming the document
  • Ensure write permissions exist for the output_path directory
  • The class creates temporary files during processing; these are automatically cleaned up
  • The watermark is placed behind the original PDF content, preserving document readability
  • Watermark images are converted to RGBA format automatically, so any image format supported by PIL can be used
  • The method is not thread-safe due to temporary file creation; use separate instances for concurrent operations
  • Large PDFs or high-resolution watermarks may consume significant memory during processing
  • The output PDF will overwrite any existing file at output_path
  • Invalid position values default to 'center' positioning

Similar Components

AI-powered semantic similarity - components with related functionality:

  • function add_watermark 74.3% similar

    A wrapper function that adds a customizable text watermark to every page of a PDF document with configurable opacity and color.

    From: /tf/active/vicechatdev/CDocs/utils/pdf_utils.py
  • class PDFManipulator 65.9% similar

    Manipulates existing PDF documents This class provides methods to add watermarks, merge PDFs, extract pages, and perform other manipulation operations.

    From: /tf/active/vicechatdev/CDocs/utils/pdf_utils.py
  • class DocumentProtector 56.5% similar

    A class that handles protecting PDF documents from editing by applying encryption and permission restrictions using pikepdf and PyMuPDF libraries.

    From: /tf/active/vicechatdev/document_auditor/src/security/document_protection.py
  • class DocumentProcessor 56.3% similar

    A comprehensive document processing class that converts documents to PDF, adds audit trails, applies security features (watermarks, signatures, hashing), and optionally converts to PDF/A format with document protection.

    From: /tf/active/vicechatdev/document_auditor/src/document_processor.py
  • class DocumentMerger 55.1% similar

    A class that merges PDF documents with audit trail pages, combining an original PDF with an audit page and updating metadata to reflect the audit process.

    From: /tf/active/vicechatdev/document_auditor/src/document_merger.py
← Back to Browse