🔍 Code Extractor

function is_valid_document_file

Maturity: 51

Validates whether a given filename has an extension corresponding to a supported document type by checking against a predefined list of valid document extensions.

File:
/tf/active/vicechatdev/CDocs/utils/__init__.py
Lines:
121 - 133
Complexity:
simple

Purpose

This function serves as a validation utility to determine if a file is an acceptable document format before processing. It's commonly used in document management systems to filter uploads, ensure compatibility with document processing pipelines, and prevent unsupported file types from entering the system. The function supports common document formats including Microsoft Word documents, PDFs, plain text files, and Rich Text Format files.

Source Code

def is_valid_document_file(filename: str) -> bool:
    """
    Check if file is a valid document type.
    
    Args:
        filename: Filename to check
        
    Returns:
        Boolean indicating if file is a valid document type
    """
    ext = get_file_extension(filename)
    valid_extensions = ["doc", "docx", "pdf", "txt", "rtf"]
    return ext in valid_extensions

Parameters

Name Type Default Kind
filename str - positional_or_keyword

Parameter Details

filename: A string representing the name of the file to validate, including its extension (e.g., 'report.pdf', 'document.docx'). The function extracts the extension from this filename to perform validation. Can be a full path or just a filename. Case sensitivity depends on the get_file_extension implementation.

Return Value

Type: bool

Returns a boolean value: True if the file extension is one of the supported document types (doc, docx, pdf, txt, rtf), False otherwise. This allows for simple conditional logic in file processing workflows.

Usage Example

# Assuming get_file_extension is defined or imported
def get_file_extension(filename: str) -> str:
    return filename.rsplit('.', 1)[-1].lower() if '.' in filename else ''

# Example usage
filename1 = 'report.pdf'
filename2 = 'image.jpg'
filename3 = 'document.docx'

if is_valid_document_file(filename1):
    print(f'{filename1} is a valid document')  # Output: report.pdf is a valid document
else:
    print(f'{filename1} is not a valid document')

if is_valid_document_file(filename2):
    print(f'{filename2} is a valid document')
else:
    print(f'{filename2} is not a valid document')  # Output: image.jpg is not a valid document

# Use in file upload validation
uploaded_files = ['contract.pdf', 'photo.png', 'notes.txt']
valid_files = [f for f in uploaded_files if is_valid_document_file(f)]
print(valid_files)  # Output: ['contract.pdf', 'notes.txt']

Best Practices

  • Ensure the get_file_extension function is properly defined and handles edge cases like files without extensions or multiple dots in filenames
  • Consider case sensitivity: the valid_extensions list uses lowercase, so get_file_extension should normalize the extension to lowercase
  • This function only validates file extensions, not actual file content. For security-critical applications, implement additional content-based validation
  • The list of valid extensions is hardcoded. For more flexible systems, consider making this configurable through parameters or configuration files
  • Use this function as a first-pass filter before more expensive file processing operations to improve performance
  • Consider extending the valid_extensions list if your application needs to support additional document formats like odt, pages, or markdown files

Similar Components

AI-powered semantic similarity - components with related functionality:

  • function validate_document 67.9% similar

    Validates document files by checking file size, extension, and optionally performing type-specific structural validation for supported document formats.

    From: /tf/active/vicechatdev/CDocs/utils/document_processor.py
  • function allowed_file 62.0% similar

    Validates whether a filename has an allowed file extension by checking if it contains a dot and if the extension (case-insensitive) exists in a predefined ALLOWED_EXTENSIONS collection.

    From: /tf/active/vicechatdev/leexi/app.py
  • function get_mime_type 60.4% similar

    Determines the MIME type of a file based on its file extension by mapping common extensions to their corresponding MIME type strings.

    From: /tf/active/vicechatdev/CDocs/utils/__init__.py
  • function is_valid_document_status 60.2% similar

    Validates whether a given status code exists in the DOCUMENT_STATUS_CONFIG configuration.

    From: /tf/active/vicechatdev/CDocs/settings_prod.py
  • function allowed_file_v1 59.9% similar

    Validates whether a given filename has an allowed file extension by checking if the extension exists in a configured whitelist.

    From: /tf/active/vicechatdev/full_smartstat/app.py
← Back to Browse