function is_valid_document_file
Validates whether a given filename has an extension corresponding to a supported document type by checking against a predefined list of valid document extensions.
/tf/active/vicechatdev/CDocs/utils/__init__.py
121 - 133
simple
Purpose
This function serves as a validation utility to determine if a file is an acceptable document format before processing. It's commonly used in document management systems to filter uploads, ensure compatibility with document processing pipelines, and prevent unsupported file types from entering the system. The function supports common document formats including Microsoft Word documents, PDFs, plain text files, and Rich Text Format files.
Source Code
def is_valid_document_file(filename: str) -> bool:
"""
Check if file is a valid document type.
Args:
filename: Filename to check
Returns:
Boolean indicating if file is a valid document type
"""
ext = get_file_extension(filename)
valid_extensions = ["doc", "docx", "pdf", "txt", "rtf"]
return ext in valid_extensions
Parameters
| Name | Type | Default | Kind |
|---|---|---|---|
filename |
str | - | positional_or_keyword |
Parameter Details
filename: A string representing the name of the file to validate, including its extension (e.g., 'report.pdf', 'document.docx'). The function extracts the extension from this filename to perform validation. Can be a full path or just a filename. Case sensitivity depends on the get_file_extension implementation.
Return Value
Type: bool
Returns a boolean value: True if the file extension is one of the supported document types (doc, docx, pdf, txt, rtf), False otherwise. This allows for simple conditional logic in file processing workflows.
Usage Example
# Assuming get_file_extension is defined or imported
def get_file_extension(filename: str) -> str:
return filename.rsplit('.', 1)[-1].lower() if '.' in filename else ''
# Example usage
filename1 = 'report.pdf'
filename2 = 'image.jpg'
filename3 = 'document.docx'
if is_valid_document_file(filename1):
print(f'{filename1} is a valid document') # Output: report.pdf is a valid document
else:
print(f'{filename1} is not a valid document')
if is_valid_document_file(filename2):
print(f'{filename2} is a valid document')
else:
print(f'{filename2} is not a valid document') # Output: image.jpg is not a valid document
# Use in file upload validation
uploaded_files = ['contract.pdf', 'photo.png', 'notes.txt']
valid_files = [f for f in uploaded_files if is_valid_document_file(f)]
print(valid_files) # Output: ['contract.pdf', 'notes.txt']
Best Practices
- Ensure the get_file_extension function is properly defined and handles edge cases like files without extensions or multiple dots in filenames
- Consider case sensitivity: the valid_extensions list uses lowercase, so get_file_extension should normalize the extension to lowercase
- This function only validates file extensions, not actual file content. For security-critical applications, implement additional content-based validation
- The list of valid extensions is hardcoded. For more flexible systems, consider making this configurable through parameters or configuration files
- Use this function as a first-pass filter before more expensive file processing operations to improve performance
- Consider extending the valid_extensions list if your application needs to support additional document formats like odt, pages, or markdown files
Tags
Similar Components
AI-powered semantic similarity - components with related functionality:
-
function validate_document 67.9% similar
-
function allowed_file 62.0% similar
-
function get_mime_type 60.4% similar
-
function is_valid_document_status 60.2% similar
-
function allowed_file_v1 59.9% similar