function create_test_file
Creates a temporary text file with predefined multi-chapter test content for testing document extraction and processing functionality.
/tf/active/vicechatdev/vice_ai/test_extraction_debug.py
15 - 42
simple
Purpose
This utility function generates a temporary test file containing structured text with multiple chapters and sections. It's designed for debugging and testing document processing systems, particularly for verifying text extraction, chunking, and logging capabilities. The function creates a file with realistic document structure including headers, sections, and descriptive content that can be used to validate document processing pipelines.
Source Code
def create_test_file():
"""Create a simple test text file"""
test_content = """
Test Document for Extraction Debugging
=====================================
This is a test document to verify that the extraction debugging
functionality is working correctly.
Chapter 1: Introduction
-----------------------
This document contains multiple sections to test text chunking.
Chapter 2: Content Analysis
---------------------------
The document processor should extract this text and save it
to a debug log file in the extracted directory.
Chapter 3: Conclusion
---------------------
If you can see this text in the extracted debug log, then
the debugging functionality is working correctly.
"""
# Create temporary file
with tempfile.NamedTemporaryFile(mode='w', suffix='.txt', delete=False) as f:
f.write(test_content)
return f.name
Return Value
Returns a string containing the absolute file path to the created temporary text file. The file has a '.txt' suffix and contains multi-section test content. The file is not automatically deleted (delete=False), so the caller is responsible for cleanup. The returned path can be used immediately to read or process the test file.
Dependencies
tempfile
Required Imports
import tempfile
Usage Example
import tempfile
import os
def create_test_file():
test_content = """
Test Document for Extraction Debugging
=====================================
This is a test document to verify that the extraction debugging
functionality is working correctly.
Chapter 1: Introduction
-----------------------
This document contains multiple sections to test text chunking.
Chapter 2: Content Analysis
---------------------------
The document processor should extract this text and save it
to a debug log file in the extracted directory.
Chapter 3: Conclusion
---------------------
If you can see this text in the extracted debug log, then
the debugging functionality is working correctly.
"""
with tempfile.NamedTemporaryFile(mode='w', suffix='.txt', delete=False) as f:
f.write(test_content)
return f.name
# Usage
test_file_path = create_test_file()
print(f"Test file created at: {test_file_path}")
# Read and verify content
with open(test_file_path, 'r') as f:
content = f.read()
print(f"File contains {len(content)} characters")
# Clean up when done
os.unlink(test_file_path)
Best Practices
- Always clean up the temporary file after use by calling os.unlink() or os.remove() on the returned path to prevent disk space accumulation
- The function uses delete=False to allow the file to persist after the context manager closes, making it the caller's responsibility to manage file lifecycle
- Consider wrapping file usage in a try-finally block to ensure cleanup even if processing fails
- The returned path is platform-specific; use pathlib.Path for cross-platform path manipulation if needed
- For automated testing, consider using pytest's tmp_path fixture or unittest's TemporaryDirectory context manager as alternatives
- The test content is hardcoded and multi-line; if you need different test content, consider parameterizing this function or creating variants
Tags
Similar Components
AI-powered semantic similarity - components with related functionality:
-
function create_test_document 78.7% similar
-
function create_test_file_v1 78.4% similar
-
function test_multiple_files 61.8% similar
-
function test_extraction_debugging 60.7% similar
-
function test_document_processing 60.5% similar