test_simulated_document - Code Extractor

function test_simulated_document

Maturity: 49

Integration test function that validates end date extraction from a simulated contract document containing an explicit term clause, using a two-step LLM-based analysis process.

File:
/tf/active/vicechatdev/contract_validity_analyzer/test_simulated_document.py

Lines:
49 - 141

Complexity:
complex

Purpose

This test function validates the complete contract analysis pipeline by processing a simulated document with known term clause content. It performs a two-step extraction process: first finding expiry dates, then extracting complete contract information including start/end dates, contract type, third parties, and validity status. The function logs detailed results, saves them to JSON, and verifies that the calculated end date matches expected values. It's designed to ensure the LLM client correctly interprets term clauses and performs date calculations.

Source Code

def test_simulated_document():
    """Test end date extraction with simulated document text containing the term clause."""
    logger = setup_test_logging()
    logger.info("Starting test on simulated document with term clause")
    
    try:
        # Load configuration
        config = Config()
        
        # Initialize LLM client
        llm_client = LLMClient(config.get_section('llm'))
        
        logger.info("Testing simulated document with term clause")
        
        # Show the document text being tested
        logger.info("=" * 80)
        logger.info("SIMULATED DOCUMENT TEXT:")
        logger.info("=" * 80)
        logger.info(SIMULATED_DOCUMENT_TEXT)
        logger.info("=" * 80)
        
        # Check if the key term clause is present
        term_keywords = ["Term:", "shall commence", "period of one (1) year", "thirty (30)", "five (5) years"]
        found_keywords = []
        for keyword in term_keywords:
            if keyword.lower() in SIMULATED_DOCUMENT_TEXT.lower():
                found_keywords.append(keyword)
                
        logger.info(f"Found term keywords: {found_keywords}")
        
        # Step 1: Test expiry date finding
        logger.info("STEP 1: Testing expiry date extraction...")
        step1_result = llm_client._find_expiry_dates(SIMULATED_DOCUMENT_TEXT, "simulated_cda.pdf")
        
        logger.info("STEP 1 RESULT:")
        logger.info("-" * 60)
        expiry_analysis = step1_result.get('expiry_analysis', 'No analysis available')
        logger.info(expiry_analysis)
        logger.info("-" * 60)
        
        # Step 2: Test complete contract analysis
        logger.info("STEP 2: Testing complete contract analysis...")
        step2_result = llm_client._extract_complete_contract_info(step1_result, SIMULATED_DOCUMENT_TEXT, "simulated_cda.pdf")
        
        logger.info("STEP 2 RESULT:")
        logger.info("-" * 60)
        if step2_result.get('error'):
            logger.error(f"Step 2 failed: {step2_result.get('error')}")
        else:
            logger.info(f"Contract Type: {step2_result.get('contract_type', 'Unknown')}")
            logger.info(f"Third Parties: {step2_result.get('third_parties', [])}")
            logger.info(f"Start Date: {step2_result.get('start_date', 'Not found')}")
            logger.info(f"End Date: {step2_result.get('end_date', 'Not found')}")
            logger.info(f"Is In Effect: {step2_result.get('is_in_effect', 'Unknown')}")
            logger.info(f"Confidence: {step2_result.get('confidence', 0.0)}")
            logger.info(f"Analysis Notes: {step2_result.get('analysis_notes', 'None')}")
        logger.info("-" * 60)
        
        # Save results to JSON for further analysis
        result_data = {
            'test_type': 'simulated_document',
            'text_length': len(SIMULATED_DOCUMENT_TEXT),
            'found_keywords': found_keywords,
            'step1_result': step1_result,
            'step2_result': step2_result
        }
        
        with open('simulated_test_results.json', 'w') as f:
            json.dump(result_data, f, indent=2, default=str)
        
        # Check if end date was successfully extracted
        end_date = step2_result.get('end_date')
        if end_date and end_date not in ['null', None, '']:
            logger.info("✓ SUCCESS: End date extracted successfully!")
            logger.info(f"End date found: {end_date}")
            
            # Verify the calculation is correct (should be 2026-06-02)
            expected_end_date = "2026-06-02"  # June 2, 2025 + 1 year
            if end_date == expected_end_date:
                logger.info("✓ CALCULATION CORRECT: End date matches expected calculation")
            else:
                logger.warning(f"⚠ CALCULATION ISSUE: Expected {expected_end_date}, got {end_date}")
            
            return True
        else:
            logger.warning("✗ FAILED: End date not extracted even with explicit term clause")
            return False
            
    except Exception as e:
        logger.error(f"Test failed with error: {e}")
        import traceback
        logger.error(traceback.format_exc())
        return False

Return Value

Returns a boolean value: True if the end date was successfully extracted and matches expected calculations (2026-06-02), False if extraction failed or an error occurred during testing. The function also produces side effects including detailed logging output and a 'simulated_test_results.json' file containing comprehensive test results.

Dependencies

pathlib
json
logging
traceback
os
sys

Required Imports

import os
import sys
import json
from pathlib import Path
from config.config import Config
from utils.llm_client import LLMClient
import logging
import traceback

Usage Example

# Ensure prerequisites are set up
# 1. Define SIMULATED_DOCUMENT_TEXT constant with contract text
# 2. Implement setup_test_logging() function
# 3. Configure config/config.py with LLM settings

SIMULATED_DOCUMENT_TEXT = '''
This Confidential Disclosure Agreement shall commence on June 2, 2025
and continue for a period of one (1) year, with automatic renewal...
'''

def setup_test_logging():
    logging.basicConfig(level=logging.INFO)
    return logging.getLogger(__name__)

# Run the test
if __name__ == '__main__':
    success = test_simulated_document()
    if success:
        print('Test passed: End date extracted correctly')
    else:
        print('Test failed: End date extraction issue')
    
    # Review detailed results
    with open('simulated_test_results.json', 'r') as f:
        results = json.load(f)
        print(json.dumps(results, indent=2))

Best Practices

Ensure SIMULATED_DOCUMENT_TEXT contains realistic contract language with clear term clauses for accurate testing
Review the generated 'simulated_test_results.json' file for detailed analysis of extraction results
Monitor logging output to understand the two-step extraction process and identify any issues
Verify that the expected end date (2026-06-02) matches your test document's term clause calculations
Ensure LLM API credentials are properly configured before running the test to avoid authentication errors
Run this test in isolation or as part of a test suite to validate contract analysis functionality
Check that the Config object's 'llm' section contains all required parameters for the LLMClient
Handle the boolean return value to integrate this test into automated testing pipelines
Be aware that LLM responses may vary; consider running multiple times for consistency validation

Similar Components

AI-powered semantic similarity - components with related functionality:

function test_with_simulated_content 86.1% similar

Tests LLM-based contract analysis prompts using simulated NDA content containing a term clause to verify extraction of contract dates and metadata.
From: /tf/active/vicechatdev/contract_validity_analyzer/test_local_document.py
function test_end_date_extraction 82.7% similar

Tests end date extraction functionality for contract documents that previously had missing end dates by downloading documents from FileCloud, extracting text, analyzing with LLM, and comparing results.
From: /tf/active/vicechatdev/contract_validity_analyzer/test_missing_end_dates.py
function test_local_document 82.2% similar

Integration test function that validates end date extraction from a local PDF document using document processing and LLM-based analysis.
From: /tf/active/vicechatdev/contract_validity_analyzer/test_local_document.py
function test_single_document 80.8% similar

Tests end date extraction from a specific PDF document by downloading it from FileCloud, extracting text, and using LLM-based analysis to identify contract expiry dates.
From: /tf/active/vicechatdev/contract_validity_analyzer/test_single_document.py
function test_llm_extraction 70.7% similar

A test function that validates LLM-based contract data extraction by processing a sample contract and verifying the extracted fields against expected values.
From: /tf/active/vicechatdev/contract_validity_analyzer/test_extractor.py

← Back to Browse

Assistant

Hi! I can help improve this code. Tell me what you'd like to enhance (e.g., "add error handling", "optimize performance", "improve readability", "add type hints").

Code Comparison

Original Code

                            def test_simulated_document():
    """Test end date extraction with simulated document text containing the term clause."""
    logger = setup_test_logging()
    logger.info("Starting test on simulated document with term clause")
    
    try:
        # Load configuration
        config = Config()
        
        # Initialize LLM client
        llm_client = LLMClient(config.get_section('llm'))
        
        logger.info("Testing simulated document with term clause")
        
        # Show the document text being tested
        logger.info("=" * 80)
        logger.info("SIMULATED DOCUMENT TEXT:")
        logger.info("=" * 80)
        logger.info(SIMULATED_DOCUMENT_TEXT)
        logger.info("=" * 80)
        
        # Check if the key term clause is present
        term_keywords = ["Term:", "shall commence", "period of one (1) year", "thirty (30)", "five (5) years"]
        found_keywords = []
        for keyword in term_keywords:
            if keyword.lower() in SIMULATED_DOCUMENT_TEXT.lower():
                found_keywords.append(keyword)
                
        logger.info(f"Found term keywords: {found_keywords}")
        
        # Step 1: Test expiry date finding
        logger.info("STEP 1: Testing expiry date extraction...")
        step1_result = llm_client._find_expiry_dates(SIMULATED_DOCUMENT_TEXT, "simulated_cda.pdf")
        
        logger.info("STEP 1 RESULT:")
        logger.info("-" * 60)
        expiry_analysis = step1_result.get('expiry_analysis', 'No analysis available')
        logger.info(expiry_analysis)
        logger.info("-" * 60)
        
        # Step 2: Test complete contract analysis
        logger.info("STEP 2: Testing complete contract analysis...")
        step2_result = llm_client._extract_complete_contract_info(step1_result, SIMULATED_DOCUMENT_TEXT, "simulated_cda.pdf")
        
        logger.info("STEP 2 RESULT:")
        logger.info("-" * 60)
        if step2_result.get('error'):
            logger.error(f"Step 2 failed: {step2_result.get('error')}")
        else:
            logger.info(f"Contract Type: {step2_result.get('contract_type', 'Unknown')}")
            logger.info(f"Third Parties: {step2_result.get('third_parties', [])}")
            logger.info(f"Start Date: {step2_result.get('start_date', 'Not found')}")
            logger.info(f"End Date: {step2_result.get('end_date', 'Not found')}")
            logger.info(f"Is In Effect: {step2_result.get('is_in_effect', 'Unknown')}")
            logger.info(f"Confidence: {step2_result.get('confidence', 0.0)}")
            logger.info(f"Analysis Notes: {step2_result.get('analysis_notes', 'None')}")
        logger.info("-" * 60)
        
        # Save results to JSON for further analysis
        result_data = {
            'test_type': 'simulated_document',
            'text_length': len(SIMULATED_DOCUMENT_TEXT),
            'found_keywords': found_keywords,
            'step1_result': step1_result,
            'step2_result': step2_result
        }
        
        with open('simulated_test_results.json', 'w') as f:
            json.dump(result_data, f, indent=2, default=str)
        
        # Check if end date was successfully extracted
        end_date = step2_result.get('end_date')
        if end_date and end_date not in ['null', None, '']:
            logger.info("✓ SUCCESS: End date extracted successfully!")
            logger.info(f"End date found: {end_date}")
            
            # Verify the calculation is correct (should be 2026-06-02)
            expected_end_date = "2026-06-02"  # June 2, 2025 + 1 year
            if end_date == expected_end_date:
                logger.info("✓ CALCULATION CORRECT: End date matches expected calculation")
            else:
                logger.warning(f"⚠ CALCULATION ISSUE: Expected {expected_end_date}, got {end_date}")
            
            return True
        else:
            logger.warning("✗ FAILED: End date not extracted even with explicit term clause")
            return False
            
    except Exception as e:
        logger.error(f"Test failed with error: {e}")
        import traceback
        logger.error(traceback.format_exc())
        return False
                        

Improved Code

🔍 Code Extractor

function test_simulated_document

Purpose

Source Code

Return Value

Dependencies

Required Imports

Usage Example

Best Practices

Tags

Similar Components

function test_with_simulated_content 86.1% similar

function test_end_date_extraction 82.7% similar

function test_local_document 82.2% similar

function test_single_document 80.8% similar

function test_llm_extraction 70.7% similar

function test_simulated_document

Purpose

Source Code

Return Value

Dependencies

Required Imports

Usage Example

Best Practices

Tags

Similar Components

function test_with_simulated_content 86.1% similar

function test_end_date_extraction 82.7% similar

function test_local_document 82.2% similar

function test_single_document 80.8% similar

function test_llm_extraction 70.7% similar

✨ Improve Code: test_simulated_document

Code Comparison