function test_simulated_document
Integration test function that validates end date extraction from a simulated contract document containing an explicit term clause, using a two-step LLM-based analysis process.
/tf/active/vicechatdev/contract_validity_analyzer/test_simulated_document.py
49 - 141
complex
Purpose
This test function validates the complete contract analysis pipeline by processing a simulated document with known term clause content. It performs a two-step extraction process: first finding expiry dates, then extracting complete contract information including start/end dates, contract type, third parties, and validity status. The function logs detailed results, saves them to JSON, and verifies that the calculated end date matches expected values. It's designed to ensure the LLM client correctly interprets term clauses and performs date calculations.
Source Code
def test_simulated_document():
"""Test end date extraction with simulated document text containing the term clause."""
logger = setup_test_logging()
logger.info("Starting test on simulated document with term clause")
try:
# Load configuration
config = Config()
# Initialize LLM client
llm_client = LLMClient(config.get_section('llm'))
logger.info("Testing simulated document with term clause")
# Show the document text being tested
logger.info("=" * 80)
logger.info("SIMULATED DOCUMENT TEXT:")
logger.info("=" * 80)
logger.info(SIMULATED_DOCUMENT_TEXT)
logger.info("=" * 80)
# Check if the key term clause is present
term_keywords = ["Term:", "shall commence", "period of one (1) year", "thirty (30)", "five (5) years"]
found_keywords = []
for keyword in term_keywords:
if keyword.lower() in SIMULATED_DOCUMENT_TEXT.lower():
found_keywords.append(keyword)
logger.info(f"Found term keywords: {found_keywords}")
# Step 1: Test expiry date finding
logger.info("STEP 1: Testing expiry date extraction...")
step1_result = llm_client._find_expiry_dates(SIMULATED_DOCUMENT_TEXT, "simulated_cda.pdf")
logger.info("STEP 1 RESULT:")
logger.info("-" * 60)
expiry_analysis = step1_result.get('expiry_analysis', 'No analysis available')
logger.info(expiry_analysis)
logger.info("-" * 60)
# Step 2: Test complete contract analysis
logger.info("STEP 2: Testing complete contract analysis...")
step2_result = llm_client._extract_complete_contract_info(step1_result, SIMULATED_DOCUMENT_TEXT, "simulated_cda.pdf")
logger.info("STEP 2 RESULT:")
logger.info("-" * 60)
if step2_result.get('error'):
logger.error(f"Step 2 failed: {step2_result.get('error')}")
else:
logger.info(f"Contract Type: {step2_result.get('contract_type', 'Unknown')}")
logger.info(f"Third Parties: {step2_result.get('third_parties', [])}")
logger.info(f"Start Date: {step2_result.get('start_date', 'Not found')}")
logger.info(f"End Date: {step2_result.get('end_date', 'Not found')}")
logger.info(f"Is In Effect: {step2_result.get('is_in_effect', 'Unknown')}")
logger.info(f"Confidence: {step2_result.get('confidence', 0.0)}")
logger.info(f"Analysis Notes: {step2_result.get('analysis_notes', 'None')}")
logger.info("-" * 60)
# Save results to JSON for further analysis
result_data = {
'test_type': 'simulated_document',
'text_length': len(SIMULATED_DOCUMENT_TEXT),
'found_keywords': found_keywords,
'step1_result': step1_result,
'step2_result': step2_result
}
with open('simulated_test_results.json', 'w') as f:
json.dump(result_data, f, indent=2, default=str)
# Check if end date was successfully extracted
end_date = step2_result.get('end_date')
if end_date and end_date not in ['null', None, '']:
logger.info("✓ SUCCESS: End date extracted successfully!")
logger.info(f"End date found: {end_date}")
# Verify the calculation is correct (should be 2026-06-02)
expected_end_date = "2026-06-02" # June 2, 2025 + 1 year
if end_date == expected_end_date:
logger.info("✓ CALCULATION CORRECT: End date matches expected calculation")
else:
logger.warning(f"⚠ CALCULATION ISSUE: Expected {expected_end_date}, got {end_date}")
return True
else:
logger.warning("✗ FAILED: End date not extracted even with explicit term clause")
return False
except Exception as e:
logger.error(f"Test failed with error: {e}")
import traceback
logger.error(traceback.format_exc())
return False
Return Value
Returns a boolean value: True if the end date was successfully extracted and matches expected calculations (2026-06-02), False if extraction failed or an error occurred during testing. The function also produces side effects including detailed logging output and a 'simulated_test_results.json' file containing comprehensive test results.
Dependencies
pathlibjsonloggingtracebackossys
Required Imports
import os
import sys
import json
from pathlib import Path
from config.config import Config
from utils.llm_client import LLMClient
import logging
import traceback
Usage Example
# Ensure prerequisites are set up
# 1. Define SIMULATED_DOCUMENT_TEXT constant with contract text
# 2. Implement setup_test_logging() function
# 3. Configure config/config.py with LLM settings
SIMULATED_DOCUMENT_TEXT = '''
This Confidential Disclosure Agreement shall commence on June 2, 2025
and continue for a period of one (1) year, with automatic renewal...
'''
def setup_test_logging():
logging.basicConfig(level=logging.INFO)
return logging.getLogger(__name__)
# Run the test
if __name__ == '__main__':
success = test_simulated_document()
if success:
print('Test passed: End date extracted correctly')
else:
print('Test failed: End date extraction issue')
# Review detailed results
with open('simulated_test_results.json', 'r') as f:
results = json.load(f)
print(json.dumps(results, indent=2))
Best Practices
- Ensure SIMULATED_DOCUMENT_TEXT contains realistic contract language with clear term clauses for accurate testing
- Review the generated 'simulated_test_results.json' file for detailed analysis of extraction results
- Monitor logging output to understand the two-step extraction process and identify any issues
- Verify that the expected end date (2026-06-02) matches your test document's term clause calculations
- Ensure LLM API credentials are properly configured before running the test to avoid authentication errors
- Run this test in isolation or as part of a test suite to validate contract analysis functionality
- Check that the Config object's 'llm' section contains all required parameters for the LLMClient
- Handle the boolean return value to integrate this test into automated testing pipelines
- Be aware that LLM responses may vary; consider running multiple times for consistency validation
Tags
Similar Components
AI-powered semantic similarity - components with related functionality:
-
function test_with_simulated_content 86.1% similar
-
function test_end_date_extraction 82.7% similar
-
function test_local_document 82.2% similar
-
function test_single_document 80.8% similar
-
function test_llm_extraction 70.7% similar