šŸ” Code Extractor

function test_new_fields

Maturity: 47

A test function that validates an LLM client's ability to extract third-party email addresses and tax identification numbers from contract documents.

File:
/tf/active/vicechatdev/contract_validity_analyzer/test_new_fields.py
Lines:
18 - 104
Complexity:
moderate

Purpose

This function serves as an integration test for the LLMClient's contract analysis capabilities, specifically testing the extraction of third-party emails and tax IDs from a sample confidentiality agreement. It creates a mock contract document, processes it through the LLM client, and verifies that the expected fields (third_party_emails and third_party_tax_ids) are present and correctly populated with extracted data.

Source Code

def test_new_fields():
    """Test that the LLM client can extract third party emails and tax IDs."""
    
    # Sample contract text with email and tax ID information
    test_document = """
    CONFIDENTIALITY AND NON-DISCLOSURE AGREEMENT

    This Confidentiality and Non-Disclosure Agreement ("Agreement") is entered into on January 15, 2024, 
    between ViceBio Ltd, a company incorporated in England and Wales, and TechCorp Solutions Inc., 
    a Delaware corporation.

    TechCorp Solutions Inc.
    123 Innovation Drive
    San Francisco, CA 94105
    Email: contracts@techcorp.com
    Federal Tax ID: 12-3456789
    Phone: (555) 123-4567

    Contact Person: Jane Smith
    Email: jane.smith@techcorp.com

    This Agreement shall commence on January 15, 2024 and shall remain in effect for a period of three (3) years 
    from the effective date, unless terminated earlier in accordance with the terms herein.

    The parties agree to maintain confidentiality of all proprietary information exchanged during the term 
    of this Agreement.

    Additional Contact:
    Legal Department: legal@techcorp.com
    Tax Department: tax@techcorp.com

    Registration Number: 987654321
    Business License: BL-2023-456

    Signed:
    ViceBio Ltd
    TechCorp Solutions Inc.
    """
    
    # Initialize LLM client
    config = {
        'provider': 'openai',
        'model': 'gpt-4o',
        'temperature': 0.0,
        'max_tokens': 4000
    }
    
    llm_client = LLMClient(config)
    
    # Test the contract analysis
    print("Testing contract analysis with new fields...")
    try:
        result = llm_client.analyze_contract(test_document, "test_contract.pdf")
        
        print("\nAnalysis Result:")
        print(json.dumps(result, indent=2))
        
        # Verify new fields are present
        if 'third_party_emails' in result:
            print(f"\nāœ“ Third party emails found: {result['third_party_emails']}")
        else:
            print("\nāœ— third_party_emails field missing")
            
        if 'third_party_tax_ids' in result:
            print(f"āœ“ Third party tax IDs found: {result['third_party_tax_ids']}")
        else:
            print("āœ— third_party_tax_ids field missing")
            
        # Check if emails were actually extracted
        emails = result.get('third_party_emails', [])
        if any('techcorp.com' in email for email in emails):
            print("āœ“ Successfully extracted TechCorp emails")
        else:
            print("āœ— Failed to extract expected emails")
            
        # Check if tax IDs were extracted
        tax_ids = result.get('third_party_tax_ids', [])
        if any('12-3456789' in tax_id or '987654321' in tax_id for tax_id in tax_ids):
            print("āœ“ Successfully extracted tax/registration IDs")
        else:
            print("āœ— Failed to extract expected tax/registration IDs")
            
        return True
        
    except Exception as e:
        print(f"Error during analysis: {e}")
        return False

Return Value

Returns a boolean value: True if the analysis completes successfully (regardless of extraction accuracy), False if an exception occurs during the analysis process. The function also prints detailed output to console showing the analysis results and validation checks.

Dependencies

  • openai
  • anthropic
  • json
  • pathlib

Required Imports

import os
import sys
import json
from pathlib import Path
from utils.llm_client import LLMClient

Conditional/Optional Imports

These imports are only needed under specific conditions:

import openai

Condition: required by LLMClient when provider is set to 'openai'

Required (conditional)
import anthropic

Condition: required by LLMClient if provider is set to 'anthropic' or 'claude'

Optional

Usage Example

# Ensure environment variables are set
import os
os.environ['OPENAI_API_KEY'] = 'your-api-key-here'

# Import required modules
import sys
import json
from pathlib import Path
from utils.llm_client import LLMClient

# Run the test
result = test_new_fields()
if result:
    print('Test passed successfully')
else:
    print('Test failed')

# Alternative: Run as part of a test suite
if __name__ == '__main__':
    success = test_new_fields()
    sys.exit(0 if success else 1)

Best Practices

  • Ensure API keys are properly set in environment variables before running this test
  • The test uses temperature=0.0 for deterministic results, which is appropriate for testing
  • The function prints detailed output for debugging; consider capturing this output in automated test environments
  • The test document contains specific patterns (techcorp.com emails, tax ID 12-3456789); modify these if testing different extraction patterns
  • Consider wrapping this in a proper test framework (pytest, unittest) for production use
  • The function returns True even if extraction fails but analysis completes; check console output for actual validation results
  • API calls to LLM providers may incur costs; be mindful when running repeatedly
  • The max_tokens setting (4000) should be sufficient for most contract analyses but may need adjustment for larger documents

Similar Components

AI-powered semantic similarity - components with related functionality:

  • function test_llm_client 81.7% similar

    Tests the LLM client functionality by analyzing a sample contract text and verifying the extraction of key contract metadata such as third parties, dates, and status.

    From: /tf/active/vicechatdev/contract_validity_analyzer/test_implementation.py
  • function test_international_tax_ids 79.9% similar

    A test function that validates an LLM client's ability to extract tax identification numbers and business registration numbers from a multi-party international contract document across 8 different countries.

    From: /tf/active/vicechatdev/contract_validity_analyzer/test_international_tax_ids.py
  • function test_llm_extraction 78.2% similar

    A test function that validates LLM-based contract data extraction by processing a sample contract and verifying the extracted fields against expected values.

    From: /tf/active/vicechatdev/contract_validity_analyzer/test_extractor.py
  • function test_llm_connectivity 71.9% similar

    Tests the connectivity and functionality of an OpenAI LLM integration by analyzing a mock email with vendor information extraction.

    From: /tf/active/vicechatdev/find_email/test_vendor_extractor.py
  • function test_edge_cases 68.8% similar

    Tests edge cases and variations in European tax ID formats by analyzing a sample contract document containing Swiss, Norwegian, Swedish, and Danish tax identifiers.

    From: /tf/active/vicechatdev/contract_validity_analyzer/test_international_tax_ids.py
← Back to Browse