🔍 Code Extractor

function main_v27

Maturity: 46

Demonstrates example usage of the VendorEmailExtractor class by searching for vendor emails across Office 365 mailboxes and displaying results.

File:
/tf/active/vicechatdev/find_email/vendor_email_extractor.py
Lines:
948 - 1000
Complexity:
moderate

Purpose

This function serves as a demonstration and testing entry point for the VendorEmailExtractor system. It loads configuration from a separate config file, initializes the extractor with Office 365 and OpenAI credentials, performs a comprehensive search for a specific vendor (Merck) across all mailboxes without limits, and displays formatted results including email counts, confidence scores, and mailbox distribution.

Source Code

def main():
    """Example usage"""
    # Load configuration
    try:
        from vendor_email_config import (
            TENANT_ID, CLIENT_ID, CLIENT_SECRET, 
            OPENAI_API_KEY, DOMAIN
        )
    except ImportError:
        print("ERROR: vendor_email_config.py not found!")
        print("Please create it from the O365_APP_SETUP_GUIDE.md instructions")
        return
    
    # Create extractor
    extractor = VendorEmailExtractor(
        tenant_id=TENANT_ID,
        client_id=CLIENT_ID,
        client_secret=CLIENT_SECRET,
        openai_api_key=OPENAI_API_KEY,
        domain=DOMAIN
    )
    
    # Example: Extract for single vendor across ALL mailboxes
    print("\n=== Single Vendor Test - All Mailboxes ===")
    print("Searching all vicebio.com mailboxes for: Merck")
    print("No limits: All mailboxes, all emails, all time\n")
    
    df = extractor.extract_for_vendor(
        vendor_name="Merck",
        max_mailboxes=None,  # Search ALL mailboxes
        max_emails_per_mailbox=None,  # No email limit - get ALL matching emails
        days_back=None  # No date limit
    )
    
    print(f"\n{'='*60}")
    print("TEST COMPLETE")
    print(f"{'='*60}")
    print(f"Found {len(df)} total records")
    
    if not df.empty:
        print(f"\nUnique vendor emails: {df['vendor_email'].nunique()}")
        print("\nTop results:")
        for email in df['vendor_email'].unique()[:10]:
            count = len(df[df['vendor_email'] == email])
            conf = df[df['vendor_email'] == email]['confidence'].mode()[0]
            mailboxes = df[df['vendor_email'] == email]['found_in_mailbox'].nunique()
            print(f"  • {email}")
            print(f"    Confidence: {conf}, Found in {count} emails across {mailboxes} mailboxes")
    else:
        print("\n⚠️  No vendor emails found for Merck")
        print("This may be normal if Merck is not in email communications")
    
    print(f"\n{'='*60}\n")

Return Value

This function does not return any value (implicitly returns None). It performs side effects by printing results to stdout and potentially creating a VendorEmailExtractor instance that may cache data.

Dependencies

  • msal
  • requests
  • pandas
  • openai
  • pathlib

Required Imports

from vendor_email_config import TENANT_ID, CLIENT_ID, CLIENT_SECRET, OPENAI_API_KEY, DOMAIN

Conditional/Optional Imports

These imports are only needed under specific conditions:

from vendor_email_config import TENANT_ID, CLIENT_ID, CLIENT_SECRET, OPENAI_API_KEY, DOMAIN

Condition: Required configuration file vendor_email_config.py must exist with these constants defined

Required (conditional)

Usage Example

if __name__ == '__main__':
    main()

Best Practices

  • Ensure vendor_email_config.py exists before running this function, or it will exit early with an error message
  • The function searches ALL mailboxes with no limits (max_mailboxes=None, max_emails_per_mailbox=None, days_back=None), which may take significant time and API quota for large organizations
  • This is intended as a demonstration/testing function, not for production use - consider adding parameters for vendor name and search limits
  • The function assumes VendorEmailExtractor class is defined in the same module
  • Error handling is minimal - only catches ImportError for missing config file
  • Results are printed to stdout rather than returned, making it unsuitable for programmatic use
  • Consider wrapping the main logic in try-except blocks to handle API errors gracefully
  • For production use, extract the core logic into a separate function that returns data rather than printing it

Similar Components

AI-powered semantic similarity - components with related functionality:

  • function test_email_search 75.5% similar

    Tests the email search functionality of a VendorEmailExtractor instance by searching for emails containing common business terms in the first available mailbox.

    From: /tf/active/vicechatdev/find_email/test_vendor_extractor.py
  • function main_v28 75.3% similar

    Command-line entry point that parses arguments and orchestrates the extraction of vendor emails from all vicebio.com mailboxes using Microsoft Graph API.

    From: /tf/active/vicechatdev/find_email/extract_vendor_batch.py
  • function extract_batch 73.5% similar

    Batch processes a list of vendors from an Excel file to extract their email addresses by searching through Microsoft 365 mailboxes using AI-powered email analysis.

    From: /tf/active/vicechatdev/find_email/extract_vendor_batch.py
  • class VendorEmailExtractor 70.6% similar

    Extract vendor email addresses from all organizational mailboxes

    From: /tf/active/vicechatdev/find_email/vendor_email_extractor.py
  • function test_mailbox_access 70.1% similar

    Tests the ability to access and retrieve mailboxes from Microsoft Graph API through a VendorEmailExtractor instance, displaying results and troubleshooting information.

    From: /tf/active/vicechatdev/find_email/test_vendor_extractor.py
← Back to Browse