function test_agent_executor
Integration test function that validates the AgentExecutor's ability to generate and execute data analysis projects using synthetic test data.
/tf/active/vicechatdev/full_smartstat/debug_agent.py
24 - 101
complex
Purpose
This test function validates the complete workflow of the AgentExecutor system by: (1) creating a synthetic pandas DataFrame with various data types and missing values, (2) generating an analysis project based on a user query and data summary, and (3) executing the generated project to produce analysis outputs. It serves as both a functional test and a demonstration of the agent executor's capabilities for automated data analysis.
Source Code
def test_agent_executor():
"""Test the agent executor with a simple analysis"""
# Initialize config and agent
app_config = config['development']()
agent = AgentExecutor(app_config)
# Create test data summary
test_data_summary = {
'shape': (100, 5),
'column_info': {
'A': {'type': 'float64', 'non_null': 100, 'unique': 95},
'B': {'type': 'float64', 'non_null': 100, 'unique': 88},
'C': {'type': 'int64', 'non_null': 100, 'unique': 3},
'D': {'type': 'object', 'non_null': 100, 'unique': 2},
'E': {'type': 'float64', 'non_null': 95, 'unique': 85}
}
}
test_session_id = "debug_session_001"
test_query = "Generate descriptive statistics and basic plots for this dataset"
print(f"Testing agent executor with session {test_session_id}")
print(f"Query: {test_query}")
print(f"Data summary: {test_data_summary}")
print("-" * 50)
# Create test data
import pandas as pd
import numpy as np
# Generate test dataframe
np.random.seed(42)
test_df = pd.DataFrame({
'A': np.random.normal(100, 15, 100),
'B': np.random.normal(50, 10, 100),
'C': np.random.choice([1, 2, 3], 100),
'D': np.random.choice(['Group1', 'Group2'], 100),
'E': np.random.normal(25, 5, 100)
})
# Add some missing values
test_df.loc[95:99, 'E'] = np.nan
# Test project generation
print("Step 1: Generating analysis project...")
project_result = agent.generate_analysis_project(
session_id=test_session_id,
user_query=test_query,
data_summary=test_data_summary,
analysis_config=None,
session_data=test_df # Pass the test dataframe
)
print(f"Project generation result: {project_result}")
if project_result['success']:
print(f"Project created at: {project_result['project_dir']}")
# Test project execution
print("\nStep 2: Executing analysis project...")
execution_result = agent.execute_analysis_project(
project_dir=project_result['project_dir'],
max_iterations=2
)
print(f"Execution result: {execution_result}")
# Check generated files
project_path = Path(project_result['project_dir'])
print(f"\nFiles in project directory {project_path}:")
if project_path.exists():
for file_path in project_path.rglob('*'):
if file_path.is_file():
print(f" {file_path.relative_to(project_path)} ({file_path.stat().st_size} bytes)")
else:
print(f"Project generation failed: {project_result.get('error', 'Unknown error')}")
Return Value
This function does not return any value (implicitly returns None). It prints status messages and results to stdout throughout execution, including project generation results, execution results, and a listing of generated files in the project directory.
Dependencies
pandasnumpypathlibloggingossys
Required Imports
import os
import sys
import logging
from pathlib import Path
from config import config
from agent_executor import AgentExecutor
from models import DatabaseManager
import pandas as pd
import numpy as np
Usage Example
# Ensure config.py is properly set up with development configuration
# from config import config
# from agent_executor import AgentExecutor
# from models import DatabaseManager
# Run the test
test_agent_executor()
# Expected output:
# - Prints test session information
# - Creates a test DataFrame with 100 rows and 5 columns
# - Generates an analysis project in a directory
# - Executes the project with max 2 iterations
# - Lists all generated files with their sizes
Best Practices
- This is a test function and should be run in a development or testing environment, not in production
- Ensure sufficient disk space is available for generated analysis projects and outputs
- The function uses a fixed random seed (42) for reproducibility of test data
- Monitor the output directory for accumulated test projects that may need cleanup
- The max_iterations parameter is set to 2 for testing; adjust based on analysis complexity needs
- Verify that the config['development']() returns a valid configuration object before running
- The test creates a session with ID 'debug_session_001' which may conflict with existing sessions
- Consider wrapping this test in a try-except block to handle potential failures gracefully
- The function intentionally creates missing values in column 'E' (rows 95-99) to test handling of incomplete data
Tags
Similar Components
AI-powered semantic similarity - components with related functionality:
-
function test_enhanced_workflow 67.3% similar
-
function main_v25 61.9% similar
-
function test_data_analysis_service 60.7% similar
-
function demo_statistical_agent 60.6% similar
-
class AgentExecutor_v1 59.1% similar