function demo_analysis_workflow
Demonstrates a complete end-to-end statistical analysis workflow using the SmartStat system, including session creation, data loading, natural language query processing, analysis execution, and result interpretation.
/tf/active/vicechatdev/full_smartstat/demo.py
72 - 179
complex
Purpose
This function serves as a comprehensive demonstration and testing tool for the SmartStat statistical analysis platform. It showcases the full workflow from data ingestion through natural language query interpretation to statistical analysis execution and visualization. The function creates sample quality control data, processes multiple types of statistical queries (descriptive statistics, normality tests, ANOVA, correlation analysis), and generates interpretations of results. It's primarily used for demos, testing, and as a reference implementation for integrating the SmartStat services.
Source Code
def demo_analysis_workflow():
"""Demonstrate the complete analysis workflow"""
print("\nš§ SmartStat Demo - Statistical Analysis Workflow")
print("=" * 60)
# Initialize services
config = DevelopmentConfig()
service = StatisticalAnalysisService(config)
# Create sample data
csv_path, sample_df = create_sample_data()
# Create analysis session
print("\nš Creating analysis session...")
session_id = service.create_analysis_session(
title="Demo Analysis - Quality Control Study",
description="Demonstration of SmartStat capabilities with quality control data"
)
print(f" Session created: {session_id}")
# Load data
print("\nš Loading data...")
data_source = DataSource(
source_type=DataSourceType.FILE_UPLOAD,
file_path=csv_path
)
load_result = service.load_data_for_session(session_id, data_source)
if load_result['success']:
print(" ā
Data loaded successfully")
print(f" Shape: {load_result['shape']}")
print(f" Columns: {load_result['columns']}")
else:
print(f" ā Data loading failed: {load_result['error']}")
return
# Demonstrate query interpretation
print("\nš¤ Testing natural language queries...")
queries = [
"Generate summary statistics for all numeric variables",
"Test if the measurement data follows a normal distribution",
"Compare mean measurements between the three groups using ANOVA",
"Create correlation analysis between measurement and quality score"
]
for i, query in enumerate(queries, 1):
print(f"\n Query {i}: {query}")
# Process query
interpretation_result = service.process_user_query(session_id, query)
if interpretation_result['success']:
print(f" ā
Interpretation successful")
analysis_plan = interpretation_result['interpretation'].get('analysis_plan', {})
print(f" Analysis type: {analysis_plan.get('analysis_type', 'unknown')}")
print(f" Target variables: {analysis_plan.get('target_variables', [])}")
# Generate and execute analysis
print(f" š§ Generating and executing analysis...")
analysis_result = service.generate_and_execute_analysis(
session_id,
interpretation_result['suggested_config'],
query
)
if analysis_result['success']:
print(f" ā
Analysis completed successfully")
if analysis_result.get('execution_result', {}).get('plots'):
print(f" š Generated {len(analysis_result['execution_result']['plots'])} plots")
else:
print(f" ā ļø Analysis failed: {analysis_result.get('error', 'Unknown error')}")
if analysis_result.get('debug_available'):
print(f" š§ Debug mode available for failed step")
else:
print(f" ā Query interpretation failed: {interpretation_result['error']}")
# Get session summary
print("\nš Session Summary...")
summary = service.get_session_summary(session_id)
if summary['success']:
print(f" ā
Analysis session completed")
print(f" Total steps: {summary['summary']['total_steps']}")
print(f" Successful steps: {summary['summary']['successful_steps']}")
print(f" Plots generated: {summary['summary']['plots_generated']}")
print(f" Session status: {summary['summary']['status']}")
else:
print(f" ā Could not retrieve session summary")
# Generate interpretation
print("\nš Generating results interpretation...")
interpretation = service.generate_interpretation(session_id)
if interpretation['success']:
print(" ā
Interpretation generated")
key_findings = interpretation.get('key_findings', [])
if key_findings:
print(" Key findings:")
for finding in key_findings[:3]: # Show first 3
print(f" ⢠{finding}")
else:
print(f" ā ļø Interpretation generation failed")
print(f"\nš Demo completed! Session ID: {session_id}")
print(f" Access the web interface at: http://localhost:5000/workspace/{session_id}")
return session_id
Return Value
Returns a string representing the session_id of the created analysis session. This ID can be used to access the analysis results through the web interface at http://localhost:5000/workspace/{session_id}. Returns None if the data loading step fails early in the workflow.
Dependencies
pandasnumpyjsondatetimesysostraceback
Required Imports
import pandas as pd
import numpy as np
import json
from datetime import datetime
import sys
import os
import traceback
from models import StatisticalSession, DataSource, DataSourceType, AnalysisConfiguration, AnalysisType, DatabaseManager
from services import StatisticalAnalysisService
from config import DevelopmentConfig
from statistical_agent import StatisticalAgent
Usage Example
# Basic usage - run the complete demo workflow
session_id = demo_analysis_workflow()
# The function will:
# 1. Create a sample dataset with quality control measurements
# 2. Initialize an analysis session
# 3. Load the data into the session
# 4. Process 4 different natural language queries:
# - Summary statistics
# - Normality testing
# - ANOVA comparison
# - Correlation analysis
# 5. Generate interpretations of results
# 6. Print progress and results to console
# 7. Return the session_id for further access
# Access results via web interface
print(f"View results at: http://localhost:5000/workspace/{session_id}")
Best Practices
- This function is designed for demonstration and testing purposes, not for production use
- Ensure all required services (database, web server) are running before executing
- The function creates temporary CSV files - ensure proper cleanup in production environments
- Monitor console output for detailed progress and error messages during execution
- The function processes 4 predefined queries sequentially - failures in one query don't stop subsequent queries
- Session IDs are generated and can be reused to access analysis results later
- The create_sample_data() function must be defined in the same module or imported separately
- Consider wrapping this function in error handling when integrating into larger applications
- The function assumes localhost:5000 for the web interface - adjust URL if deploying elsewhere
- Each query demonstrates a different type of statistical analysis - useful as a template for custom queries
Tags
Similar Components
AI-powered semantic similarity - components with related functionality:
-
function demonstrate_sql_workflow_v1 76.9% similar
-
function demonstrate_sql_workflow 76.3% similar
-
function main_v62 75.8% similar
-
function main_v61 71.7% similar
-
function demo_statistical_agent 69.9% similar