🔍 Code Extractor

function smartstat_get_history

Maturity: 52

Flask API endpoint that retrieves analysis history for a SmartStat session, with automatic session recovery from saved data if the session is not found in memory.

File:
/tf/active/vicechatdev/vice_ai/new_app.py
Lines:
5551 - 5621
Complexity:
complex

Purpose

This endpoint serves as a robust API route for fetching SmartStat analysis history. It handles session retrieval with automatic recovery mechanisms: if a session is not found in memory, it attempts to reconstruct it from saved data sections and project files. It supports both single-dataset and multi-dataset modes, restores dataframes from CSV files, and loads historical analysis data from metadata. The function ensures proper authentication and ownership verification before returning session data.

Source Code

def smartstat_get_history(session_id):
    """Get analysis history for a SmartStat session"""
    user_email = get_current_user()
    
    # Get or recover session
    session = smartstat_service.get_session(session_id)
    if not session:
        logger.warning(f"Session {session_id} not found - attempting to recover")
        all_sections = data_section_service.get_user_data_sections(user_email)
        data_section = next((ds for ds in all_sections if ds.analysis_session_id == session_id), None)
        
        if data_section:
            # Recreate session
            session = SmartStatSession(session_id, data_section.id, data_section.title)
            smartstat_service.sessions[session_id] = session
            
            # Try to restore dataframes from saved project if it exists
            from pathlib import Path
            project_dirs = list(Path(smartstat_config.GENERATED_SCRIPTS_FOLDER).glob(f"{session_id}/project_*"))
            if project_dirs:
                # Get the most recent project
                latest_project = max(project_dirs, key=lambda p: p.stat().st_mtime)
                
                # Check for multi-dataset files (all CSV files except data.csv and requirements.txt)
                csv_files = list(latest_project.glob("*.csv"))
                dataset_files = [f for f in csv_files if f.name != "data.csv"]
                
                if len(dataset_files) > 0:
                    # Multi-dataset mode - restore all datasets
                    logger.info(f"Recovering multi-dataset session with {len(dataset_files)} datasets")
                    from smartstat_service import smart_read_csv
                    for csv_file in dataset_files:
                        dataset_name = csv_file.stem  # filename without extension
                        df = smart_read_csv(str(csv_file))
                        session.datasets[dataset_name] = df
                        logger.info(f"Restored dataset '{dataset_name}': {len(df)} rows, {len(df.columns)} columns")
                    
                    # Set the primary dataframe to the first dataset for backward compatibility
                    if session.datasets:
                        first_dataset_name = list(session.datasets.keys())[0]
                        session.dataframe = session.datasets[first_dataset_name]
                else:
                    # Single dataset mode - restore from data.csv
                    data_file = latest_project / "data.csv"
                    if data_file.exists():
                        from smartstat_service import smart_read_csv
                        df = smart_read_csv(str(data_file))
                        session.dataframe = df
                        logger.info(f"Recovered single-dataset session with {len(df)} rows from saved project")
            
            # Try to load history from metadata if available
            if data_section.metadata and 'analysis_history' in data_section.metadata:
                try:
                    session.analysis_history = data_section.metadata['analysis_history']
                except Exception as e:
                    logger.error(f"Error loading history from metadata: {e}")
        else:
            return jsonify({'error': 'Session not found'}), 404
    
    # Verify ownership
    data_section = data_section_service.get_data_section(session.data_section_id)
    if not data_section or data_section.owner != user_email:
        return jsonify({'error': 'Access denied'}), 403
    
    logger.info(f"Returning history: {len(session.analysis_history)} analyses, has_data={session.dataframe is not None}")
    
    return jsonify({
        'success': True,
        'session': session.to_dict(),
        'history': clean_for_json(session.analysis_history)
    })

Parameters

Name Type Default Kind
session_id - - positional_or_keyword

Parameter Details

session_id: Unique identifier (string) for the SmartStat analysis session. Used to locate the session in memory or recover it from saved project files and data sections.

Return Value

Returns a Flask JSON response. On success (200): {'success': True, 'session': <session_dict>, 'history': <cleaned_analysis_history_list>}. On session not found (404): {'error': 'Session not found'}. On access denied (403): {'error': 'Access denied'}. The session dict contains session metadata, and history contains a list of previous analyses performed in the session.

Dependencies

  • flask
  • logging
  • pathlib
  • pandas
  • json

Required Imports

from flask import jsonify
import logging
from pathlib import Path
from services import DataSectionService
from smartstat_service import SmartStatService, SmartStatSession, smart_read_csv
from auth.azure_auth import get_current_user
from functools import wraps

Conditional/Optional Imports

These imports are only needed under specific conditions:

from pathlib import Path

Condition: only when session recovery is needed and project directories must be scanned

Required (conditional)
from smartstat_service import smart_read_csv

Condition: only when restoring dataframes from saved CSV files during session recovery

Required (conditional)

Usage Example

# This is a Flask route handler, typically called via HTTP GET request
# Example HTTP request:
# GET /api/smartstat/abc123-session-id/history
# Headers: Authorization: Bearer <token>

# In Flask application setup:
from flask import Flask, jsonify
from functools import wraps

app = Flask(__name__)

# Assuming all services are initialized
# smartstat_service = SmartStatService()
# data_section_service = DataSectionService()

@app.route('/api/smartstat/<session_id>/history', methods=['GET'])
@require_auth
def smartstat_get_history(session_id):
    # Function implementation as provided
    pass

# Client-side usage example (JavaScript):
# fetch('/api/smartstat/abc123-session-id/history', {
#   method: 'GET',
#   headers: {
#     'Authorization': 'Bearer ' + token
#   }
# })
# .then(response => response.json())
# .then(data => {
#   console.log('Session:', data.session);
#   console.log('History:', data.history);
# });

Best Practices

  • Always ensure the require_auth decorator is applied to protect this endpoint from unauthorized access
  • The function implements robust error handling with session recovery, but ensure smartstat_config.GENERATED_SCRIPTS_FOLDER is properly configured
  • Session recovery attempts to restore both single-dataset and multi-dataset modes automatically based on saved CSV files
  • The function verifies ownership of the data section before returning history to prevent unauthorized access
  • Logging is used extensively for debugging session recovery - monitor logs when sessions fail to load
  • The clean_for_json() utility must be implemented to sanitize analysis history before returning to avoid JSON serialization errors
  • Session recovery prioritizes the most recent project directory when multiple exist
  • For multi-dataset sessions, all CSV files except 'data.csv' are treated as separate datasets
  • The function maintains backward compatibility by setting session.dataframe to the first dataset in multi-dataset mode
  • Consider implementing caching or session persistence strategies to reduce the need for frequent recovery operations
  • Ensure proper error handling in the smart_read_csv function as it's called during recovery without explicit try-catch in this context

Similar Components

AI-powered semantic similarity - components with related functionality:

  • function smartstat_download_log 75.8% similar

    Flask API endpoint that generates and downloads an execution log file containing the analysis history and debug information for a SmartStat session.

    From: /tf/active/vicechatdev/vice_ai/new_app.py
  • function smartstat_run_analysis 74.2% similar

    Flask API endpoint that initiates a SmartStat statistical analysis in a background thread, tracking progress and persisting results to a data section.

    From: /tf/active/vicechatdev/vice_ai/new_app.py
  • function smartstat_save_to_document 73.3% similar

    Flask route handler that saves SmartStat statistical analysis results back to a data section document, generating a final report with queries, results, and plots.

    From: /tf/active/vicechatdev/vice_ai/new_app.py
  • function smartstat_workspace 73.0% similar

    Flask route handler that opens a SmartStat statistical analysis workspace for a specific data section, managing session creation, data restoration, and access control.

    From: /tf/active/vicechatdev/vice_ai/new_app.py
  • function smartstat_upload_data 73.0% similar

    Flask route handler that uploads CSV or Excel data files to a SmartStat analysis session, with support for multi-sheet Excel files and session recovery.

    From: /tf/active/vicechatdev/vice_ai/new_app.py
← Back to Browse