🔍 Code Extractor

function smartstat_select_sheet

Maturity: 50

Flask API endpoint that processes one or more Excel sheets from an uploaded file, validates them, categorizes them as datasets or information sheets, and adds them to a SmartStat analysis session.

File:
/tf/active/vicechatdev/vice_ai/new_app.py
Lines:
4902 - 4994
Complexity:
complex

Purpose

This endpoint is part of a statistical analysis workflow where users upload Excel files and select specific sheets to analyze. It reads the selected sheets, validates their format, distinguishes between data sheets (for analysis) and information sheets (for context), stores them in the session, and updates metadata. It supports multi-sheet selection for comprehensive analysis across related datasets.

Source Code

def smartstat_select_sheet(session_id):
    """Select one or more Excel sheets to analyze"""
    user_email = get_current_user()
    
    session = smartstat_service.get_session(session_id)
    if not session:
        return jsonify({'error': 'Session not found'}), 404
    
    # Verify ownership
    data_section = data_section_service.get_data_section(session.data_section_id)
    if not data_section or data_section.owner != user_email:
        return jsonify({'error': 'Access denied'}), 403
    
    data = request.get_json()
    sheet_names = data.get('sheet_names')  # Now accepts array of sheet names
    file_path = data.get('file_path')
    
    if not sheet_names or not file_path:
        return jsonify({'error': 'Sheet names and file path required'}), 400
    
    # Ensure sheet_names is a list
    if isinstance(sheet_names, str):
        sheet_names = [sheet_names]
    
    try:
        from smartstat_service import read_excel_file
        from pathlib import Path
        
        datasets_info = []
        info_sheets_info = []
        all_columns = []
        total_rows = 0
        
        # Load each selected sheet
        for sheet_name in sheet_names:
            sheet_data = read_excel_file(file_path, sheet_name=sheet_name)
            df = sheet_data['dataframe']
            validation = sheet_data.get('format_validation')
            context = sheet_data.get('context')
            
            # Add to session datasets with validation and context
            result = smartstat_service.add_dataset(
                session_id, 
                sheet_name, 
                df,
                validation=validation,
                context=context
            )
            
            if result['success']:
                if result['type'] == 'information_sheet':
                    info_sheets_info.append({
                        'name': sheet_name,
                        'type': 'information_sheet',
                        'validation': validation,
                        'context_length': result.get('context_length', 0)
                    })
                else:
                    datasets_info.append(result['dataset_info'])
                    all_columns.extend(df.columns.tolist())
                    total_rows += len(df)
        
        session.updated_at = datetime.now()
        
        # Generate combined dataset info for preview
        combined_info = {
            'rows': total_rows,
            'columns': len(set(all_columns)),  # Unique columns across all sheets
            'column_names': list(set(all_columns)),
            'datasets': datasets_info,
            'info_sheets': info_sheets_info,
            'dataset_count': len(datasets_info),
            'info_sheet_count': len(info_sheets_info)
        }
        
        # Update metadata
        data_section = data_section_service.get_data_section(session.data_section_id)
        if not data_section.metadata:
            data_section.metadata = {}
        data_section.metadata['dataset_count'] = len(datasets_info)
        data_section.metadata['info_sheet_count'] = len(info_sheets_info)
        data_section.metadata['datasets'] = {ds['name']: {'rows': ds['rows'], 'columns': ds['columns']} for ds in datasets_info}
        data_section.metadata['info_sheets'] = [sheet['name'] for sheet in info_sheets_info]
        data_section_service.update_data_section(data_section)
        
        return jsonify({
            'success': True,
            'dataset_info': combined_info
        })
            
    except Exception as e:
        logger.error(f"Error selecting Excel sheets: {e}")
        return jsonify({'error': str(e)}), 500

Parameters

Name Type Default Kind
session_id - - positional_or_keyword

Parameter Details

session_id: String identifier for the SmartStat session. Used to retrieve the existing session and associate the selected sheets with it. Must correspond to a valid, existing session owned by the authenticated user.

Return Value

Returns a JSON response with HTTP status code. On success (200): {'success': True, 'dataset_info': {...}} containing combined information about all loaded sheets including total rows, unique columns, dataset count, info sheet count, and details for each sheet. On error: {'error': 'message'} with status 404 (session not found), 403 (access denied), 400 (missing parameters), or 500 (processing error).

Dependencies

  • flask
  • datetime
  • pathlib
  • logging
  • pandas
  • smartstat_service
  • data_section_service
  • models

Required Imports

from flask import request, jsonify
from datetime import datetime
from pathlib import Path
import logging

Conditional/Optional Imports

These imports are only needed under specific conditions:

from smartstat_service import read_excel_file

Condition: imported inside the function when processing Excel files

Required (conditional)
from pathlib import Path

Condition: imported inside the function for file path handling

Required (conditional)

Usage Example

# Client-side usage (JavaScript fetch example)
const response = await fetch('/api/smartstat/abc123/select-sheet', {
  method: 'POST',
  headers: {
    'Content-Type': 'application/json',
    'Authorization': 'Bearer <token>'
  },
  body: JSON.stringify({
    sheet_names: ['Sales Data', 'Customer Info'],
    file_path: '/uploads/data_file.xlsx'
  })
});

const result = await response.json();
if (result.success) {
  console.log('Loaded datasets:', result.dataset_info.dataset_count);
  console.log('Info sheets:', result.dataset_info.info_sheet_count);
  console.log('Total rows:', result.dataset_info.rows);
  console.log('Unique columns:', result.dataset_info.columns);
}

Best Practices

  • Always verify session ownership before allowing sheet selection to prevent unauthorized access
  • Handle both single sheet name (string) and multiple sheet names (array) for backward compatibility
  • Distinguish between data sheets (for analysis) and information sheets (for context) using validation results
  • Track metadata at both session and data section levels for comprehensive dataset management
  • Use try-except blocks to catch and log errors during Excel file reading and processing
  • Update session timestamps to track activity and enable session expiration logic
  • Aggregate statistics (rows, columns) across multiple sheets for unified dataset preview
  • Store validation results and context information with each dataset for downstream analysis
  • Return detailed information about both successful and failed sheet loads
  • Ensure file_path is validated and sanitized before reading to prevent path traversal attacks

Similar Components

AI-powered semantic similarity - components with related functionality:

  • function smartstat_upload_data 78.6% similar

    Flask route handler that uploads CSV or Excel data files to a SmartStat analysis session, with support for multi-sheet Excel files and session recovery.

    From: /tf/active/vicechatdev/vice_ai/new_app.py
  • function smartstat_upload_files 75.6% similar

    Flask API endpoint that handles multi-file uploads (CSV, Excel, PDF, Word, PowerPoint) to a SmartStat session, processing data files as datasets and documents as information sheets.

    From: /tf/active/vicechatdev/vice_ai/new_app.py
  • function upload_analysis_dataset 66.5% similar

    Flask API endpoint that handles file upload for data analysis sessions, accepting CSV and Excel files, validating user access, and processing the dataset through a data analysis service.

    From: /tf/active/vicechatdev/vice_ai/new_app.py
  • function smartstat_save_selective 66.1% similar

    Flask route handler that saves selected statistical analysis rounds and their associated plots from a SmartStat session to a data section, with options to replace or append to existing content.

    From: /tf/active/vicechatdev/vice_ai/new_app.py
  • function smartstat_workspace 65.7% similar

    Flask route handler that opens a SmartStat statistical analysis workspace for a specific data section, managing session creation, data restoration, and access control.

    From: /tf/active/vicechatdev/vice_ai/new_app.py
← Back to Browse