function smartstat_select_sheet
Flask API endpoint that processes one or more Excel sheets from an uploaded file, validates them, categorizes them as datasets or information sheets, and adds them to a SmartStat analysis session.
/tf/active/vicechatdev/vice_ai/new_app.py
4902 - 4994
complex
Purpose
This endpoint is part of a statistical analysis workflow where users upload Excel files and select specific sheets to analyze. It reads the selected sheets, validates their format, distinguishes between data sheets (for analysis) and information sheets (for context), stores them in the session, and updates metadata. It supports multi-sheet selection for comprehensive analysis across related datasets.
Source Code
def smartstat_select_sheet(session_id):
"""Select one or more Excel sheets to analyze"""
user_email = get_current_user()
session = smartstat_service.get_session(session_id)
if not session:
return jsonify({'error': 'Session not found'}), 404
# Verify ownership
data_section = data_section_service.get_data_section(session.data_section_id)
if not data_section or data_section.owner != user_email:
return jsonify({'error': 'Access denied'}), 403
data = request.get_json()
sheet_names = data.get('sheet_names') # Now accepts array of sheet names
file_path = data.get('file_path')
if not sheet_names or not file_path:
return jsonify({'error': 'Sheet names and file path required'}), 400
# Ensure sheet_names is a list
if isinstance(sheet_names, str):
sheet_names = [sheet_names]
try:
from smartstat_service import read_excel_file
from pathlib import Path
datasets_info = []
info_sheets_info = []
all_columns = []
total_rows = 0
# Load each selected sheet
for sheet_name in sheet_names:
sheet_data = read_excel_file(file_path, sheet_name=sheet_name)
df = sheet_data['dataframe']
validation = sheet_data.get('format_validation')
context = sheet_data.get('context')
# Add to session datasets with validation and context
result = smartstat_service.add_dataset(
session_id,
sheet_name,
df,
validation=validation,
context=context
)
if result['success']:
if result['type'] == 'information_sheet':
info_sheets_info.append({
'name': sheet_name,
'type': 'information_sheet',
'validation': validation,
'context_length': result.get('context_length', 0)
})
else:
datasets_info.append(result['dataset_info'])
all_columns.extend(df.columns.tolist())
total_rows += len(df)
session.updated_at = datetime.now()
# Generate combined dataset info for preview
combined_info = {
'rows': total_rows,
'columns': len(set(all_columns)), # Unique columns across all sheets
'column_names': list(set(all_columns)),
'datasets': datasets_info,
'info_sheets': info_sheets_info,
'dataset_count': len(datasets_info),
'info_sheet_count': len(info_sheets_info)
}
# Update metadata
data_section = data_section_service.get_data_section(session.data_section_id)
if not data_section.metadata:
data_section.metadata = {}
data_section.metadata['dataset_count'] = len(datasets_info)
data_section.metadata['info_sheet_count'] = len(info_sheets_info)
data_section.metadata['datasets'] = {ds['name']: {'rows': ds['rows'], 'columns': ds['columns']} for ds in datasets_info}
data_section.metadata['info_sheets'] = [sheet['name'] for sheet in info_sheets_info]
data_section_service.update_data_section(data_section)
return jsonify({
'success': True,
'dataset_info': combined_info
})
except Exception as e:
logger.error(f"Error selecting Excel sheets: {e}")
return jsonify({'error': str(e)}), 500
Parameters
| Name | Type | Default | Kind |
|---|---|---|---|
session_id |
- | - | positional_or_keyword |
Parameter Details
session_id: String identifier for the SmartStat session. Used to retrieve the existing session and associate the selected sheets with it. Must correspond to a valid, existing session owned by the authenticated user.
Return Value
Returns a JSON response with HTTP status code. On success (200): {'success': True, 'dataset_info': {...}} containing combined information about all loaded sheets including total rows, unique columns, dataset count, info sheet count, and details for each sheet. On error: {'error': 'message'} with status 404 (session not found), 403 (access denied), 400 (missing parameters), or 500 (processing error).
Dependencies
flaskdatetimepathlibloggingpandassmartstat_servicedata_section_servicemodels
Required Imports
from flask import request, jsonify
from datetime import datetime
from pathlib import Path
import logging
Conditional/Optional Imports
These imports are only needed under specific conditions:
from smartstat_service import read_excel_file
Condition: imported inside the function when processing Excel files
Required (conditional)from pathlib import Path
Condition: imported inside the function for file path handling
Required (conditional)Usage Example
# Client-side usage (JavaScript fetch example)
const response = await fetch('/api/smartstat/abc123/select-sheet', {
method: 'POST',
headers: {
'Content-Type': 'application/json',
'Authorization': 'Bearer <token>'
},
body: JSON.stringify({
sheet_names: ['Sales Data', 'Customer Info'],
file_path: '/uploads/data_file.xlsx'
})
});
const result = await response.json();
if (result.success) {
console.log('Loaded datasets:', result.dataset_info.dataset_count);
console.log('Info sheets:', result.dataset_info.info_sheet_count);
console.log('Total rows:', result.dataset_info.rows);
console.log('Unique columns:', result.dataset_info.columns);
}
Best Practices
- Always verify session ownership before allowing sheet selection to prevent unauthorized access
- Handle both single sheet name (string) and multiple sheet names (array) for backward compatibility
- Distinguish between data sheets (for analysis) and information sheets (for context) using validation results
- Track metadata at both session and data section levels for comprehensive dataset management
- Use try-except blocks to catch and log errors during Excel file reading and processing
- Update session timestamps to track activity and enable session expiration logic
- Aggregate statistics (rows, columns) across multiple sheets for unified dataset preview
- Store validation results and context information with each dataset for downstream analysis
- Return detailed information about both successful and failed sheet loads
- Ensure file_path is validated and sanitized before reading to prevent path traversal attacks
Tags
Similar Components
AI-powered semantic similarity - components with related functionality:
-
function smartstat_upload_data 78.6% similar
-
function smartstat_upload_files 75.6% similar
-
function upload_analysis_dataset 66.5% similar
-
function smartstat_save_selective 66.1% similar
-
function smartstat_workspace 65.7% similar