🔍 Code Extractor

function upload_data_section_dataset

Maturity: 54

Flask API endpoint that handles CSV file uploads for data section analysis, processes the file, extracts metadata, and stores it in the data section for persistence.

File:
/tf/active/vicechatdev/vice_ai/new_app.py
Lines:
4464 - 4527
Complexity:
moderate

Purpose

This endpoint allows authenticated users to upload CSV datasets to their data sections for analysis. It validates file ownership, checks for an active analysis session, processes the uploaded CSV file to extract metadata (rows, columns, preview), stores this information in the data section, and integrates with the data analysis service for further processing.

Source Code

def upload_data_section_dataset(section_id):
    """Upload dataset for data section analysis"""
    if not DATA_ANALYSIS_AVAILABLE:
        return jsonify({'error': 'Data analysis service not available'}), 503
    
    user_email = get_current_user()
    
    # Verify ownership
    data_section = data_section_service.get_data_section(section_id)
    if not data_section or data_section.owner != user_email:
        return jsonify({'error': 'Data section not found or access denied'}), 404
    
    if not data_section.analysis_session_id:
        return jsonify({'error': 'No analysis session found. Create session first.'}), 400
    
    if 'file' not in request.files:
        return jsonify({'error': 'No file uploaded'}), 400
    
    file = request.files['file']
    if file.filename == '':
        return jsonify({'error': 'No file selected'}), 400
    
    if not file.filename.lower().endswith('.csv'):
        return jsonify({'error': 'Only CSV files are supported'}), 400
    
    try:
        import tempfile
        from werkzeug.utils import secure_filename
        
        filename = secure_filename(file.filename)
        with tempfile.NamedTemporaryFile(delete=False, suffix='.csv') as tmp_file:
            file.save(tmp_file.name)
            
            # Process the dataset
            result = data_analysis_service.upload_dataset(
                session_id=data_section.analysis_session_id,
                file_path=tmp_file.name,
                original_filename=filename
            )
            
            # Save CSV data to data section for persistence
            import pandas as pd
            from smartstat_service import smart_read_csv
            df = smart_read_csv(tmp_file.name)
            csv_info = {
                'rows': len(df),
                'columns': len(df.columns),
                'column_names': list(df.columns),
                'preview': df.head(20).to_dict('records')
            }
            
            # Update the data section with CSV info
            data_section = data_section_service.get_data_section(section_id)
            data_section.csv_data = json.dumps(csv_info)
            data_section_service.update_data_section(data_section)
            
            # Clean up temp file
            os.unlink(tmp_file.name)
            
            return jsonify(result)
            
    except Exception as e:
        logger.error(f"Error uploading dataset: {e}")
        return jsonify({'error': str(e)}), 500

Parameters

Name Type Default Kind
section_id - - positional_or_keyword

Parameter Details

section_id: String identifier for the data section where the dataset will be uploaded. Must correspond to an existing data section owned by the authenticated user.

Return Value

Returns a JSON response. On success (200), returns the result from data_analysis_service.upload_dataset containing upload confirmation and metadata. On error, returns JSON with 'error' key and appropriate HTTP status code: 503 if data analysis service unavailable, 404 if section not found or access denied, 400 if no analysis session exists or file validation fails, 500 for processing errors.

Dependencies

  • flask
  • werkzeug
  • pandas
  • tempfile
  • os
  • json
  • logging

Required Imports

from flask import request, jsonify
from werkzeug.utils import secure_filename
import tempfile
import os
import json
import pandas as pd
from smartstat_service import smart_read_csv
from services import DataSectionService
from data_analysis_service import DataAnalysisService
from auth.azure_auth import require_auth, get_current_user

Conditional/Optional Imports

These imports are only needed under specific conditions:

import tempfile

Condition: imported inside try block for temporary file handling during CSV upload

Required (conditional)
from werkzeug.utils import secure_filename

Condition: imported inside try block for sanitizing uploaded filenames

Required (conditional)
import pandas as pd

Condition: imported inside try block for reading and processing CSV data

Required (conditional)
from smartstat_service import smart_read_csv

Condition: imported inside try block for intelligent CSV reading with encoding detection

Required (conditional)

Usage Example

# Client-side usage example (JavaScript fetch)
const formData = new FormData();
formData.append('file', csvFile); // csvFile is a File object

fetch('/api/data-sections/section-123/analysis/upload', {
  method: 'POST',
  headers: {
    'Authorization': 'Bearer ' + authToken
  },
  body: formData
})
.then(response => response.json())
.then(data => {
  console.log('Upload successful:', data);
  // data contains upload result with dataset info
})
.catch(error => {
  console.error('Upload failed:', error);
});

# Server-side context (Flask app setup)
from flask import Flask, request
from services import DataSectionService
from data_analysis_service import DataAnalysisService
from auth.azure_auth import require_auth

app = Flask(__name__)
data_section_service = DataSectionService()
data_analysis_service = DataAnalysisService()
DATA_ANALYSIS_AVAILABLE = True

# The endpoint is then registered with decorators:
# @app.route('/api/data-sections/<section_id>/analysis/upload', methods=['POST'])
# @require_auth
# def upload_data_section_dataset(section_id): ...

Best Practices

  • Always verify user ownership of the data section before allowing file uploads to prevent unauthorized access
  • Ensure an analysis session exists before uploading datasets to maintain proper workflow
  • Use secure_filename() to sanitize uploaded filenames and prevent directory traversal attacks
  • Store files in temporary locations and clean up (os.unlink) after processing to avoid disk space issues
  • Limit file uploads to CSV format only for security and consistency
  • Extract and store dataset metadata (rows, columns, preview) for quick access without re-reading files
  • Wrap file processing in try-except blocks to handle encoding issues, malformed CSV, and other errors gracefully
  • Use smart_read_csv for robust CSV reading with automatic encoding detection
  • Return appropriate HTTP status codes (503, 404, 400, 500) to help clients handle different error scenarios
  • Log errors with sufficient detail for debugging while avoiding sensitive data exposure
  • Consider implementing file size limits to prevent resource exhaustion
  • Store only a preview (first 20 rows) rather than entire dataset to optimize storage

Similar Components

AI-powered semantic similarity - components with related functionality:

  • function upload_analysis_dataset 86.3% similar

    Flask API endpoint that handles file upload for data analysis sessions, accepting CSV and Excel files, validating user access, and processing the dataset through a data analysis service.

    From: /tf/active/vicechatdev/vice_ai/new_app.py
  • function upload_data 77.8% similar

    Flask route handler that accepts file uploads via POST request, validates the file, saves it with a timestamp, and loads the data into an analysis session.

    From: /tf/active/vicechatdev/full_smartstat/app.py
  • function smartstat_upload_data 72.8% similar

    Flask route handler that uploads CSV or Excel data files to a SmartStat analysis session, with support for multi-sheet Excel files and session recovery.

    From: /tf/active/vicechatdev/vice_ai/new_app.py
  • function save_data_section_analysis 71.9% similar

    Flask API endpoint that saves analysis results (plots, conclusions, and analysis data) from a data analysis session to a data section in the database.

    From: /tf/active/vicechatdev/vice_ai/new_app.py
  • function smartstat_upload_files 71.2% similar

    Flask API endpoint that handles multi-file uploads (CSV, Excel, PDF, Word, PowerPoint) to a SmartStat session, processing data files as datasets and documents as information sheets.

    From: /tf/active/vicechatdev/vice_ai/new_app.py
← Back to Browse