🔍 Code Extractor

class Config_v1

Maturity: 52

Configuration class that centralizes all application settings including Flask configuration, directory paths, API keys, LLM model settings, and statistical analysis parameters.

File:
/tf/active/vicechatdev/full_smartstat/config.py
Lines:
9 - 92
Complexity:
moderate

Purpose

This class serves as a centralized configuration management system for a statistical analysis application (SmartStat). It defines Flask settings, manages directory structures for uploads/reports/sessions, configures multiple LLM providers (OpenAI, Anthropic, Google Gemini, Azure), sets statistical analysis parameters, and controls Python script execution settings. The class automatically creates required directories upon instantiation and provides default values with environment variable overrides for sensitive data.

Source Code

class Config:
    """Base configuration class"""
    
    # Flask settings
    SECRET_KEY = os.environ.get('SECRET_KEY') or 'smartstat-dev-key-change-in-production'
    MAX_CONTENT_LENGTH = 100 * 1024 * 1024  # 100MB max file size
    
    # Application directories
    BASE_DIR = Path(__file__).parent
    UPLOAD_FOLDER = BASE_DIR / 'uploads'
    GENERATED_SCRIPTS_FOLDER = BASE_DIR / 'generated_scripts'
    REPORTS_FOLDER = BASE_DIR / 'reports'
    SESSIONS_FOLDER = BASE_DIR / 'sessions'
    SANDBOX_FOLDER = BASE_DIR / 'sandbox'
    OUTPUT_DIR = BASE_DIR / 'output'  # For agent-generated files
    
    # Database
    DATABASE_URL = os.environ.get('DATABASE_URL') or f'sqlite:///{BASE_DIR}/smartstat.db'
    
    def __init__(self):
        # Ensure directories exist
        for directory in [self.UPLOAD_FOLDER, self.GENERATED_SCRIPTS_FOLDER, 
                         self.REPORTS_FOLDER, self.SESSIONS_FOLDER, self.SANDBOX_FOLDER,
                         self.OUTPUT_DIR]:
            directory.mkdir(parents=True, exist_ok=True)
    
    # LLM API Configuration (matching your existing setup)
    OPENAI_API_KEY = os.environ.get('OPENAI_API_KEY') or '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'
    GEMINI_API_KEY = os.environ.get('GEMINI_API_KEY') or 'AIzaSyBN6Vv4ag8kl2NWjoj96Zz2Qs9GZdO6uEw'
    AZURE_OPENAI_ENDPOINT = os.environ.get('AZURE_OPENAI_ENDPOINT') or 'https://vice-llm-2.openai.azure.com/'
    AZURE_OPENAI_API_KEY = os.environ.get('AZURE_OPENAI_API_KEY') or '8DaDtzYz3HePiypmFb6JQmJd3zUCtyCQkiYE8bePRnpyk2YNkJZRJQQJ99BAACfhMk5XJ3w3AAABACOGyJVB'
    
    # LLM Model Configuration
    AVAILABLE_MODELS = {
        'gpt-4o': {
            'name': 'GPT-4o (OpenAI)',
            'provider': 'openai',
            'description': 'OpenAI GPT-4o - Latest model with excellent reasoning and analysis'
        },
        'claude-sonnet-4-5-20250929': {
            'name': 'Claude Sonnet 4.5 (Anthropic)',
            'provider': 'anthropic',
            'description': 'Anthropic Claude Sonnet 4.5 - Latest model with superior analytical and coding capabilities'
        },
        'gemini-2.0-flash-exp': {
            'name': 'Gemini 2.0 Flash (Google)',
            'provider': 'gemini',
            'description': 'Google Gemini 2.0 Flash - Experimental model with enhanced reasoning'
        }
    }
    
    DEFAULT_MODEL = 'gpt-4o'
    
    # Anthropic API Configuration
    ANTHROPIC_API_KEY = os.environ.get('ANTHROPIC_API_KEY') or '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'
    
    # Statistical Analysis Settings
    DEFAULT_SIGNIFICANCE_LEVEL = 0.05
    MAX_DATASET_ROWS = 100000  # Medium size datasets
    MAX_COLUMNS = 500
    DEFAULT_SAMPLE_SIZE = 1000  # For large dataset sampling
    
    # Python execution settings
    SCRIPT_TIMEOUT = 300  # 5 minutes max execution time
    ALLOWED_IMPORTS = [
        'pandas', 'numpy', 'scipy', 'matplotlib', 'seaborn', 'statsmodels',
        'sklearn', 'math', 'statistics', 'datetime', 'json', 'warnings'
    ]
    
    # SQL Server settings
    MSSQL_DRIVER = '{ODBC Driver 18 for SQL Server}'
    MSSQL_TIMEOUT = 60
    
    # File format settings
    ALLOWED_EXTENSIONS = {'csv', 'xlsx', 'xls', 'txt', 'tsv'}
    
    # Report settings
    REPORT_TEMPLATE_DIR = BASE_DIR / 'templates' / 'reports'
    
    # Cleanup settings
    AUTO_CLEANUP_ENABLED = True  # Enable automatic cleanup
    KEEP_RECENT_ANALYSES = 8     # Number of recent analyses to keep per session
    CLEANUP_OLD_SESSIONS_DAYS = 15  # Days after which to clean up entire sessions
    MAX_ANALYSES_PER_SESSION = 10  # Maximum analyses before forced cleanup

Parameters

Name Type Default Kind
bases - -

Parameter Details

__init__: The constructor takes no parameters. It automatically creates all required directories (UPLOAD_FOLDER, GENERATED_SCRIPTS_FOLDER, REPORTS_FOLDER, SESSIONS_FOLDER, SANDBOX_FOLDER, OUTPUT_DIR) if they don't exist, using mkdir with parents=True and exist_ok=True to ensure safe directory creation.

Return Value

Instantiation returns a Config object with all class attributes accessible. The __init__ method itself returns None but has the side effect of creating necessary directories. All attributes are accessible as class variables or instance variables after instantiation.

Class Interface

Methods

__init__(self) -> None

Purpose: Initializes the Config instance and creates all required directories if they don't exist

Returns: None - side effect is creating directories on the filesystem

Attributes

Name Type Description Scope
SECRET_KEY str Flask secret key for session management and CSRF protection, sourced from environment or default class
MAX_CONTENT_LENGTH int Maximum file upload size in bytes (100MB) class
BASE_DIR Path Base directory path of the application, calculated from Config file location class
UPLOAD_FOLDER Path Directory path for storing uploaded files class
GENERATED_SCRIPTS_FOLDER Path Directory path for storing LLM-generated Python scripts class
REPORTS_FOLDER Path Directory path for storing generated analysis reports class
SESSIONS_FOLDER Path Directory path for storing user session data class
SANDBOX_FOLDER Path Directory path for sandboxed script execution environment class
OUTPUT_DIR Path Directory path for agent-generated output files class
DATABASE_URL str Database connection URL, defaults to SQLite database in BASE_DIR class
OPENAI_API_KEY str API key for OpenAI services, sourced from environment or default class
GEMINI_API_KEY str API key for Google Gemini services, sourced from environment or default class
AZURE_OPENAI_ENDPOINT str Azure OpenAI service endpoint URL, sourced from environment or default class
AZURE_OPENAI_API_KEY str API key for Azure OpenAI services, sourced from environment or default class
AVAILABLE_MODELS dict Dictionary mapping model IDs to their metadata (name, provider, description) for GPT-4o, Claude Sonnet 4.5, and Gemini 2.0 Flash class
DEFAULT_MODEL str Default LLM model identifier, set to 'gpt-4o' class
ANTHROPIC_API_KEY str API key for Anthropic Claude services, sourced from environment or default class
DEFAULT_SIGNIFICANCE_LEVEL float Default statistical significance level (alpha) for hypothesis tests, set to 0.05 class
MAX_DATASET_ROWS int Maximum number of rows allowed in datasets (100,000) class
MAX_COLUMNS int Maximum number of columns allowed in datasets (500) class
DEFAULT_SAMPLE_SIZE int Default sample size for large dataset sampling (1,000 rows) class
SCRIPT_TIMEOUT int Maximum execution time for Python scripts in seconds (300 = 5 minutes) class
ALLOWED_IMPORTS list List of Python module names allowed for import in generated scripts (pandas, numpy, scipy, matplotlib, seaborn, statsmodels, sklearn, math, statistics, datetime, json, warnings) class
MSSQL_DRIVER str ODBC driver string for SQL Server connections class
MSSQL_TIMEOUT int Timeout in seconds for SQL Server queries (60 seconds) class
ALLOWED_EXTENSIONS set Set of allowed file extensions for uploads (csv, xlsx, xls, txt, tsv) class
REPORT_TEMPLATE_DIR Path Directory path for report templates class
AUTO_CLEANUP_ENABLED bool Flag to enable/disable automatic cleanup of old files (True) class
KEEP_RECENT_ANALYSES int Number of recent analyses to keep per session during cleanup (8) class
CLEANUP_OLD_SESSIONS_DAYS int Number of days after which to clean up entire sessions (15) class
MAX_ANALYSES_PER_SESSION int Maximum number of analyses per session before forced cleanup (10) class

Dependencies

  • os
  • pathlib

Required Imports

import os
from pathlib import Path

Usage Example

# Instantiate the configuration
config = Config()

# Access Flask settings
app.config['SECRET_KEY'] = config.SECRET_KEY
app.config['MAX_CONTENT_LENGTH'] = config.MAX_CONTENT_LENGTH

# Access directory paths
upload_path = config.UPLOAD_FOLDER
reports_path = config.REPORTS_FOLDER

# Access LLM configuration
api_key = config.OPENAI_API_KEY
default_model = config.DEFAULT_MODEL
available_models = config.AVAILABLE_MODELS

# Access statistical settings
significance = config.DEFAULT_SIGNIFICANCE_LEVEL
max_rows = config.MAX_DATASET_ROWS

# Access execution settings
timeout = config.SCRIPT_TIMEOUT
allowed_imports = config.ALLOWED_IMPORTS

# Check file extensions
if file_ext in config.ALLOWED_EXTENSIONS:
    # Process file
    pass

Best Practices

  • Instantiate Config once at application startup to ensure all directories are created before use
  • Override sensitive values (API keys, SECRET_KEY) using environment variables in production
  • Do not commit the Config class with hardcoded API keys to version control - use environment variables
  • The class uses class variables, so all instances share the same configuration values
  • Directory paths are Path objects from pathlib, use them with Path operations or convert to string with str()
  • Check ALLOWED_EXTENSIONS before processing uploaded files for security
  • Respect MAX_CONTENT_LENGTH, SCRIPT_TIMEOUT, and MAX_DATASET_ROWS limits to prevent resource exhaustion
  • Use ALLOWED_IMPORTS list to validate Python imports before executing user-generated scripts
  • The AUTO_CLEANUP_ENABLED and related settings control automatic cleanup of old analyses and sessions
  • AVAILABLE_MODELS dictionary provides metadata for UI display and provider routing
  • BASE_DIR is calculated relative to the Config class file location, ensure proper deployment structure

Similar Components

AI-powered semantic similarity - components with related functionality:

  • class Config 89.8% similar

    Configuration class that manages application-wide settings, directory structures, API keys, and operational parameters for a statistical analysis application.

    From: /tf/active/vicechatdev/vice_ai/smartstat_config.py
  • class SmartStatConfig 71.4% similar

    Configuration class for SmartStat service that manages directory paths and API keys for various LLM providers integrated into Vice AI.

    From: /tf/active/vicechatdev/vice_ai/new_app.py
  • class Config_v3 64.5% similar

    Configuration manager class that loads, manages, and persists configuration settings for a contract validity analyzer application, supporting YAML files and environment variable overrides.

    From: /tf/active/vicechatdev/contract_validity_analyzer/config/config.py
  • class Config_v5 60.2% similar

    A hierarchical configuration manager that loads and manages settings from multiple sources (defaults, files, environment variables) with support for nested structures and dynamic updates.

    From: /tf/active/vicechatdev/invoice_extraction/config.py
  • class Config_v2 56.5% similar

    Configuration class that manages environment-based settings for a SharePoint to FileCloud synchronization application.

    From: /tf/active/vicechatdev/SPFCsync/config.py
← Back to Browse