🔍 Code Extractor

class AnalysisRequest

Maturity: 51

A dataclass that encapsulates an analysis request combining data retrieval requirements and statistical analysis specifications.

File:
/tf/active/vicechatdev/full_smartstat/enhanced_sql_workflow.py
Lines:
26 - 44
Complexity:
simple

Purpose

AnalysisRequest serves as a structured container for defining both data needs (what data to retrieve, filters, row limits) and analysis requirements (what statistical analysis to perform, expected result types). It acts as a communication object between components that need to coordinate data retrieval and statistical processing, providing a standardized way to specify data queries alongside analysis intentions.

Source Code

class AnalysisRequest:
    """Analysis request combining data needs and statistical analysis"""
    data_description: str
    analysis_description: str
    max_rows: Optional[int] = 10000
    specific_columns: List[str] = None
    filters: Dict[str, Any] = None
    expected_result_type: str = None  # 'summary', 'comparison', 'trend', 'correlation', etc.
    
    def to_dict(self) -> Dict[str, Any]:
        """Convert analysis request to dictionary"""
        return {
            'data_description': self.data_description,
            'analysis_description': self.analysis_description,
            'max_rows': self.max_rows,
            'specific_columns': self.specific_columns or [],
            'filters': self.filters or {},
            'expected_result_type': self.expected_result_type
        }

Parameters

Name Type Default Kind
bases - -

Parameter Details

data_description: A textual description of the data needed for analysis. This should describe what data is being requested in natural language (e.g., 'sales data from Q4 2023').

analysis_description: A textual description of the statistical analysis to be performed on the data. This describes the analytical operation or question to be answered (e.g., 'calculate average sales by region').

max_rows: Optional integer limiting the maximum number of rows to retrieve. Defaults to 10000. Use this to prevent excessive data retrieval and manage memory constraints.

specific_columns: Optional list of column names to retrieve. If None or empty, all columns may be retrieved. Use this to limit data to only necessary columns for the analysis.

filters: Optional dictionary of filter conditions to apply to the data retrieval. Keys are typically column names and values are filter criteria. Structure depends on the consuming component's expectations.

expected_result_type: Optional string indicating the type of analysis result expected. Common values include 'summary', 'comparison', 'trend', 'correlation', etc. This helps downstream components understand how to process and present results.

Return Value

Instantiation returns an AnalysisRequest object with all specified attributes. The to_dict() method returns a dictionary representation with keys: 'data_description', 'analysis_description', 'max_rows', 'specific_columns' (empty list if None), 'filters' (empty dict if None), and 'expected_result_type'.

Class Interface

Methods

to_dict(self) -> Dict[str, Any]

Purpose: Converts the AnalysisRequest instance to a dictionary representation suitable for serialization, API transmission, or storage

Returns: Dictionary with keys 'data_description', 'analysis_description', 'max_rows', 'specific_columns' (empty list if None), 'filters' (empty dict if None), and 'expected_result_type'. None values for specific_columns and filters are converted to empty list and dict respectively.

Attributes

Name Type Description Scope
data_description str Natural language description of the data needed for the analysis instance
analysis_description str Natural language description of the statistical analysis to perform instance
max_rows Optional[int] Maximum number of rows to retrieve, defaults to 10000 instance
specific_columns List[str] List of specific column names to retrieve, None means no restriction instance
filters Dict[str, Any] Dictionary of filter conditions to apply during data retrieval instance
expected_result_type str Type of analysis result expected (e.g., 'summary', 'comparison', 'trend', 'correlation') instance

Dependencies

  • typing
  • dataclasses

Required Imports

from dataclasses import dataclass
from typing import Dict, List, Optional, Any

Usage Example

from dataclasses import dataclass
from typing import Dict, List, Optional, Any

@dataclass
class AnalysisRequest:
    data_description: str
    analysis_description: str
    max_rows: Optional[int] = 10000
    specific_columns: List[str] = None
    filters: Dict[str, Any] = None
    expected_result_type: str = None
    
    def to_dict(self) -> Dict[str, Any]:
        return {
            'data_description': self.data_description,
            'analysis_description': self.analysis_description,
            'max_rows': self.max_rows,
            'specific_columns': self.specific_columns or [],
            'filters': self.filters or {},
            'expected_result_type': self.expected_result_type
        }

# Create an analysis request
request = AnalysisRequest(
    data_description="Customer purchase data from 2023",
    analysis_description="Calculate average purchase value by customer segment",
    max_rows=5000,
    specific_columns=["customer_id", "segment", "purchase_amount", "purchase_date"],
    filters={"year": 2023, "status": "completed"},
    expected_result_type="summary"
)

# Convert to dictionary for serialization or API calls
request_dict = request.to_dict()
print(request_dict)

Best Practices

  • Always provide meaningful data_description and analysis_description to help downstream components understand the intent
  • Set max_rows appropriately based on memory constraints and analysis needs - the default 10000 may need adjustment
  • Use specific_columns to limit data retrieval to only necessary fields for better performance
  • Structure filters dictionary consistently with the expectations of the consuming SQL or data retrieval component
  • Use expected_result_type to guide result formatting and presentation logic
  • Call to_dict() when serializing for API calls, logging, or storage
  • Be aware that specific_columns and filters default to None, not empty collections - handle this in consuming code
  • Since this is a dataclass, avoid mutable default arguments (the current implementation is safe as defaults are None)
  • Consider validating that data_description and analysis_description are non-empty strings after instantiation
  • This class is immutable by default (dataclass without frozen=True), but best practice is to treat it as immutable after creation

Similar Components

AI-powered semantic similarity - components with related functionality:

  • class AnalysisResult_v1 69.7% similar

    A dataclass that encapsulates the results from statistical analysis operations, including metadata, file paths, and timestamps.

    From: /tf/active/vicechatdev/vice_ai/models.py
  • class AnalysisResult 69.6% similar

    A dataclass that encapsulates the results from statistical analysis operations, including metadata, file paths, and timestamps.

    From: /tf/active/vicechatdev/vice_ai/smartstat_models.py
  • class AnalysisResult_v1 68.0% similar

    A dataclass that encapsulates results from statistical analysis operations, providing structured storage and serialization capabilities for analysis outputs.

    From: /tf/active/vicechatdev/vice_ai/models.py
  • class DataAnalysisSession_v1 64.5% similar

    A dataclass representing a statistical analysis session that is linked to specific document sections, managing analysis state, messages, plots, and configuration.

    From: /tf/active/vicechatdev/vice_ai/models.py
  • class DataSection 63.1% similar

    A dataclass representing a dedicated data analysis section that stores analysis results, plots, dataset information, and conclusions separately from text content.

    From: /tf/active/vicechatdev/vice_ai/models.py
← Back to Browse