🔍 Code Extractor

class AnalysisStep

Maturity: 48

A dataclass representing an individual step in an analysis process, tracking execution details, scripts, outputs, and metadata for each step in a data analysis workflow.

File:
/tf/active/vicechatdev/vice_ai/smartstat_models.py
Lines:
151 - 187
Complexity:
simple

Purpose

AnalysisStep serves as a structured container for tracking individual steps within a larger analysis session. It captures all relevant information about a step including its type (data_load, script_generation, execution, debugging, reporting), input data, generated scripts, execution results, and associated metadata. This class is designed to maintain a complete audit trail of analysis operations, enabling debugging, reporting, and session reconstruction.

Source Code

class AnalysisStep:
    """Individual step in analysis process"""
    step_id: str
    session_id: str
    step_number: int
    step_type: str  # "data_load", "script_generation", "execution", "debugging", "reporting"
    input_data: Dict[str, Any]
    generated_script: str = ""
    execution_output: str = ""
    execution_error: str = ""
    execution_success: bool = False
    notes: str = None
    metadata: Dict[str, Any] = None
    created_at: datetime = None
    
    def __post_init__(self):
        if self.created_at is None:
            self.created_at = datetime.now()
        if self.metadata is None:
            self.metadata = {}
    
    def to_dict(self) -> Dict[str, Any]:
        """Convert to dictionary with proper serialization"""
        return {
            'step_id': self.step_id,
            'session_id': self.session_id,
            'step_number': self.step_number,
            'step_type': self.step_type,
            'input_data': self.input_data,
            'generated_script': self.generated_script,
            'execution_output': self.execution_output,
            'execution_error': self.execution_error,
            'execution_success': self.execution_success,
            'notes': self.notes,
            'metadata': self.metadata,
            'created_at': self.created_at.isoformat() if self.created_at else None
        }

Parameters

Name Type Default Kind
bases - -

Parameter Details

step_id: Unique identifier for this analysis step, typically a UUID string

session_id: Identifier linking this step to its parent analysis session

step_number: Sequential number indicating the order of this step within the session (integer)

step_type: Type of analysis step being performed. Valid values: 'data_load', 'script_generation', 'execution', 'debugging', 'reporting'

input_data: Dictionary containing input parameters and data for this step. Structure varies by step_type

generated_script: Python or other script code generated during this step (default: empty string)

execution_output: Standard output captured from script execution (default: empty string)

execution_error: Error messages or stack traces if execution failed (default: empty string)

execution_success: Boolean flag indicating whether the step executed successfully (default: False)

notes: Optional human-readable notes or comments about this step (default: None)

metadata: Dictionary for storing additional arbitrary metadata about the step (default: None, initialized to {} in __post_init__)

created_at: Timestamp when this step was created (default: None, initialized to current datetime in __post_init__)

Return Value

Instantiation returns an AnalysisStep object with all specified attributes. The to_dict() method returns a dictionary representation with all attributes serialized, including ISO format datetime conversion for created_at.

Class Interface

Methods

__post_init__(self) -> None

Purpose: Dataclass post-initialization hook that sets default values for created_at and metadata if not provided

Returns: None - modifies instance attributes in place

to_dict(self) -> Dict[str, Any]

Purpose: Converts the AnalysisStep instance to a dictionary with proper serialization of datetime objects

Returns: Dictionary containing all step attributes with created_at converted to ISO format string

Attributes

Name Type Description Scope
step_id str Unique identifier for this analysis step instance
session_id str Identifier of the parent analysis session instance
step_number int Sequential order number of this step within the session instance
step_type str Type of analysis step: 'data_load', 'script_generation', 'execution', 'debugging', or 'reporting' instance
input_data Dict[str, Any] Dictionary containing input parameters and data for this step instance
generated_script str Script code generated during this step (default: empty string) instance
execution_output str Standard output captured from script execution (default: empty string) instance
execution_error str Error messages or stack traces from failed execution (default: empty string) instance
execution_success bool Flag indicating whether the step executed successfully (default: False) instance
notes str Optional human-readable notes about this step (default: None) instance
metadata Dict[str, Any] Dictionary for storing additional arbitrary metadata (initialized to {} if None) instance
created_at datetime Timestamp when this step was created (initialized to current time if None) instance

Dependencies

  • datetime
  • typing
  • dataclasses

Required Imports

from datetime import datetime
from typing import Dict, Any
from dataclasses import dataclass

Usage Example

from datetime import datetime
from typing import Dict, Any
from dataclasses import dataclass

# Create a new analysis step
step = AnalysisStep(
    step_id='step-001',
    session_id='session-abc',
    step_number=1,
    step_type='data_load',
    input_data={'file_path': '/data/input.csv', 'format': 'csv'},
    notes='Loading initial dataset'
)

# Update step with execution results
step.generated_script = 'import pandas as pd\ndf = pd.read_csv("/data/input.csv")'
step.execution_success = True
step.execution_output = 'Successfully loaded 1000 rows'

# Add custom metadata
step.metadata['rows_loaded'] = 1000
step.metadata['columns'] = ['id', 'name', 'value']

# Convert to dictionary for serialization
step_dict = step.to_dict()
print(step_dict['created_at'])  # ISO format timestamp

Best Practices

  • Always provide a unique step_id (consider using uuid.uuid4().hex) to ensure step uniqueness
  • Set step_type to one of the documented values: 'data_load', 'script_generation', 'execution', 'debugging', 'reporting'
  • Update execution_success, execution_output, and execution_error after running generated scripts to maintain accurate state
  • Use the metadata dictionary for extensible custom attributes rather than modifying the class
  • Call to_dict() when serializing to JSON or storing in databases to ensure proper datetime formatting
  • The created_at timestamp is automatically set if not provided, capturing step creation time
  • Link steps together using consistent session_id values and sequential step_number values
  • Store detailed error information in execution_error for debugging failed steps
  • Use notes field for human-readable context that helps understand the step's purpose

Similar Components

AI-powered semantic similarity - components with related functionality:

  • class AnalysisStep_v1 97.7% similar

    A dataclass representing an individual step in an analysis process, tracking execution details, scripts, outputs, and errors for a specific analysis operation.

    From: /tf/active/vicechatdev/smartstat/models.py
  • class DataAnalysisSession_v1 73.5% similar

    A dataclass representing a statistical analysis session that is linked to specific document sections, managing analysis state, messages, plots, and configuration.

    From: /tf/active/vicechatdev/vice_ai/models.py
  • class DataSection 70.6% similar

    A dataclass representing a dedicated data analysis section that stores analysis results, plots, dataset information, and conclusions separately from text content.

    From: /tf/active/vicechatdev/vice_ai/models.py
  • class DataAnalysisSession 70.1% similar

    A dataclass representing a data analysis session that is linked to a specific text section within a document, managing conversation messages, analysis results, plots, and configuration.

    From: /tf/active/vicechatdev/vice_ai/models.py
  • class StatisticalSession 68.8% similar

    A dataclass representing a statistical analysis session that tracks metadata, configuration, and status of data analysis operations.

    From: /tf/active/vicechatdev/vice_ai/smartstat_models.py
← Back to Browse