class AnalysisStep_v1
A dataclass representing an individual step in an analysis process, tracking execution details, scripts, outputs, and errors for a specific analysis operation.
/tf/active/vicechatdev/smartstat/models.py
146 - 176
simple
Purpose
AnalysisStep serves as a data container for tracking individual steps within a multi-step analysis workflow. It captures metadata about each step including its type (data loading, script generation, execution, debugging, or reporting), the generated scripts, execution results, and any errors encountered. This class is designed to maintain a complete audit trail of analysis operations within a session, enabling debugging, reporting, and workflow reconstruction.
Source Code
class AnalysisStep:
"""Individual step in analysis process"""
step_id: str
session_id: str
step_number: int
step_type: str # "data_load", "script_generation", "execution", "debugging", "reporting"
input_data: Dict[str, Any]
generated_script: str = ""
execution_output: str = ""
execution_error: str = ""
execution_success: bool = False
created_at: datetime = None
def __post_init__(self):
if self.created_at is None:
self.created_at = datetime.now()
def to_dict(self) -> Dict[str, Any]:
"""Convert to dictionary with proper serialization"""
return {
'step_id': self.step_id,
'session_id': self.session_id,
'step_number': self.step_number,
'step_type': self.step_type,
'input_data': self.input_data,
'generated_script': self.generated_script,
'execution_output': self.execution_output,
'execution_error': self.execution_error,
'execution_success': self.execution_success,
'created_at': self.created_at.isoformat() if self.created_at else None
}
Parameters
| Name | Type | Default | Kind |
|---|---|---|---|
bases |
- | - |
Parameter Details
step_id: Unique identifier for this analysis step, typically a UUID string
session_id: Identifier linking this step to a parent analysis session
step_number: Sequential number indicating the order of this step within the session (integer)
step_type: Type of analysis step being performed. Valid values: 'data_load', 'script_generation', 'execution', 'debugging', 'reporting'
input_data: Dictionary containing input parameters and data for this step. Structure varies based on step_type
generated_script: The script code generated or used in this step (default: empty string)
execution_output: Standard output captured from script execution (default: empty string)
execution_error: Error messages or stack traces if execution failed (default: empty string)
execution_success: Boolean flag indicating whether the step executed successfully (default: False)
created_at: Timestamp when the step was created. Auto-populated with current datetime if not provided (default: None, set in __post_init__)
Return Value
Instantiation returns an AnalysisStep object with all specified attributes. The to_dict() method returns a dictionary representation with all fields serialized, including the created_at timestamp converted to ISO format string.
Class Interface
Methods
__post_init__(self) -> None
Purpose: Dataclass post-initialization hook that sets created_at to current datetime if not provided
Returns: None - modifies the instance in place
to_dict(self) -> Dict[str, Any]
Purpose: Converts the AnalysisStep instance to a dictionary with proper serialization of all fields, including datetime to ISO format
Returns: Dictionary containing all step attributes with created_at converted to ISO format string (or None if created_at is None)
Attributes
| Name | Type | Description | Scope |
|---|---|---|---|
step_id |
str | Unique identifier for this analysis step | instance |
session_id |
str | Identifier of the parent analysis session this step belongs to | instance |
step_number |
int | Sequential number indicating the position of this step in the analysis workflow | instance |
step_type |
str | Type of analysis operation: 'data_load', 'script_generation', 'execution', 'debugging', or 'reporting' | instance |
input_data |
Dict[str, Any] | Dictionary containing input parameters and data specific to this step's operation | instance |
generated_script |
str | The script code generated or executed in this step (default: empty string) | instance |
execution_output |
str | Standard output captured from script execution (default: empty string) | instance |
execution_error |
str | Error messages or stack traces if execution failed (default: empty string) | instance |
execution_success |
bool | Flag indicating whether the step executed successfully (default: False) | instance |
created_at |
datetime | Timestamp when the step was created, automatically set to current time if not provided | instance |
Dependencies
datetimetyping
Required Imports
from datetime import datetime
from typing import Dict, Any
from dataclasses import dataclass
Usage Example
from datetime import datetime
from typing import Dict, Any
from dataclasses import dataclass
@dataclass
class AnalysisStep:
step_id: str
session_id: str
step_number: int
step_type: str
input_data: Dict[str, Any]
generated_script: str = ""
execution_output: str = ""
execution_error: str = ""
execution_success: bool = False
created_at: datetime = None
def __post_init__(self):
if self.created_at is None:
self.created_at = datetime.now()
def to_dict(self) -> Dict[str, Any]:
return {
'step_id': self.step_id,
'session_id': self.session_id,
'step_number': self.step_number,
'step_type': self.step_type,
'input_data': self.input_data,
'generated_script': self.generated_script,
'execution_output': self.execution_output,
'execution_error': self.execution_error,
'execution_success': self.execution_success,
'created_at': self.created_at.isoformat() if self.created_at else None
}
# Create a new analysis step
step = AnalysisStep(
step_id='step-001',
session_id='session-abc',
step_number=1,
step_type='data_load',
input_data={'file_path': '/data/input.csv', 'format': 'csv'}
)
# Update execution results
step.generated_script = 'import pandas as pd\ndf = pd.read_csv("/data/input.csv")'
step.execution_output = 'Successfully loaded 1000 rows'
step.execution_success = True
# Serialize to dictionary
step_dict = step.to_dict()
print(step_dict['created_at']) # ISO format timestamp
Best Practices
- Always provide step_id and session_id to maintain proper tracking and relationships between steps
- Use consistent step_type values from the defined set: 'data_load', 'script_generation', 'execution', 'debugging', 'reporting'
- Set execution_success to True only after successful execution, and populate execution_error if failures occur
- The created_at timestamp is automatically set if not provided, ensuring all steps have creation timestamps
- Use to_dict() method for serialization to JSON or database storage, as it properly handles datetime conversion
- Keep input_data structure consistent for the same step_type to facilitate processing and analysis
- Update execution_output and execution_error fields immediately after script execution for accurate tracking
- Step_number should be sequential within a session to maintain proper ordering of analysis workflow
Similar Components
AI-powered semantic similarity - components with related functionality:
-
class AnalysisStep 97.7% similar
-
class DataAnalysisSession_v1 71.8% similar
-
class AnalysisResult_v1 67.6% similar
-
class DataSection 67.1% similar
-
class DataAnalysisSession 66.2% similar