🔍 Code Extractor

class DataAnalysisSession

Maturity: 50

A dataclass representing a data analysis session that is linked to a specific text section within a document, managing conversation messages, analysis results, plots, and configuration.

File:
/tf/active/vicechatdev/vice_ai/models.py
Lines:
1973 - 2072
Complexity:
moderate

Purpose

DataAnalysisSession manages the complete lifecycle of a data analysis workflow within a document context. It tracks the analysis conversation through messages, stores configuration and data sources, maintains analysis status, holds SQL queries, stores results and generated plots, and provides serialization/deserialization capabilities. This class serves as the central state container for interactive data analysis sessions that are embedded within document sections.

Source Code

class DataAnalysisSession:
    """Data analysis session linked to a text section"""
    session_id: str
    section_id: str  # Link to TextSection
    document_id: str  # Link to Document
    user_id: str = "default"
    created_at: datetime = None
    updated_at: datetime = None
    title: str = ""
    description: str = ""
    data_source: DataSource = None
    analysis_config: AnalysisConfiguration = None
    status: AnalysisStatus = AnalysisStatus.PENDING
    sql_query: str = ""
    
    # Chat messages for analysis conversation
    messages: List[Dict] = None
    
    # Results storage
    analysis_results: List[Dict] = None
    generated_plots: List[str] = None  # Paths to plot files
    conclusions: str = ""
    
    def __post_init__(self):
        if self.created_at is None:
            self.created_at = datetime.now()
        if self.updated_at is None:
            self.updated_at = datetime.now()
        if self.messages is None:
            self.messages = []
        if self.analysis_results is None:
            self.analysis_results = []
        if self.generated_plots is None:
            self.generated_plots = []
    
    def add_message(self, role: str, content: str, analysis_data: Dict = None):
        """Add a message to the analysis conversation"""
        message = {
            'id': str(uuid.uuid4()),
            'role': role,
            'content': content,
            'timestamp': datetime.now().isoformat(),
            'analysis_data': analysis_data or {}
        }
        self.messages.append(message)
        self.updated_at = datetime.now()
    
    def to_dict(self) -> Dict[str, Any]:
        """Convert to dictionary with proper serialization"""
        return {
            'session_id': self.session_id,
            'section_id': self.section_id,
            'document_id': self.document_id,
            'user_id': self.user_id,
            'created_at': self.created_at.isoformat() if self.created_at else None,
            'updated_at': self.updated_at.isoformat() if self.updated_at else None,
            'title': self.title,
            'description': self.description,
            'data_source': self.data_source.to_dict() if self.data_source else None,
            'analysis_config': self.analysis_config.to_dict() if self.analysis_config else None,
            'status': self.status.value,
            'sql_query': self.sql_query,
            'messages': self.messages,
            'analysis_results': self.analysis_results,
            'generated_plots': self.generated_plots,
            'conclusions': self.conclusions
        }
    
    @classmethod
    def from_dict(cls, data: Dict[str, Any]) -> 'DataAnalysisSession':
        """Create instance from dictionary"""
        session = cls(
            session_id=data['session_id'],
            section_id=data['section_id'],
            document_id=data['document_id'],
            user_id=data.get('user_id', 'default'),
            title=data.get('title', ''),
            description=data.get('description', ''),
            sql_query=data.get('sql_query', ''),
            messages=data.get('messages', []),
            analysis_results=data.get('analysis_results', []),
            generated_plots=data.get('generated_plots', []),
            conclusions=data.get('conclusions', '')
        )
        
        # Handle datetime fields
        if data.get('created_at'):
            session.created_at = datetime.fromisoformat(data['created_at'])
        if data.get('updated_at'):
            session.updated_at = datetime.fromisoformat(data['updated_at'])
        
        # Handle complex objects
        if data.get('data_source'):
            session.data_source = DataSource.from_dict(data['data_source'])
        if data.get('analysis_config'):
            session.analysis_config = AnalysisConfiguration.from_dict(data['analysis_config'])
        if data.get('status'):
            session.status = AnalysisStatus(data['status'])
        
        return session

Parameters

Name Type Default Kind
bases - -

Parameter Details

session_id: Unique identifier for the analysis session (string)

section_id: Identifier linking this session to a specific TextSection (string)

document_id: Identifier linking this session to a parent Document (string)

user_id: Identifier for the user who owns this session, defaults to 'default' (string)

created_at: Timestamp when the session was created, auto-set to current time if None (datetime)

updated_at: Timestamp when the session was last modified, auto-set to current time if None (datetime)

title: Human-readable title for the analysis session, defaults to empty string (string)

description: Detailed description of what this analysis session is about, defaults to empty string (string)

data_source: DataSource object containing information about the data being analyzed, defaults to None (DataSource)

analysis_config: AnalysisConfiguration object with settings for the analysis, defaults to None (AnalysisConfiguration)

status: Current status of the analysis (PENDING, RUNNING, COMPLETED, etc.), defaults to AnalysisStatus.PENDING (AnalysisStatus enum)

sql_query: SQL query string used for data retrieval, defaults to empty string (string)

messages: List of conversation messages in the analysis chat, auto-initialized to empty list (List[Dict])

analysis_results: List of dictionaries containing analysis results, auto-initialized to empty list (List[Dict])

generated_plots: List of file paths to generated plot images, auto-initialized to empty list (List[str])

conclusions: Text summary of analysis conclusions, defaults to empty string (string)

Return Value

Instantiation returns a DataAnalysisSession object with all attributes initialized. The to_dict() method returns a dictionary representation with proper serialization of datetime and enum values. The from_dict() class method returns a new DataAnalysisSession instance reconstructed from a dictionary.

Class Interface

Methods

__post_init__(self) -> None

Purpose: Dataclass post-initialization hook that sets default values for datetime fields and initializes empty lists

Returns: None - modifies instance attributes in place

add_message(self, role: str, content: str, analysis_data: Dict = None) -> None

Purpose: Adds a new message to the analysis conversation with automatic ID generation, timestamp, and updates the session's updated_at field

Parameters:

  • role: Role of the message sender (e.g., 'user', 'assistant', 'system')
  • content: Text content of the message
  • analysis_data: Optional dictionary containing additional analysis-related data for this message

Returns: None - appends message to self.messages list and updates self.updated_at

to_dict(self) -> Dict[str, Any]

Purpose: Converts the DataAnalysisSession instance to a dictionary with proper serialization of datetime objects, enums, and nested objects

Returns: Dictionary representation of the session with all attributes serialized to JSON-compatible types

from_dict(cls, data: Dict[str, Any]) -> DataAnalysisSession

Purpose: Class method that creates a DataAnalysisSession instance from a dictionary, handling deserialization of datetime strings, enums, and nested objects

Parameters:

  • data: Dictionary containing serialized session data with keys matching class attributes

Returns: New DataAnalysisSession instance reconstructed from the dictionary data

Attributes

Name Type Description Scope
session_id str Unique identifier for this analysis session instance
section_id str Identifier linking this session to a TextSection instance
document_id str Identifier linking this session to a Document instance
user_id str Identifier for the user who owns this session, defaults to 'default' instance
created_at datetime Timestamp when the session was created, auto-initialized to current time instance
updated_at datetime Timestamp when the session was last modified, auto-updated by add_message() instance
title str Human-readable title for the analysis session instance
description str Detailed description of the analysis session purpose instance
data_source DataSource DataSource object containing information about the data being analyzed instance
analysis_config AnalysisConfiguration Configuration object with settings for the analysis instance
status AnalysisStatus Current status of the analysis (enum value like PENDING, RUNNING, COMPLETED) instance
sql_query str SQL query string used for data retrieval in this analysis instance
messages List[Dict] List of conversation messages, each with id, role, content, timestamp, and analysis_data instance
analysis_results List[Dict] List of dictionaries containing structured analysis results instance
generated_plots List[str] List of file paths to generated plot images instance
conclusions str Text summary of analysis conclusions and findings instance

Dependencies

  • uuid
  • json
  • datetime
  • typing
  • dataclasses
  • enum
  • sqlite3
  • os

Required Imports

import uuid
from datetime import datetime
from typing import List, Dict, Optional, Any
from dataclasses import dataclass

Usage Example

from datetime import datetime
from typing import List, Dict
from dataclasses import dataclass

# Create a new analysis session
session = DataAnalysisSession(
    session_id='sess_123',
    section_id='sec_456',
    document_id='doc_789',
    user_id='user_001',
    title='Sales Analysis Q4',
    description='Analyzing quarterly sales data'
)

# Add messages to the conversation
session.add_message(
    role='user',
    content='Show me sales trends',
    analysis_data={'query_type': 'trend'}
)

session.add_message(
    role='assistant',
    content='Here are the sales trends...',
    analysis_data={'chart_type': 'line'}
)

# Update session properties
session.status = AnalysisStatus.COMPLETED
session.sql_query = 'SELECT * FROM sales WHERE quarter = 4'
session.generated_plots.append('/path/to/plot.png')
session.conclusions = 'Sales increased by 15% in Q4'

# Serialize to dictionary
data_dict = session.to_dict()

# Deserialize from dictionary
restored_session = DataAnalysisSession.from_dict(data_dict)

Best Practices

  • Always provide session_id, section_id, and document_id when instantiating to maintain proper linking
  • Use add_message() method to add conversation messages rather than directly appending to messages list, as it handles timestamp and ID generation
  • The __post_init__ method automatically initializes None values for created_at, updated_at, and list attributes, so these can be omitted during instantiation
  • When serializing/deserializing, ensure DataSource and AnalysisConfiguration classes are available and implement their own to_dict/from_dict methods
  • The updated_at timestamp is automatically updated when add_message() is called
  • Status should be updated as the analysis progresses through its lifecycle (PENDING -> RUNNING -> COMPLETED)
  • Store plot file paths in generated_plots list for later retrieval and display
  • Use to_dict() for JSON serialization and database storage, and from_dict() for reconstruction
  • The messages list maintains conversation history with unique IDs and timestamps for each message
  • Analysis results should be stored as dictionaries in analysis_results list for structured data storage

Similar Components

AI-powered semantic similarity - components with related functionality:

  • class DataAnalysisSession_v1 94.6% similar

    A dataclass representing a statistical analysis session that is linked to specific document sections, managing analysis state, messages, plots, and configuration.

    From: /tf/active/vicechatdev/vice_ai/models.py
  • class StatisticalSession 82.6% similar

    A dataclass representing a statistical analysis session that tracks metadata, configuration, and status of data analysis operations.

    From: /tf/active/vicechatdev/vice_ai/smartstat_models.py
  • class DataSection 81.3% similar

    A dataclass representing a dedicated data analysis section that stores analysis results, plots, dataset information, and conclusions separately from text content.

    From: /tf/active/vicechatdev/vice_ai/models.py
  • class StatisticalSession_v1 79.4% similar

    A dataclass representing a statistical analysis session that tracks user data analysis workflows, including data sources, configurations, and execution status.

    From: /tf/active/vicechatdev/smartstat/models.py
  • class ChatSession_v1 77.6% similar

    A dataclass representing a chat session associated with a specific text section in a document, managing conversation messages, context, and references.

    From: /tf/active/vicechatdev/vice_ai/models.py
← Back to Browse