🔍 Code Extractor

class DatabaseInfo

Maturity: 46

A dataclass that encapsulates complete database schema information including tables, columns, relationships, and metadata for a specific database instance.

File:
/tf/active/vicechatdev/full_smartstat/dynamic_schema_discovery.py
Lines:
32 - 67
Complexity:
moderate

Purpose

DatabaseInfo serves as a comprehensive data container for storing and serializing complete database schema information. It aggregates metadata about a database including its structure (tables, columns), statistics (row counts), relationships between tables, and discovery metadata. The class provides serialization capabilities to convert the schema information into dictionary format suitable for JSON export, making it useful for schema documentation, analysis, and data catalog applications.

Source Code

class DatabaseInfo:
    """Complete database schema information"""
    database_name: str
    server_name: str
    discovery_timestamp: str
    total_tables: int
    total_columns: int
    total_rows: int
    tables: List[TableInfo]
    relationships: List[Dict[str, Any]]
    
    def to_dict(self) -> Dict[str, Any]:
        """Convert to dictionary for JSON serialization"""
        return {
            'database_name': self.database_name,
            'server_name': self.server_name,
            'discovery_timestamp': self.discovery_timestamp,
            'total_tables': self.total_tables,
            'total_columns': self.total_columns,
            'total_rows': self.total_rows,
            'tables': [asdict(table) for table in self.tables],
            'relationships': self.relationships,
            'columns_by_table': {table.name: [col['name'] for col in table.columns] for table in self.tables},
            'system_architecture': {
                'database_name': self.database_name,
                'server': self.server_name,
                'total_tables': self.total_tables,
                'total_columns': self.total_columns,
                'total_rows': self.total_rows,
                'generated_on': self.discovery_timestamp,
                'columns_by_table': {
                    table.name: [{'COLUMN_NAME': col['name'], **col} for col in table.columns] 
                    for table in self.tables
                }
            }
        }

Parameters

Name Type Default Kind
bases - -

Parameter Details

database_name: The name of the database being documented. This is a required string identifier for the database instance.

server_name: The name or address of the server hosting the database. Used to identify the database location in distributed environments.

discovery_timestamp: ISO format timestamp string indicating when the schema information was collected. Useful for tracking schema evolution over time.

total_tables: Integer count of the total number of tables discovered in the database schema.

total_columns: Integer count of the total number of columns across all tables in the database.

total_rows: Integer count of the total number of rows across all tables in the database. May be an estimate depending on discovery method.

tables: List of TableInfo objects, where each TableInfo contains detailed information about a single table including its columns, row count, and metadata.

relationships: List of dictionaries describing relationships between tables (foreign keys, joins). Each dictionary contains relationship metadata such as source table, target table, and key columns.

Return Value

As a dataclass, instantiation returns a DatabaseInfo object with all specified attributes. The to_dict() method returns a comprehensive dictionary containing all database information in a nested structure suitable for JSON serialization, including the original attributes plus computed fields like 'columns_by_table' and 'system_architecture'.

Class Interface

Methods

to_dict(self) -> Dict[str, Any]

Purpose: Converts the DatabaseInfo instance into a comprehensive dictionary structure suitable for JSON serialization, including computed fields for easier data access

Returns: A dictionary containing all database information with keys: 'database_name', 'server_name', 'discovery_timestamp', 'total_tables', 'total_columns', 'total_rows', 'tables' (list of table dicts), 'relationships', 'columns_by_table' (mapping of table names to column name lists), and 'system_architecture' (nested structure with detailed schema information)

Attributes

Name Type Description Scope
database_name str The name identifier of the database instance
server_name str The server hostname or address where the database is hosted instance
discovery_timestamp str ISO format timestamp indicating when the schema was discovered/captured instance
total_tables int Total count of tables in the database instance
total_columns int Total count of columns across all tables instance
total_rows int Total count of rows across all tables instance
tables List[TableInfo] List of TableInfo objects containing detailed information about each table in the database instance
relationships List[Dict[str, Any]] List of dictionaries describing relationships (foreign keys, joins) between tables instance

Dependencies

  • dataclasses
  • typing

Required Imports

from dataclasses import dataclass, asdict
from typing import Dict, List, Any

Usage Example

from dataclasses import dataclass, asdict
from typing import Dict, List, Any
from datetime import datetime

# Assuming TableInfo is defined
@dataclass
class TableInfo:
    name: str
    columns: List[Dict[str, Any]]
    row_count: int

# Create sample table info
table1 = TableInfo(
    name='users',
    columns=[{'name': 'id', 'type': 'int'}, {'name': 'email', 'type': 'varchar'}],
    row_count=1000
)

table2 = TableInfo(
    name='orders',
    columns=[{'name': 'id', 'type': 'int'}, {'name': 'user_id', 'type': 'int'}],
    row_count=5000
)

# Instantiate DatabaseInfo
db_info = DatabaseInfo(
    database_name='production_db',
    server_name='db-server-01',
    discovery_timestamp=datetime.now().isoformat(),
    total_tables=2,
    total_columns=4,
    total_rows=6000,
    tables=[table1, table2],
    relationships=[{'from': 'orders.user_id', 'to': 'users.id', 'type': 'foreign_key'}]
)

# Convert to dictionary for serialization
db_dict = db_info.to_dict()
print(db_dict['database_name'])
print(db_dict['columns_by_table'])

# Access attributes directly
print(f"Database: {db_info.database_name}")
print(f"Total tables: {db_info.total_tables}")
for table in db_info.tables:
    print(f"Table: {table.name}, Columns: {len(table.columns)}")

Best Practices

  • Always ensure discovery_timestamp is in ISO format for consistency and parseability
  • Validate that total_tables, total_columns, and total_rows match the actual counts in the tables list before instantiation
  • Use the to_dict() method for JSON serialization rather than accessing attributes directly when exporting data
  • Ensure all TableInfo objects in the tables list have properly structured columns (list of dicts with 'name' key) to avoid KeyError in to_dict()
  • The class is immutable by default as a dataclass; if you need mutability, add frozen=False to the @dataclass decorator
  • Consider validating relationships reference actual tables in the tables list to maintain referential integrity
  • The to_dict() method creates a nested structure with some redundancy (columns_by_table appears twice); be aware of this when processing the output
  • This class is designed for read-only schema representation; for schema modification operations, use separate database management tools

Similar Components

AI-powered semantic similarity - components with related functionality:

  • class TableInfo 83.0% similar

    A dataclass that encapsulates comprehensive metadata about a database table, including schema information, columns, keys, and data quality metrics.

    From: /tf/active/vicechatdev/full_smartstat/dynamic_schema_discovery.py
  • class DatabaseSchema 78.7% similar

    A dataclass that represents comprehensive database schema information, including table structures, columns, relationships, and categorizations for SQL database introspection and query generation.

    From: /tf/active/vicechatdev/full_smartstat/sql_query_generator.py
  • class DatabaseSchema_v1 73.1% similar

    A dataclass that represents database schema information, including table categories, relationships, and system architecture. Provides functionality to load schema from JSON files.

    From: /tf/active/vicechatdev/smartstat/sql_query_generator.py
  • class DataSource 52.7% similar

    A dataclass that represents configuration for various data sources, supporting file-based, SQL database, and query-based data access patterns.

    From: /tf/active/vicechatdev/vice_ai/models.py
  • class DataSource_v1 50.7% similar

    A dataclass that encapsulates configuration for various data sources used in analysis, supporting file-based, SQL database, and query-based data sources.

    From: /tf/active/vicechatdev/vice_ai/models.py
← Back to Browse