šŸ” Code Extractor

function check_specific_corruption

Maturity: 36

Detects and fixes specific corruption patterns in the chat_session_id column of a SQLite database's text_sections table, replacing invalid values with NULL.

File:
/tf/active/vicechatdev/vice_ai/check_specific_corruption.py
Lines:
7 - 67
Complexity:
moderate

Purpose

This function is designed to diagnose and repair data corruption in a SQLite database where chat_session_id fields contain invalid JSON-like patterns ('{}', '[]', 'null', or empty strings). It scans for these specific patterns, reports their occurrence with affected row IDs, automatically fixes them by setting the values to NULL, and displays a sample of remaining non-NULL values for verification. This is particularly useful for cleaning up database inconsistencies that may cause application errors.

Source Code

def check_specific_corruption():
    db_path = '/tf/active/vice_ai/documents.db'
    
    print("šŸ” Checking for specific corruption patterns...")
    
    conn = sqlite3.connect(db_path)
    cursor = conn.cursor()
    
    try:
        # Check for the specific patterns the Flask logs are showing
        patterns = ['{}', '[]', 'null', '']
        
        for pattern in patterns:
            cursor.execute("""
                SELECT COUNT(*), GROUP_CONCAT(id) as section_ids
                FROM text_sections 
                WHERE chat_session_id = ?
            """, (pattern,))
            
            result = cursor.fetchone()
            count = result[0] if result[0] else 0
            section_ids = result[1] if result[1] else ""
            
            if count > 0:
                print(f"   šŸ”“ Pattern '{pattern}': {count} rows -> {section_ids}")
                
                # Fix this specific pattern
                cursor.execute("""
                    UPDATE text_sections 
                    SET chat_session_id = NULL 
                    WHERE chat_session_id = ?
                """, (pattern,))
                
                updated = cursor.rowcount
                if updated > 0:
                    print(f"      āœ… Fixed {updated} rows with pattern '{pattern}'")
                    conn.commit()
            else:
                print(f"   āœ… No corruption with pattern '{pattern}'")
        
        # Now check what's left
        cursor.execute("""
            SELECT id, chat_session_id, title
            FROM text_sections 
            WHERE chat_session_id IS NOT NULL
            ORDER BY id
            LIMIT 5
        """)
        
        remaining_rows = cursor.fetchall()
        print(f"\nšŸ“Š Sample of remaining non-NULL chat_session_id values:")
        for row in remaining_rows:
            section_id, chat_session_id, title = row
            title_short = title[:30] + "..." if len(title) > 30 else title
            print(f"   šŸ“„ {section_id}: '{chat_session_id}' -> {title_short}")
        
    except Exception as e:
        print(f"āŒ Error: {e}")
        conn.rollback()
    finally:
        conn.close()

Return Value

This function does not return any value (implicitly returns None). It performs side effects by modifying the database and printing diagnostic information to stdout.

Dependencies

  • sqlite3

Required Imports

import sqlite3

Usage Example

import sqlite3

def check_specific_corruption():
    db_path = '/tf/active/vice_ai/documents.db'
    
    print("šŸ” Checking for specific corruption patterns...")
    
    conn = sqlite3.connect(db_path)
    cursor = conn.cursor()
    
    try:
        patterns = ['{}', '[]', 'null', '']
        
        for pattern in patterns:
            cursor.execute("""
                SELECT COUNT(*), GROUP_CONCAT(id) as section_ids
                FROM text_sections 
                WHERE chat_session_id = ?
            """, (pattern,))
            
            result = cursor.fetchone()
            count = result[0] if result[0] else 0
            section_ids = result[1] if result[1] else ""
            
            if count > 0:
                print(f"   šŸ”“ Pattern '{pattern}': {count} rows -> {section_ids}")
                
                cursor.execute("""
                    UPDATE text_sections 
                    SET chat_session_id = NULL 
                    WHERE chat_session_id = ?
                """, (pattern,))
                
                updated = cursor.rowcount
                if updated > 0:
                    print(f"      āœ… Fixed {updated} rows with pattern '{pattern}'")
                    conn.commit()
            else:
                print(f"   āœ… No corruption with pattern '{pattern}'")
        
        cursor.execute("""
            SELECT id, chat_session_id, title
            FROM text_sections 
            WHERE chat_session_id IS NOT NULL
            ORDER BY id
            LIMIT 5
        """)
        
        remaining_rows = cursor.fetchall()
        print(f"\nšŸ“Š Sample of remaining non-NULL chat_session_id values:")
        for row in remaining_rows:
            section_id, chat_session_id, title = row
            title_short = title[:30] + "..." if len(title) > 30 else title
            print(f"   šŸ“„ {section_id}: '{chat_session_id}' -> {title_short}")
        
    except Exception as e:
        print(f"āŒ Error: {e}")
        conn.rollback()
    finally:
        conn.close()

# Execute the function
check_specific_corruption()

Best Practices

  • Always backup the database before running this function as it performs UPDATE operations
  • The function uses hardcoded database path '/tf/active/vice_ai/documents.db' - consider parameterizing this for reusability
  • The function commits changes immediately after each pattern fix - consider batching commits for better performance
  • Error handling includes rollback on exceptions, but successful partial updates are committed and won't be rolled back
  • The function prints diagnostic information to stdout - redirect or capture output if running in production
  • Consider adding a dry-run mode that reports issues without making changes
  • The GROUP_CONCAT function may truncate results if there are many affected IDs - check SQLite limits
  • The function only checks for 4 specific patterns - extend the patterns list if other corruption types are discovered

Similar Components

AI-powered semantic similarity - components with related functionality:

  • function check_and_fix_corruption 92.3% similar

    Scans a SQLite database for corrupted chat_session_id values in the text_sections table and automatically fixes them by setting invalid entries to NULL.

    From: /tf/active/vicechatdev/vice_ai/direct_corruption_checker.py
  • function direct_fix 84.8% similar

    A database maintenance function that detects and fixes corrupted chat_session_id values in a SQLite database's text_sections table by identifying invalid patterns and setting them to NULL.

    From: /tf/active/vicechatdev/vice_ai/direct_sqlite_fix.py
  • function cleanup_corrupted_chat_session_ids 82.2% similar

    Database maintenance function that identifies and fixes corrupted chat_session_id values in the text_sections table by replacing invalid string representations with NULL.

    From: /tf/active/vicechatdev/vice_ai/cleanup_database.py
  • function raw_cleanup_database 79.0% similar

    Performs raw database cleanup on a SQLite database to identify and fix corrupted chat_session_id values in the text_sections table by converting invalid string representations ('{}', '[]', 'null', '') to NULL.

    From: /tf/active/vicechatdev/vice_ai/raw_database_cleanup.py
  • function inspect_database 75.6% similar

    Inspects a SQLite database at a hardcoded path, examining table structure, row counts, and identifying potentially corrupted chat_session_id values in the text_sections table.

    From: /tf/active/vicechatdev/vice_ai/database_inspector.py
← Back to Browse