check_specific_corruption - Code Extractor

function check_specific_corruption

Maturity: 36

Detects and fixes specific corruption patterns in the chat_session_id column of a SQLite database's text_sections table, replacing invalid values with NULL.

File:
/tf/active/vicechatdev/vice_ai/check_specific_corruption.py

Lines:
7 - 67

Complexity:
moderate

Purpose

This function is designed to diagnose and repair data corruption in a SQLite database where chat_session_id fields contain invalid JSON-like patterns ('{}', '[]', 'null', or empty strings). It scans for these specific patterns, reports their occurrence with affected row IDs, automatically fixes them by setting the values to NULL, and displays a sample of remaining non-NULL values for verification. This is particularly useful for cleaning up database inconsistencies that may cause application errors.

Source Code

def check_specific_corruption():
    db_path = '/tf/active/vice_ai/documents.db'
    
    print("🔍 Checking for specific corruption patterns...")
    
    conn = sqlite3.connect(db_path)
    cursor = conn.cursor()
    
    try:
        # Check for the specific patterns the Flask logs are showing
        patterns = ['{}', '[]', 'null', '']
        
        for pattern in patterns:
            cursor.execute("""
                SELECT COUNT(*), GROUP_CONCAT(id) as section_ids
                FROM text_sections 
                WHERE chat_session_id = ?
            """, (pattern,))
            
            result = cursor.fetchone()
            count = result[0] if result[0] else 0
            section_ids = result[1] if result[1] else ""
            
            if count > 0:
                print(f"   🔴 Pattern '{pattern}': {count} rows -> {section_ids}")
                
                # Fix this specific pattern
                cursor.execute("""
                    UPDATE text_sections 
                    SET chat_session_id = NULL 
                    WHERE chat_session_id = ?
                """, (pattern,))
                
                updated = cursor.rowcount
                if updated > 0:
                    print(f"      ✅ Fixed {updated} rows with pattern '{pattern}'")
                    conn.commit()
            else:
                print(f"   ✅ No corruption with pattern '{pattern}'")
        
        # Now check what's left
        cursor.execute("""
            SELECT id, chat_session_id, title
            FROM text_sections 
            WHERE chat_session_id IS NOT NULL
            ORDER BY id
            LIMIT 5
        """)
        
        remaining_rows = cursor.fetchall()
        print(f"\n📊 Sample of remaining non-NULL chat_session_id values:")
        for row in remaining_rows:
            section_id, chat_session_id, title = row
            title_short = title[:30] + "..." if len(title) > 30 else title
            print(f"   📄 {section_id}: '{chat_session_id}' -> {title_short}")
        
    except Exception as e:
        print(f"❌ Error: {e}")
        conn.rollback()
    finally:
        conn.close()

Return Value

This function does not return any value (implicitly returns None). It performs side effects by modifying the database and printing diagnostic information to stdout.

Dependencies

sqlite3

Required Imports

import sqlite3

Usage Example

import sqlite3

def check_specific_corruption():
    db_path = '/tf/active/vice_ai/documents.db'
    
    print("🔍 Checking for specific corruption patterns...")
    
    conn = sqlite3.connect(db_path)
    cursor = conn.cursor()
    
    try:
        patterns = ['{}', '[]', 'null', '']
        
        for pattern in patterns:
            cursor.execute("""
                SELECT COUNT(*), GROUP_CONCAT(id) as section_ids
                FROM text_sections 
                WHERE chat_session_id = ?
            """, (pattern,))
            
            result = cursor.fetchone()
            count = result[0] if result[0] else 0
            section_ids = result[1] if result[1] else ""
            
            if count > 0:
                print(f"   🔴 Pattern '{pattern}': {count} rows -> {section_ids}")
                
                cursor.execute("""
                    UPDATE text_sections 
                    SET chat_session_id = NULL 
                    WHERE chat_session_id = ?
                """, (pattern,))
                
                updated = cursor.rowcount
                if updated > 0:
                    print(f"      ✅ Fixed {updated} rows with pattern '{pattern}'")
                    conn.commit()
            else:
                print(f"   ✅ No corruption with pattern '{pattern}'")
        
        cursor.execute("""
            SELECT id, chat_session_id, title
            FROM text_sections 
            WHERE chat_session_id IS NOT NULL
            ORDER BY id
            LIMIT 5
        """)
        
        remaining_rows = cursor.fetchall()
        print(f"\n📊 Sample of remaining non-NULL chat_session_id values:")
        for row in remaining_rows:
            section_id, chat_session_id, title = row
            title_short = title[:30] + "..." if len(title) > 30 else title
            print(f"   📄 {section_id}: '{chat_session_id}' -> {title_short}")
        
    except Exception as e:
        print(f"❌ Error: {e}")
        conn.rollback()
    finally:
        conn.close()

# Execute the function
check_specific_corruption()

Best Practices

Always backup the database before running this function as it performs UPDATE operations
The function uses hardcoded database path '/tf/active/vice_ai/documents.db' - consider parameterizing this for reusability
The function commits changes immediately after each pattern fix - consider batching commits for better performance
Error handling includes rollback on exceptions, but successful partial updates are committed and won't be rolled back
The function prints diagnostic information to stdout - redirect or capture output if running in production
Consider adding a dry-run mode that reports issues without making changes
The GROUP_CONCAT function may truncate results if there are many affected IDs - check SQLite limits
The function only checks for 4 specific patterns - extend the patterns list if other corruption types are discovered

Similar Components

AI-powered semantic similarity - components with related functionality:

function check_and_fix_corruption 92.3% similar

Scans a SQLite database for corrupted chat_session_id values in the text_sections table and automatically fixes them by setting invalid entries to NULL.
From: /tf/active/vicechatdev/vice_ai/direct_corruption_checker.py
function direct_fix 84.8% similar

A database maintenance function that detects and fixes corrupted chat_session_id values in a SQLite database's text_sections table by identifying invalid patterns and setting them to NULL.
From: /tf/active/vicechatdev/vice_ai/direct_sqlite_fix.py
function cleanup_corrupted_chat_session_ids 82.2% similar

Database maintenance function that identifies and fixes corrupted chat_session_id values in the text_sections table by replacing invalid string representations with NULL.
From: /tf/active/vicechatdev/vice_ai/cleanup_database.py
function raw_cleanup_database 79.0% similar

Performs raw database cleanup on a SQLite database to identify and fix corrupted chat_session_id values in the text_sections table by converting invalid string representations ('{}', '[]', 'null', '') to NULL.
From: /tf/active/vicechatdev/vice_ai/raw_database_cleanup.py
function inspect_database 75.6% similar

Inspects a SQLite database at a hardcoded path, examining table structure, row counts, and identifying potentially corrupted chat_session_id values in the text_sections table.
From: /tf/active/vicechatdev/vice_ai/database_inspector.py

← Back to Browse

Assistant

Hi! I can help improve this code. Tell me what you'd like to enhance (e.g., "add error handling", "optimize performance", "improve readability", "add type hints").

Code Comparison

Original Code

                            def check_specific_corruption():
    db_path = '/tf/active/vice_ai/documents.db'
    
    print("🔍 Checking for specific corruption patterns...")
    
    conn = sqlite3.connect(db_path)
    cursor = conn.cursor()
    
    try:
        # Check for the specific patterns the Flask logs are showing
        patterns = ['{}', '[]', 'null', '']
        
        for pattern in patterns:
            cursor.execute("""
                SELECT COUNT(*), GROUP_CONCAT(id) as section_ids
                FROM text_sections 
                WHERE chat_session_id = ?
            """, (pattern,))
            
            result = cursor.fetchone()
            count = result[0] if result[0] else 0
            section_ids = result[1] if result[1] else ""
            
            if count > 0:
                print(f"   🔴 Pattern '{pattern}': {count} rows -> {section_ids}")
                
                # Fix this specific pattern
                cursor.execute("""
                    UPDATE text_sections 
                    SET chat_session_id = NULL 
                    WHERE chat_session_id = ?
                """, (pattern,))
                
                updated = cursor.rowcount
                if updated > 0:
                    print(f"      ✅ Fixed {updated} rows with pattern '{pattern}'")
                    conn.commit()
            else:
                print(f"   ✅ No corruption with pattern '{pattern}'")
        
        # Now check what's left
        cursor.execute("""
            SELECT id, chat_session_id, title
            FROM text_sections 
            WHERE chat_session_id IS NOT NULL
            ORDER BY id
            LIMIT 5
        """)
        
        remaining_rows = cursor.fetchall()
        print(f"\n📊 Sample of remaining non-NULL chat_session_id values:")
        for row in remaining_rows:
            section_id, chat_session_id, title = row
            title_short = title[:30] + "..." if len(title) > 30 else title
            print(f"   📄 {section_id}: '{chat_session_id}' -> {title_short}")
        
    except Exception as e:
        print(f"❌ Error: {e}")
        conn.rollback()
    finally:
        conn.close()
                        

Improved Code

🔍 Code Extractor

function check_specific_corruption

Purpose

Source Code

Return Value

Dependencies

Required Imports

Usage Example

Best Practices

Tags

Similar Components

function check_and_fix_corruption 92.3% similar

function direct_fix 84.8% similar

function cleanup_corrupted_chat_session_ids 82.2% similar

function raw_cleanup_database 79.0% similar

function inspect_database 75.6% similar

function check_specific_corruption

Purpose

Source Code

Return Value

Dependencies

Required Imports

Usage Example

Best Practices

Tags

Similar Components

function check_and_fix_corruption 92.3% similar

function direct_fix 84.8% similar

function cleanup_corrupted_chat_session_ids 82.2% similar

function raw_cleanup_database 79.0% similar

function inspect_database 75.6% similar

✨ Improve Code: check_specific_corruption

Code Comparison