🔍 Code Extractor

function validate_schema

Maturity: 56

Validates that a Neo4j database schema contains all required constraints and node labels for a controlled document management system.

File:
/tf/active/vicechatdev/CDocs/db/schema_manager.py
Lines:
348 - 423
Complexity:
moderate

Purpose

This function performs comprehensive schema validation for a Neo4j graph database used in a document management system. It checks for the existence of required uniqueness constraints on various node types (documents, approvals, reviews, etc.) and verifies that all necessary node labels are present in the database. The function is decorated with a guard_execution cooldown of 5000ms to prevent excessive validation calls. It returns a tuple indicating success/failure and a list of any issues found, making it suitable for health checks, deployment validation, and system diagnostics.

Source Code

def validate_schema(driver: Driver) -> Tuple[bool, List[str]]:
    """
    Validate that the database schema is correctly set up.
            
    Args:
        driver: Neo4j driver instance
        
    Returns:
        Tuple of (success boolean, list of issues found)
    """
    logger.info("Beginning schema validation")
    issues = []
    
    try:
        with driver.session() as session:
            # Check for required constraints
            constraints = session.run("SHOW CONSTRAINTS").values()
            
            constraint_names = [row[0] for row in constraints]
            
            required_constraints = [
                "cdocs_uid_unique",
                "controlled_document_uid_unique",
                "document_version_uid_unique",
                "review_cycle_uid_unique",
                "approval_uid_unique",
                "review_comment_uid_unique",
                "document_type_code_unique",
                "department_code_unique",
                # Add new approval constraints
                "approval_step_uid_unique",
                "approver_uid_unique",
                "approval_workflow_uid_unique",
                # Add new approval cycle constraints
                "approval_cycle_uid_unique",
                "approver_assignment_uid_unique",
                "approval_comment_uid_unique"
            ]
            
            for constraint in required_constraints:
                if constraint not in constraint_names:
                    issues.append(f"Missing constraint: {constraint}")
            
            # Check for required node types
            required_node_labels = [
                NodeLabels.CDOCS,
                NodeLabels.APPROVAL,
                NodeLabels.APPROVAL_STEP,
                NodeLabels.APPROVER,
                # Add new approval cycle node types
                NodeLabels.APPROVAL_CYCLE,
                NodeLabels.APPROVER_ASSIGNMENT,
                NodeLabels.APPROVAL_COMMENT
            ]
            
            for label in required_node_labels:
                result = session.run(
                    f"MATCH (n:{label}) RETURN count(n) > 0 as exists"
                )
                record = result.single()
                if not record or not record["exists"]:
                    issues.append(f"No nodes found with label: {label}")
            
            # Rest of existing validation code...
            
            if issues:
                logger.warning(f"Schema validation found issues: {issues}")
                return False, issues
            else:
                logger.info("Schema validation passed")
                return True, []
                
    except Exception as e:
        logger.error(f"Schema validation failed with error: {e}")
        issues.append(f"Validation error: {str(e)}")
        return False, issues

Parameters

Name Type Default Kind
driver Driver - positional_or_keyword

Parameter Details

driver: A Neo4j Driver instance that provides connection to the Neo4j database. This should be an active, authenticated driver object capable of creating sessions. The driver is used to execute Cypher queries for constraint checking and node label verification.

Return Value

Type: Tuple[bool, List[str]]

Returns a Tuple[bool, List[str]] where the first element is a boolean indicating overall validation success (True if all checks pass, False if any issues found), and the second element is a list of strings describing any issues discovered during validation. An empty list indicates no issues. Issue strings follow patterns like 'Missing constraint: {constraint_name}' or 'No nodes found with label: {label}'.

Dependencies

  • neo4j
  • logging
  • typing
  • uuid
  • traceback
  • CDocs

Required Imports

from typing import Tuple, List
from neo4j import Driver
import logging
from CDocs import guard_execution

Usage Example

from neo4j import GraphDatabase
from typing import Tuple, List
from CDocs import guard_execution
import logging

# Setup logger
logger = logging.getLogger(__name__)

# Create Neo4j driver
driver = GraphDatabase.driver(
    "bolt://localhost:7687",
    auth=("neo4j", "password")
)

try:
    # Validate schema
    is_valid, issues = validate_schema(driver)
    
    if is_valid:
        print("Schema validation passed successfully")
    else:
        print(f"Schema validation failed with {len(issues)} issues:")
        for issue in issues:
            print(f"  - {issue}")
finally:
    driver.close()

Best Practices

  • Always close the Neo4j driver after validation to prevent connection leaks
  • The function is protected by a 5000ms cooldown via guard_execution decorator - avoid calling it too frequently
  • Check both the boolean return value and the issues list for comprehensive error handling
  • Ensure NodeLabels enum/class is properly defined before calling this function
  • Run this validation during application startup or deployment to catch schema issues early
  • The function logs validation results - ensure logging is properly configured to capture these messages
  • Consider running this as part of CI/CD pipeline to validate database migrations
  • The function checks for node existence but doesn't validate node properties or relationships - additional validation may be needed for complete schema verification

Similar Components

AI-powered semantic similarity - components with related functionality:

  • function initialize_schema 75.7% similar

    Initializes the Neo4j database schema by creating required constraints, indexes, root nodes, audit trails, and migrating existing data structures to current schema versions.

    From: /tf/active/vicechatdev/CDocs/db/schema_manager.py
  • function init_database 60.1% similar

    Initializes a Neo4j database with required schema constraints, creates an AuditTrail node, and migrates existing audit events.

    From: /tf/active/vicechatdev/CDocs/db/__init__.py
  • function update_schema 59.2% similar

    Updates a Neo4j database schema to match a specific version, enabling schema migrations during system upgrades.

    From: /tf/active/vicechatdev/CDocs/db/schema_manager.py
  • function validate_and_fix_document_permissions 58.3% similar

    Validates and optionally fixes document sharing permissions for controlled documents in a Neo4j database, processing documents in configurable batches with detailed progress tracking and error handling.

    From: /tf/active/vicechatdev/CDocs/utils/sharing_validator.py
  • function generate_neo4j_schema_report 56.7% similar

    Generates a comprehensive schema report of a Neo4j graph database, including node labels, relationships, properties, constraints, indexes, and sample data, outputting multiple file formats (JSON, HTML, Python snippets, Cypher examples).

    From: /tf/active/vicechatdev/neo4j_schema_report.py
← Back to Browse