🔍 Code Extractor

function search_documents_v1

Maturity: 61

Searches for controlled documents in a Neo4j graph database based on multiple optional filter criteria including text query, document type, department, status, and owner.

File:
/tf/active/vicechatdev/CDocs/controllers/document_controller.py
Lines:
489 - 583
Complexity:
moderate

Purpose

This function provides a flexible search interface for retrieving controlled documents from a Neo4j database. It dynamically constructs Cypher queries based on provided filter parameters, supports text search across title and description fields, and returns matching documents sorted by creation date. The function is designed for document management systems where users need to find documents based on various metadata attributes.

Source Code

def search_documents(query=None, doc_type=None, department=None, status=None, owner=None, limit=100, user=None):
    """
    Search for documents based on criteria.
    
    Parameters
    ----------
    query : str, optional
        Text search query
    doc_type : str, optional
        Document type to filter by
    department : str, optional
        Department to filter by
    status : str, optional
        Status to filter by
    owner : str, optional
        Owner UID to filter by
    limit : int, optional
        Maximum number of results to return
    user : DocUser, optional
        The current user (for permission filtering)
        
    Returns
    -------
    List[Dict[str, Any]]
        List of document dictionaries matching the search criteria
    """
    try:
        from CDocs.db import db_operations
        
        logger.info("Controller action: search_documents")
        
        # Build the Cypher query
        cypher_query = """
        MATCH (d:ControlledDocument)
        """
        
        # Add optional filters
        where_clauses = []
        params = {}
        
        if query:
            where_clauses.append("(d.title CONTAINS $query OR d.description CONTAINS $query)")
            params["query"] = query
        
        if doc_type:
            where_clauses.append("d.doc_type = $doc_type")
            params["doc_type"] = doc_type
        
        if department:
            where_clauses.append("d.department = $department")
            params["department"] = department
        
        if status:
            where_clauses.append("d.status = $status")
            params["status"] = status
        
        if owner:
            where_clauses.append("d.owner_id = $owner")
            params["owner"] = owner
        
        # Add WHERE clause if we have any conditions
        if where_clauses:
            cypher_query += "WHERE " + " AND ".join(where_clauses)
        
        # Add permission filtering if user is provided
        # This is commented out for now as it depends on schema details
        # if user and hasattr(user, 'uid') and user.role != 'ADMIN':
        #    # Only add more WHERE conditions if we already have some
        #    connector = "AND" if where_clauses else "WHERE"
        #    cypher_query += f" {connector} (d.owner_id = $user_id OR d.is_public = true)"
        #    params["user_id"] = user.uid
        
        # Add RETURN clause with LIMIT
        cypher_query += f"""
        RETURN d 
        ORDER BY d.created_date DESC
        LIMIT {int(limit)}
        """
        
        # Execute query
        result = db_operations.run_query(cypher_query, params)
        
        # Process results into a list of document dictionaries
        documents = []
        if result:
            for record in result:
                if 'd' in record:
                    document = dict(record['d'])
                    documents.append(document)
        
        return documents
        
    except Exception as e:
        logger.error(f"Error in controller action search_documents: {e}")
        raise e

Parameters

Name Type Default Kind
query - None positional_or_keyword
doc_type - None positional_or_keyword
department - None positional_or_keyword
status - None positional_or_keyword
owner - None positional_or_keyword
limit - 100 positional_or_keyword
user - None positional_or_keyword

Parameter Details

query: Optional text string to search within document titles and descriptions using CONTAINS matching. Case-sensitive partial text matching is performed.

doc_type: Optional string to filter documents by their type (e.g., 'Policy', 'Procedure', 'Work Instruction'). Must match the doc_type property exactly.

department: Optional string to filter documents by department ownership. Must match the department property exactly.

status: Optional string to filter documents by their current status (e.g., 'DRAFT', 'PUBLISHED', 'ARCHIVED'). Must match the status property exactly.

owner: Optional string representing the owner's UID to filter documents by ownership. Must match the owner_id property exactly.

limit: Integer specifying the maximum number of results to return. Defaults to 100. Must be a positive integer.

user: Optional DocUser object representing the current user. Intended for permission filtering (currently commented out in implementation). Should have 'uid' and 'role' attributes.

Return Value

Returns a List[Dict[str, Any]] containing document dictionaries. Each dictionary represents a ControlledDocument node from Neo4j with all its properties (e.g., title, description, doc_type, department, status, owner_id, created_date). Returns an empty list if no documents match the criteria. Results are ordered by created_date in descending order (newest first).

Dependencies

  • logging
  • CDocs.db.db_operations
  • CDocs.controllers (for log_controller_action decorator)

Required Imports

import logging
from CDocs.controllers import log_controller_action

Conditional/Optional Imports

These imports are only needed under specific conditions:

from CDocs.db import db_operations

Condition: Imported inside the function at runtime, required for all executions

Required (conditional)

Usage Example

# Basic text search
results = search_documents(query='safety', limit=50)

# Filter by multiple criteria
results = search_documents(
    doc_type='Policy',
    department='Engineering',
    status='PUBLISHED',
    limit=25
)

# Search with owner filter
results = search_documents(
    query='quality',
    owner='user123',
    limit=10
)

# Get all documents (up to limit)
all_docs = search_documents(limit=100)

# Process results
for doc in results:
    print(f"Title: {doc['title']}, Status: {doc['status']}")

Best Practices

  • Always handle the returned list even if empty - check length before processing results
  • Use appropriate limit values to avoid performance issues with large datasets
  • The query parameter performs case-sensitive CONTAINS matching - consider normalizing search terms
  • Filter parameters must match database values exactly - no fuzzy matching is performed
  • The function logs all search operations via the decorator for audit purposes
  • Permission filtering is currently commented out - implement user-based access control if needed
  • Wrap calls in try-except blocks as the function re-raises exceptions after logging
  • The limit parameter is converted to int and directly interpolated into the query - ensure it's a valid positive integer
  • Results are always sorted by created_date DESC - cannot be customized without modifying the function
  • The function expects db_operations.run_query to return records with a 'd' key containing document nodes

Similar Components

AI-powered semantic similarity - components with related functionality:

  • function search_documents 94.7% similar

    Searches for documents in a Neo4j graph database based on multiple optional filter criteria including text query, document type, department, status, and owner.

    From: /tf/active/vicechatdev/document_controller_backup.py
  • function get_documents 85.9% similar

    Retrieves controlled documents from a Neo4j database with comprehensive filtering, permission-based access control, pagination, and full-text search capabilities.

    From: /tf/active/vicechatdev/CDocs/controllers/document_controller.py
  • function get_documents_v1 83.8% similar

    Retrieves filtered and paginated documents from a Neo4j graph database with permission-based access control, supporting multiple filter criteria and search functionality.

    From: /tf/active/vicechatdev/document_controller_backup.py
  • function search_documents_in_filecloud 75.6% similar

    Searches for controlled documents in FileCloud using text search and optional metadata filters, returning structured document information including UIDs, versions, and metadata.

    From: /tf/active/vicechatdev/CDocs/controllers/filecloud_controller.py
  • function get_all_documents 71.0% similar

    Retrieves all controlled documents from a Neo4j graph database with their associated owner information, formatted for administrative management interfaces.

    From: /tf/active/vicechatdev/CDocs/controllers/admin_controller.py
← Back to Browse