function search_documents
Searches for documents in a Neo4j graph database based on multiple optional filter criteria including text query, document type, department, status, and owner.
/tf/active/vicechatdev/document_controller_backup.py
308 - 402
moderate
Purpose
This function provides a flexible document search capability for a document management system. It constructs and executes a Cypher query against a Neo4j database to retrieve documents matching specified criteria. The function supports text search across title and description fields, filtering by document metadata, and returns results as a list of dictionaries. It includes logging via a decorator and handles permission filtering (currently commented out). The function is designed to be used in a controlled document management system (CDocs) where documents have properties like type, department, status, and ownership.
Source Code
def search_documents(query=None, doc_type=None, department=None, status=None, owner=None, limit=100, user=None):
"""
Search for documents based on criteria.
Parameters
----------
query : str, optional
Text search query
doc_type : str, optional
Document type to filter by
department : str, optional
Department to filter by
status : str, optional
Status to filter by
owner : str, optional
Owner UID to filter by
limit : int, optional
Maximum number of results to return
user : DocUser, optional
The current user (for permission filtering)
Returns
-------
List[Dict[str, Any]]
List of document dictionaries matching the search criteria
"""
try:
from CDocs.db import db_operations
logger.info("Controller action: search_documents")
# Build the Cypher query
cypher_query = """
MATCH (d:Document)
"""
# Add optional filters
where_clauses = []
params = {}
if query:
where_clauses.append("(d.title CONTAINS $query OR d.description CONTAINS $query)")
params["query"] = query
if doc_type:
where_clauses.append("d.doc_type = $doc_type")
params["doc_type"] = doc_type
if department:
where_clauses.append("d.department = $department")
params["department"] = department
if status:
where_clauses.append("d.status = $status")
params["status"] = status
if owner:
where_clauses.append("d.owner_id = $owner")
params["owner"] = owner
# Add WHERE clause if we have any conditions
if where_clauses:
cypher_query += "WHERE " + " AND ".join(where_clauses)
# Add permission filtering if user is provided
# This is commented out for now as it depends on schema details
# if user and hasattr(user, 'uid') and user.role != 'ADMIN':
# # Only add more WHERE conditions if we already have some
# connector = "AND" if where_clauses else "WHERE"
# cypher_query += f" {connector} (d.owner_id = $user_id OR d.is_public = true)"
# params["user_id"] = user.uid
# Add RETURN clause with LIMIT
cypher_query += f"""
RETURN d
ORDER BY d.created_date DESC
LIMIT {int(limit)}
"""
# Execute query
result = db_operations.run_query(cypher_query, params)
# Process results into a list of document dictionaries
documents = []
if result:
for record in result:
if 'd' in record:
document = dict(record['d'])
documents.append(document)
return documents
except Exception as e:
logger.error(f"Error in controller action search_documents: {e}")
raise e
Parameters
| Name | Type | Default | Kind |
|---|---|---|---|
query |
- | None | positional_or_keyword |
doc_type |
- | None | positional_or_keyword |
department |
- | None | positional_or_keyword |
status |
- | None | positional_or_keyword |
owner |
- | None | positional_or_keyword |
limit |
- | 100 | positional_or_keyword |
user |
- | None | positional_or_keyword |
Parameter Details
query: Optional text string to search within document titles and descriptions. Uses CONTAINS operator for partial matching. Can be None to skip text search filtering.
doc_type: Optional string to filter documents by their type (e.g., 'policy', 'procedure', 'form'). Must match the doc_type property exactly. Can be None to include all document types.
department: Optional string to filter documents by department (e.g., 'HR', 'Engineering', 'Finance'). Must match the department property exactly. Can be None to include all departments.
status: Optional string to filter documents by their current status (e.g., 'DRAFT', 'PUBLISHED', 'ARCHIVED'). Must match the status property exactly. Can be None to include all statuses.
owner: Optional string representing the owner's UID (user identifier) to filter documents by ownership. Must match the owner_id property exactly. Can be None to include documents from all owners.
limit: Integer specifying the maximum number of documents to return. Defaults to 100. Must be a positive integer. Results are ordered by created_date in descending order (newest first).
user: Optional DocUser object representing the current user making the search request. Intended for permission filtering to restrict results based on user access rights. Currently not actively used in the query but included for future permission implementation.
Return Value
Returns a List[Dict[str, Any]] containing document dictionaries. Each dictionary represents a document node from the Neo4j database with all its properties (e.g., title, description, doc_type, department, status, owner_id, created_date, etc.). Returns an empty list if no documents match the criteria or if an error occurs during query execution. The list is ordered by created_date in descending order and limited to the specified number of results.
Dependencies
loggingCDocs.db.db_operationsCDocs.models.user_extensionsCDocs.controllers
Required Imports
import logging
from CDocs.controllers import log_controller_action
from CDocs.models.user_extensions import DocUser
Conditional/Optional Imports
These imports are only needed under specific conditions:
from CDocs.db import db_operations
Condition: imported lazily inside the function at runtime, always needed for function execution
Required (conditional)Usage Example
from CDocs.controllers import search_documents
from CDocs.models.user_extensions import DocUser
# Simple text search
results = search_documents(query='safety protocol')
# Search with multiple filters
results = search_documents(
query='procedure',
doc_type='SOP',
department='Engineering',
status='PUBLISHED',
limit=50
)
# Search by owner
results = search_documents(
owner='user123',
status='DRAFT'
)
# Search with user context for future permission filtering
current_user = DocUser(uid='user456', role='VIEWER')
results = search_documents(
department='HR',
user=current_user
)
# Process results
for doc in results:
print(f"Title: {doc.get('title')}, Status: {doc.get('status')}")
Best Practices
- Always handle the returned list defensively as it may be empty if no documents match or if an error occurs
- Use the limit parameter to prevent retrieving excessive amounts of data, especially in production environments
- Consider implementing pagination for large result sets rather than increasing the limit
- The text query parameter uses CONTAINS which is case-sensitive in Neo4j; consider normalizing search terms
- Ensure proper error handling when calling this function as it re-raises exceptions after logging
- The user parameter is included for future permission filtering but is currently not enforced; do not rely on it for access control until implemented
- Filter parameters must match exact values in the database; consider providing users with valid options from a controlled vocabulary
- Results are ordered by created_date DESC, so newest documents appear first
- The function uses dynamic Cypher query construction; ensure all parameters are properly sanitized (currently handled via parameterized queries)
- Consider adding indexes on frequently queried properties (doc_type, department, status, owner_id) for better performance
Tags
Similar Components
AI-powered semantic similarity - components with related functionality:
-
function search_documents_v1 94.7% similar
-
function get_documents_v1 82.2% similar
-
function get_documents 80.7% similar
-
function search_documents_in_filecloud 71.5% similar
-
function get_all_documents 65.3% similar