🔍 Code Extractor

function search_documents_in_filecloud

Maturity: 68

Searches for controlled documents in FileCloud using text search and optional metadata filters, returning structured document information including UIDs, versions, and metadata.

File:
/tf/active/vicechatdev/CDocs/controllers/filecloud_controller.py
Lines:
963 - 1055
Complexity:
moderate

Purpose

This function provides a comprehensive search interface for FileCloud documents with support for text-based search, hierarchical folder filtering by department and document type, and metadata-based filtering. It specifically targets controlled documents (those with doc_uid metadata) and returns structured information suitable for document management systems. The function constructs dynamic folder paths based on filters, delegates to the FileCloud API client, and processes results to extract relevant document metadata.

Source Code

def search_documents_in_filecloud(
    search_text: str = "",
    doc_type: Optional[str] = None,
    department: Optional[str] = None,
    metadata: Optional[Dict[str, str]] = None,
    max_results: int = 100
) -> Dict[str, Any]:
    """
    Search for documents in FileCloud using text and metadata.
    
    Args:
        search_text: Text to search for
        doc_type: Optional document type filter
        department: Optional department filter
        metadata: Optional metadata criteria
        max_results: Maximum results to return
        
    Returns:
        Dictionary with search results
        
    Raises:
        FileCloudError: If search fails
    """
    # Define folder path based on filters
    folder_path = f"/{settings.FILECLOUD_ROOT_FOLDER}"
    if department:
        folder_path += f"/{department}"
        if doc_type:
            folder_path += f"/{doc_type}"
    
    # Prepare metadata criteria
    search_metadata = metadata or {}
    
    # Add doc_type and department to metadata if provided
    if doc_type and not department:
        search_metadata["doc_type"] = doc_type
    if department and not doc_type:
        search_metadata["department"] = department
    
    try:
        client = get_filecloud_client()
        
        # Search documents
        result = client.search_documents(
            search_text=search_text,
            folder_path=folder_path,
            metadata=search_metadata,
            max_results=max_results
        )
        
        if not result.get('success', False):
            logger.error(f"Failed to search documents in FileCloud: {result.get('message', 'Unknown error')}")
            raise FileCloudError(f"Failed to search documents: {result.get('message', 'Unknown error')}")
        
        # Extract document information from search results
        search_results = result.get('results', [])
        documents = []
        
        for item in search_results:
            # Extract path and metadata
            file_path = item.get('path', '')
            file_metadata = item.get('metadata', {})
            
            # Check if this is a controlled document by looking for doc_uid in metadata
            if 'doc_uid' in file_metadata:
                doc_info = {
                    'file_path': file_path,
                    'doc_uid': file_metadata.get('doc_uid', ''),
                    'doc_number': file_metadata.get('doc_number', ''),
                    'title': file_metadata.get('title', ''),
                    'version': file_metadata.get('version_number', ''),
                    'doc_type': file_metadata.get('doc_type', ''),
                    'department': file_metadata.get('department', ''),
                    'status': file_metadata.get('status', ''),
                    'metadata': file_metadata
                }
                documents.append(doc_info)
        
        return {
            "success": True,
            "count": len(documents),
            "documents": documents,
            "search_criteria": {
                "text": search_text,
                "doc_type": doc_type,
                "department": department,
                "metadata": search_metadata
            }
        }
        
    except Exception as e:
        logger.error(f"Error searching documents in FileCloud: {e}")
        raise FileCloudError(f"Error searching documents: {e}")

Parameters

Name Type Default Kind
search_text str '' positional_or_keyword
doc_type Optional[str] None positional_or_keyword
department Optional[str] None positional_or_keyword
metadata Optional[Dict[str, str]] None positional_or_keyword
max_results int 100 positional_or_keyword

Parameter Details

search_text: Text string to search for within documents. Can be empty string to search by metadata/filters only. Searches document content and metadata fields.

doc_type: Optional document type filter (e.g., 'SOP', 'Policy', 'Form'). When provided with department, narrows folder path to /{root}/{department}/{doc_type}. When provided alone, adds to metadata search criteria.

department: Optional department filter (e.g., 'Engineering', 'QA'). When provided, narrows folder path to /{root}/{department}. When provided alone without doc_type, adds to metadata search criteria.

metadata: Optional dictionary of additional metadata key-value pairs to filter results. These criteria are combined with doc_type and department filters when applicable.

max_results: Maximum number of search results to return. Defaults to 100. Controls pagination/limiting of results from FileCloud API.

Return Value

Type: Dict[str, Any]

Returns a dictionary with keys: 'success' (bool, always True on successful execution), 'count' (int, number of documents found), 'documents' (list of dicts, each containing file_path, doc_uid, doc_number, title, version, doc_type, department, status, and full metadata dict), and 'search_criteria' (dict echoing the search parameters used). Only returns documents that have a 'doc_uid' in their metadata (controlled documents).

Dependencies

  • logging
  • CDocs.config.settings
  • CDocs.utils.FC_api.FileCloudAPI
  • CDocs.controllers.log_controller_action

Required Imports

from typing import Dict, Any, Optional
from CDocs.config import settings
from CDocs.utils.FC_api import get_filecloud_client
from CDocs.controllers import log_controller_action
import logging

Usage Example

# Search for all SOPs in Engineering department containing 'safety'
results = search_documents_in_filecloud(
    search_text='safety',
    doc_type='SOP',
    department='Engineering',
    max_results=50
)

if results['success']:
    print(f"Found {results['count']} documents")
    for doc in results['documents']:
        print(f"{doc['doc_number']}: {doc['title']} (v{doc['version']})")

# Search with custom metadata
results = search_documents_in_filecloud(
    search_text='',
    metadata={'status': 'approved', 'author': 'John Doe'},
    max_results=100
)

# Search all documents in a department
results = search_documents_in_filecloud(
    search_text='',
    department='Quality',
    max_results=200
)

Best Practices

  • Always wrap calls in try-except to handle FileCloudError exceptions
  • Use empty search_text with metadata filters for metadata-only searches
  • Be aware that only documents with 'doc_uid' metadata are returned (controlled documents)
  • Consider pagination for large result sets by adjusting max_results parameter
  • The folder path construction is hierarchical: providing both department and doc_type creates a more specific path than either alone
  • When doc_type is provided without department, it's added to metadata search rather than folder path
  • Check the 'success' key in the return dictionary before processing results
  • The function is decorated with @log_controller_action for audit trail purposes

Similar Components

AI-powered semantic similarity - components with related functionality:

  • function search_documents_v1 75.6% similar

    Searches for controlled documents in a Neo4j graph database based on multiple optional filter criteria including text query, document type, department, status, and owner.

    From: /tf/active/vicechatdev/CDocs/controllers/document_controller.py
  • function search_filecloud_for_documents 73.6% similar

    Searches FileCloud for documents that have the 'is_cdoc' metadata flag set to true, retrieving their file paths and associated metadata attributes.

    From: /tf/active/vicechatdev/CDocs/FC_sync.py
  • function search_documents 71.5% similar

    Searches for documents in a Neo4j graph database based on multiple optional filter criteria including text query, document type, department, status, and owner.

    From: /tf/active/vicechatdev/document_controller_backup.py
  • function get_documents 67.3% similar

    Retrieves controlled documents from a Neo4j database with comprehensive filtering, permission-based access control, pagination, and full-text search capabilities.

    From: /tf/active/vicechatdev/CDocs/controllers/document_controller.py
  • function get_documents_v1 65.7% similar

    Retrieves filtered and paginated documents from a Neo4j graph database with permission-based access control, supporting multiple filter criteria and search functionality.

    From: /tf/active/vicechatdev/document_controller_backup.py
← Back to Browse