🔍 Code Extractor

function search_filecloud_for_documents

Maturity: 53

Searches FileCloud for documents that have the 'is_cdoc' metadata flag set to true, retrieving their file paths and associated metadata attributes.

File:
/tf/active/vicechatdev/CDocs/FC_sync.py
Lines:
404 - 481
Complexity:
moderate

Purpose

This function queries a FileCloud instance to find all controlled documents (documents with is_cdoc=true metadata). It uses the MetadataCatalog to perform metadata-based searches and retrieves detailed information including document UIDs, numbers, types, titles, departments, status, owners, and revisions. This is typically used in document management systems to identify and process controlled documents that require special handling or tracking.

Source Code

def search_filecloud_for_documents() -> List[Dict]:
    """
    Search FileCloud for documents with controlled document metadata flag.
    
    Returns:
        List[Dict]: List of document information dictionaries
    """
    try:
        client = get_filecloud_client()
        
        # Initialize MetadataCatalog with the client
        from metadata_catalog import MetadataCatalog
        catalog = MetadataCatalog(client)
        
        # Define search criteria for is_cdoc = true
        search_criteria = [
            {
                'set_name': 'CDocs',  # Assuming the set name, adjust if needed
                'attribute_name': 'is_cdoc',
                'value': 'true',
                'operator': 'equals'
            }
        ]
        
        # Search using MetadataCatalog
        logger.info("Searching for documents with is_cdoc=true metadata")
        file_paths = catalog.search_files_by_metadata(
            search_criteria=search_criteria,
            search_string="**"  # Match any filename
        )
        
        if not file_paths:
            logger.info("No documents with controlled document flag found in FileCloud")
            return []
            
        logger.info(f"Found {len(file_paths)} documents with controlled document flag in FileCloud")
        
        # Get detailed information for each file
        documents = []
        for path in file_paths:
            try:
                # Get file metadata
                metadata_values = catalog.get_metadata_values(path)
                
                # Extract attributes as a simple dictionary across all sets
                attributes = catalog.get_attribute_values(metadata_values)
                
                # Add file path and metadata to document list
                document_info = {
                    'file_path': path,
                    'filename': os.path.basename(path),
                    'metadata': {
                        # Map known metadata fields
                        'doc_uid': attributes.get('doc_uid', ''),
                        'is_cdoc': attributes.get('is_cdoc', 'true'),
                        'cdoc_uid': attributes.get('cdoc_uid', ''),
                        'doc_number': attributes.get('doc_number', ''),
                        'doc_type': attributes.get('doc_type', ''),
                        'title': attributes.get('title', os.path.basename(path)),
                        'department': attributes.get('department', ''),
                        'status': attributes.get('status', ''),
                        'owner': attributes.get('owner', ''),
                        'revision': attributes.get('revision', '')
                    }
                }
                documents.append(document_info)
                
            except Exception as e:
                logger.warning(f"Error getting metadata for file {path}: {e}")
                continue
        
        return documents
        
    except Exception as e:
        logger.error(f"Error searching FileCloud: {e}")
        import traceback
        logger.error(traceback.format_exc())
        return []

Return Value

Type: List[Dict]

Returns a List[Dict] where each dictionary contains document information with keys: 'file_path' (full path in FileCloud), 'filename' (base filename), and 'metadata' (dictionary with fields: doc_uid, is_cdoc, cdoc_uid, doc_number, doc_type, title, department, status, owner, revision). Returns an empty list if no documents are found or if an error occurs.

Dependencies

  • os
  • sys
  • logging
  • tempfile
  • uuid
  • io
  • typing
  • datetime
  • traceback
  • FC_api
  • metadata_catalog
  • CDocs.db.db_operations
  • CDocs.models.document
  • CDocs.models.user_extensions
  • CDocs.controllers.filecloud_controller
  • CDocs.controllers.document_controller
  • CDocs.config

Required Imports

import os
from typing import List, Dict
from CDocs.controllers.filecloud_controller import get_filecloud_client

Conditional/Optional Imports

These imports are only needed under specific conditions:

from metadata_catalog import MetadataCatalog

Condition: imported inside the function after getting the FileCloud client, required for all executions

Required (conditional)
import traceback

Condition: used in exception handling to log detailed error information

Required (conditional)

Usage Example

import logging
from typing import List, Dict
from CDocs.controllers.filecloud_controller import get_filecloud_client

# Setup logger
logger = logging.getLogger(__name__)
logging.basicConfig(level=logging.INFO)

# Call the function
documents = search_filecloud_for_documents()

# Process results
if documents:
    print(f"Found {len(documents)} controlled documents")
    for doc in documents:
        print(f"File: {doc['filename']}")
        print(f"Path: {doc['file_path']}")
        print(f"Doc Number: {doc['metadata']['doc_number']}")
        print(f"Title: {doc['metadata']['title']}")
        print(f"Status: {doc['metadata']['status']}")
        print("---")
else:
    print("No controlled documents found")

Best Practices

  • Ensure the FileCloud client is properly configured before calling this function
  • The function handles exceptions gracefully and returns an empty list on errors, so always check the return value
  • The 'CDocs' metadata set name is hardcoded; adjust if your FileCloud instance uses a different set name
  • The function logs warnings for individual file metadata retrieval failures but continues processing other files
  • Consider the performance impact when searching large FileCloud instances as it retrieves detailed metadata for each matching file
  • The search uses '**' as a wildcard to match any filename; this searches all files with the metadata flag
  • Ensure proper logging configuration to capture info, warning, and error messages for debugging
  • The function assumes specific metadata attributes (doc_uid, cdoc_uid, doc_number, etc.); missing attributes will have empty string defaults

Similar Components

AI-powered semantic similarity - components with related functionality:

  • function search_documents_in_filecloud 73.6% similar

    Searches for controlled documents in FileCloud using text search and optional metadata filters, returning structured document information including UIDs, versions, and metadata.

    From: /tf/active/vicechatdev/CDocs/controllers/filecloud_controller.py
  • function get_document_metadata_from_filecloud 68.0% similar

    Retrieves metadata for a specific document (and optionally a specific version) from FileCloud storage system.

    From: /tf/active/vicechatdev/CDocs/controllers/filecloud_controller.py
  • function import_document_from_filecloud 63.7% similar

    Imports a document from FileCloud into the system by extracting metadata, creating a controlled document record, downloading the file content, creating a document version, and uploading it back to FileCloud with proper folder structure.

    From: /tf/active/vicechatdev/CDocs/FC_sync.py
  • function upload_document_to_filecloud 63.0% similar

    Uploads a document version to FileCloud storage system with metadata, handling file creation, folder structure, and audit logging.

    From: /tf/active/vicechatdev/CDocs/controllers/filecloud_controller.py
  • function extract_metadata_from_filecloud 61.6% similar

    Extracts and normalizes metadata from FileCloud for document creation, providing default values and generating document numbers when needed.

    From: /tf/active/vicechatdev/CDocs/FC_sync.py
← Back to Browse