function search_filecloud_for_documents
Searches FileCloud for documents that have the 'is_cdoc' metadata flag set to true, retrieving their file paths and associated metadata attributes.
/tf/active/vicechatdev/CDocs/FC_sync.py
404 - 481
moderate
Purpose
This function queries a FileCloud instance to find all controlled documents (documents with is_cdoc=true metadata). It uses the MetadataCatalog to perform metadata-based searches and retrieves detailed information including document UIDs, numbers, types, titles, departments, status, owners, and revisions. This is typically used in document management systems to identify and process controlled documents that require special handling or tracking.
Source Code
def search_filecloud_for_documents() -> List[Dict]:
"""
Search FileCloud for documents with controlled document metadata flag.
Returns:
List[Dict]: List of document information dictionaries
"""
try:
client = get_filecloud_client()
# Initialize MetadataCatalog with the client
from metadata_catalog import MetadataCatalog
catalog = MetadataCatalog(client)
# Define search criteria for is_cdoc = true
search_criteria = [
{
'set_name': 'CDocs', # Assuming the set name, adjust if needed
'attribute_name': 'is_cdoc',
'value': 'true',
'operator': 'equals'
}
]
# Search using MetadataCatalog
logger.info("Searching for documents with is_cdoc=true metadata")
file_paths = catalog.search_files_by_metadata(
search_criteria=search_criteria,
search_string="**" # Match any filename
)
if not file_paths:
logger.info("No documents with controlled document flag found in FileCloud")
return []
logger.info(f"Found {len(file_paths)} documents with controlled document flag in FileCloud")
# Get detailed information for each file
documents = []
for path in file_paths:
try:
# Get file metadata
metadata_values = catalog.get_metadata_values(path)
# Extract attributes as a simple dictionary across all sets
attributes = catalog.get_attribute_values(metadata_values)
# Add file path and metadata to document list
document_info = {
'file_path': path,
'filename': os.path.basename(path),
'metadata': {
# Map known metadata fields
'doc_uid': attributes.get('doc_uid', ''),
'is_cdoc': attributes.get('is_cdoc', 'true'),
'cdoc_uid': attributes.get('cdoc_uid', ''),
'doc_number': attributes.get('doc_number', ''),
'doc_type': attributes.get('doc_type', ''),
'title': attributes.get('title', os.path.basename(path)),
'department': attributes.get('department', ''),
'status': attributes.get('status', ''),
'owner': attributes.get('owner', ''),
'revision': attributes.get('revision', '')
}
}
documents.append(document_info)
except Exception as e:
logger.warning(f"Error getting metadata for file {path}: {e}")
continue
return documents
except Exception as e:
logger.error(f"Error searching FileCloud: {e}")
import traceback
logger.error(traceback.format_exc())
return []
Return Value
Type: List[Dict]
Returns a List[Dict] where each dictionary contains document information with keys: 'file_path' (full path in FileCloud), 'filename' (base filename), and 'metadata' (dictionary with fields: doc_uid, is_cdoc, cdoc_uid, doc_number, doc_type, title, department, status, owner, revision). Returns an empty list if no documents are found or if an error occurs.
Dependencies
ossysloggingtempfileuuidiotypingdatetimetracebackFC_apimetadata_catalogCDocs.db.db_operationsCDocs.models.documentCDocs.models.user_extensionsCDocs.controllers.filecloud_controllerCDocs.controllers.document_controllerCDocs.config
Required Imports
import os
from typing import List, Dict
from CDocs.controllers.filecloud_controller import get_filecloud_client
Conditional/Optional Imports
These imports are only needed under specific conditions:
from metadata_catalog import MetadataCatalog
Condition: imported inside the function after getting the FileCloud client, required for all executions
Required (conditional)import traceback
Condition: used in exception handling to log detailed error information
Required (conditional)Usage Example
import logging
from typing import List, Dict
from CDocs.controllers.filecloud_controller import get_filecloud_client
# Setup logger
logger = logging.getLogger(__name__)
logging.basicConfig(level=logging.INFO)
# Call the function
documents = search_filecloud_for_documents()
# Process results
if documents:
print(f"Found {len(documents)} controlled documents")
for doc in documents:
print(f"File: {doc['filename']}")
print(f"Path: {doc['file_path']}")
print(f"Doc Number: {doc['metadata']['doc_number']}")
print(f"Title: {doc['metadata']['title']}")
print(f"Status: {doc['metadata']['status']}")
print("---")
else:
print("No controlled documents found")
Best Practices
- Ensure the FileCloud client is properly configured before calling this function
- The function handles exceptions gracefully and returns an empty list on errors, so always check the return value
- The 'CDocs' metadata set name is hardcoded; adjust if your FileCloud instance uses a different set name
- The function logs warnings for individual file metadata retrieval failures but continues processing other files
- Consider the performance impact when searching large FileCloud instances as it retrieves detailed metadata for each matching file
- The search uses '**' as a wildcard to match any filename; this searches all files with the metadata flag
- Ensure proper logging configuration to capture info, warning, and error messages for debugging
- The function assumes specific metadata attributes (doc_uid, cdoc_uid, doc_number, etc.); missing attributes will have empty string defaults
Tags
Similar Components
AI-powered semantic similarity - components with related functionality:
-
function search_documents_in_filecloud 73.6% similar
-
function get_document_metadata_from_filecloud 68.0% similar
-
function import_document_from_filecloud 63.7% similar
-
function upload_document_to_filecloud 63.0% similar
-
function extract_metadata_from_filecloud 61.6% similar