🔍 Code Extractor

function import_document_from_filecloud

Maturity: 63

Imports a document from FileCloud into the system by extracting metadata, creating a controlled document record, downloading the file content, creating a document version, and uploading it back to FileCloud with proper folder structure.

File:
/tf/active/vicechatdev/CDocs/FC_sync.py
Lines:
276 - 402
Complexity:
complex

Purpose

This function serves as the main entry point for importing documents from FileCloud into the document management system. It handles the complete workflow: extracting and normalizing metadata from FileCloud, checking for existing documents (commented out), creating a new controlled document record in the database, downloading the file content from FileCloud, creating a document version with the content, setting it as the current version, ensuring proper folder structure exists, and uploading the document back to FileCloud with appropriate metadata. This is typically used during bulk imports or migration operations from FileCloud storage.

Source Code

def import_document_from_filecloud(file_path: str, metadata: Dict, admin_user: DocUser) -> Optional[Dict]:
    """
    Import a document from FileCloud into the system.
    
    Args:
        file_path: Path to the document in FileCloud
        metadata: Document metadata
        admin_user: Admin user performing the import
        
    Returns:
        Dict: Result of the import operation
    """
    try:
        # Extract metadata
        core_file_path = "/".join(file_path.split("/")[:-1])
        normalized_metadata = extract_metadata_from_filecloud(metadata)
        
        # Set custom_path if available in original metadata or from core_file_path
        if "custom_path" in metadata:
            normalized_metadata["custom_path"] = metadata["custom_path"]
        elif core_file_path:
            normalized_metadata["custom_path"] = core_file_path
            
        logger.info(f"Normalized metadata: {normalized_metadata}")
        
        # First check if document already exists by cdoc_uid (most reliable method)
        cdoc_uid = normalized_metadata.get("cdoc_uid")
        # if cdoc_uid:
        #     existing_doc = check_document_exists_by_uid(cdoc_uid)
        #     if existing_doc:
        #         logger.info(f"Document with UID {cdoc_uid} already exists in database")
        #         return {
        #             "success": False,
        #             "message": f"Document with UID {cdoc_uid} already exists",
        #             "document": existing_doc.to_dict()
        #         }
        
        # # Then check by document number as fallback
        doc_number = normalized_metadata.get("doc_number")
        # if doc_number:
        #     existing_doc = check_document_exists_by_doc_number(doc_number)
        #     if existing_doc:
        #         logger.info(f"Document {doc_number} already exists in database")
        #         return {
        #             "success": False,
        #             "message": f"Document {doc_number} already exists",
        #             "document": existing_doc.to_dict()
        #         }
        
        # Create new controlled document
        document = create_controlled_document(normalized_metadata, core_file_path, admin_user)
        if not document:
            return None
            
        # Download document content from FileCloud
        client = get_filecloud_client()
        file_content = client.download_file(file_path)
        
        if not isinstance(file_content, bytes):
            logger.error(f"Failed to download file from FileCloud: {file_path}")
            return None
            
        # Get filename from path
        filename = os.path.basename(file_path)
            
        # Create document version
        version_result = create_document_version(
            user=admin_user,
            document_uid=document.uid,
            file_content=file_content,
            file_name=filename,
            comment="Imported from FileCloud"
        )
        
        if not version_result:
            logger.error(f"Failed to create document version for {doc_number}")
            return None
            
        # Set document as the current version
        from CDocs.controllers.document_controller import set_current_version
        set_result = set_current_version(
            user=admin_user,
            document_uid=document.uid,
            version_uid=version_result.get("UID")
        )
        
        if not set_result or not set_result.get("success"):
            logger.warning(f"Failed to set current version for {doc_number}: {set_result.get('message', 'Unknown error')}")
            
        # Get custom path and ensure folder structure
        custom_path = normalized_metadata.get("custom_path")
        if custom_path:
            # Ensure we have the correct folder structure
            try:
                # Create expected folder structure
                ensure_document_folders(document)
            except FileCloudError as e:
                logger.warning(f"Error ensuring folder structure (non-critical): {e}")
        
        # Upload to FileCloud using the same logic as clone function
        from CDocs.controllers.filecloud_controller import upload_document_to_filecloud
        filecloud_result = upload_document_to_filecloud(
            user=admin_user,
            document=document,
            file_content=file_content,
            version_comment="Imported from FileCloud",
            metadata=None  # Let the upload function calculate metadata
        )
        
        if not filecloud_result or not filecloud_result.get('success', False):
            logger.warning(f"FileCloud upload warning: {filecloud_result.get('message', 'Unknown error')}")
        
        logger.info(f"Successfully imported document {doc_number} from {file_path}")
        
        return {
            "success": True,
            "message": f"Document {doc_number} imported successfully",
            "document_uid": document.uid,
            "version_uid": version_result.get("UID"),
            "doc_number": doc_number
        }
        
    except Exception as e:
        logger.error(f"Error importing document from FileCloud: {e}")
        import traceback
        logger.error(traceback.format_exc())
        return None

Parameters

Name Type Default Kind
file_path str - positional_or_keyword
metadata Dict - positional_or_keyword
admin_user DocUser - positional_or_keyword

Parameter Details

file_path: String representing the full path to the document in FileCloud storage (e.g., '/folder/subfolder/document.pdf'). The function extracts the directory path and filename from this parameter.

metadata: Dictionary containing document metadata from FileCloud. Expected keys include 'cdoc_uid' (unique document identifier), 'doc_number' (document number), 'custom_path' (optional custom folder path), and other document attributes. This metadata is normalized before creating the document record.

admin_user: DocUser object representing the administrator performing the import operation. This user must have appropriate permissions to create documents and versions. The user is associated with the document creation and version history.

Return Value

Type: Optional[Dict]

Returns an Optional[Dict] with the import operation result. On success, returns a dictionary with keys: 'success' (True), 'message' (success message), 'document_uid' (UID of created document), 'version_uid' (UID of created version), 'doc_number' (document number). On failure, returns None. The commented-out code suggests it may also return failure dictionaries with 'success' (False), 'message' (error description), and 'document' (existing document dict) if duplicates are detected.

Dependencies

  • os
  • sys
  • logging
  • tempfile
  • uuid
  • io
  • typing
  • datetime
  • traceback
  • CDocs.db.db_operations
  • CDocs.models.document
  • CDocs.models.user_extensions
  • CDocs.controllers.filecloud_controller
  • CDocs.controllers.document_controller
  • CDocs.config
  • FC_api
  • metadata_catalog

Required Imports

import os
import logging
from typing import Dict, Optional
from CDocs.models.user_extensions import DocUser
from CDocs.controllers.filecloud_controller import get_filecloud_client, upload_document_to_filecloud, ensure_document_folders, FileCloudError
from CDocs.controllers.document_controller import create_document_version, set_current_version
import traceback

Conditional/Optional Imports

These imports are only needed under specific conditions:

from CDocs.controllers.document_controller import set_current_version

Condition: imported inline within the function when setting the current version of the document

Required (conditional)
from CDocs.controllers.filecloud_controller import upload_document_to_filecloud

Condition: imported inline within the function when uploading to FileCloud

Required (conditional)

Usage Example

from CDocs.models.user_extensions import DocUser
from typing import Dict

# Assume admin_user is already authenticated
admin_user = DocUser.query.filter_by(username='admin').first()

# Metadata from FileCloud
metadata = {
    'cdoc_uid': 'DOC-12345-UID',
    'doc_number': 'DOC-12345',
    'title': 'Engineering Specification',
    'custom_path': '/Engineering/Specifications',
    'author': 'John Doe',
    'revision': 'A'
}

# Import document from FileCloud
file_path = '/Engineering/Specifications/DOC-12345.pdf'
result = import_document_from_filecloud(file_path, metadata, admin_user)

if result and result.get('success'):
    print(f"Document imported successfully: {result['doc_number']}")
    print(f"Document UID: {result['document_uid']}")
    print(f"Version UID: {result['version_uid']}")
else:
    print("Failed to import document")

Best Practices

  • Ensure the admin_user has appropriate permissions before calling this function
  • The metadata dictionary should contain at least 'doc_number' and preferably 'cdoc_uid' for proper document identification
  • The function includes commented-out duplicate checking logic that may need to be enabled based on business requirements
  • Handle the None return value appropriately as it indicates a critical failure in the import process
  • The function performs multiple operations (database creation, file download, version creation, file upload) - consider implementing transaction rollback on failure
  • Monitor logs for warnings about FileCloud upload issues, as the function continues even if FileCloud operations fail
  • Ensure extract_metadata_from_filecloud, create_controlled_document, and ensure_document_folders functions are properly defined in the module
  • The function downloads file content into memory - be cautious with large files that may cause memory issues
  • Custom paths in metadata are used to organize documents in FileCloud folder structure

Similar Components

AI-powered semantic similarity - components with related functionality:

  • function upload_document_to_filecloud 77.6% similar

    Uploads a document version to FileCloud storage system with metadata, handling file creation, folder structure, and audit logging.

    From: /tf/active/vicechatdev/CDocs/controllers/filecloud_controller.py
  • function extract_metadata_from_filecloud 72.2% similar

    Extracts and normalizes metadata from FileCloud for document creation, providing default values and generating document numbers when needed.

    From: /tf/active/vicechatdev/CDocs/FC_sync.py
  • function get_document_metadata_from_filecloud 67.7% similar

    Retrieves metadata for a specific document (and optionally a specific version) from FileCloud storage system.

    From: /tf/active/vicechatdev/CDocs/controllers/filecloud_controller.py
  • function update_document_metadata_in_filecloud 67.5% similar

    Updates metadata for a document stored in FileCloud, merging new metadata with existing values and logging the update in an audit trail.

    From: /tf/active/vicechatdev/CDocs/controllers/filecloud_controller.py
  • function main_v6 65.6% similar

    Main execution function that orchestrates the import of controlled documents from FileCloud into a Neo4j database, checking for duplicates and managing document metadata.

    From: /tf/active/vicechatdev/CDocs/FC_sync.py
← Back to Browse