🔍 Code Extractor

function extract_metadata_from_filecloud

Maturity: 54

Extracts and normalizes metadata from FileCloud for document creation, providing default values and generating document numbers when needed.

File:
/tf/active/vicechatdev/CDocs/FC_sync.py
Lines:
140 - 192
Complexity:
moderate

Purpose

This function serves as a data normalization layer between FileCloud metadata and the internal document management system. It ensures all required fields have valid values, generates missing identifiers (doc_uid, document numbers), and standardizes metadata structure for creating controlled documents. It handles missing or empty metadata gracefully by providing sensible defaults.

Source Code

def extract_metadata_from_filecloud(metadata: Dict) -> Dict:
    """
    Extract relevant metadata for document creation.
    
    Args:
        metadata: Metadata from FileCloud
        
    Returns:
        Dict: Normalized metadata for document creation
    """

    if metadata.get("doc_uid", "") == "":
        doc_uid= str(uuid.uuid4())
    else:
        doc_uid= metadata.get("doc_uid", "")


    if metadata.get("status", "") == "":
        status= 'DRAFT'
    else:
        status= metadata.get("status", "")

    if metadata.get("revision", "") == "":
        revision= '0.0'
    else:
        revision= metadata.get("revision", "")
    
    # Initialize with default values
    extracted = {
        "doc_uid": doc_uid,
        "title": metadata.get("title", "Untitled Document"),
        "doc_number": metadata.get("doc_number", ""),
        "doc_type": metadata.get("doc_type", "SOP"),
        "department": metadata.get("department", "QA"),
        "status": status,
        "owner": metadata.get("owner", ""),
        "revision": revision,
        "cdoc_uid": metadata.get("cdoc_uid", ""),
        "custom_path": ''
    }
    
    # If no doc number provided, generate one from other metadata
    if not extracted["doc_number"] and extracted["doc_type"] and extracted["department"]:
        try:
            extracted["doc_number"] = settings.generate_document_number(
                extracted["doc_type"], 
                extracted["department"]
            )
        except:
            # Use a simple fallback if generation fails
            extracted["doc_number"] = f"{extracted['doc_type']}-{extracted['department']}-{uuid.uuid4().hex[:6]}"
    
    return extracted

Parameters

Name Type Default Kind
metadata Dict - positional_or_keyword

Parameter Details

metadata: Dictionary containing metadata from FileCloud. Expected keys include: 'doc_uid' (document unique identifier), 'title' (document title), 'doc_number' (document number), 'doc_type' (document type like 'SOP'), 'department' (department code like 'QA'), 'status' (document status), 'owner' (document owner), 'revision' (version number), 'cdoc_uid' (controlled document UID). All keys are optional; the function provides defaults for missing values.

Return Value

Type: Dict

Returns a dictionary with normalized metadata containing: 'doc_uid' (UUID string, generated if missing), 'title' (string, defaults to 'Untitled Document'), 'doc_number' (string, generated if missing), 'doc_type' (string, defaults to 'SOP'), 'department' (string, defaults to 'QA'), 'status' (string, defaults to 'DRAFT'), 'owner' (string, empty if not provided), 'revision' (string, defaults to '0.0'), 'cdoc_uid' (string, empty if not provided), 'custom_path' (string, always empty string). All values are guaranteed to be present and non-null.

Dependencies

  • uuid
  • CDocs.config.settings

Required Imports

import uuid
from typing import Dict
from CDocs.config import settings

Usage Example

from typing import Dict
import uuid
from CDocs.config import settings

# Example 1: Complete metadata
filecloud_metadata = {
    'doc_uid': 'abc-123',
    'title': 'Quality Assurance Procedure',
    'doc_number': 'SOP-QA-001',
    'doc_type': 'SOP',
    'department': 'QA',
    'status': 'APPROVED',
    'owner': 'john.doe@example.com',
    'revision': '1.0',
    'cdoc_uid': 'cdoc-456'
}

result = extract_metadata_from_filecloud(filecloud_metadata)
print(result['doc_number'])  # 'SOP-QA-001'

# Example 2: Minimal metadata with defaults
minimal_metadata = {
    'title': 'New Document'
}

result = extract_metadata_from_filecloud(minimal_metadata)
print(result['status'])  # 'DRAFT'
print(result['revision'])  # '0.0'
print(result['doc_type'])  # 'SOP'

# Example 3: Empty metadata requiring document number generation
empty_metadata = {}

result = extract_metadata_from_filecloud(empty_metadata)
print(result['title'])  # 'Untitled Document'
print(result['doc_number'])  # Generated number like 'SOP-QA-a1b2c3'

Best Practices

  • Always pass a dictionary to the metadata parameter, even if empty, to avoid errors
  • Ensure the settings.generate_document_number function is properly configured before relying on automatic document number generation
  • The function uses try-except for document number generation, so it will never fail but may produce fallback numbers if settings are misconfigured
  • Empty strings in the input metadata are treated as missing values and will trigger default value assignment
  • The function generates UUIDs for missing doc_uid values, ensuring uniqueness but potentially creating duplicates if called multiple times with the same incomplete metadata
  • Document numbers are only auto-generated if doc_type and department are present (either in input or via defaults)
  • The custom_path field is always set to empty string and cannot be overridden through input metadata

Similar Components

AI-powered semantic similarity - components with related functionality:

  • function import_document_from_filecloud 72.2% similar

    Imports a document from FileCloud into the system by extracting metadata, creating a controlled document record, downloading the file content, creating a document version, and uploading it back to FileCloud with proper folder structure.

    From: /tf/active/vicechatdev/CDocs/FC_sync.py
  • function get_document_metadata_from_filecloud 67.0% similar

    Retrieves metadata for a specific document (and optionally a specific version) from FileCloud storage system.

    From: /tf/active/vicechatdev/CDocs/controllers/filecloud_controller.py
  • function update_document_metadata_in_filecloud 66.6% similar

    Updates metadata for a document stored in FileCloud, merging new metadata with existing values and logging the update in an audit trail.

    From: /tf/active/vicechatdev/CDocs/controllers/filecloud_controller.py
  • function upload_document_to_filecloud 65.2% similar

    Uploads a document version to FileCloud storage system with metadata, handling file creation, folder structure, and audit logging.

    From: /tf/active/vicechatdev/CDocs/controllers/filecloud_controller.py
  • function search_filecloud_for_documents 61.6% similar

    Searches FileCloud for documents that have the 'is_cdoc' metadata flag set to true, retrieving their file paths and associated metadata attributes.

    From: /tf/active/vicechatdev/CDocs/FC_sync.py
← Back to Browse