🔍 Code Extractor

function main_v98

Maturity: 38

Command-line application that uploads PDF files without WUXI coding from a local directory to a FileCloud server, with support for dry-run mode and customizable file patterns.

File:
/tf/active/vicechatdev/mailsearch/upload_non_wuxi_coded.py
Lines:
157 - 246
Complexity:
moderate

Purpose

This is the main entry point for a file upload utility that filters PDF files based on WUXI coding patterns and uploads them to a specific FileCloud location. It's designed for document management workflows where files need to be categorized and uploaded to a shared cloud storage system. The function handles authentication, file filtering, batch uploading, and provides detailed progress reporting with success/error summaries.

Source Code

def main():
    parser = argparse.ArgumentParser(
        description="Upload non-WUXI coded files from output folder to FileCloud"
    )
    
    parser.add_argument(
        '--source',
        default='./output',
        help='Source directory (default: ./output)'
    )
    
    parser.add_argument(
        '--target',
        default='/SHARED/vicebio_shares/03_CMC/e-sign - document to approve/Extract docusign - not Wuxi coded',
        help='Target folder in FileCloud'
    )
    
    parser.add_argument(
        '--dry-run',
        action='store_true',
        help='Show what would be uploaded without actually uploading'
    )
    
    parser.add_argument(
        '--pattern',
        default='*.pdf',
        help='File pattern to match (default: *.pdf)'
    )
    
    args = parser.parse_args()
    
    # Setup timezone
    cet_timezone = ZoneInfo("Europe/Brussels")
    
    # Find all PDF files without WUXI coding
    source_path = Path(args.source)
    all_files = list(source_path.glob(args.pattern))
    non_wuxi_files = [f for f in all_files if not has_wuxi_coding(f.name)]
    
    print(f"Found {len(all_files)} total files")
    print(f"Filtered to {len(non_wuxi_files)} files without WUXI coding")
    print(f"Target folder: {args.target}")
    print("=" * 80)
    
    if not non_wuxi_files:
        print("No files to upload")
        return
    
    if args.dry_run:
        print("\nDRY RUN MODE - No files will be uploaded")
        print("=" * 80)
        for file_path in sorted(non_wuxi_files):
            print(f"\n{file_path.name}")
            print(f"  → Would upload to: {args.target}/{file_path.name}")
        return
    
    # Login to FileCloud
    print("\nLogging in to FileCloud...")
    Headers = {'Accept': 'application/json'}
    Creds = {'userid': 'wim@vicebio.com', 'password': 'Studico01!'}
    ServerURL = 'https://filecloud.vicebio.com/'
    LoginEndPoint = 'core/loginguest'
    
    s = requests.session()
    LoginCall = s.post(ServerURL + LoginEndPoint, data=Creds, headers=Headers).json()
    print("✓ Logged in successfully")
    print("=" * 80)
    
    # Upload files
    success_count = 0
    error_count = 0
    
    for file_path in sorted(non_wuxi_files):
        try:
            if upload_file_to_filecloud(str(file_path), args.target, s, cet_timezone, args.dry_run):
                success_count += 1
            else:
                error_count += 1
        except Exception as e:
            print(f"\n{file_path.name}")
            print(f"  ✗ Error: {e}")
            error_count += 1
    
    # Summary
    print("\n" + "=" * 80)
    print("SUMMARY")
    print("=" * 80)
    print(f"Total files: {len(non_wuxi_files)}")
    print(f"Successful: {success_count}")
    print(f"Errors: {error_count}")

Return Value

Returns None implicitly. The function performs side effects (file uploads, console output) and exits normally. Early returns occur when no files are found or in dry-run mode.

Dependencies

  • argparse
  • pathlib
  • requests
  • xmltodict
  • datetime
  • zoneinfo
  • os
  • re

Required Imports

import argparse
from pathlib import Path
import requests
import xmltodict
from datetime import datetime
from zoneinfo import ZoneInfo
import os
import re

Usage Example

# Run with default settings
if __name__ == '__main__':
    main()

# Command-line usage examples:
# python script.py
# python script.py --source ./my_pdfs --pattern '*.pdf'
# python script.py --dry-run
# python script.py --target '/SHARED/custom_folder' --source ./docs
# python script.py --pattern '*.docx' --dry-run

Best Practices

  • SECURITY WARNING: Credentials are hardcoded in the source code. Use environment variables or secure credential management instead.
  • The function depends on external functions 'has_wuxi_coding()' and 'upload_file_to_filecloud()' which must be defined in the same module.
  • Use --dry-run flag first to verify which files will be uploaded before performing actual uploads.
  • Ensure the source directory exists and contains files matching the pattern before running.
  • The function uses a persistent session object for FileCloud API calls to maintain authentication.
  • Error handling is implemented per-file, so one failure won't stop the entire batch.
  • The timezone is set to Europe/Brussels (CET) - adjust if needed for different regions.
  • Consider implementing retry logic for network failures in production use.
  • The function prints progress to stdout - redirect or capture if logging to file is needed.

Similar Components

AI-powered semantic similarity - components with related functionality:

  • function main_v22 66.9% similar

    Command-line entry point for a reMarkable PDF upload tool that handles argument parsing, folder listing, and PDF document uploads to a reMarkable device.

    From: /tf/active/vicechatdev/e-ink-llm/cloudtest/upload_pdf_new.py
  • function main_v39 64.7% similar

    Main entry point function that orchestrates a file synchronization process from a FileCloud source to a local directory, with progress reporting and error handling.

    From: /tf/active/vicechatdev/UQchat/download_uq_files.py
  • function main_v77 64.5% similar

    Executes a dry run comparison analysis of PDF upload requests between a simulated implementation and a real application, without making actual API calls.

    From: /tf/active/vicechatdev/e-ink-llm/cloudtest/dry_run_comparison.py
  • function main_v26 63.9% similar

    Command-line test function that uploads a PDF document to a reMarkable device, with optional parent folder specification via command-line argument.

    From: /tf/active/vicechatdev/e-ink-llm/cloudtest/final_uploads.py
  • function main_v1 63.2% similar

    Main execution function that processes and copies document files from an output directory to target folders based on document codes, with support for dry-run and test modes.

    From: /tf/active/vicechatdev/mailsearch/copy_signed_documents.py
← Back to Browse