šŸ” Code Extractor

function sync_directory

Maturity: 48

Recursively synchronizes a directory from a FileCloud remote server to a local filesystem, downloading new or modified files and creating directory structures as needed.

File:
/tf/active/vicechatdev/UQchat/download_uq_files.py
Lines:
110 - 167
Complexity:
moderate

Purpose

This function performs a complete directory synchronization from a FileCloud server to a local path. It traverses the remote directory tree recursively, creates matching local directory structures, and downloads files that are either missing locally or have been modified on the remote server. It compares modification timestamps to avoid re-downloading unchanged files, making it efficient for incremental backups or syncing operations.

Source Code

def sync_directory(session, remote_path, local_path):
    """Recursively sync a directory from FileCloud"""
    print(f"\nšŸ“‚ {remote_path}")
    
    # Get list of entries in this directory
    entries = get_file_list(session, remote_path)
    
    if not entries:
        print(f"   (empty)")
        return
    
    print(f"   Found {len(entries)} items")
    
    for entry in entries:
        entry_name = entry.get('name')
        entry_type = entry.get('type')
        
        if not entry_name:
            continue
        
        # Build remote and local paths
        remote_entry_path = f"{remote_path}/{entry_name}"
        local_entry_path = local_path / entry_name
        
        # Check if it's a directory (type can be 'dir', '0', or 'directory')
        if entry_type in ['dir', '0', 'directory']:
            print(f"   šŸ“ {entry_name}/")
            # Create local directory
            local_entry_path.mkdir(parents=True, exist_ok=True)
            # Recursively sync subdirectory
            sync_directory(session, remote_entry_path, local_entry_path)
            
        # Check if it's a file (type can be 'file', '1')
        elif entry_type in ['file', '1'] or entry_type is None:
            # Check if file already exists locally
            if local_entry_path.exists():
                # Compare modification times
                try:
                    remote_modified = entry.get('modifiediso', '')
                    local_modified = datetime.fromtimestamp(
                        local_entry_path.stat().st_mtime,
                        tz=cet_timezone
                    ).isoformat(timespec='seconds')
                    
                    if remote_modified.split('+')[0] == local_modified.split('+')[0]:
                        print(f"   ⊘ {entry_name} (already up to date)")
                        continue
                    else:
                        print(f"   ↻ {entry_name} (updating)")
                except Exception as e:
                    print(f"   ↓ {entry_name} (re-downloading)")
            else:
                print(f"   ↓ {entry_name}")
            
            # Download the file
            download_file(session, remote_entry_path, local_entry_path)
        else:
            print(f"   ⚠ Unknown type '{entry_type}' for: {entry_name}")

Parameters

Name Type Default Kind
session - - positional_or_keyword
remote_path - - positional_or_keyword
local_path - - positional_or_keyword

Parameter Details

session: An authenticated requests.Session object or similar session object that maintains connection state and authentication credentials for making API calls to the FileCloud server. This session should already be authenticated before calling this function.

remote_path: A string representing the path to the directory on the FileCloud server to sync from. Should be in the format expected by the FileCloud API (e.g., '/path/to/directory'). This is the source directory that will be mirrored locally.

local_path: A pathlib.Path object representing the local filesystem directory where the remote directory contents will be synced to. This directory will be created if it doesn't exist, and its contents will be updated to match the remote directory structure.

Return Value

This function does not return any value (returns None implicitly). It performs side effects by creating directories, downloading files to the local filesystem, and printing status messages to stdout. The function operates through recursive calls and file I/O operations.

Dependencies

  • requests
  • xmltodict
  • pathlib
  • datetime
  • zoneinfo

Required Imports

import os
import requests
import xmltodict
from pathlib import Path
from datetime import datetime
from zoneinfo import ZoneInfo

Usage Example

import requests
from pathlib import Path
from datetime import datetime
from zoneinfo import ZoneInfo
import xmltodict

# Define timezone
cet_timezone = ZoneInfo('CET')

# Assume get_file_list and download_file functions are defined
# Create authenticated session
session = requests.Session()
session.auth = ('username', 'password')
session.headers.update({'User-Agent': 'FileCloudSync/1.0'})

# Define paths
remote_directory = '/shared/documents'
local_directory = Path('./local_backup/documents')

# Ensure local directory exists
local_directory.mkdir(parents=True, exist_ok=True)

# Sync the directory
sync_directory(session, remote_directory, local_directory)

Best Practices

  • Ensure the session object is properly authenticated before calling this function to avoid authentication errors during sync
  • The local_path should be a pathlib.Path object, not a string, for proper path handling across different operating systems
  • The function relies on external helper functions (get_file_list and download_file) which must be defined in the same scope
  • The cet_timezone variable must be defined globally or in the calling scope using ZoneInfo
  • Consider implementing error handling around the recursive calls to prevent complete failure if one subdirectory fails
  • The function prints status messages to stdout; consider redirecting or capturing output if running in a non-interactive environment
  • Be aware that the function compares timestamps by splitting on '+' to handle timezone differences, which may not work for all timestamp formats
  • The function creates directories with parents=True and exist_ok=True, so it's safe to call on existing directory structures
  • File type detection handles multiple formats ('dir', '0', 'directory' for directories and 'file', '1', None for files) to accommodate different FileCloud API versions

Similar Components

AI-powered semantic similarity - components with related functionality:

  • function create_folder 67.8% similar

    Creates a nested folder structure on a FileCloud server by traversing a path and creating missing directories.

    From: /tf/active/vicechatdev/filecloud_wuxi_sync.py
  • function main_v29 66.0% similar

    Main entry point function that orchestrates a file synchronization process from a FileCloud source to a local directory, with progress reporting and error handling.

    From: /tf/active/vicechatdev/UQchat/download_uq_files.py
  • function check_filecloud_structure 59.5% similar

    Diagnostic function that checks the FileCloud server structure and verifies accessibility of various paths including root, SHARED, and configured base paths.

    From: /tf/active/vicechatdev/SPFCsync/check_filecloud_structure.py
  • class SharePointFileCloudSync 58.5% similar

    Orchestrates synchronization of documents from SharePoint to FileCloud, managing the complete sync lifecycle including document retrieval, comparison, upload, and folder structure creation.

    From: /tf/active/vicechatdev/SPFCsync/sync_service.py
  • function test_filecloud_operations 58.0% similar

    Tests FileCloud basic operations by creating a test folder to verify connectivity and authentication with a FileCloud server.

    From: /tf/active/vicechatdev/SPFCsync/test_connections.py
← Back to Browse