function sync_directory
Recursively synchronizes a directory from a FileCloud remote server to a local filesystem, downloading new or modified files and creating directory structures as needed.
/tf/active/vicechatdev/UQchat/download_uq_files.py
110 - 167
moderate
Purpose
This function performs a complete directory synchronization from a FileCloud server to a local path. It traverses the remote directory tree recursively, creates matching local directory structures, and downloads files that are either missing locally or have been modified on the remote server. It compares modification timestamps to avoid re-downloading unchanged files, making it efficient for incremental backups or syncing operations.
Source Code
def sync_directory(session, remote_path, local_path):
"""Recursively sync a directory from FileCloud"""
print(f"\nš {remote_path}")
# Get list of entries in this directory
entries = get_file_list(session, remote_path)
if not entries:
print(f" (empty)")
return
print(f" Found {len(entries)} items")
for entry in entries:
entry_name = entry.get('name')
entry_type = entry.get('type')
if not entry_name:
continue
# Build remote and local paths
remote_entry_path = f"{remote_path}/{entry_name}"
local_entry_path = local_path / entry_name
# Check if it's a directory (type can be 'dir', '0', or 'directory')
if entry_type in ['dir', '0', 'directory']:
print(f" š {entry_name}/")
# Create local directory
local_entry_path.mkdir(parents=True, exist_ok=True)
# Recursively sync subdirectory
sync_directory(session, remote_entry_path, local_entry_path)
# Check if it's a file (type can be 'file', '1')
elif entry_type in ['file', '1'] or entry_type is None:
# Check if file already exists locally
if local_entry_path.exists():
# Compare modification times
try:
remote_modified = entry.get('modifiediso', '')
local_modified = datetime.fromtimestamp(
local_entry_path.stat().st_mtime,
tz=cet_timezone
).isoformat(timespec='seconds')
if remote_modified.split('+')[0] == local_modified.split('+')[0]:
print(f" ā {entry_name} (already up to date)")
continue
else:
print(f" ā» {entry_name} (updating)")
except Exception as e:
print(f" ā {entry_name} (re-downloading)")
else:
print(f" ā {entry_name}")
# Download the file
download_file(session, remote_entry_path, local_entry_path)
else:
print(f" ā Unknown type '{entry_type}' for: {entry_name}")
Parameters
| Name | Type | Default | Kind |
|---|---|---|---|
session |
- | - | positional_or_keyword |
remote_path |
- | - | positional_or_keyword |
local_path |
- | - | positional_or_keyword |
Parameter Details
session: An authenticated requests.Session object or similar session object that maintains connection state and authentication credentials for making API calls to the FileCloud server. This session should already be authenticated before calling this function.
remote_path: A string representing the path to the directory on the FileCloud server to sync from. Should be in the format expected by the FileCloud API (e.g., '/path/to/directory'). This is the source directory that will be mirrored locally.
local_path: A pathlib.Path object representing the local filesystem directory where the remote directory contents will be synced to. This directory will be created if it doesn't exist, and its contents will be updated to match the remote directory structure.
Return Value
This function does not return any value (returns None implicitly). It performs side effects by creating directories, downloading files to the local filesystem, and printing status messages to stdout. The function operates through recursive calls and file I/O operations.
Dependencies
requestsxmltodictpathlibdatetimezoneinfo
Required Imports
import os
import requests
import xmltodict
from pathlib import Path
from datetime import datetime
from zoneinfo import ZoneInfo
Usage Example
import requests
from pathlib import Path
from datetime import datetime
from zoneinfo import ZoneInfo
import xmltodict
# Define timezone
cet_timezone = ZoneInfo('CET')
# Assume get_file_list and download_file functions are defined
# Create authenticated session
session = requests.Session()
session.auth = ('username', 'password')
session.headers.update({'User-Agent': 'FileCloudSync/1.0'})
# Define paths
remote_directory = '/shared/documents'
local_directory = Path('./local_backup/documents')
# Ensure local directory exists
local_directory.mkdir(parents=True, exist_ok=True)
# Sync the directory
sync_directory(session, remote_directory, local_directory)
Best Practices
- Ensure the session object is properly authenticated before calling this function to avoid authentication errors during sync
- The local_path should be a pathlib.Path object, not a string, for proper path handling across different operating systems
- The function relies on external helper functions (get_file_list and download_file) which must be defined in the same scope
- The cet_timezone variable must be defined globally or in the calling scope using ZoneInfo
- Consider implementing error handling around the recursive calls to prevent complete failure if one subdirectory fails
- The function prints status messages to stdout; consider redirecting or capturing output if running in a non-interactive environment
- Be aware that the function compares timestamps by splitting on '+' to handle timezone differences, which may not work for all timestamp formats
- The function creates directories with parents=True and exist_ok=True, so it's safe to call on existing directory structures
- File type detection handles multiple formats ('dir', '0', 'directory' for directories and 'file', '1', None for files) to accommodate different FileCloud API versions
Tags
Similar Components
AI-powered semantic similarity - components with related functionality:
-
function create_folder 67.8% similar
-
function main_v29 66.0% similar
-
function check_filecloud_structure 59.5% similar
-
class SharePointFileCloudSync 58.5% similar
-
function test_filecloud_operations 58.0% similar