analyze_structure - Code Extractor

function analyze_structure

Maturity: 44

Analyzes and reports on the folder structure of a SharePoint site, displaying folder paths, file counts, and searching for expected folder patterns.

File:
/tf/active/vicechatdev/SPFCsync/analyze_structure.py

Lines:
10 - 92

Complexity:
moderate

Purpose

This function connects to a SharePoint site via Microsoft Graph API, retrieves all documents from the root directory, analyzes the folder structure, and provides a detailed report including: total item count, unique folder paths with file counts, sample files in each folder, and searches for specific expected folder patterns (like numbered folders 01-08 and named folders like UCJ, Toxicology, CMC, etc.). It's useful for auditing SharePoint document libraries, understanding folder organization, and verifying expected folder structures exist.

Source Code

def analyze_structure():
    """Analyze the current folder structure"""
    config = Config()
    
    try:
        client = SharePointGraphClient(
            site_url=config.SHAREPOINT_SITE_URL,
            client_id=config.AZURE_CLIENT_ID,
            client_secret=config.AZURE_CLIENT_SECRET
        )
        
        print("✅ SharePoint Graph client initialized successfully")
        print(f"Site ID: {client.site_id}")
        print(f"Drive ID: {client.drive_id}")
        print()
        
    except Exception as e:
        print(f"❌ Failed to initialize client: {e}")
        return
    
    print("🔍 ANALYZING CURRENT FOLDER STRUCTURE")
    print("=" * 60)
    
    # Get all documents and analyze their paths
    try:
        documents = client.get_all_documents("/")
        print(f"✅ Found {len(documents)} total items")
        print()
        
        # Analyze folder distribution
        folder_paths = set()
        file_by_folder = {}
        
        for doc in documents:
            folder_path = doc.get('folder_path', 'Unknown')
            folder_paths.add(folder_path)
            
            if folder_path not in file_by_folder:
                file_by_folder[folder_path] = []
            file_by_folder[folder_path].append(doc)
        
        print(f"📁 Found {len(folder_paths)} unique folder paths:")
        print("-" * 40)
        
        for folder_path in sorted(folder_paths):
            files_in_folder = len(file_by_folder[folder_path])
            print(f"📁 {folder_path}: {files_in_folder} files")
            
            # Show first few files as examples
            if files_in_folder > 0:
                example_files = file_by_folder[folder_path][:3]
                for file_info in example_files:
                    print(f"   📄 {file_info.get('name', 'Unknown')}")
                if files_in_folder > 3:
                    print(f"   ... and {files_in_folder - 3} more files")
            print()
        
        # Look for specific patterns
        print("\n🔍 SEARCHING FOR EXPECTED FOLDER PATTERNS")
        print("-" * 50)
        
        expected_patterns = [
            "01", "02", "03", "04", "05", "06", "07", "08",
            "UCJ", "Toxicology", "CMC", "Quality", "Clinical", 
            "Regulatory", "Marketing", "Manufacturing"
        ]
        
        for pattern in expected_patterns:
            matching_folders = [path for path in folder_paths if pattern.lower() in path.lower()]
            matching_files = [doc for doc in documents if pattern.lower() in doc.get('name', '').lower()]
            
            if matching_folders or matching_files:
                print(f"🔍 Pattern '{pattern}':")
                if matching_folders:
                    print(f"   📁 Folders: {matching_folders}")
                if matching_files:
                    print(f"   📄 Files: {len(matching_files)} files contain this pattern")
                print()
        
    except Exception as e:
        print(f"❌ Failed to get documents: {e}")
        import traceback
        traceback.print_exc()

Return Value

This function does not return any value (implicitly returns None). It prints analysis results directly to stdout, including success/error messages, folder structure information, file counts, and pattern matching results.

Dependencies

sharepoint_graph_client
config
traceback

Required Imports

from sharepoint_graph_client import SharePointGraphClient
from config import Config

Conditional/Optional Imports

These imports are only needed under specific conditions:

import traceback

Condition: only used when an exception occurs during document retrieval to print detailed error information

Optional

Usage Example

# Ensure config.py exists with required settings
# Example config.py:
# class Config:
#     SHAREPOINT_SITE_URL = 'https://yourtenant.sharepoint.com/sites/yoursite'
#     AZURE_CLIENT_ID = 'your-client-id'
#     AZURE_CLIENT_SECRET = 'your-client-secret'

# Run the analysis
analyze_structure()

# Output will be printed to console showing:
# - SharePoint connection status
# - Total number of items found
# - List of folder paths with file counts
# - Sample files in each folder
# - Pattern matching results for expected folders

Best Practices

Ensure Azure AD application has appropriate SharePoint permissions (Sites.Read.All or Sites.ReadWrite.All) before running
The function prints output directly to stdout, so redirect output if you need to capture results programmatically
Handle the case where the function returns early (None) if client initialization fails
The function analyzes all documents from root ('/'), which may be slow for large SharePoint sites with many files
Expected patterns list can be customized by modifying the expected_patterns list in the source code
Error handling is built-in but basic - consider wrapping calls in try-except for production use
The function shows only the first 3 files per folder as examples to avoid cluttering output

Similar Components

AI-powered semantic similarity - components with related functionality:

function test_folder_structure 80.9% similar

Tests SharePoint folder structure by listing root-level folders, displaying their contents, and providing a summary of total folders and documents.
From: /tf/active/vicechatdev/SPFCsync/test_folder_structure.py
function search_and_locate 74.9% similar

Searches for specific numbered folders (01-08) in a SharePoint site and traces their locations, contents, and file distributions by type.
From: /tf/active/vicechatdev/SPFCsync/search_detailed.py
function search_for_folders 74.8% similar

Searches for specific predefined folders in a SharePoint site using Microsoft Graph API and prints the search results with their locations.
From: /tf/active/vicechatdev/SPFCsync/diagnostic_comprehensive.py
function explore_site_structure 73.6% similar

Explores and displays the complete structure of a SharePoint site using Microsoft Graph API, including drives, document libraries, lists, and alternative API endpoints.
From: /tf/active/vicechatdev/SPFCsync/diagnostic_comprehensive.py
function compare_with_expected_folders 72.7% similar

Compares SharePoint folders found via Microsoft Graph API against a predefined list of expected folder names from a reference screenshot, reporting matches, missing folders, and additional folders.
From: /tf/active/vicechatdev/SPFCsync/test_folder_structure.py

← Back to Browse

Assistant

Hi! I can help improve this code. Tell me what you'd like to enhance (e.g., "add error handling", "optimize performance", "improve readability", "add type hints").

Code Comparison

Original Code

                            def analyze_structure():
    """Analyze the current folder structure"""
    config = Config()
    
    try:
        client = SharePointGraphClient(
            site_url=config.SHAREPOINT_SITE_URL,
            client_id=config.AZURE_CLIENT_ID,
            client_secret=config.AZURE_CLIENT_SECRET
        )
        
        print("✅ SharePoint Graph client initialized successfully")
        print(f"Site ID: {client.site_id}")
        print(f"Drive ID: {client.drive_id}")
        print()
        
    except Exception as e:
        print(f"❌ Failed to initialize client: {e}")
        return
    
    print("🔍 ANALYZING CURRENT FOLDER STRUCTURE")
    print("=" * 60)
    
    # Get all documents and analyze their paths
    try:
        documents = client.get_all_documents("/")
        print(f"✅ Found {len(documents)} total items")
        print()
        
        # Analyze folder distribution
        folder_paths = set()
        file_by_folder = {}
        
        for doc in documents:
            folder_path = doc.get('folder_path', 'Unknown')
            folder_paths.add(folder_path)
            
            if folder_path not in file_by_folder:
                file_by_folder[folder_path] = []
            file_by_folder[folder_path].append(doc)
        
        print(f"📁 Found {len(folder_paths)} unique folder paths:")
        print("-" * 40)
        
        for folder_path in sorted(folder_paths):
            files_in_folder = len(file_by_folder[folder_path])
            print(f"📁 {folder_path}: {files_in_folder} files")
            
            # Show first few files as examples
            if files_in_folder > 0:
                example_files = file_by_folder[folder_path][:3]
                for file_info in example_files:
                    print(f"   📄 {file_info.get('name', 'Unknown')}")
                if files_in_folder > 3:
                    print(f"   ... and {files_in_folder - 3} more files")
            print()
        
        # Look for specific patterns
        print("\n🔍 SEARCHING FOR EXPECTED FOLDER PATTERNS")
        print("-" * 50)
        
        expected_patterns = [
            "01", "02", "03", "04", "05", "06", "07", "08",
            "UCJ", "Toxicology", "CMC", "Quality", "Clinical", 
            "Regulatory", "Marketing", "Manufacturing"
        ]
        
        for pattern in expected_patterns:
            matching_folders = [path for path in folder_paths if pattern.lower() in path.lower()]
            matching_files = [doc for doc in documents if pattern.lower() in doc.get('name', '').lower()]
            
            if matching_folders or matching_files:
                print(f"🔍 Pattern '{pattern}':")
                if matching_folders:
                    print(f"   📁 Folders: {matching_folders}")
                if matching_files:
                    print(f"   📄 Files: {len(matching_files)} files contain this pattern")
                print()
        
    except Exception as e:
        print(f"❌ Failed to get documents: {e}")
        import traceback
        traceback.print_exc()
                        

Improved Code

🔍 Code Extractor

function analyze_structure

Purpose

Source Code

Return Value

Dependencies

Required Imports

Conditional/Optional Imports

Usage Example

Best Practices

Tags

Similar Components

function test_folder_structure 80.9% similar

function search_and_locate 74.9% similar

function search_for_folders 74.8% similar

function explore_site_structure 73.6% similar

function compare_with_expected_folders 72.7% similar

function analyze_structure

Purpose

Source Code

Return Value

Dependencies

Required Imports

Conditional/Optional Imports

Usage Example

Best Practices

Tags

Similar Components

function test_folder_structure 80.9% similar

function search_and_locate 74.9% similar

function search_for_folders 74.8% similar

function explore_site_structure 73.6% similar

function compare_with_expected_folders 72.7% similar

✨ Improve Code: analyze_structure

Code Comparison