šŸ” Code Extractor

function search_and_locate

Maturity: 47

Searches for specific numbered folders (01-08) in a SharePoint site and traces their locations, contents, and file distributions by type.

File:
/tf/active/vicechatdev/SPFCsync/search_detailed.py
Lines:
10 - 125
Complexity:
moderate

Purpose

This diagnostic function performs comprehensive searches across a SharePoint site to locate expected organizational folders (Research, Toxicology, CMC, Quality, Clinical, Regulatory, Marketing, Manufacturing) and analyze file distributions. It provides detailed output about folder locations, contents, and file type distributions to help troubleshoot missing folders or understand SharePoint site structure.

Source Code

def search_and_locate():
    """Search for specific folders and trace their locations"""
    config = Config()
    
    try:
        client = SharePointGraphClient(
            site_url=config.SHAREPOINT_SITE_URL,
            client_id=config.AZURE_CLIENT_ID,
            client_secret=config.AZURE_CLIENT_SECRET
        )
        
        print("āœ… SharePoint Graph client initialized successfully")
        
    except Exception as e:
        print(f"āŒ Failed to initialize client: {e}")
        return
    
    print("šŸ” DETAILED FOLDER LOCATION SEARCH")
    print("=" * 60)
    
    # List of folders we expect to find
    expected_folders = [
        "01 UCJ Research",
        "02 Toxicology", 
        "03 CMC",
        "04 Quality",
        "05 Clinical",
        "06 Regulatory",
        "07 Marketing",
        "08 Manufacturing"
    ]
    
    for folder_name in expected_folders:
        print(f"\nšŸ“ Searching for: '{folder_name}'")
        print("-" * 40)
        
        # Search using Graph API
        search_url = f"{client.graph_base_url}/sites/{client.site_id}/drive/root/search(q='{folder_name}')"
        
        try:
            response = client.session.get(search_url)
            if response.status_code == 200:
                results = response.json()
                
                if 'value' in results and results['value']:
                    print(f"āœ… Found {len(results['value'])} items")
                    
                    for item in results['value']:
                        item_type = "šŸ“ Folder" if 'folder' in item else "šŸ“„ File"
                        name = item.get('name', 'Unknown')
                        parent_path = item.get('parentReference', {}).get('path', 'Unknown path')
                        web_url = item.get('webUrl', 'No URL')
                        
                        print(f"  {item_type}: {name}")
                        print(f"    Path: {parent_path}")
                        print(f"    URL: {web_url}")
                        
                        # If it's a folder, try to get its contents
                        if 'folder' in item:
                            folder_id = item['id']
                            children_url = f"{client.graph_base_url}/sites/{client.site_id}/drive/items/{folder_id}/children"
                            try:
                                children_response = client.session.get(children_url)
                                if children_response.status_code == 200:
                                    children = children_response.json()
                                    if 'value' in children:
                                        child_count = len(children['value'])
                                        folders = sum(1 for child in children['value'] if 'folder' in child)
                                        files = sum(1 for child in children['value'] if 'file' in child)
                                        print(f"    Contents: {folders} folders, {files} files (total: {child_count})")
                            except Exception as e:
                                print(f"    āš ļø  Couldn't access folder contents: {e}")
                        
                        print()
                else:
                    print("āŒ No results found")
            else:
                print(f"āŒ Search failed: {response.status_code} - {response.text}")
                
        except Exception as e:
            print(f"āŒ Search error: {e}")
    
    # Also try searching for common file types to see if we find files from missing folders
    print("\n\nšŸ” SEARCHING FOR FILES BY TYPE")
    print("=" * 40)
    
    file_types = ['docx', 'pdf', 'xlsx', 'pptx']
    
    for file_type in file_types:
        print(f"\nšŸ“„ Searching for .{file_type} files...")
        search_url = f"{client.graph_base_url}/sites/{client.site_id}/drive/root/search(q='*.{file_type}')"
        
        try:
            response = client.session.get(search_url)
            if response.status_code == 200:
                results = response.json()
                
                if 'value' in results and results['value']:
                    print(f"āœ… Found {len(results['value'])} .{file_type} files")
                    
                    # Group by parent path to see folder distribution
                    path_counts = {}
                    for item in results['value']:
                        parent_path = item.get('parentReference', {}).get('path', 'Unknown')
                        path_counts[parent_path] = path_counts.get(parent_path, 0) + 1
                    
                    print("  Distribution by folder:")
                    for path, count in sorted(path_counts.items()):
                        print(f"    {count} files in: {path}")
                else:
                    print("āŒ No files found")
            else:
                print(f"āŒ Search failed: {response.status_code}")
                
        except Exception as e:
            print(f"āŒ Search error: {e}")

Return Value

This function returns None. It performs side effects by printing detailed search results to stdout, including folder locations, contents (folder/file counts), web URLs, and file type distributions across the SharePoint site.

Dependencies

  • sharepoint_graph_client
  • config
  • json
  • requests

Required Imports

import json
from sharepoint_graph_client import SharePointGraphClient
from config import Config

Usage Example

# Ensure config.py exists with required settings:
# class Config:
#     SHAREPOINT_SITE_URL = 'https://yourtenant.sharepoint.com/sites/yoursite'
#     AZURE_CLIENT_ID = 'your-client-id'
#     AZURE_CLIENT_SECRET = 'your-client-secret'

from sharepoint_graph_client import SharePointGraphClient
from config import Config
import json

# Simply call the function - it handles all initialization and output
search_and_locate()

# The function will print:
# - Search results for each expected folder (01-08)
# - Folder paths and web URLs
# - Contents of found folders (file/folder counts)
# - Distribution of common file types (.docx, .pdf, .xlsx, .pptx) across folders

Best Practices

  • This is a diagnostic/debugging function intended for interactive use, not production automation
  • Ensure Azure AD credentials have sufficient permissions (Sites.Read.All minimum) before running
  • The function makes multiple API calls and may take time on large SharePoint sites
  • Output is printed to stdout - redirect or capture if logging is needed
  • The hardcoded folder list (01-08) should be modified if searching for different organizational structures
  • Consider rate limiting when running against large SharePoint sites to avoid throttling
  • The function does not handle pagination - results may be limited to first page of search results
  • Error handling is present but errors are printed rather than raised, making this unsuitable for automated pipelines

Similar Components

AI-powered semantic similarity - components with related functionality:

  • function search_for_folders 78.5% similar

    Searches for specific predefined folders in a SharePoint site using Microsoft Graph API and prints the search results with their locations.

    From: /tf/active/vicechatdev/SPFCsync/diagnostic_comprehensive.py
  • function analyze_structure 74.9% similar

    Analyzes and reports on the folder structure of a SharePoint site, displaying folder paths, file counts, and searching for expected folder patterns.

    From: /tf/active/vicechatdev/SPFCsync/analyze_structure.py
  • function main_v35 70.2% similar

    A diagnostic function that explores SharePoint site structure to investigate why only 2 folders are visible when more are expected in the web interface.

    From: /tf/active/vicechatdev/SPFCsync/diagnostic_comprehensive.py
  • function test_folder_structure 69.9% similar

    Tests SharePoint folder structure by listing root-level folders, displaying their contents, and providing a summary of total folders and documents.

    From: /tf/active/vicechatdev/SPFCsync/test_folder_structure.py
  • function explore_alternative_endpoints 68.3% similar

    Tests multiple Microsoft Graph API endpoints to locate missing folders in a SharePoint drive by trying different URL patterns and searching for expected folders.

    From: /tf/active/vicechatdev/SPFCsync/diagnostic_comprehensive.py
← Back to Browse