function search_and_locate
Searches for specific numbered folders (01-08) in a SharePoint site and traces their locations, contents, and file distributions by type.
/tf/active/vicechatdev/SPFCsync/search_detailed.py
10 - 125
moderate
Purpose
This diagnostic function performs comprehensive searches across a SharePoint site to locate expected organizational folders (Research, Toxicology, CMC, Quality, Clinical, Regulatory, Marketing, Manufacturing) and analyze file distributions. It provides detailed output about folder locations, contents, and file type distributions to help troubleshoot missing folders or understand SharePoint site structure.
Source Code
def search_and_locate():
"""Search for specific folders and trace their locations"""
config = Config()
try:
client = SharePointGraphClient(
site_url=config.SHAREPOINT_SITE_URL,
client_id=config.AZURE_CLIENT_ID,
client_secret=config.AZURE_CLIENT_SECRET
)
print("ā
SharePoint Graph client initialized successfully")
except Exception as e:
print(f"ā Failed to initialize client: {e}")
return
print("š DETAILED FOLDER LOCATION SEARCH")
print("=" * 60)
# List of folders we expect to find
expected_folders = [
"01 UCJ Research",
"02 Toxicology",
"03 CMC",
"04 Quality",
"05 Clinical",
"06 Regulatory",
"07 Marketing",
"08 Manufacturing"
]
for folder_name in expected_folders:
print(f"\nš Searching for: '{folder_name}'")
print("-" * 40)
# Search using Graph API
search_url = f"{client.graph_base_url}/sites/{client.site_id}/drive/root/search(q='{folder_name}')"
try:
response = client.session.get(search_url)
if response.status_code == 200:
results = response.json()
if 'value' in results and results['value']:
print(f"ā
Found {len(results['value'])} items")
for item in results['value']:
item_type = "š Folder" if 'folder' in item else "š File"
name = item.get('name', 'Unknown')
parent_path = item.get('parentReference', {}).get('path', 'Unknown path')
web_url = item.get('webUrl', 'No URL')
print(f" {item_type}: {name}")
print(f" Path: {parent_path}")
print(f" URL: {web_url}")
# If it's a folder, try to get its contents
if 'folder' in item:
folder_id = item['id']
children_url = f"{client.graph_base_url}/sites/{client.site_id}/drive/items/{folder_id}/children"
try:
children_response = client.session.get(children_url)
if children_response.status_code == 200:
children = children_response.json()
if 'value' in children:
child_count = len(children['value'])
folders = sum(1 for child in children['value'] if 'folder' in child)
files = sum(1 for child in children['value'] if 'file' in child)
print(f" Contents: {folders} folders, {files} files (total: {child_count})")
except Exception as e:
print(f" ā ļø Couldn't access folder contents: {e}")
print()
else:
print("ā No results found")
else:
print(f"ā Search failed: {response.status_code} - {response.text}")
except Exception as e:
print(f"ā Search error: {e}")
# Also try searching for common file types to see if we find files from missing folders
print("\n\nš SEARCHING FOR FILES BY TYPE")
print("=" * 40)
file_types = ['docx', 'pdf', 'xlsx', 'pptx']
for file_type in file_types:
print(f"\nš Searching for .{file_type} files...")
search_url = f"{client.graph_base_url}/sites/{client.site_id}/drive/root/search(q='*.{file_type}')"
try:
response = client.session.get(search_url)
if response.status_code == 200:
results = response.json()
if 'value' in results and results['value']:
print(f"ā
Found {len(results['value'])} .{file_type} files")
# Group by parent path to see folder distribution
path_counts = {}
for item in results['value']:
parent_path = item.get('parentReference', {}).get('path', 'Unknown')
path_counts[parent_path] = path_counts.get(parent_path, 0) + 1
print(" Distribution by folder:")
for path, count in sorted(path_counts.items()):
print(f" {count} files in: {path}")
else:
print("ā No files found")
else:
print(f"ā Search failed: {response.status_code}")
except Exception as e:
print(f"ā Search error: {e}")
Return Value
This function returns None. It performs side effects by printing detailed search results to stdout, including folder locations, contents (folder/file counts), web URLs, and file type distributions across the SharePoint site.
Dependencies
sharepoint_graph_clientconfigjsonrequests
Required Imports
import json
from sharepoint_graph_client import SharePointGraphClient
from config import Config
Usage Example
# Ensure config.py exists with required settings:
# class Config:
# SHAREPOINT_SITE_URL = 'https://yourtenant.sharepoint.com/sites/yoursite'
# AZURE_CLIENT_ID = 'your-client-id'
# AZURE_CLIENT_SECRET = 'your-client-secret'
from sharepoint_graph_client import SharePointGraphClient
from config import Config
import json
# Simply call the function - it handles all initialization and output
search_and_locate()
# The function will print:
# - Search results for each expected folder (01-08)
# - Folder paths and web URLs
# - Contents of found folders (file/folder counts)
# - Distribution of common file types (.docx, .pdf, .xlsx, .pptx) across folders
Best Practices
- This is a diagnostic/debugging function intended for interactive use, not production automation
- Ensure Azure AD credentials have sufficient permissions (Sites.Read.All minimum) before running
- The function makes multiple API calls and may take time on large SharePoint sites
- Output is printed to stdout - redirect or capture if logging is needed
- The hardcoded folder list (01-08) should be modified if searching for different organizational structures
- Consider rate limiting when running against large SharePoint sites to avoid throttling
- The function does not handle pagination - results may be limited to first page of search results
- Error handling is present but errors are printed rather than raised, making this unsuitable for automated pipelines
Tags
Similar Components
AI-powered semantic similarity - components with related functionality:
-
function search_for_folders 78.5% similar
-
function analyze_structure 74.9% similar
-
function main_v35 70.2% similar
-
function test_folder_structure 69.9% similar
-
function explore_alternative_endpoints 68.3% similar