function validate_azure_token_v1
Validates an Azure AD token by parsing the JWT id_token and extracting user information such as user ID, email, name, and preferred username.
/tf/active/vicechatdev/docchat/auth/azure_auth.py
168 - 231
moderate
Purpose
This function is designed to validate Azure AD authentication tokens received from OAuth flows and extract user profile information from the JWT id_token claims. It handles JWT parsing without full cryptographic validation, decodes the base64-encoded payload, and extracts standard Azure AD claims (oid, email, upn, name, preferred_username). If the id_token is missing or invalid, it returns fallback user information or None on critical errors. This is typically used after obtaining tokens from Azure AD to identify and authenticate users in web applications.
Source Code
def validate_azure_token(token_data: dict) -> dict:
"""
Validate an Azure AD token and extract user information.
Parameters:
token_data (dict): Token data from get_token_from_code
Returns:
dict: User information extracted from token
"""
try:
logger.info(f"Validating Azure token")
if not token_data or 'access_token' not in token_data:
logger.error("Invalid token data - missing access_token")
return None
# Parse the ID token to extract user information
if 'id_token' in token_data:
# Parse the JWT without full validation
jwt_parts = token_data['id_token'].split('.')
if len(jwt_parts) != 3:
logger.error("Invalid JWT format in id_token")
return None
# Decode the payload (second part)
payload_bytes = jwt_parts[1].encode('utf-8')
# Add padding if needed
padding_needed = len(payload_bytes) % 4
if padding_needed:
payload_bytes += b'=' * (4 - padding_needed)
try:
decoded_bytes = base64.urlsafe_b64decode(payload_bytes)
claims = json.loads(decoded_bytes.decode('utf-8'))
# Extract user info from claims
user_info = {
'user_id': claims.get('oid', claims.get('sub', '')), # Object ID or Subject
'email': claims.get('email', claims.get('upn', claims.get('preferred_username', ''))),
'name': claims.get('name', ''),
'preferred_username': claims.get('preferred_username', '')
}
logger.info(f"Successfully extracted user info: {user_info['email']}")
return user_info
except Exception as decode_error:
logger.error(f"Error decoding id_token: {decode_error}")
return None
# If no id_token, use minimal user info
logger.warning("No id_token found, using minimal user info")
return {
'user_id': 'unknown',
'email': 'unknown@example.com',
'name': 'Unknown User',
'preferred_username': 'unknown'
}
except Exception as e:
logger.error(f"Error validating Azure token: {e}", exc_info=True)
return None
Parameters
| Name | Type | Default | Kind |
|---|---|---|---|
token_data |
dict | - | positional_or_keyword |
Parameter Details
token_data: A dictionary containing Azure AD token information, expected to have 'access_token' and optionally 'id_token' keys. The 'id_token' should be a JWT (JSON Web Token) string in standard three-part format (header.payload.signature). This is typically the response from an Azure AD token endpoint or from a function like get_token_from_code.
Return Value
Type: dict
Returns a dictionary containing user information with keys: 'user_id' (Azure object ID or subject), 'email' (user's email address), 'name' (display name), and 'preferred_username' (username). Returns None if token validation fails critically (missing access_token, invalid JWT format, or decoding errors). If id_token is missing but access_token exists, returns a dictionary with placeholder values ('unknown', 'unknown@example.com', 'Unknown User').
Dependencies
base64jsonloggingmsal
Required Imports
import base64
import json
import logging
Conditional/Optional Imports
These imports are only needed under specific conditions:
import msal
Condition: Required for Azure AD authentication context, though not directly used in this function it's part of the module dependencies
Required (conditional)Usage Example
import base64
import json
import logging
# Setup logger
logger = logging.getLogger(__name__)
logger.setLevel(logging.INFO)
handler = logging.StreamHandler()
logger.addHandler(handler)
# Example token data from Azure AD
token_data = {
'access_token': 'eyJ0eXAiOiJKV1QiLCJhbGc...',
'id_token': 'eyJ0eXAiOiJKV1QiLCJhbGciOiJSUzI1NiIsImtpZCI6Ik1yNS1BVWliZkJpaTdOZDFqQmViYXhib1hXMCJ9.eyJvaWQiOiIxMjM0NTY3OC0xMjM0LTEyMzQtMTIzNC0xMjM0NTY3ODkwYWIiLCJlbWFpbCI6InVzZXJAZXhhbXBsZS5jb20iLCJuYW1lIjoiSm9obiBEb2UiLCJwcmVmZXJyZWRfdXNlcm5hbWUiOiJqb2huLmRvZUBleGFtcGxlLmNvbSJ9.signature'
}
# Validate token and extract user info
user_info = validate_azure_token(token_data)
if user_info:
print(f"User ID: {user_info['user_id']}")
print(f"Email: {user_info['email']}")
print(f"Name: {user_info['name']}")
print(f"Username: {user_info['preferred_username']}")
else:
print("Token validation failed")
Best Practices
- This function performs JWT parsing without cryptographic signature validation, which is acceptable for extracting claims but should not be used as the sole security mechanism
- Always ensure the token_data comes from a trusted source (e.g., directly from Azure AD token endpoint)
- The function returns None on critical errors, so always check the return value before using user_info
- A logger instance must be configured in the module scope before calling this function
- The function prioritizes 'oid' over 'sub' for user_id and 'email' over 'upn' over 'preferred_username' for email extraction, following Azure AD claim hierarchy
- Consider implementing full JWT validation with signature verification for production security-critical applications
- Handle the fallback case where minimal user info is returned when id_token is missing
Tags
Similar Components
AI-powered semantic similarity - components with related functionality:
-
function validate_azure_token 96.0% similar
-
function test_azure_token 60.4% similar
-
function validate_azure_client_secret 55.9% similar
-
function auth_callback_v2 54.2% similar
-
function azure_callback 53.7% similar