🔍 Code Extractor

function validate_azure_token

Maturity: 55

Validates an Azure AD token by decoding the JWT id_token and extracting user information such as email, name, and object ID.

File:
/tf/active/vicechatdev/CDocs/auth/azure_auth.py
Lines:
196 - 269
Complexity:
moderate

Purpose

This function processes Azure AD authentication tokens to extract and validate user identity information. It decodes the JWT id_token payload without cryptographic validation (noted as suitable for testing only), extracts user claims like email, name, and Azure AD object ID, and returns structured user information. If the id_token is missing, it falls back to minimal user info from the access token data. This is typically used in OAuth2/OpenID Connect authentication flows with Azure AD.

Source Code

def validate_azure_token(token_data: dict) -> dict:
    """
    Validate an Azure AD token and extract user information.
    
    Parameters
    ----------
    token_data : dict
        Token data from get_token_from_code
    
    Returns
    -------
    dict
        User information extracted from token
    """
    try:
        logger.info(f"Validating Azure token, keys: {token_data.keys()}")
        
        if not token_data or 'access_token' not in token_data:
            logger.error("Invalid token data - missing access_token")
            return None
            
        # For simple validation, just decode the ID token
        if 'id_token' in token_data:
            # Parse the JWT without validation (for testing only)
            # In production, you should properly validate the token signature
            import base64
            import json
            
            # Parse the JWT parts
            jwt_parts = token_data['id_token'].split('.')
            if len(jwt_parts) != 3:
                logger.error("Invalid JWT format in id_token")
                return None
                
            # Decode the payload (second part)
            payload_bytes = jwt_parts[1].encode('utf-8')
            
            # Add padding if needed
            padding_needed = len(payload_bytes) % 4
            if padding_needed:
                payload_bytes += b'=' * (4 - padding_needed)
                
            try:
                decoded_bytes = base64.urlsafe_b64decode(payload_bytes)
                claims = json.loads(decoded_bytes.decode('utf-8'))
                
                # Extract user info from claims
                user_info = {
                    'email': claims.get('email', claims.get('upn', '')),
                    'name': claims.get('name', ''),
                    'oid': claims.get('oid', ''),  # Object ID in Azure AD
                    'preferred_username': claims.get('preferred_username', '')
                }
                
                logger.info(f"Successfully extracted user info from id_token: {user_info}")
                return user_info
                
            except Exception as decode_error:
                logger.error(f"Error decoding id_token: {decode_error}")
                return None
        
        # If no id_token, try to use the access token info
        logger.warning("No id_token found, using minimal user info")
        return {
            'email': token_data.get('username', 'unknown@example.com'),
            'name': 'Unknown User',
            'token_type': token_data.get('token_type', '')
        }
    
    except Exception as e:
        logger.error(f"Error validating Azure token: {e}")
        import traceback
        logger.error(traceback.format_exc())
        return None

Parameters

Name Type Default Kind
token_data dict - positional_or_keyword

Parameter Details

token_data: A dictionary containing Azure AD token information, expected to be returned from a token exchange function like get_token_from_code. Must contain 'access_token' key at minimum. Ideally contains 'id_token' key with a JWT string. May also contain 'username' and 'token_type' keys for fallback scenarios.

Return Value

Type: dict

Returns a dictionary containing extracted user information with keys: 'email' (user's email or UPN), 'name' (display name), 'oid' (Azure AD object ID), and 'preferred_username' (preferred username). If id_token is not present, returns minimal info with 'email', 'name', and 'token_type'. Returns None if validation fails or token_data is invalid.

Dependencies

  • base64
  • json
  • logging
  • traceback

Required Imports

import logging

Conditional/Optional Imports

These imports are only needed under specific conditions:

import base64

Condition: only when id_token is present in token_data for JWT decoding

Required (conditional)
import json

Condition: only when id_token is present in token_data for parsing JWT payload

Required (conditional)
import traceback

Condition: only when an exception occurs for detailed error logging

Required (conditional)

Usage Example

import logging
import base64
import json
import traceback

logger = logging.getLogger(__name__)
logger.setLevel(logging.INFO)
handler = logging.StreamHandler()
logger.addHandler(handler)

# Example token data from Azure AD
token_data = {
    'access_token': 'eyJ0eXAiOiJKV1QiLCJhbGc...',
    'id_token': 'eyJ0eXAiOiJKV1QiLCJhbGciOiJSUzI1NiIsImtpZCI6Ik...',
    'token_type': 'Bearer',
    'expires_in': 3600
}

user_info = validate_azure_token(token_data)

if user_info:
    print(f"User Email: {user_info.get('email')}")
    print(f"User Name: {user_info.get('name')}")
    print(f"Azure AD OID: {user_info.get('oid')}")
else:
    print("Token validation failed")

Best Practices

  • WARNING: This function does NOT cryptographically validate the JWT signature. The comment explicitly states this is for testing only. In production, use proper JWT validation libraries like PyJWT or python-jose to verify token signatures.
  • Ensure the logger instance is properly configured before calling this function, as it relies heavily on logging for debugging and error tracking.
  • The function returns None on any validation failure, so always check the return value before using the user information.
  • For production use, implement proper JWT signature validation using Azure AD's public keys from the JWKS endpoint.
  • The function handles missing padding in base64-encoded JWT payloads, which is common in JWT implementations.
  • Consider implementing token expiration checking by validating the 'exp' claim in the JWT payload.
  • The fallback mechanism (when id_token is missing) provides minimal user info and should not be relied upon for secure user identification.

Similar Components

AI-powered semantic similarity - components with related functionality:

  • function validate_azure_token_v1 96.0% similar

    Validates an Azure AD token by parsing the JWT id_token and extracting user information such as user ID, email, name, and preferred username.

    From: /tf/active/vicechatdev/docchat/auth/azure_auth.py
  • function test_azure_token 60.0% similar

    Tests Azure AD authentication by attempting to acquire an OAuth2 access token using client credentials flow for Microsoft Graph API access.

    From: /tf/active/vicechatdev/SPFCsync/diagnose_sharepoint.py
  • function validate_azure_client_secret 56.3% similar

    Validates an Azure client secret by checking for placeholder values, minimum length requirements, and common invalid patterns.

    From: /tf/active/vicechatdev/SPFCsync/validate_config.py
  • function auth_callback_v2 53.3% similar

    Flask route handler that processes OAuth 2.0 callback from Azure AD, exchanges authorization code for access tokens, and establishes user session.

    From: /tf/active/vicechatdev/vice_ai/app.py
  • function azure_callback 52.8% similar

    OAuth 2.0 callback endpoint for Azure AD authentication that exchanges authorization codes for access tokens and establishes user sessions.

    From: /tf/active/vicechatdev/docchat/app.py
← Back to Browse