🔍 Code Extractor

function parse_log_line

Maturity: 44

Parses a structured log line string and extracts timestamp, logger name, log level, and message components into a dictionary.

File:
/tf/active/vicechatdev/SPFCsync/monitor.py
Lines:
15 - 34
Complexity:
simple

Purpose

This function is designed to parse log lines that follow a specific format (timestamp - logger_name - level - message) and convert them into structured data. It's useful for log analysis, monitoring systems, and log aggregation tools where raw log strings need to be converted into queryable data structures. The function handles malformed lines gracefully by returning None when the pattern doesn't match or timestamp parsing fails.

Source Code

def parse_log_line(line):
    """Parse a log line and extract information."""
    # Expected format: timestamp - name - level - message
    pattern = r'(\d{4}-\d{2}-\d{2} \d{2}:\d{2}:\d{2},\d{3}) - (.*?) - (\w+) - (.*)'
    match = re.match(pattern, line.strip())
    
    if match:
        timestamp_str, logger_name, level, message = match.groups()
        try:
            timestamp = datetime.strptime(timestamp_str, '%Y-%m-%d %H:%M:%S,%f')
            return {
                'timestamp': timestamp,
                'logger': logger_name,
                'level': level,
                'message': message
            }
        except ValueError:
            pass
    
    return None

Parameters

Name Type Default Kind
line - - positional_or_keyword

Parameter Details

line: A string representing a single log line. Expected format: 'YYYY-MM-DD HH:MM:SS,mmm - logger_name - LEVEL - message'. The function will strip whitespace from the line before processing. Can be any string, but will only successfully parse if it matches the expected log format.

Return Value

Returns a dictionary with keys 'timestamp' (datetime object), 'logger' (string), 'level' (string), and 'message' (string) if the log line is successfully parsed. Returns None if the line doesn't match the expected pattern or if the timestamp cannot be parsed. The timestamp is converted from string to a datetime object using the format '%Y-%m-%d %H:%M:%S,%f'.

Dependencies

  • re
  • datetime

Required Imports

import re
from datetime import datetime

Usage Example

import re
from datetime import datetime

def parse_log_line(line):
    pattern = r'(\d{4}-\d{2}-\d{2} \d{2}:\d{2}:\d{2},\d{3}) - (.*?) - (\w+) - (.*)'
    match = re.match(pattern, line.strip())
    
    if match:
        timestamp_str, logger_name, level, message = match.groups()
        try:
            timestamp = datetime.strptime(timestamp_str, '%Y-%m-%d %H:%M:%S,%f')
            return {
                'timestamp': timestamp,
                'logger': logger_name,
                'level': level,
                'message': message
            }
        except ValueError:
            pass
    
    return None

# Example usage
log_line = '2024-01-15 14:30:45,123 - myapp.module - ERROR - Connection timeout'
result = parse_log_line(log_line)
if result:
    print(f"Timestamp: {result['timestamp']}")
    print(f"Logger: {result['logger']}")
    print(f"Level: {result['level']}")
    print(f"Message: {result['message']}")
else:
    print("Failed to parse log line")

# Example with invalid line
invalid_line = 'This is not a valid log line'
result = parse_log_line(invalid_line)
print(result)  # Output: None

Best Practices

  • Always check if the return value is None before accessing dictionary keys to avoid AttributeError
  • The function expects milliseconds in the timestamp (3 digits after comma). Ensure your log format matches this expectation
  • The regex pattern uses non-greedy matching (.*?) for the logger name to correctly parse logs with multiple hyphens
  • The function strips whitespace from input, so leading/trailing spaces won't cause parsing failures
  • For batch processing of log files, consider wrapping this function in error handling to continue processing even if individual lines fail
  • The log level is expected to be a word character sequence (\w+), typically INFO, DEBUG, ERROR, WARNING, etc.
  • If you need to parse logs with different formats, consider modifying the regex pattern or creating format-specific variants of this function

Similar Components

AI-powered semantic similarity - components with related functionality:

  • function parse_datetime 49.0% similar

    Parses a datetime string in YYYY-MM-DD HH:MM:SS format into a Python datetime object, returning None if parsing fails.

    From: /tf/active/vicechatdev/CDocs/utils/__init__.py
  • function parse_date 46.2% similar

    Parses a date string in YYYY-MM-DD format into a datetime object, returning None if parsing fails or input is empty.

    From: /tf/active/vicechatdev/CDocs/utils/__init__.py
  • function setup_logging 45.7% similar

    Configures and initializes a Python logging system with both console and rotating file handlers, supporting customizable log levels, formats, and file rotation policies.

    From: /tf/active/vicechatdev/contract_validity_analyzer/utils/logging_utils.py
  • function setup_logging_v2 45.2% similar

    Configures Python's logging system with both console and file output, creating a timestamped log file for real document testing sessions.

    From: /tf/active/vicechatdev/contract_validity_analyzer/test_real_documents.py
  • function parse_references_section 44.6% similar

    Parses a formatted references section string and extracts structured data including reference numbers, sources, and content previews using regular expressions.

    From: /tf/active/vicechatdev/improved_convert_disclosures_to_table.py
← Back to Browse