🔍 Code Extractor

function analyze_temporal_trends

Maturity: 42

Analyzes and prints temporal trends in timing issues for treatments that occur before flock start dates or after flock end dates, breaking down occurrences by year and month.

File:
/tf/active/vicechatdev/data_quality_dashboard.py
Lines:
297 - 321
Complexity:
simple

Purpose

This function provides a temporal analysis of data quality issues in livestock/poultry flock management systems. It identifies patterns in timing anomalies by examining when treatments were incorrectly recorded before a flock's start date or after its end date. The analysis helps identify systematic data entry errors, seasonal patterns, or year-over-year trends in data quality issues.

Source Code

def analyze_temporal_trends(before_start, after_end):
    """Analyze temporal trends in timing issues."""
    print("\nTEMPORAL TRENDS ANALYSIS")
    print("-" * 40)
    
    # Trends by year
    print("Before-start treatments by flock start year:")
    start_years = before_start['StartDate'].dt.year.value_counts().sort_index()
    for year, count in start_years.items():
        print(f"  {year}: {count} treatments")
    
    print("\nAfter-end treatments by flock end year:")
    end_years = after_end['EndDate'].dt.year.value_counts().sort_index()
    for year, count in end_years.items():
        print(f"  {year}: {count} treatments")
    
    # Monthly patterns
    print("\nMonthly patterns (before-start treatments):")
    if len(before_start) > 0:
        monthly = before_start['StartDate'].dt.month.value_counts().sort_index()
        months = ['Jan', 'Feb', 'Mar', 'Apr', 'May', 'Jun', 
                 'Jul', 'Aug', 'Sep', 'Oct', 'Nov', 'Dec']
        for month_num, count in monthly.items():
            month_name = months[month_num - 1]
            print(f"  {month_name}: {count} treatments")

Parameters

Name Type Default Kind
before_start - - positional_or_keyword
after_end - - positional_or_keyword

Parameter Details

before_start: A pandas DataFrame containing treatment records that occurred before their associated flock's start date. Must have a 'StartDate' column with datetime values representing the flock start dates.

after_end: A pandas DataFrame containing treatment records that occurred after their associated flock's end date. Must have an 'EndDate' column with datetime values representing the flock end dates.

Return Value

This function does not return any value (returns None implicitly). It outputs formatted text directly to stdout showing: (1) count of before-start treatments grouped by flock start year, (2) count of after-end treatments grouped by flock end year, and (3) monthly distribution of before-start treatments with month names.

Dependencies

  • pandas

Required Imports

import pandas as pd

Usage Example

import pandas as pd
from datetime import datetime

# Create sample data with timing issues
before_start_data = pd.DataFrame({
    'TreatmentID': [1, 2, 3],
    'StartDate': pd.to_datetime(['2022-01-15', '2022-03-20', '2023-02-10']),
    'TreatmentDate': pd.to_datetime(['2022-01-10', '2022-03-15', '2023-02-05'])
})

after_end_data = pd.DataFrame({
    'TreatmentID': [4, 5],
    'EndDate': pd.to_datetime(['2022-06-30', '2023-08-15']),
    'TreatmentDate': pd.to_datetime(['2022-07-05', '2023-08-20'])
})

# Analyze temporal trends
analyze_temporal_trends(before_start_data, after_end_data)

Best Practices

  • Ensure input DataFrames have datetime-typed columns before calling this function using pd.to_datetime()
  • Check that DataFrames are not empty before calling to avoid errors, especially for the monthly patterns section which checks len(before_start) > 0
  • This function prints directly to stdout, so redirect output if you need to capture results programmatically
  • The function assumes 'StartDate' column exists in before_start and 'EndDate' column exists in after_end - validate column names beforehand
  • Consider wrapping this function call in try-except blocks to handle potential KeyError if expected columns are missing
  • For large datasets, be aware that value_counts() operations may be memory-intensive

Similar Components

AI-powered semantic similarity - components with related functionality:

  • function show_critical_errors 75.7% similar

    Displays critical data quality errors in treatment records, focusing on date anomalies including 1900 dates, extreme future dates, and extreme past dates relative to flock lifecycles.

    From: /tf/active/vicechatdev/data_quality_dashboard.py
  • function create_data_quality_dashboard_v1 75.4% similar

    Creates an interactive data quality dashboard for analyzing treatment timing issues in poultry flock management data by loading and processing CSV files containing timing anomalies.

    From: /tf/active/vicechatdev/data_quality_dashboard.py
  • function create_data_quality_dashboard 74.8% similar

    Creates an interactive command-line dashboard for analyzing data quality issues in treatment timing data, specifically focusing on treatments administered outside of flock lifecycle dates.

    From: /tf/active/vicechatdev/data_quality_dashboard.py
  • function show_problematic_flocks 73.2% similar

    Analyzes and displays problematic flocks from a dataset by identifying those with systematic timing issues in their treatment records, categorizing them by severity and volume.

    From: /tf/active/vicechatdev/data_quality_dashboard.py
  • function analyze_flock_type_patterns 72.2% similar

    Analyzes and prints timing pattern statistics for flock data by categorizing issues that occur before start time and after end time, grouped by flock type.

    From: /tf/active/vicechatdev/data_quality_dashboard.py
← Back to Browse