show_problematic_flocks - Code Extractor

function show_problematic_flocks

Maturity: 42

Analyzes and displays problematic flocks from a dataset by identifying those with systematic timing issues in their treatment records, categorizing them by severity and volume.

File:
/tf/active/vicechatdev/data_quality_dashboard.py

Lines:
267 - 295

Complexity:
simple

Purpose

This function provides a diagnostic report for data quality analysis in livestock/poultry management systems. It identifies flocks with timing data entry errors by examining treatment records, highlighting both systematic issues (100% error rate) and high-volume flocks with significant but partial timing problems. This helps data managers prioritize data cleaning efforts and identify systematic data entry problems.

Source Code

def show_problematic_flocks(flocks_issues):
    """Show the most problematic flocks."""
    print("\nMOST PROBLEMATIC FLOCKS")
    print("-" * 40)
    
    # Flocks with 100% timing issues
    perfect_issues = flocks_issues[flocks_issues['TimingIssueRate'] == 1.0]
    print(f"Flocks with 100% timing issues: {len(perfect_issues)}")
    print("(These likely have systematic data entry errors)")
    
    if len(perfect_issues) > 0:
        print("\nTop 10 flocks with 100% issues (by treatment count):")
        top_perfect = perfect_issues.nlargest(10, 'TotalTreatments')
        for _, flock in top_perfect.iterrows():
            print(f"  {flock['FlockCD']}: {flock['TotalTreatments']} treatments, {flock['Type']} type")
    
    # Flocks with partial issues but high volume
    partial_issues = flocks_issues[
        (flocks_issues['TimingIssueRate'] < 1.0) & 
        (flocks_issues['TimingIssueRate'] > 0.1) &
        (flocks_issues['TotalTreatments'] >= 10)
    ]
    
    if len(partial_issues) > 0:
        print(f"\nHigh-volume flocks with significant timing issues (10+ treatments, >10% issues):")
        top_partial = partial_issues.nlargest(10, 'TotalTreatments')
        for _, flock in top_partial.iterrows():
            rate = flock['TimingIssueRate'] * 100
            print(f"  {flock['FlockCD']}: {rate:.1f}% issues ({flock['TimingIssueCount']}/{flock['TotalTreatments']} treatments)")

Parameters

Name	Type	Default	Kind
`flocks_issues`	-	-	positional_or_keyword

Parameter Details

flocks_issues: A pandas DataFrame containing flock-level aggregated data with the following expected columns: 'TimingIssueRate' (float, 0.0-1.0 representing percentage of treatments with timing issues), 'FlockCD' (string/identifier for flock code), 'TotalTreatments' (integer, total number of treatments for the flock), 'Type' (string, type/category of flock), and 'TimingIssueCount' (integer, count of treatments with timing issues). The DataFrame should be pre-computed with these aggregated metrics.

Return Value

This function returns None. It produces console output displaying: (1) count and details of flocks with 100% timing issues, showing top 10 by treatment count, (2) high-volume flocks (10+ treatments) with >10% but <100% timing issues, showing top 10 by treatment count with their issue rates and counts.

Dependencies

pandas

Required Imports

import pandas as pd

Usage Example

import pandas as pd

# Create sample flocks_issues DataFrame
flocks_issues = pd.DataFrame({
    'FlockCD': ['F001', 'F002', 'F003', 'F004', 'F005'],
    'TotalTreatments': [50, 25, 15, 100, 8],
    'TimingIssueCount': [50, 25, 3, 20, 1],
    'TimingIssueRate': [1.0, 1.0, 0.2, 0.2, 0.125],
    'Type': ['Broiler', 'Layer', 'Broiler', 'Layer', 'Broiler']
})

# Display problematic flocks report
show_problematic_flocks(flocks_issues)

Best Practices

Ensure the input DataFrame contains all required columns (TimingIssueRate, FlockCD, TotalTreatments, Type, TimingIssueCount) before calling this function
Pre-calculate TimingIssueRate as a float between 0.0 and 1.0 (not as a percentage)
This function is designed for console output; redirect stdout if you need to capture the output programmatically
The function uses hardcoded thresholds (10+ treatments, >10% issues) which may need adjustment based on your dataset characteristics
Consider the output as a diagnostic tool for identifying data quality issues rather than production reporting
The function assumes FlockCD is a meaningful identifier; ensure it's populated and unique in your dataset

Similar Components

AI-powered semantic similarity - components with related functionality:

function analyze_flock_type_patterns 76.1% similar

Analyzes and prints timing pattern statistics for flock data by categorizing issues that occur before start time and after end time, grouped by flock type.
From: /tf/active/vicechatdev/data_quality_dashboard.py
function quick_clean 75.0% similar

Cleans flock data by identifying and removing flocks that have treatment records with timing inconsistencies (treatments administered outside the flock's start/end date range).
From: /tf/active/vicechatdev/quick_cleaner.py
function show_critical_errors 74.3% similar

Displays critical data quality errors in treatment records, focusing on date anomalies including 1900 dates, extreme future dates, and extreme past dates relative to flock lifecycles.
From: /tf/active/vicechatdev/data_quality_dashboard.py
function analyze_temporal_trends 73.2% similar

Analyzes and prints temporal trends in timing issues for treatments that occur before flock start dates or after flock end dates, breaking down occurrences by year and month.
From: /tf/active/vicechatdev/data_quality_dashboard.py
function create_data_quality_dashboard 72.9% similar

Creates an interactive command-line dashboard for analyzing data quality issues in treatment timing data, specifically focusing on treatments administered outside of flock lifecycle dates.
From: /tf/active/vicechatdev/data_quality_dashboard.py

🔍 Code Extractor

function show_problematic_flocks

Purpose

Source Code

Parameters

Parameter Details

Return Value

Dependencies

Required Imports

Usage Example

Best Practices

Tags

Similar Components

function analyze_flock_type_patterns 76.1% similar

function quick_clean 75.0% similar

function show_critical_errors 74.3% similar

function analyze_temporal_trends 73.2% similar

function create_data_quality_dashboard 72.9% similar

function show_problematic_flocks

Purpose

Source Code

Parameters

Parameter Details

Return Value

Dependencies

Required Imports

Usage Example

Best Practices

Tags

Similar Components

function analyze_flock_type_patterns 76.1% similar

function quick_clean 75.0% similar

function show_critical_errors 74.3% similar

function analyze_temporal_trends 73.2% similar

function create_data_quality_dashboard 72.9% similar

✨ Improve Code: show_problematic_flocks

Code Comparison