🔍 Code Extractor

function create_data_quality_dashboard_v1

Maturity: 44

Creates an interactive data quality dashboard for analyzing treatment timing issues in poultry flock management data by loading and processing CSV files containing timing anomalies.

File:
/tf/active/vicechatdev/data_quality_dashboard.py
Lines:
76 - 102
Complexity:
moderate

Purpose

This function serves as the main entry point for a data quality dashboard that visualizes and analyzes treatment timing issues in poultry flock data. It loads pre-generated CSV files containing treatments administered before flock start dates, after flock end dates, severe timing issues, and flocks with timing problems. The function is designed to help identify and investigate data quality issues in treatment administration records.

Source Code

def create_data_quality_dashboard():
    """Create an interactive data quality dashboard."""
    
    print("TREATMENT TIMING DATA QUALITY DASHBOARD")
    print("=" * 50)
    
    # Dataset selection
    dataset_choice = select_dataset()
    if dataset_choice is None:
        return
    
    # Load data
    try:
        before_start = pd.read_csv("/tf/active/timing_analysis_output/treatments_before_start.csv")
        after_end = pd.read_csv("/tf/active/timing_analysis_output/treatments_after_end.csv")
        severe_cases = pd.read_csv("/tf/active/timing_analysis_output/severe_timing_issues.csv")
        flocks_issues = pd.read_csv("/tf/active/timing_analysis_output/flocks_with_timing_issues.csv")
        
        # Convert dates
        before_start['AdministeredDate'] = pd.to_datetime(before_start['AdministeredDate'])
        before_start['StartDate'] = pd.to_datetime(before_start['StartDate'])
        after_end['AdministeredDate'] = pd.to_datetime(after_end['AdministeredDate'])
        after_end['EndDate'] = pd.to_datetime(after_end['EndDate'])
        
    except FileNotFoundError:
        print("Error: Analysis output files not found. Please run the main analysis first.")
        return

Return Value

Returns None. The function operates through side effects by printing dashboard information to console and potentially displaying visualizations. Returns early (None) if dataset selection is cancelled or if required CSV files are not found.

Dependencies

  • pandas
  • matplotlib
  • seaborn
  • datetime
  • warnings
  • os

Required Imports

import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns
from datetime import datetime
import warnings
import os

Usage Example

# Ensure the required CSV files exist in the expected directory
# and the select_dataset() function is defined

import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns
from datetime import datetime
import warnings
import os

# Define the select_dataset function (example implementation)
def select_dataset():
    print("Select dataset: 1) Dataset A, 2) Dataset B")
    choice = input("Enter choice (1 or 2): ")
    return choice if choice in ['1', '2'] else None

# Call the dashboard function
create_data_quality_dashboard()

# The function will:
# 1. Print a dashboard header
# 2. Prompt for dataset selection
# 3. Load and process timing analysis CSV files
# 4. Convert date columns to datetime format
# 5. Display data quality metrics (implementation continues beyond shown code)

Best Practices

  • Ensure all required CSV files are generated by running the main timing analysis before calling this function
  • The function expects specific CSV file structure with 'AdministeredDate', 'StartDate', and 'EndDate' columns
  • Handle the FileNotFoundError gracefully by running prerequisite analysis scripts first
  • The select_dataset() function must be defined in the same scope before calling this function
  • Consider wrapping the function call in a try-except block to handle unexpected data format issues
  • Ensure sufficient file system permissions for reading from '/tf/active/timing_analysis_output/' directory
  • The function performs date parsing which may fail if date formats in CSV files are inconsistent

Similar Components

AI-powered semantic similarity - components with related functionality:

  • function create_data_quality_dashboard 93.7% similar

    Creates an interactive command-line dashboard for analyzing data quality issues in treatment timing data, specifically focusing on treatments administered outside of flock lifecycle dates.

    From: /tf/active/vicechatdev/data_quality_dashboard.py
  • function analyze_temporal_trends 75.4% similar

    Analyzes and prints temporal trends in timing issues for treatments that occur before flock start dates or after flock end dates, breaking down occurrences by year and month.

    From: /tf/active/vicechatdev/data_quality_dashboard.py
  • function quick_clean 72.5% similar

    Cleans flock data by identifying and removing flocks that have treatment records with timing inconsistencies (treatments administered outside the flock's start/end date range).

    From: /tf/active/vicechatdev/quick_cleaner.py
  • function show_critical_errors 71.7% similar

    Displays critical data quality errors in treatment records, focusing on date anomalies including 1900 dates, extreme future dates, and extreme past dates relative to flock lifecycles.

    From: /tf/active/vicechatdev/data_quality_dashboard.py
  • function show_problematic_flocks 70.3% similar

    Analyzes and displays problematic flocks from a dataset by identifying those with systematic timing issues in their treatment records, categorizing them by severity and volume.

    From: /tf/active/vicechatdev/data_quality_dashboard.py
← Back to Browse