load_analysis_data - Code Extractor

function load_analysis_data

Maturity: 42

Loads CSV dataset(s) into pandas DataFrames based on dataset configuration, supporting both single dataset loading and comparison mode with two datasets.

File:
/tf/active/vicechatdev/data_quality_dashboard.py

Lines:
56 - 74

Complexity:
simple

Purpose

This function serves as a data loader for analysis workflows, handling two distinct loading patterns: (1) comparison mode where two CSV files (original and cleaned versions) are loaded for comparative analysis, and (2) single dataset mode where one CSV file is loaded. It returns a dictionary containing the loaded DataFrame(s) and metadata about the dataset type.

Source Code

def load_analysis_data(dataset_info):
    """Load analysis data based on dataset selection."""
    if dataset_info['type'] == 'compare':
        print("Loading data for comparison analysis...")
        # Load both datasets for comparison
        original_flocks = pd.read_csv(dataset_info['original'])
        cleaned_flocks = pd.read_csv(dataset_info['cleaned'])
        return {
            'original_flocks': original_flocks,
            'cleaned_flocks': cleaned_flocks,
            'type': 'compare'
        }
    else:
        print(f"Loading {dataset_info['type']} dataset...")
        flocks = pd.read_csv(dataset_info['path'])
        return {
            'flocks': flocks,
            'type': dataset_info['type']
        }

Parameters

Name	Type	Default	Kind
`dataset_info`	-	-	positional_or_keyword

Parameter Details

dataset_info: A dictionary containing dataset configuration. For comparison mode, must include keys: 'type' (set to 'compare'), 'original' (path to original CSV), and 'cleaned' (path to cleaned CSV). For single dataset mode, must include keys: 'type' (dataset type identifier, any string except 'compare') and 'path' (path to CSV file). All paths should be valid file system paths to CSV files.

Return Value

Returns a dictionary with different structures based on dataset type. For comparison mode (type='compare'): {'original_flocks': DataFrame, 'cleaned_flocks': DataFrame, 'type': 'compare'}. For single dataset mode: {'flocks': DataFrame, 'type': <dataset_type_string>}. All DataFrames are pandas DataFrame objects loaded from CSV files.

Dependencies

pandas

Required Imports

import pandas as pd

Usage Example

import pandas as pd

# Example 1: Load single dataset
dataset_config = {
    'type': 'original',
    'path': 'data/flocks.csv'
}
result = load_analysis_data(dataset_config)
flocks_df = result['flocks']
print(f"Loaded {len(flocks_df)} rows of type {result['type']}")

# Example 2: Load comparison datasets
comparison_config = {
    'type': 'compare',
    'original': 'data/original_flocks.csv',
    'cleaned': 'data/cleaned_flocks.csv'
}
result = load_analysis_data(comparison_config)
original_df = result['original_flocks']
cleaned_df = result['cleaned_flocks']
print(f"Loaded {len(original_df)} original and {len(cleaned_df)} cleaned records")

Best Practices

Ensure CSV files exist before calling this function to avoid FileNotFoundError
Validate the structure of dataset_info dictionary before passing to ensure required keys are present
Handle potential pandas CSV parsing errors (e.g., encoding issues, malformed CSV) with try-except blocks when calling this function
The function prints status messages to stdout; consider redirecting or capturing output in production environments
For large CSV files, consider memory constraints as entire datasets are loaded into memory
The returned dictionary structure differs based on dataset type; always check the 'type' key before accessing DataFrame keys

Similar Components

AI-powered semantic similarity - components with related functionality:

function load_dataset 74.2% similar

Loads a CSV dataset from a specified file path using pandas and returns it as a DataFrame with error handling for file not found and general exceptions.
From: /tf/active/vicechatdev/vice_ai/smartstat_scripts/e1ecec5f-4ea5-49c5-b4f5-d051ce851294/project_1/analysis.py
function load_data 73.8% similar

Loads a CSV dataset from a specified filepath using pandas, with fallback to creating sample data if the file is not found.
From: /tf/active/vicechatdev/vice_ai/smartstat_scripts/5a059cb7-3903-4020-8519-14198d1f39c9/analysis_1.py
function compare_datasets 61.0% similar

Analyzes and compares two pandas DataFrames containing flock data (original vs cleaned), printing detailed statistics about removed records, type distributions, and impact assessment.
From: /tf/active/vicechatdev/data_quality_dashboard.py
function explore_data 59.8% similar

Performs comprehensive exploratory data analysis on a pandas DataFrame, printing dataset overview, data types, missing values, descriptive statistics, and identifying categorical and numerical variables.
From: /tf/active/vicechatdev/vice_ai/smartstat_scripts/5a059cb7-3903-4020-8519-14198d1f39c9/analysis_1.py
function select_dataset 56.5% similar

Interactive command-line function that prompts users to select between original, cleaned, or comparison of flock datasets for analysis.
From: /tf/active/vicechatdev/data_quality_dashboard.py

🔍 Code Extractor

function load_analysis_data

Purpose

Source Code

Parameters

Parameter Details

Return Value

Dependencies

Required Imports

Usage Example

Best Practices

Tags

Similar Components

function load_dataset 74.2% similar

function load_data 73.8% similar

function compare_datasets 61.0% similar

function explore_data 59.8% similar

function select_dataset 56.5% similar

function load_analysis_data

Purpose

Source Code

Parameters

Parameter Details

Return Value

Dependencies

Required Imports

Usage Example

Best Practices

Tags

Similar Components

function load_dataset 74.2% similar

function load_data 73.8% similar

function compare_datasets 61.0% similar

function explore_data 59.8% similar

function select_dataset 56.5% similar

✨ Improve Code: load_analysis_data

Code Comparison