šŸ” Code Extractor

function test_us_csv

Maturity: 45

A unit test function that validates the smart_read_csv function's ability to correctly parse US-formatted CSV files with comma delimiters and point decimal separators.

File:
/tf/active/vicechatdev/vice_ai/test_regional_formats.py
Lines:
48 - 81
Complexity:
simple

Purpose

This test function verifies that the smart_read_csv utility can properly read and parse CSV files in US format (comma-separated values with decimal points). It creates a temporary CSV file with sample data containing numeric values with decimal points, reads it using smart_read_csv, and validates that numeric columns are correctly identified and converted to appropriate data types. The test ensures data integrity by checking specific values and calculating statistics.

Source Code

def test_us_csv():
    """Test US CSV format (comma delimiter, point decimal)"""
    print("\n" + "="*60)
    print("Test 2: US CSV (comma delimiter, point decimals)")
    print("="*60)
    
    # Create test US CSV
    test_file = Path('/tmp/test_us.csv')
    us_data = """Name,Weight,Height,Temperature
Alice,65.5,170.2,36.6
Bob,78.3,182.5,37.1
Charlie,71.2,175.8,36.9
Diana,58.7,165.3,36.7"""
    
    test_file.write_text(us_data)
    
    # Read with smart_read_csv
    df = smart_read_csv(str(test_file))
    
    print(f"\nLoaded DataFrame:")
    print(df)
    print(f"\nColumn types:")
    print(df.dtypes)
    print(f"\nWeight column (should be numeric):")
    print(f"  Type: {df['Weight'].dtype}")
    print(f"  Values: {df['Weight'].tolist()}")
    print(f"  Mean: {df['Weight'].mean():.2f}")
    
    # Verify conversion
    assert df['Weight'].dtype in ['float64', 'Float64'], f"Weight should be numeric, got {df['Weight'].dtype}"
    assert 65.4 < df['Weight'].iloc[0] < 65.6, f"Alice's weight should be ~65.5, got {df['Weight'].iloc[0]}"
    
    print("\nāœ“ US CSV test PASSED")
    test_file.unlink()

Return Value

This function does not return any value (implicitly returns None). It performs assertions and prints test results to stdout. If assertions fail, it raises an AssertionError.

Dependencies

  • pandas
  • pathlib
  • smartstat_service

Required Imports

import pandas as pd
from pathlib import Path
from smartstat_service import smart_read_csv

Usage Example

# Run the test function
test_us_csv()

# Expected output:
# ============================================================
# Test 2: US CSV (comma delimiter, point decimals)
# ============================================================
# 
# Loaded DataFrame:
#       Name  Weight  Height  Temperature
# 0    Alice    65.5   170.2         36.6
# 1      Bob    78.3   182.5         37.1
# 2  Charlie    71.2   175.8         36.9
# 3    Diana    58.7   165.3         36.7
# 
# Column types:
# Name            object
# Weight         float64
# Height         float64
# Temperature    float64
# dtype: object
# 
# Weight column (should be numeric):
#   Type: float64
#   Values: [65.5, 78.3, 71.2, 58.7]
#   Mean: 68.43
# 
# āœ“ US CSV test PASSED

Best Practices

  • This is a test function and should be run in a testing environment, not in production code
  • The function creates temporary files in /tmp which are cleaned up after the test
  • Assertions will raise AssertionError if the smart_read_csv function doesn't work as expected
  • The test validates both data type conversion and actual numeric values to ensure accuracy
  • The temporary file is explicitly deleted using unlink() to prevent accumulation of test files
  • This test assumes the smart_read_csv function can auto-detect US CSV format
  • The test uses hardcoded file path /tmp/test_us.csv which may not work on all operating systems (Windows uses different temp directory)

Similar Components

AI-powered semantic similarity - components with related functionality:

  • function test_us_with_thousands 87.9% similar

    A unit test function that validates the smart_read_csv function's ability to correctly parse US-formatted CSV files containing numbers with thousand separators (commas) and decimal points.

    From: /tf/active/vicechatdev/vice_ai/test_regional_formats.py
  • function test_european_csv 82.7% similar

    A test function that validates the ability to read and parse European-formatted CSV files (semicolon delimiters, comma decimal separators) and convert them to proper numeric types.

    From: /tf/active/vicechatdev/vice_ai/test_regional_formats.py
  • function test_tab_delimited_european 81.6% similar

    A unit test function that validates the smart_read_csv function's ability to correctly parse tab-delimited CSV files containing European-style decimal numbers (using commas instead of periods).

    From: /tf/active/vicechatdev/vice_ai/test_regional_formats.py
  • function test_european_with_thousands 78.5% similar

    A unit test function that validates the smart_read_csv function's ability to correctly parse European-formatted CSV files with thousand separators (dots) and decimal commas.

    From: /tf/active/vicechatdev/vice_ai/test_regional_formats.py
  • function smart_read_csv 72.0% similar

    Automatically detects CSV file delimiters (comma, semicolon, tab) and handles regional decimal formats (European comma vs US/UK point) to reliably parse CSV files from different locales.

    From: /tf/active/vicechatdev/vice_ai/smartstat_service.py
← Back to Browse