šŸ” Code Extractor

function test_european_csv

Maturity: 45

A test function that validates the ability to read and parse European-formatted CSV files (semicolon delimiters, comma decimal separators) and convert them to proper numeric types.

File:
/tf/active/vicechatdev/vice_ai/test_regional_formats.py
Lines:
12 - 45
Complexity:
moderate

Purpose

This function serves as a unit test to verify that the smart_read_csv function correctly handles European CSV format conventions. It creates a temporary CSV file with European formatting (semicolons as field delimiters and commas as decimal separators), reads it using smart_read_csv, and validates that numeric columns are properly converted to float types with correct values. The test ensures data integrity by checking column types, specific values, and statistical calculations.

Source Code

def test_european_csv():
    """Test European CSV format (semicolon delimiter, comma decimal)"""
    print("\n" + "="*60)
    print("Test 1: European CSV (semicolon delimiter, comma decimals)")
    print("="*60)
    
    # Create test European CSV
    test_file = Path('/tmp/test_european.csv')
    european_data = """Name;Weight;Height;Temperature
Alice;65,5;170,2;36,6
Bob;78,3;182,5;37,1
Charlie;71,2;175,8;36,9
Diana;58,7;165,3;36,7"""
    
    test_file.write_text(european_data)
    
    # Read with smart_read_csv
    df = smart_read_csv(str(test_file))
    
    print(f"\nLoaded DataFrame:")
    print(df)
    print(f"\nColumn types:")
    print(df.dtypes)
    print(f"\nWeight column (should be numeric):")
    print(f"  Type: {df['Weight'].dtype}")
    print(f"  Values: {df['Weight'].tolist()}")
    print(f"  Mean: {df['Weight'].mean():.2f}")
    
    # Verify conversion
    assert df['Weight'].dtype in ['float64', 'Float64'], f"Weight should be numeric, got {df['Weight'].dtype}"
    assert 65.4 < df['Weight'].iloc[0] < 65.6, f"Alice's weight should be ~65.5, got {df['Weight'].iloc[0]}"
    
    print("\nāœ“ European CSV test PASSED")
    test_file.unlink()

Return Value

This function does not return any value (implicitly returns None). It performs assertions and prints test results to stdout. If assertions fail, it raises an AssertionError. On success, it prints a success message and cleans up the temporary test file.

Dependencies

  • pandas
  • pathlib
  • smartstat_service

Required Imports

import pandas as pd
from pathlib import Path
from smartstat_service import smart_read_csv

Usage Example

# Run the test function
from pathlib import Path
import pandas as pd
from smartstat_service import smart_read_csv

# Execute the test
test_european_csv()

# Expected output:
# ============================================================
# Test 1: European CSV (semicolon delimiter, comma decimals)
# ============================================================
# 
# Loaded DataFrame:
#       Name  Weight  Height  Temperature
# 0    Alice    65.5   170.2         36.6
# 1      Bob    78.3   182.5         37.1
# 2  Charlie    71.2   175.8         36.9
# 3    Diana    58.7   165.3         36.7
# 
# Column types:
# Name            object
# Weight         float64
# Height         float64
# Temperature    float64
# dtype: object
# 
# Weight column (should be numeric):
#   Type: float64
#   Values: [65.5, 78.3, 71.2, 58.7]
#   Mean: 68.42
# 
# āœ“ European CSV test PASSED

Best Practices

  • This is a test function and should be run in a testing environment, not in production code
  • Requires write permissions to /tmp directory - may need adjustment for Windows systems
  • The function creates and deletes temporary files - ensure proper cleanup even if assertions fail by using try-finally blocks in production
  • Assertions use floating-point comparisons with tolerance ranges (65.4 < value < 65.6) to account for floating-point precision
  • The test file is cleaned up with unlink() at the end - consider using context managers or pytest fixtures for more robust cleanup
  • This test validates both delimiter detection (semicolon) and decimal separator conversion (comma to period)
  • The function prints verbose output for debugging - consider using logging or pytest's capture mechanisms in a test suite

Similar Components

AI-powered semantic similarity - components with related functionality:

  • function test_tab_delimited_european 91.0% similar

    A unit test function that validates the smart_read_csv function's ability to correctly parse tab-delimited CSV files containing European-style decimal numbers (using commas instead of periods).

    From: /tf/active/vicechatdev/vice_ai/test_regional_formats.py
  • function test_european_with_thousands 89.5% similar

    A unit test function that validates the smart_read_csv function's ability to correctly parse European-formatted CSV files with thousand separators (dots) and decimal commas.

    From: /tf/active/vicechatdev/vice_ai/test_regional_formats.py
  • function test_us_csv 82.7% similar

    A unit test function that validates the smart_read_csv function's ability to correctly parse US-formatted CSV files with comma delimiters and point decimal separators.

    From: /tf/active/vicechatdev/vice_ai/test_regional_formats.py
  • function smart_read_csv 77.0% similar

    Automatically detects CSV file delimiters (comma, semicolon, tab) and handles regional decimal formats (European comma vs US/UK point) to reliably parse CSV files from different locales.

    From: /tf/active/vicechatdev/vice_ai/smartstat_service.py
  • function test_us_with_thousands 76.5% similar

    A unit test function that validates the smart_read_csv function's ability to correctly parse US-formatted CSV files containing numbers with thousand separators (commas) and decimal points.

    From: /tf/active/vicechatdev/vice_ai/test_regional_formats.py
← Back to Browse