šŸ” Code Extractor

function test_us_with_thousands

Maturity: 44

A unit test function that validates the smart_read_csv function's ability to correctly parse US-formatted CSV files containing numbers with thousand separators (commas) and decimal points.

File:
/tf/active/vicechatdev/vice_ai/test_regional_formats.py
Lines:
118 - 149
Complexity:
simple

Purpose

This test function verifies that the smart_read_csv utility can properly handle US number formatting conventions where commas are used as thousand separators and periods as decimal points. It creates a temporary CSV file with US-formatted numeric data, reads it using smart_read_csv, and asserts that numeric columns are correctly parsed as float types with accurate values. This ensures the CSV parser can distinguish between US and European number formats.

Source Code

def test_us_with_thousands():
    """Test US format with thousand separators"""
    print("\n" + "="*60)
    print("Test 4: US CSV with thousand separators")
    print("="*60)
    
    # Create test CSV with US thousand separators (commas) and decimal points
    test_file = Path('/tmp/test_us_thousands.csv')
    us_data = """Product,Price,Quantity,Revenue
Widget A,1234.56,100,123456.00
Widget B,2567.89,50,128394.50
Widget C,987.65,200,197530.00"""
    
    test_file.write_text(us_data)
    
    # Read with smart_read_csv
    df = smart_read_csv(str(test_file))
    
    print(f"\nLoaded DataFrame:")
    print(df)
    print(f"\nColumn types:")
    print(df.dtypes)
    print(f"\nPrice column (should be numeric):")
    print(f"  Type: {df['Price'].dtype}")
    print(f"  Values: {df['Price'].tolist()}")
    
    # Verify conversion
    assert df['Price'].dtype in ['float64', 'Float64'], f"Price should be numeric, got {df['Price'].dtype}"
    assert 1234.0 < df['Price'].iloc[0] < 1235.0, f"Widget A price should be ~1234.56, got {df['Price'].iloc[0]}"
    
    print("\nāœ“ US thousands separator test PASSED")
    test_file.unlink()

Return Value

This function does not return any value (implicitly returns None). It performs assertions and prints test results to stdout. If assertions fail, it raises an AssertionError.

Dependencies

  • pandas
  • pathlib
  • smartstat_service

Required Imports

import pandas as pd
from pathlib import Path
from smartstat_service import smart_read_csv

Usage Example

# Run the test function
test_us_with_thousands()

# Expected output:
# ============================================================
# Test 4: US CSV with thousand separators
# ============================================================
# 
# Loaded DataFrame:
#     Product   Price  Quantity    Revenue
# 0  Widget A  1234.56       100  123456.00
# 1  Widget B  2567.89        50  128394.50
# 2  Widget C   987.65       200  197530.00
# 
# Column types:
# Product      object
# Price       float64
# Quantity      int64
# Revenue     float64
# dtype: object
# 
# Price column (should be numeric):
#   Type: float64
#   Values: [1234.56, 2567.89, 987.65]
# 
# āœ“ US thousands separator test PASSED

Best Practices

  • This is a test function and should be run in a testing environment, not in production code
  • The function creates and deletes temporary files in /tmp - ensure proper cleanup even if assertions fail by using try-finally blocks
  • The test assumes /tmp directory exists and is writable (Unix/Linux convention)
  • Assertions check both data type and value accuracy to ensure proper numeric conversion
  • The test file is cleaned up after execution using unlink() to avoid leaving temporary files
  • Consider using pytest fixtures or unittest setUp/tearDown methods for better test isolation
  • The function prints verbose output for debugging - consider using logging or test framework reporting instead

Similar Components

AI-powered semantic similarity - components with related functionality:

  • function test_european_with_thousands 88.6% similar

    A unit test function that validates the smart_read_csv function's ability to correctly parse European-formatted CSV files with thousand separators (dots) and decimal commas.

    From: /tf/active/vicechatdev/vice_ai/test_regional_formats.py
  • function test_us_csv 87.9% similar

    A unit test function that validates the smart_read_csv function's ability to correctly parse US-formatted CSV files with comma delimiters and point decimal separators.

    From: /tf/active/vicechatdev/vice_ai/test_regional_formats.py
  • function test_tab_delimited_european 78.1% similar

    A unit test function that validates the smart_read_csv function's ability to correctly parse tab-delimited CSV files containing European-style decimal numbers (using commas instead of periods).

    From: /tf/active/vicechatdev/vice_ai/test_regional_formats.py
  • function test_european_csv 76.5% similar

    A test function that validates the ability to read and parse European-formatted CSV files (semicolon delimiters, comma decimal separators) and convert them to proper numeric types.

    From: /tf/active/vicechatdev/vice_ai/test_regional_formats.py
  • function main_v67 66.4% similar

    Test runner function that executes a suite of regional format handling tests for CSV parsing, including European and US number formats with various delimiters.

    From: /tf/active/vicechatdev/vice_ai/test_regional_formats.py
← Back to Browse