function clean_nan_for_json
Recursively traverses nested data structures (dicts, lists) and converts NaN, null, and invalid numeric values to None for safe JSON serialization.
/tf/active/vicechatdev/vice_ai/data_analysis_service.py
484 - 516
moderate
Purpose
This function prepares data structures containing pandas/numpy numeric types for JSON serialization by handling NaN values that would otherwise cause JSON encoding errors. It's particularly useful when working with data analysis results that may contain NaN values from pandas DataFrames or numpy arrays. The function handles nested structures recursively and gracefully handles type conversion errors.
Source Code
def clean_nan_for_json(obj):
"""Recursively clean NaN values from nested data structures for JSON serialization"""
try:
if isinstance(obj, dict):
return {k: clean_nan_for_json(v) for k, v in obj.items()}
elif isinstance(obj, list):
return [clean_nan_for_json(item) for item in obj]
elif isinstance(obj, (np.floating, np.integer)):
try:
if pd.isna(obj):
return None
return float(obj)
except (TypeError, ValueError):
return float(obj) if obj is not None else None
elif obj is None:
return None
elif isinstance(obj, float):
try:
if np.isnan(obj) or obj != obj: # NaN check
return None
return obj
except (TypeError, ValueError):
return obj
else:
try:
if pd.isna(obj):
return None
except (TypeError, ValueError):
pass
return obj
except Exception as e:
logger.warning(f"Error in clean_nan_for_json with obj type {type(obj)}: {e}")
return obj
Parameters
| Name | Type | Default | Kind |
|---|---|---|---|
obj |
- | - | positional_or_keyword |
Parameter Details
obj: Any Python object to be cleaned. Can be a dict, list, numpy numeric type (np.floating, np.integer), float, None, or any other type. Nested structures are traversed recursively. NaN values in any supported numeric type will be converted to None.
Return Value
Returns the cleaned version of the input object with the same structure. NaN values are replaced with None. Numpy numeric types are converted to Python float. For unsupported types or errors during processing, returns the original object unchanged. Return type matches input type structure (dict returns dict, list returns list, etc.).
Dependencies
numpypandaslogging
Required Imports
import numpy as np
import pandas as pd
import logging
Usage Example
import numpy as np
import pandas as pd
import logging
import json
logger = logging.getLogger(__name__)
def clean_nan_for_json(obj):
"""Recursively clean NaN values from nested data structures for JSON serialization"""
try:
if isinstance(obj, dict):
return {k: clean_nan_for_json(v) for k, v in obj.items()}
elif isinstance(obj, list):
return [clean_nan_for_json(item) for item in obj]
elif isinstance(obj, (np.floating, np.integer)):
try:
if pd.isna(obj):
return None
return float(obj)
except (TypeError, ValueError):
return float(obj) if obj is not None else None
elif obj is None:
return None
elif isinstance(obj, float):
try:
if np.isnan(obj) or obj != obj:
return None
return obj
except (TypeError, ValueError):
return obj
else:
try:
if pd.isna(obj):
return None
except (TypeError, ValueError):
pass
return obj
except Exception as e:
logger.warning(f"Error in clean_nan_for_json with obj type {type(obj)}: {e}")
return obj
# Example usage
data = {
'values': [1.0, np.nan, 3.5, float('nan')],
'nested': {
'score': np.float64(42.5),
'invalid': np.nan,
'count': np.int64(10)
},
'mixed': [np.nan, None, 'text', 123]
}
cleaned = clean_nan_for_json(data)
print(json.dumps(cleaned, indent=2))
# Output:
# {
# "values": [1.0, null, 3.5, null],
# "nested": {
# "score": 42.5,
# "invalid": null,
# "count": 10.0
# },
# "mixed": [null, null, "text", 123]
# }
Best Practices
- Ensure a logger is configured before using this function to capture warning messages
- Use this function before JSON serialization when working with pandas DataFrames or numpy arrays
- Be aware that numpy integer types are converted to float for consistency
- The function is defensive and returns the original object if processing fails, preventing data loss
- For large nested structures, consider the recursive nature may impact performance
- The function handles multiple NaN representations (np.nan, float('nan'), pd.NA) for robustness
Tags
Similar Components
AI-powered semantic similarity - components with related functionality:
-
function clean_for_json_v4 91.6% similar
-
function clean_for_json_v2 91.2% similar
-
function clean_for_json_v6 90.2% similar
-
function clean_for_json_v5 90.1% similar
-
function clean_for_json_v1 89.9% similar