🔍 Code Extractor

function clean_for_json

Maturity: 45

Recursively traverses and sanitizes Python data structures (dicts, lists, tuples, numpy arrays) to ensure all values are JSON-serializable by converting numpy types, handling NaN/Inf values, and normalizing data types.

File:
/tf/active/vicechatdev/vice_ai/smartstat_scripts/f0b81d95-24d9-418a-8d9f-1b241684e64c/project_1/analysis.py
Lines:
472 - 494
Complexity:
moderate

Purpose

This function prepares complex Python data structures containing numpy arrays, pandas objects, and various numeric types for JSON serialization. It handles edge cases like NaN, Infinity, numpy-specific types, and nested structures. Common use cases include preparing data analysis results for API responses, saving computation results to JSON files, or transmitting scientific computing data over web services.

Source Code

def clean_for_json(obj):
    """Recursively clean data structure for JSON serialization"""
    if isinstance(obj, dict):
        return {str(k): clean_for_json(v) for k, v in obj.items()}
    elif isinstance(obj, list):
        return [clean_for_json(item) for item in obj]
    elif isinstance(obj, tuple):
        return clean_for_json(list(obj))
    elif isinstance(obj, (np.integer, np.int64, np.int32)):
        return int(obj)
    elif isinstance(obj, (np.floating, np.float64, np.float32)):
        if math.isnan(obj) or math.isinf(obj):
            return None
        return float(obj)
    elif isinstance(obj, np.ndarray):
        return clean_for_json(obj.tolist())
    elif isinstance(obj, float):
        if math.isnan(obj) or math.isinf(obj):
            return None
        return obj
    elif pd.isna(obj):
        return None
    return obj

Parameters

Name Type Default Kind
obj - - positional_or_keyword

Parameter Details

obj: Any Python object to be cleaned for JSON serialization. Can be a primitive type (int, float, str), collection (dict, list, tuple), numpy array (np.ndarray), numpy scalar (np.integer, np.floating), pandas NA value, or nested combinations of these types. The function recursively processes nested structures.

Return Value

Returns a JSON-serializable version of the input object. Dictionaries have string keys and cleaned values; lists and tuples become lists with cleaned elements; numpy arrays are converted to nested lists; numpy numeric types become Python int/float; NaN and Inf values become None; pandas NA values become None; all other values are returned unchanged. The return type matches the structure of the input but with all values converted to JSON-compatible types.

Dependencies

  • numpy
  • pandas
  • math

Required Imports

import numpy as np
import pandas as pd
import math

Usage Example

import numpy as np
import pandas as pd
import math
import json

def clean_for_json(obj):
    if isinstance(obj, dict):
        return {str(k): clean_for_json(v) for k, v in obj.items()}
    elif isinstance(obj, list):
        return [clean_for_json(item) for item in obj]
    elif isinstance(obj, tuple):
        return clean_for_json(list(obj))
    elif isinstance(obj, (np.integer, np.int64, np.int32)):
        return int(obj)
    elif isinstance(obj, (np.floating, np.float64, np.float32)):
        if math.isnan(obj) or math.isinf(obj):
            return None
        return float(obj)
    elif isinstance(obj, np.ndarray):
        return clean_for_json(obj.tolist())
    elif isinstance(obj, float):
        if math.isnan(obj) or math.isinf(obj):
            return None
        return obj
    elif pd.isna(obj):
        return None
    return obj

# Example usage
data = {
    'array': np.array([1, 2, 3]),
    'float': np.float64(3.14),
    'nan': float('nan'),
    'inf': float('inf'),
    'nested': {
        'tuple': (1, 2, 3),
        'pd_na': pd.NA
    }
}

cleaned = clean_for_json(data)
print(json.dumps(cleaned, indent=2))
# Output:
# {
#   "array": [1, 2, 3],
#   "float": 3.14,
#   "nan": null,
#   "inf": null,
#   "nested": {
#     "tuple": [1, 2, 3],
#     "pd_na": null
#   }
# }

Best Practices

  • Always use this function before calling json.dumps() on data structures containing numpy or pandas objects to avoid TypeError exceptions
  • Be aware that NaN and Infinity values are converted to None (null in JSON), which may affect downstream data analysis
  • Dictionary keys are converted to strings, so numeric keys will lose their original type
  • Tuples are converted to lists in the output, losing the immutability property
  • For large numpy arrays, consider the memory implications of converting to nested Python lists
  • The function does not handle custom objects or classes - these will be returned unchanged and may still cause JSON serialization errors
  • Consider validating the output with json.dumps() to ensure complete serializability if dealing with unknown data types

Similar Components

AI-powered semantic similarity - components with related functionality:

  • function clean_for_json_v7 95.9% similar

    Recursively traverses and sanitizes data structures (dicts, lists, numpy types) to ensure JSON serialization compatibility by converting numpy types to native Python types and handling NaN/Inf values.

    From: /tf/active/vicechatdev/vice_ai/smartstat_scripts/d1e252f5-950c-4ad7-b425-86b4b02c3c62/analysis_1.py
  • function clean_for_json_v12 94.1% similar

    Recursively sanitizes Python objects to make them JSON-serializable by converting non-serializable types (NumPy types, pandas objects, tuples, NaN/Inf values) into JSON-compatible formats.

    From: /tf/active/vicechatdev/vice_ai/smartstat_scripts/290a39ea-3ae0-4301-8e2f-9d5c3bf80e6e/project_3/analysis.py
  • function clean_for_json_v15 93.1% similar

    Recursively sanitizes Python objects to make them JSON-serializable by converting NumPy types to native Python types and handling NaN/Inf float values.

    From: /tf/active/vicechatdev/vice_ai/smartstat_scripts/290a39ea-3ae0-4301-8e2f-9d5c3bf80e6e/analysis_3.py
  • function clean_for_json_v8 92.8% similar

    Recursively traverses and converts a nested data structure (dicts, lists, numpy types, pandas NaN) into JSON-serializable Python primitives.

    From: /tf/active/vicechatdev/vice_ai/smartstat_scripts/d1e252f5-950c-4ad7-b425-86b4b02c3c62/analysis_5.py
  • function clean_for_json_v13 92.3% similar

    Recursively sanitizes Python objects to make them JSON-serializable by converting NumPy types to native Python types and handling NaN/Inf values.

    From: /tf/active/vicechatdev/vice_ai/smartstat_scripts/d48d7789-9627-4e96-9f48-f90b687cd07d/analysis_1.py
← Back to Browse