🔍 Code Extractor

function is_dataframe

Maturity: 38

Checks whether the supplied data object is a pandas DataFrame or a Dask DataFrame, with support for lazy imports of both libraries.

File:
/tf/active/vicechatdev/patches/util.py
Lines:
1477 - 1485
Complexity:
simple

Purpose

This utility function provides a safe way to determine if an object is a DataFrame from either pandas or Dask libraries. It handles cases where these libraries may or may not be imported in the current module, checking sys.modules before attempting to use them. This is particularly useful in libraries that support multiple DataFrame implementations without requiring all dependencies to be installed.

Source Code

def is_dataframe(data):
    """
    Checks whether the supplied data is of DataFrame type.
    """
    dd = None
    if 'dask.dataframe' in sys.modules and 'pandas' in sys.modules:
        import dask.dataframe as dd
    return((pd is not None and isinstance(data, pd.DataFrame)) or
          (dd is not None and isinstance(data, dd.DataFrame)))

Parameters

Name Type Default Kind
data - - positional_or_keyword

Parameter Details

data: Any Python object to be checked. Typically expected to be a DataFrame-like object, but can be any type. No type constraints are enforced at the parameter level.

Return Value

Returns a boolean value: True if the data is an instance of pandas.DataFrame or dask.dataframe.DataFrame, False otherwise. Returns False if neither pandas nor dask.dataframe are available in sys.modules.

Dependencies

  • sys
  • pandas
  • dask

Required Imports

import sys
import pandas as pd

Conditional/Optional Imports

These imports are only needed under specific conditions:

import dask.dataframe as dd

Condition: only if dask.dataframe is already loaded in sys.modules and pandas is available

Optional

Usage Example

import sys
import pandas as pd

# Assuming the function is defined or imported
def is_dataframe(data):
    dd = None
    if 'dask.dataframe' in sys.modules and 'pandas' in sys.modules:
        import dask.dataframe as dd
    return((pd is not None and isinstance(data, pd.DataFrame)) or
          (dd is not None and isinstance(data, dd.DataFrame)))

# Example usage
df = pd.DataFrame({'a': [1, 2, 3], 'b': [4, 5, 6]})
print(is_dataframe(df))  # Output: True

my_list = [1, 2, 3]
print(is_dataframe(my_list))  # Output: False

# With Dask (if available)
try:
    import dask.dataframe as dd
    dask_df = dd.from_pandas(df, npartitions=2)
    print(is_dataframe(dask_df))  # Output: True
except ImportError:
    print('Dask not available')

Best Practices

  • This function assumes that 'pd' is already imported as a global variable (import pandas as pd) in the module where it's defined
  • The function uses lazy importing for dask.dataframe to avoid import errors when dask is not installed
  • The function checks sys.modules before attempting to import dask.dataframe, which is more efficient than try-except blocks
  • Note that the function relies on the global 'pd' variable being defined; if pandas is not imported in the calling context, this will raise a NameError
  • This function is best used in libraries that want to support both pandas and dask without making dask a hard requirement
  • The function does not check for DataFrame-like objects from other libraries (e.g., polars, cudf) - it only checks pandas and dask

Similar Components

AI-powered semantic similarity - components with related functionality:

  • function is_series 82.3% similar

    Checks whether the supplied data object is a pandas Series or dask Series type, with lazy loading support for dask.

    From: /tf/active/vicechatdev/patches/util.py
  • function is_dask_array 70.7% similar

    Checks whether the provided data object is a Dask array by conditionally importing dask.array and performing an isinstance check.

    From: /tf/active/vicechatdev/patches/util.py
  • function is_cupy_array 62.2% similar

    Checks whether the provided data object is a CuPy ndarray by conditionally importing CuPy and performing an isinstance check.

    From: /tf/active/vicechatdev/patches/util.py
  • function isdatetime 57.7% similar

    Determines whether a given value (array or scalar) is a recognized datetime type, checking both NumPy datetime64 arrays and Python datetime objects.

    From: /tf/active/vicechatdev/patches/util.py
  • function load_dataset 52.2% similar

    Loads a CSV dataset from a specified file path using pandas and returns it as a DataFrame with error handling for file not found and general exceptions.

    From: /tf/active/vicechatdev/vice_ai/smartstat_scripts/e1ecec5f-4ea5-49c5-b4f5-d051ce851294/project_1/analysis.py
← Back to Browse