🔍 Code Extractor

function dimension_sort

Maturity: 44

Sorts an ordered dictionary by specified dimension keys, supporting both standard Python tuple sorting and categorical ordering for dimensions with predefined values.

File:
/tf/active/vicechatdev/patches/util.py
Lines:
1239 - 1258
Complexity:
moderate

Purpose

This function provides flexible sorting for multi-dimensional data structures where dimensions can be either continuous (sorted naturally) or categorical (sorted by predefined order). It's designed to work with data structures that have key dimensions (kdims) and value dimensions (vdims), commonly used in data visualization and analysis frameworks like HoloViews. The function handles categorical dimensions by using their predefined value order rather than natural sorting.

Source Code

def dimension_sort(odict, kdims, vdims, key_index):
    """
    Sorts data by key using usual Python tuple sorting semantics
    or sorts in categorical order for any categorical Dimensions.
    """
    sortkws = {}
    ndims = len(kdims)
    dimensions = kdims+vdims
    indexes = [(dimensions[i], int(i not in range(ndims)),
                    i if i in range(ndims) else i-ndims)
                for i in key_index]
    cached_values = {d.name: [None]+list(d.values) for d in dimensions}

    if len(set(key_index)) != len(key_index):
        raise ValueError("Cannot sort on duplicated dimensions")
    else:
       sortkws['key'] = lambda x: tuple(cached_values[dim.name].index(x[t][d])
                                        if dim.values else x[t][d]
                                        for i, (dim, t, d) in enumerate(indexes))
    return python2sort(odict.items(), **sortkws)

Parameters

Name Type Default Kind
odict - - positional_or_keyword
kdims - - positional_or_keyword
vdims - - positional_or_keyword
key_index - - positional_or_keyword

Parameter Details

odict: An ordered dictionary (OrderedDict) containing the data to be sorted. The dictionary items will be sorted based on the specified dimensions.

kdims: A list of key dimensions (Dimension objects) that define the primary structure of the data. These dimensions are used for indexing and must have a 'name' attribute and optionally a 'values' attribute for categorical ordering.

vdims: A list of value dimensions (Dimension objects) that represent dependent variables or measurements. Like kdims, these must have 'name' and optional 'values' attributes.

key_index: A list of integers specifying which dimensions (by index) to use for sorting. Indexes refer to positions in the combined list of kdims+vdims. Must not contain duplicates.

Return Value

Returns a sorted list of tuples from the input ordered dictionary, where each tuple is (key, value). The sorting is performed using the python2sort function with a custom key function that respects categorical ordering for dimensions with predefined values. For categorical dimensions, items are sorted by their position in the dimension's values list; for non-categorical dimensions, natural Python sorting is used.

Dependencies

  • python2sort

Required Imports

from collections import OrderedDict

Usage Example

# Assuming Dimension class and python2sort are available
from collections import OrderedDict

# Create mock Dimension objects
class Dimension:
    def __init__(self, name, values=None):
        self.name = name
        self.values = values

# Define dimensions
kdims = [Dimension('category', values=['low', 'medium', 'high']), Dimension('year')]
vdims = [Dimension('value')]

# Create data
odict = OrderedDict([
    (('high', 2020), {'value': 100}),
    (('low', 2021), {'value': 50}),
    (('medium', 2020), {'value': 75})
])

# Sort by first dimension (category) using categorical order
key_index = [0]
sorted_data = dimension_sort(odict, kdims, vdims, key_index)
# Result: [('low', 2021), ('medium', 2020), ('high', 2020)] ordered by categorical values

Best Practices

  • Ensure key_index does not contain duplicate dimension indexes, as this will raise a ValueError
  • Dimension objects must have a 'name' attribute; the 'values' attribute should be a list for categorical dimensions or None for natural sorting
  • The key_index values should be valid indexes within the range of len(kdims + vdims)
  • For categorical dimensions, all data values must exist in the dimension's values list, otherwise index() will raise a ValueError
  • The input odict keys should be tuples where each element corresponds to a dimension value
  • Performance consideration: categorical value lookups are cached in a dictionary for efficiency

Similar Components

AI-powered semantic similarity - components with related functionality:

  • function python2sort 55.3% similar

    A sorting function that mimics Python 2's behavior of grouping incomparable types separately and sorting within each group, rather than raising a TypeError when comparing incompatible types.

    From: /tf/active/vicechatdev/patches/util.py
  • function merge_dimensions 53.0% similar

    Merges multiple lists of Dimension objects by combining their values while preserving unique dimensions and maintaining order of first appearance.

    From: /tf/active/vicechatdev/patches/util.py
  • function arglexsort 50.2% similar

    Returns the indices that would lexicographically sort multiple arrays, treating them as columns of a structured array.

    From: /tf/active/vicechatdev/patches/util.py
  • function sort_topologically 47.7% similar

    Performs stackless topological sorting on a directed acyclic graph (DAG), organizing nodes into levels based on their dependencies.

    From: /tf/active/vicechatdev/patches/util.py
  • function group_select 46.5% similar

    Recursively groups a list of key tuples into a nested dictionary structure to optimize indexing operations by avoiding duplicate key lookups.

    From: /tf/active/vicechatdev/patches/util.py
← Back to Browse