function dimension_sort
Sorts an ordered dictionary by specified dimension keys, supporting both standard Python tuple sorting and categorical ordering for dimensions with predefined values.
/tf/active/vicechatdev/patches/util.py
1239 - 1258
moderate
Purpose
This function provides flexible sorting for multi-dimensional data structures where dimensions can be either continuous (sorted naturally) or categorical (sorted by predefined order). It's designed to work with data structures that have key dimensions (kdims) and value dimensions (vdims), commonly used in data visualization and analysis frameworks like HoloViews. The function handles categorical dimensions by using their predefined value order rather than natural sorting.
Source Code
def dimension_sort(odict, kdims, vdims, key_index):
"""
Sorts data by key using usual Python tuple sorting semantics
or sorts in categorical order for any categorical Dimensions.
"""
sortkws = {}
ndims = len(kdims)
dimensions = kdims+vdims
indexes = [(dimensions[i], int(i not in range(ndims)),
i if i in range(ndims) else i-ndims)
for i in key_index]
cached_values = {d.name: [None]+list(d.values) for d in dimensions}
if len(set(key_index)) != len(key_index):
raise ValueError("Cannot sort on duplicated dimensions")
else:
sortkws['key'] = lambda x: tuple(cached_values[dim.name].index(x[t][d])
if dim.values else x[t][d]
for i, (dim, t, d) in enumerate(indexes))
return python2sort(odict.items(), **sortkws)
Parameters
| Name | Type | Default | Kind |
|---|---|---|---|
odict |
- | - | positional_or_keyword |
kdims |
- | - | positional_or_keyword |
vdims |
- | - | positional_or_keyword |
key_index |
- | - | positional_or_keyword |
Parameter Details
odict: An ordered dictionary (OrderedDict) containing the data to be sorted. The dictionary items will be sorted based on the specified dimensions.
kdims: A list of key dimensions (Dimension objects) that define the primary structure of the data. These dimensions are used for indexing and must have a 'name' attribute and optionally a 'values' attribute for categorical ordering.
vdims: A list of value dimensions (Dimension objects) that represent dependent variables or measurements. Like kdims, these must have 'name' and optional 'values' attributes.
key_index: A list of integers specifying which dimensions (by index) to use for sorting. Indexes refer to positions in the combined list of kdims+vdims. Must not contain duplicates.
Return Value
Returns a sorted list of tuples from the input ordered dictionary, where each tuple is (key, value). The sorting is performed using the python2sort function with a custom key function that respects categorical ordering for dimensions with predefined values. For categorical dimensions, items are sorted by their position in the dimension's values list; for non-categorical dimensions, natural Python sorting is used.
Dependencies
python2sort
Required Imports
from collections import OrderedDict
Usage Example
# Assuming Dimension class and python2sort are available
from collections import OrderedDict
# Create mock Dimension objects
class Dimension:
def __init__(self, name, values=None):
self.name = name
self.values = values
# Define dimensions
kdims = [Dimension('category', values=['low', 'medium', 'high']), Dimension('year')]
vdims = [Dimension('value')]
# Create data
odict = OrderedDict([
(('high', 2020), {'value': 100}),
(('low', 2021), {'value': 50}),
(('medium', 2020), {'value': 75})
])
# Sort by first dimension (category) using categorical order
key_index = [0]
sorted_data = dimension_sort(odict, kdims, vdims, key_index)
# Result: [('low', 2021), ('medium', 2020), ('high', 2020)] ordered by categorical values
Best Practices
- Ensure key_index does not contain duplicate dimension indexes, as this will raise a ValueError
- Dimension objects must have a 'name' attribute; the 'values' attribute should be a list for categorical dimensions or None for natural sorting
- The key_index values should be valid indexes within the range of len(kdims + vdims)
- For categorical dimensions, all data values must exist in the dimension's values list, otherwise index() will raise a ValueError
- The input odict keys should be tuples where each element corresponds to a dimension value
- Performance consideration: categorical value lookups are cached in a dictionary for efficiency
Tags
Similar Components
AI-powered semantic similarity - components with related functionality:
-
function python2sort 55.3% similar
-
function merge_dimensions 53.0% similar
-
function arglexsort 50.2% similar
-
function sort_topologically 47.7% similar
-
function group_select 46.5% similar