function group_select
Recursively groups a list of key tuples into a nested dictionary structure to optimize indexing operations by avoiding duplicate key lookups.
/tf/active/vicechatdev/patches/util.py
1433 - 1448
moderate
Purpose
This function takes a list of tuples representing nested keys/indices and organizes them into a hierarchical dictionary structure. By grouping tuples that share common prefixes, it enables efficient batch indexing operations where the same parent keys don't need to be accessed multiple times. This is particularly useful for optimizing nested data structure access patterns.
Source Code
def group_select(selects, length=None, depth=None):
"""
Given a list of key tuples to select, groups them into sensible
chunks to avoid duplicating indexing operations.
"""
if length == None and depth == None:
length = depth = len(selects[0])
getter = operator.itemgetter(depth-length)
if length > 1:
selects = sorted(selects, key=getter)
grouped_selects = defaultdict(dict)
for k, v in itertools.groupby(selects, getter):
grouped_selects[k] = group_select(list(v), length-1, depth)
return grouped_selects
else:
return list(selects)
Parameters
| Name | Type | Default | Kind |
|---|---|---|---|
selects |
- | - | positional_or_keyword |
length |
- | None | positional_or_keyword |
depth |
- | None | positional_or_keyword |
Parameter Details
selects: A list of tuples where each tuple represents a sequence of keys/indices for nested data access. All tuples should have the same length. Example: [('a', 'b', 'c'), ('a', 'b', 'd'), ('a', 'e', 'f')]
length: Optional integer specifying how many levels deep to process from the current position. If None, defaults to the length of the first tuple in selects. Used internally for recursion tracking.
depth: Optional integer specifying the total depth/length of the original tuples. If None, defaults to the length of the first tuple in selects. Used internally to maintain context during recursion.
Return Value
Returns either a nested defaultdict structure (when length > 1) where keys at each level map to sub-dictionaries or lists, or a list of tuples (when length == 1) representing the leaf nodes. The structure groups tuples by common prefixes at each level, creating a tree-like organization.
Dependencies
operatoritertoolscollections
Required Imports
import operator
import itertools
from collections import defaultdict
Usage Example
import operator
import itertools
from collections import defaultdict
def group_select(selects, length=None, depth=None):
if length == None and depth == None:
length = depth = len(selects[0])
getter = operator.itemgetter(depth-length)
if length > 1:
selects = sorted(selects, key=getter)
grouped_selects = defaultdict(dict)
for k, v in itertools.groupby(selects, getter):
grouped_selects[k] = group_select(list(v), length-1, depth)
return grouped_selects
else:
return list(selects)
# Example usage
selects = [('a', 'b', 'c'), ('a', 'b', 'd'), ('a', 'e', 'f'), ('x', 'y', 'z')]
result = group_select(selects)
print(result)
# Output: defaultdict(<class 'dict'>, {'a': defaultdict(<class 'dict'>, {'b': [('a', 'b', 'c'), ('a', 'b', 'd')], 'e': [('a', 'e', 'f')]}), 'x': defaultdict(<class 'dict'>, {'y': [('x', 'y', 'z')]})})
Best Practices
- Ensure all tuples in the selects list have the same length to avoid indexing errors
- The function modifies the structure recursively, so be aware of the memory implications for very deep or large tuple lists
- The returned defaultdict structure may need to be converted to regular dicts if serialization is required
- This function is most beneficial when you have many tuples sharing common prefixes, as it reduces redundant indexing operations
- The function assumes tuples are hashable and comparable for sorting and grouping operations
Tags
Similar Components
AI-powered semantic similarity - components with related functionality:
-
function iterative_select 63.2% similar
-
function unpack_group 55.8% similar
-
function layer_groups 55.3% similar
-
function get_unique_keys 49.7% similar
-
class ndmapping_groupby 48.9% similar