🔍 Code Extractor

function group_select

Maturity: 42

Recursively groups a list of key tuples into a nested dictionary structure to optimize indexing operations by avoiding duplicate key lookups.

File:
/tf/active/vicechatdev/patches/util.py
Lines:
1433 - 1448
Complexity:
moderate

Purpose

This function takes a list of tuples representing nested keys/indices and organizes them into a hierarchical dictionary structure. By grouping tuples that share common prefixes, it enables efficient batch indexing operations where the same parent keys don't need to be accessed multiple times. This is particularly useful for optimizing nested data structure access patterns.

Source Code

def group_select(selects, length=None, depth=None):
    """
    Given a list of key tuples to select, groups them into sensible
    chunks to avoid duplicating indexing operations.
    """
    if length == None and depth == None:
        length = depth = len(selects[0])
    getter = operator.itemgetter(depth-length)
    if length > 1:
        selects = sorted(selects, key=getter)
        grouped_selects = defaultdict(dict)
        for k, v in itertools.groupby(selects, getter):
            grouped_selects[k] = group_select(list(v), length-1, depth)
        return grouped_selects
    else:
        return list(selects)

Parameters

Name Type Default Kind
selects - - positional_or_keyword
length - None positional_or_keyword
depth - None positional_or_keyword

Parameter Details

selects: A list of tuples where each tuple represents a sequence of keys/indices for nested data access. All tuples should have the same length. Example: [('a', 'b', 'c'), ('a', 'b', 'd'), ('a', 'e', 'f')]

length: Optional integer specifying how many levels deep to process from the current position. If None, defaults to the length of the first tuple in selects. Used internally for recursion tracking.

depth: Optional integer specifying the total depth/length of the original tuples. If None, defaults to the length of the first tuple in selects. Used internally to maintain context during recursion.

Return Value

Returns either a nested defaultdict structure (when length > 1) where keys at each level map to sub-dictionaries or lists, or a list of tuples (when length == 1) representing the leaf nodes. The structure groups tuples by common prefixes at each level, creating a tree-like organization.

Dependencies

  • operator
  • itertools
  • collections

Required Imports

import operator
import itertools
from collections import defaultdict

Usage Example

import operator
import itertools
from collections import defaultdict

def group_select(selects, length=None, depth=None):
    if length == None and depth == None:
        length = depth = len(selects[0])
    getter = operator.itemgetter(depth-length)
    if length > 1:
        selects = sorted(selects, key=getter)
        grouped_selects = defaultdict(dict)
        for k, v in itertools.groupby(selects, getter):
            grouped_selects[k] = group_select(list(v), length-1, depth)
        return grouped_selects
    else:
        return list(selects)

# Example usage
selects = [('a', 'b', 'c'), ('a', 'b', 'd'), ('a', 'e', 'f'), ('x', 'y', 'z')]
result = group_select(selects)
print(result)
# Output: defaultdict(<class 'dict'>, {'a': defaultdict(<class 'dict'>, {'b': [('a', 'b', 'c'), ('a', 'b', 'd')], 'e': [('a', 'e', 'f')]}), 'x': defaultdict(<class 'dict'>, {'y': [('x', 'y', 'z')]})})

Best Practices

  • Ensure all tuples in the selects list have the same length to avoid indexing errors
  • The function modifies the structure recursively, so be aware of the memory implications for very deep or large tuple lists
  • The returned defaultdict structure may need to be converted to regular dicts if serialization is required
  • This function is most beneficial when you have many tuples sharing common prefixes, as it reduces redundant indexing operations
  • The function assumes tuples are hashable and comparable for sorting and grouping operations

Similar Components

AI-powered semantic similarity - components with related functionality:

  • function iterative_select 63.2% similar

    Recursively selects subgroups from a hierarchical object structure by iterating through dimensions and applying select operations, avoiding duplication of selections.

    From: /tf/active/vicechatdev/patches/util.py
  • function unpack_group 55.8% similar

    Unpacks a pandas DataFrame group by iterating over rows and yielding tuples of keys and objects, with special handling for objects with 'kdims' attribute.

    From: /tf/active/vicechatdev/patches/util.py
  • function layer_groups 55.3% similar

    Groups elements from an ordering list into a dictionary based on a slice of each element's specification, using the first 'length' items as the grouping key.

    From: /tf/active/vicechatdev/patches/util.py
  • function get_unique_keys 49.7% similar

    Extracts unique key values from an ndmapping object for specified dimensions, returning an iterator of unique tuples.

    From: /tf/active/vicechatdev/patches/util.py
  • class ndmapping_groupby 48.9% similar

    A parameterized function class that performs groupby operations on NdMapping objects, automatically using pandas for improved performance when available, falling back to pure Python implementation otherwise.

    From: /tf/active/vicechatdev/patches/util.py
← Back to Browse