🔍 Code Extractor

function search_indices

Maturity: 38

Finds the indices of specified values within a source array by using sorted search for efficient lookup.

File:
/tf/active/vicechatdev/patches/util.py
Lines:
2180 - 2186
Complexity:
moderate

Purpose

This function efficiently locates the positions (indices) of multiple target values within a source array. It uses numpy's argsort and searchsorted functions to perform the lookup in O(n log n + m log n) time complexity, where n is the size of the source array and m is the number of values to find. This is particularly useful for mapping values between arrays, finding positions of elements, or performing reverse lookups in data processing pipelines.

Source Code

def search_indices(values, source):
    """
    Given a set of values returns the indices of each of those values
    in the source array.
    """
    orig_indices = source.argsort()
    return orig_indices[np.searchsorted(source[orig_indices], values)]

Parameters

Name Type Default Kind
values - - positional_or_keyword
source - - positional_or_keyword

Parameter Details

values: An array-like object (list, numpy array, pandas Series, etc.) containing the values whose indices need to be found in the source array. These are the target values to search for.

source: A numpy array or array-like object that serves as the reference array in which to search for the values. This array will be sorted internally (via argsort) to enable efficient binary search, but the original array is not modified.

Return Value

Returns a numpy array of integers representing the indices where each value from the 'values' parameter appears in the 'source' array. The returned array has the same length as 'values', with each element corresponding to the index position in 'source'. If a value appears multiple times in source, the index returned corresponds to the position found by searchsorted (typically the first occurrence in the sorted order).

Dependencies

  • numpy

Required Imports

import numpy as np

Usage Example

import numpy as np

def search_indices(values, source):
    orig_indices = source.argsort()
    return orig_indices[np.searchsorted(source[orig_indices], values)]

# Example usage
source = np.array([10, 50, 30, 20, 40])
values = np.array([30, 10, 40])

indices = search_indices(values, source)
print(indices)  # Output: [2 0 4]
print(source[indices])  # Output: [30 10 40] - verifies the indices are correct

Best Practices

  • Ensure that all values in 'values' actually exist in 'source', otherwise the returned indices may point to incorrect positions or cause unexpected behavior
  • The source array should contain unique values for predictable results; duplicate values in source may lead to ambiguous index returns
  • This function assumes numeric or comparable data types that can be sorted; ensure both arrays contain compatible data types
  • For very large arrays, be aware of memory usage as argsort creates a copy of indices
  • Consider using this function when you need to find multiple values at once, as it's more efficient than repeated individual searches
  • The function does not validate that values exist in source; consider adding validation if needed for your use case

Similar Components

AI-powered semantic similarity - components with related functionality:

  • function arglexsort 60.5% similar

    Returns the indices that would lexicographically sort multiple arrays, treating them as columns of a structured array.

    From: /tf/active/vicechatdev/patches/util.py
  • function find_range 42.7% similar

    Robustly computes the minimum and maximum values from a collection, with fallback mechanisms for edge cases and support for extending the range with soft bounds.

    From: /tf/active/vicechatdev/patches/util.py
  • function cross_index 42.2% similar

    Efficiently indexes into a Cartesian product of iterables without materializing the full product, using a linear index to retrieve the corresponding tuple of values.

    From: /tf/active/vicechatdev/patches/util.py
  • function find_minmax 41.6% similar

    Computes the minimum of the first elements and maximum of the second elements from two tuples of numeric values, handling NaN values gracefully.

    From: /tf/active/vicechatdev/patches/util.py
  • function dimension_sort 41.4% similar

    Sorts an ordered dictionary by specified dimension keys, supporting both standard Python tuple sorting and categorical ordering for dimensions with predefined values.

    From: /tf/active/vicechatdev/patches/util.py
← Back to Browse