function search_indices
Finds the indices of specified values within a source array by using sorted search for efficient lookup.
/tf/active/vicechatdev/patches/util.py
2180 - 2186
moderate
Purpose
This function efficiently locates the positions (indices) of multiple target values within a source array. It uses numpy's argsort and searchsorted functions to perform the lookup in O(n log n + m log n) time complexity, where n is the size of the source array and m is the number of values to find. This is particularly useful for mapping values between arrays, finding positions of elements, or performing reverse lookups in data processing pipelines.
Source Code
def search_indices(values, source):
"""
Given a set of values returns the indices of each of those values
in the source array.
"""
orig_indices = source.argsort()
return orig_indices[np.searchsorted(source[orig_indices], values)]
Parameters
| Name | Type | Default | Kind |
|---|---|---|---|
values |
- | - | positional_or_keyword |
source |
- | - | positional_or_keyword |
Parameter Details
values: An array-like object (list, numpy array, pandas Series, etc.) containing the values whose indices need to be found in the source array. These are the target values to search for.
source: A numpy array or array-like object that serves as the reference array in which to search for the values. This array will be sorted internally (via argsort) to enable efficient binary search, but the original array is not modified.
Return Value
Returns a numpy array of integers representing the indices where each value from the 'values' parameter appears in the 'source' array. The returned array has the same length as 'values', with each element corresponding to the index position in 'source'. If a value appears multiple times in source, the index returned corresponds to the position found by searchsorted (typically the first occurrence in the sorted order).
Dependencies
numpy
Required Imports
import numpy as np
Usage Example
import numpy as np
def search_indices(values, source):
orig_indices = source.argsort()
return orig_indices[np.searchsorted(source[orig_indices], values)]
# Example usage
source = np.array([10, 50, 30, 20, 40])
values = np.array([30, 10, 40])
indices = search_indices(values, source)
print(indices) # Output: [2 0 4]
print(source[indices]) # Output: [30 10 40] - verifies the indices are correct
Best Practices
- Ensure that all values in 'values' actually exist in 'source', otherwise the returned indices may point to incorrect positions or cause unexpected behavior
- The source array should contain unique values for predictable results; duplicate values in source may lead to ambiguous index returns
- This function assumes numeric or comparable data types that can be sorted; ensure both arrays contain compatible data types
- For very large arrays, be aware of memory usage as argsort creates a copy of indices
- Consider using this function when you need to find multiple values at once, as it's more efficient than repeated individual searches
- The function does not validate that values exist in source; consider adding validation if needed for your use case
Tags
Similar Components
AI-powered semantic similarity - components with related functionality:
-
function arglexsort 60.5% similar
-
function find_range 42.7% similar
-
function cross_index 42.2% similar
-
function find_minmax 41.6% similar
-
function dimension_sort 41.4% similar