🔍 Code Extractor

function unique_iterator

Maturity: 45

A generator function that yields unique elements from an input sequence in order of first appearance, filtering out duplicates.

File:
/tf/active/vicechatdev/patches/util.py
Lines:
1121 - 1130
Complexity:
simple

Purpose

This function provides memory-efficient iteration over a sequence while maintaining order and removing duplicate elements. It uses a set to track seen items and yields each unique element only once. This is useful for deduplicating iterables while preserving the original order, particularly when working with large sequences where creating a new list would be memory-intensive.

Source Code

def unique_iterator(seq):
    """
    Returns an iterator containing all non-duplicate elements
    in the input sequence.
    """
    seen = set()
    for item in seq:
        if item not in seen:
            seen.add(item)
            yield item

Parameters

Name Type Default Kind
seq - - positional_or_keyword

Parameter Details

seq: An iterable sequence (list, tuple, generator, etc.) containing elements to be deduplicated. Elements must be hashable (immutable types like strings, numbers, tuples) since they are stored in a set. The sequence can contain any hashable data type.

Return Value

Returns a generator iterator that yields unique elements from the input sequence in the order they first appear. Each element is yielded exactly once, even if it appears multiple times in the input. The generator is lazy and only processes elements as they are consumed.

Usage Example

# Basic usage with a list
data = [1, 2, 3, 2, 4, 1, 5, 3]
for unique_item in unique_iterator(data):
    print(unique_item)
# Output: 1, 2, 3, 4, 5

# Convert to list if needed
unique_list = list(unique_iterator(data))
print(unique_list)  # [1, 2, 3, 4, 5]

# Works with strings
text = "hello world"
unique_chars = list(unique_iterator(text))
print(''.join(unique_chars))  # 'helo wrd'

# Works with any iterable
tuples = [(1, 2), (3, 4), (1, 2), (5, 6)]
for item in unique_iterator(tuples):
    print(item)
# Output: (1, 2), (3, 4), (5, 6)

Best Practices

  • Elements in the input sequence must be hashable (immutable types) since they are stored in a set. Unhashable types like lists or dictionaries will raise a TypeError.
  • The function preserves the order of first appearance, making it suitable for order-sensitive operations.
  • Use this function when working with large sequences to avoid creating intermediate lists, as it yields elements one at a time.
  • If you need the results as a list, wrap the function call with list(): list(unique_iterator(seq))
  • The function maintains a set of all seen items in memory, so memory usage grows with the number of unique elements.
  • For very large datasets with many unique values, consider alternative approaches if memory is constrained.
  • This is more efficient than using set(seq) when you need to preserve order, as set() does not guarantee order preservation in Python versions before 3.7.

Similar Components

AI-powered semantic similarity - components with related functionality:

  • function unique_zip 59.0% similar

    Returns a unique list of tuples created by zipping multiple iterables together, removing any duplicate tuples while preserving order.

    From: /tf/active/vicechatdev/patches/util.py
  • function get_unique_keys 50.8% similar

    Extracts unique key values from an ndmapping object for specified dimensions, returning an iterator of unique tuples.

    From: /tf/active/vicechatdev/patches/util.py
  • function unique_array 46.2% similar

    Returns an array of unique values from the input array while preserving the original order of first occurrence.

    From: /tf/active/vicechatdev/patches/util.py
  • class Generator 43.2% similar

    Generator is a specialized Callable wrapper class that wraps Python generator objects, providing controlled iteration with no arguments and no memoization.

    From: /tf/active/vicechatdev/patches/spaces.py
  • function test_no_identical_chunks 42.3% similar

    A unit test function that verifies the HashCleaner's behavior when processing a list of unique text chunks, ensuring no chunks are removed when all are distinct.

    From: /tf/active/vicechatdev/chromadb-cleanup/tests/test_hash_cleaner.py
← Back to Browse