🔍 Code Extractor

function date_range

Maturity: 45

Generates an evenly-spaced date range array with a specified number of samples between start and end dates, with dates centered in each interval.

File:
/tf/active/vicechatdev/patches/util.py
Lines:
2073 - 2082
Complexity:
moderate

Purpose

This function computes a date range by calculating the appropriate time step between a start and end date to produce exactly 'length' number of datetime samples. It's useful for creating time-based axes or sampling points for time series data visualization and analysis. The function centers each date point in its interval (start+step/2) and supports different time units for precision control.

Source Code

def date_range(start, end, length, time_unit='us'):
    """
    Computes a date range given a start date, end date and the number
    of samples.
    """
    step = (1./compute_density(start, end, length, time_unit))
    if pd and isinstance(start, pd.Timestamp):
        start = start.to_datetime64()
    step = np.timedelta64(int(round(step)), time_unit)
    return start+step/2.+np.arange(length)*step

Parameters

Name Type Default Kind
start - - positional_or_keyword
end - - positional_or_keyword
length - - positional_or_keyword
time_unit - 'us' positional_or_keyword

Parameter Details

start: The starting date/time of the range. Can be a numpy datetime64, pandas Timestamp, or any datetime-like object. This represents the beginning of the time period to be divided into samples.

end: The ending date/time of the range. Should be the same type as start. This represents the end of the time period to be divided into samples.

length: Integer specifying the number of date samples to generate in the range. Must be a positive integer representing how many evenly-spaced points to create between start and end.

time_unit: String specifying the time unit for the step calculation. Default is 'us' (microseconds). Common values include 'D' (days), 'h' (hours), 'm' (minutes), 's' (seconds), 'ms' (milliseconds), 'us' (microseconds), 'ns' (nanoseconds). This affects the precision of the datetime calculations.

Return Value

Returns a numpy array of datetime64 objects containing 'length' evenly-spaced dates between start and end. Each date is centered in its interval (offset by step/2 from the start). The array type is numpy.ndarray with dtype datetime64[time_unit].

Dependencies

  • numpy
  • pandas

Required Imports

import numpy as np
import pandas as pd

Conditional/Optional Imports

These imports are only needed under specific conditions:

from compute_density import compute_density

Condition: The function compute_density must be available in the same module or imported separately. This is a required dependency that calculates the density of points per time unit.

Required (conditional)

Usage Example

import numpy as np
import pandas as pd

# Assuming compute_density is defined
def compute_density(start, end, length, time_unit='us'):
    delta = (end - start) / np.timedelta64(1, time_unit)
    return length / delta

# Example 1: Using numpy datetime64
start = np.datetime64('2023-01-01')
end = np.datetime64('2023-01-10')
dates = date_range(start, end, length=10, time_unit='D')
print(dates)

# Example 2: Using pandas Timestamp
start_ts = pd.Timestamp('2023-01-01 00:00:00')
end_ts = pd.Timestamp('2023-01-01 12:00:00')
hourly_dates = date_range(start_ts, end_ts, length=12, time_unit='h')
print(hourly_dates)

# Example 3: High precision with microseconds
start_precise = np.datetime64('2023-01-01T00:00:00')
end_precise = np.datetime64('2023-01-01T00:00:01')
micro_dates = date_range(start_precise, end_precise, length=1000, time_unit='us')
print(micro_dates[:5])

Best Practices

  • Ensure that the start and end dates are compatible datetime types (numpy datetime64 or pandas Timestamp)
  • The length parameter should be positive and reasonable for the time span to avoid extremely small or large step sizes
  • Choose an appropriate time_unit based on the scale of your date range (e.g., 'D' for days when spanning months, 'us' for sub-second precision)
  • Be aware that the function centers dates in each interval (adds step/2), so the first date will be offset from the exact start time
  • The compute_density function must be available in scope - this is a required dependency that needs to be defined or imported
  • When using pandas Timestamps, they are automatically converted to numpy datetime64 for consistent array operations
  • For very large length values or very small time ranges, be mindful of potential precision issues with the chosen time_unit

Similar Components

AI-powered semantic similarity - components with related functionality:

  • function bound_range 58.1% similar

    Computes a bounding range and density from evenly spaced samples, extending the range by half the density on each side and detecting if values are inverted.

    From: /tf/active/vicechatdev/patches/util.py
  • function compute_density 57.9% similar

    Computes the density (samples per unit) of a grid given start and end boundaries and the number of samples, with special handling for datetime/timedelta types.

    From: /tf/active/vicechatdev/patches/util.py
  • function range_pad 56.0% similar

    Pads a numeric or datetime range by a specified fraction of the interval, with optional logarithmic scaling for positive numeric values.

    From: /tf/active/vicechatdev/patches/util.py
  • function find_range 51.7% similar

    Robustly computes the minimum and maximum values from a collection, with fallback mechanisms for edge cases and support for extending the range with soft bounds.

    From: /tf/active/vicechatdev/patches/util.py
  • function parse_datetime_v1 49.1% similar

    Converts various date representations (string, integer, pandas Timestamp) into a numpy datetime64 object using pandas datetime parsing capabilities.

    From: /tf/active/vicechatdev/patches/util.py
← Back to Browse