Search - Code Extractor

Search Components

Full-Text: Fast keyword matching | Semantic: AI-powered understanding of intent (finds similar concepts)

Search Results for "indexing"

Found 31 matching component(s)

function add_document_to_graph_v1

Creates a Neo4j graph node for a processed document and connects it to a folder hierarchy, along with its text and table chunks.

File: /tf/active/vicechatdev/offline_docstore_multi_vice.py

neo4j graph-database document-management knowledge-graph cypher-query
class PowerPointProcessor

A class that processes PowerPoint (.pptx) presentations to extract text content and tables, converting tables to markdown format and organizing content by slides.

File: /tf/active/vicechatdev/leexi/enhanced_meeting_minutes_generator.py

powerpoint pptx document-processing text-extraction table-extraction
function index_documents_example

A demonstration function that indexes documents from a specified folder using a DocumentIndexer, creating the folder if it doesn't exist, and displays indexing results and collection statistics.

File: /tf/active/vicechatdev/docchat/example_usage.py

document-indexing example tutorial demonstration RAG
function main_v73

Orchestrates and executes a series of example demonstrations for the DocChat system, including document indexing, RAG queries, and conversation modes.

File: /tf/active/vicechatdev/docchat/example_usage.py

demo examples orchestration RAG document-chat
function build_document_tree_lazy

Builds a single-level document tree structure for lazy loading, scanning only immediate children of a target directory without recursively loading subdirectories.

File: /tf/active/vicechatdev/docchat/app.py

file-system directory-tree lazy-loading document-management file-browser
function build_document_tree_recursive

Recursively builds a complete hierarchical tree structure of documents and folders from a target directory path, filtering for supported file types and skipping hidden/cache directories.

File: /tf/active/vicechatdev/docchat/app.py

file-system directory-traversal recursive document-management tree-structure
function get_document_info

Retrieves indexing status and metadata for a document, including whether it's indexed, its document ID, chunk count, and reindexing status.

File: /tf/active/vicechatdev/docchat/app.py

document-management indexing metadata vector-database chromadb
function init_engines

Initializes the RAG (Retrieval-Augmented Generation) engine and document indexer components, loads persisted sessions, and optionally starts background auto-indexing of documents.

File: /tf/active/vicechatdev/docchat/app.py

initialization RAG document-indexing background-processing threading
function api_task_status

Flask API endpoint that retrieves and returns the status of asynchronous tasks (chat or indexing operations) by task ID.

File: /tf/active/vicechatdev/docchat/app.py

api flask rest-endpoint task-status async-polling
function api_upload

Flask API endpoint that handles file uploads, validates file types, saves files to a configured directory structure, and automatically indexes the uploaded document for search/retrieval.

File: /tf/active/vicechatdev/docchat/app.py

file-upload api-endpoint document-management rag indexing
function api_index_folder

Flask API endpoint that initiates a background task to index documents in a specified folder, tracking progress and returning a task ID for status monitoring.

File: /tf/active/vicechatdev/docchat/app.py

flask api endpoint background-task document-indexing
function api_index_progress

Flask API endpoint that retrieves the current progress status of an asynchronous indexing task by its task ID.

File: /tf/active/vicechatdev/docchat/app.py

flask api rest-endpoint progress-tracking async-task
function matches_source_filter

Checks if a document path matches any of the provided source filters using exact match, folder prefix match, path component sequence match, or filename match.

File: /tf/active/vicechatdev/docchat/rag_engine.py

path-matching file-filtering document-filtering path-normalization string-matching
function upload_document

Flask route handler that processes file uploads, saves them securely to disk, and indexes the document content for retrieval-augmented generation (RAG) search.

File: /tf/active/vicechatdev/docchat/blueprint.py

file-upload document-processing flask-route authentication RAG
function index_all_documents

Flask route handler that initiates background indexing of all documents in the system, creating a task ID for tracking progress and returning immediately while indexing continues asynchronously.

File: /tf/active/vicechatdev/docchat/blueprint.py

flask api-endpoint background-task document-indexing async-processing
function get_task_status

Flask API endpoint that retrieves the current status of a background task by its task ID from an in-memory active_tasks dictionary.

File: /tf/active/vicechatdev/docchat/blueprint.py

flask api rest-endpoint task-status background-task
function get_stats

Flask API endpoint that retrieves and returns statistics about a document collection from a RAG (Retrieval-Augmented Generation) system.

File: /tf/active/vicechatdev/docchat/blueprint.py

flask api endpoint statistics rag
class DocumentProcessor_v8

Process different document types for indexing

File: /tf/active/vicechatdev/docchat/document_indexer.py

class documentprocessor
class DocumentIndexer

A class for indexing documents into ChromaDB with support for multiple file formats (PDF, Word, PowerPoint, Excel, text files), smart incremental indexing, and document chunk management.

File: /tf/active/vicechatdev/docchat/document_indexer.py

document-indexing vector-database chromadb embeddings pdf-processing
function create_test_document

Creates a text file at the specified path with the given content, primarily used for testing purposes.

File: /tf/active/vicechatdev/docchat/test_incremental_indexing.py

testing file-creation test-utilities file-io document-creation
function test_incremental_indexing

Comprehensive test function that validates incremental indexing functionality of a document indexing system, including initial indexing, change detection, re-indexing, and force re-indexing scenarios.

File: /tf/active/vicechatdev/docchat/test_incremental_indexing.py

testing incremental-indexing document-indexing integration-test file-system
function extract_text_from_pdf

Extracts all text content from a PDF document and returns it as a string.

File: /tf/active/vicechatdev/CDocs/utils/pdf_utils.py

pdf text-extraction document-processing file-io pdf-parsing
function process_ellipses

Expands an Ellipsis (...) in a __getitem__ key by replacing it with the appropriate number of empty slices (slice(None)) to match the dimensions of an object.

File: /tf/active/vicechatdev/patches/util.py

indexing ellipsis slicing multi-dimensional data-structures
function int_to_alpha

Converts a non-negative integer to an Excel-style alphabetic column label (A, B, C, ..., Z, AA, AB, ..., ZZ, AAA, etc.).

File: /tf/active/vicechatdev/patches/util.py

string-conversion alphabetic-labels excel-columns base-26 spreadsheet
function group_select

Recursively groups a list of key tuples into a nested dictionary structure to optimize indexing operations by avoiding duplicate key lookups.

File: /tf/active/vicechatdev/patches/util.py

data-structures optimization indexing grouping recursion
function get_spec

Extracts a specification tuple from a labeled data object, consisting of the class name, group, and label attributes.

File: /tf/active/vicechatdev/patches/util.py

metadata specification data-object identification holoviews
function cross_index

Efficiently indexes into a Cartesian product of iterables without materializing the full product, using a linear index to retrieve the corresponding tuple of values.

File: /tf/active/vicechatdev/patches/util.py

cartesian-product indexing combinatorics memory-efficient itertools-alternative
function search_indices

Finds the indices of specified values within a source array by using sorted search for efficient lookup.

File: /tf/active/vicechatdev/patches/util.py

array-search indexing numpy data-lookup sorting
class DynamicMap

A DynamicMap is a type of HoloMap where the elements are dynamically generated by a callable. The callable is invoked with values associated with the key dimensions or with values supplied by stream parameters.

File: /tf/active/vicechatdev/patches/spaces.py

class dynamicmap
class GridSpace

GridSpace is a container class for organizing elements in a 1D or 2D grid structure with floating-point keys, ensuring all contained elements are of the same type.

File: /tf/active/vicechatdev/patches/spaces.py

grid layout container mapping 2D-grid
function scan_wuxi2_folder

Recursively scans a wuxi2 folder for PDF documents, extracts document codes from filenames, and organizes them into a dictionary mapping codes to file information.

File: /tf/active/vicechatdev/mailsearch/compare_documents.py

file-scanning directory-traversal pdf-processing document-indexing code-extraction

Search Examples

validation - Find validation functions
database - Find database-related components
email - Find email processing functions
api - Find API-related components
authentication - Find auth components

Search Components

Search Results for "indexing"

function add_document_to_graph_v1

class PowerPointProcessor

function index_documents_example

function main_v73

function build_document_tree_lazy

function build_document_tree_recursive

function get_document_info

function init_engines

function api_task_status

function api_upload

function api_index_folder

function api_index_progress

function matches_source_filter

function upload_document

function index_all_documents

function get_task_status

function get_stats

class DocumentProcessor_v8

class DocumentIndexer

function create_test_document

function test_incremental_indexing

function extract_text_from_pdf

function process_ellipses

function int_to_alpha

function group_select

function get_spec

function cross_index

function search_indices

class DynamicMap

class GridSpace

function scan_wuxi2_folder

Search Examples