function api_index_progress
Flask API endpoint that retrieves the current progress status of an asynchronous indexing task by its task ID.
/tf/active/vicechatdev/docchat/app.py
1404 - 1422
simple
Purpose
This endpoint provides real-time progress tracking for document indexing operations. It allows clients to poll for task status, progress percentage, current file being processed, and results upon completion. The function uses thread-safe access to a shared task dictionary to return task information or a 404 error if the task doesn't exist.
Source Code
def api_index_progress(task_id):
"""Get progress of an indexing task"""
with task_lock:
if task_id not in active_tasks:
return jsonify({'error': 'Task not found'}), 404
task = active_tasks[task_id]
response = {
'status': task['status'],
'progress': task['progress'],
'current_file': task.get('current_file', ''),
'processed': task.get('processed', 0),
'total': task.get('total', 0)
}
if task['status'] == 'completed' and task.get('results'):
response['results'] = task['results']
return jsonify(response)
Parameters
| Name | Type | Default | Kind |
|---|---|---|---|
task_id |
- | - | positional_or_keyword |
Parameter Details
task_id: String identifier for the indexing task. This ID is used to look up the task in the active_tasks dictionary. Expected to be a UUID or unique string generated when the indexing task was initiated.
Return Value
Returns a Flask JSON response. On success (200): a dictionary containing 'status' (task state), 'progress' (percentage complete), 'current_file' (file being processed), 'processed' (number of files completed), 'total' (total files to process), and optionally 'results' if status is 'completed'. On failure (404): a dictionary with 'error' key set to 'Task not found'.
Dependencies
flaskthreading
Required Imports
from flask import jsonify
Usage Example
# Setup required globals
from flask import Flask, jsonify
from threading import Lock
app = Flask(__name__)
task_lock = Lock()
active_tasks = {
'task-123': {
'status': 'processing',
'progress': 45,
'current_file': 'document.pdf',
'processed': 9,
'total': 20
}
}
@app.route('/api/index-progress/<task_id>', methods=['GET'])
def api_index_progress(task_id):
with task_lock:
if task_id not in active_tasks:
return jsonify({'error': 'Task not found'}), 404
task = active_tasks[task_id]
response = {
'status': task['status'],
'progress': task['progress'],
'current_file': task.get('current_file', ''),
'processed': task.get('processed', 0),
'total': task.get('total', 0)
}
if task['status'] == 'completed' and task.get('results'):
response['results'] = task['results']
return jsonify(response)
# Client usage:
# GET /api/index-progress/task-123
# Response: {"status": "processing", "progress": 45, "current_file": "document.pdf", "processed": 9, "total": 20}
Best Practices
- Always use the task_lock when accessing active_tasks to prevent race conditions in multi-threaded environments
- Use .get() method with default values for optional task fields to avoid KeyError exceptions
- Return appropriate HTTP status codes (404 for not found, 200 for success)
- Only include 'results' in response when task is completed to avoid sending incomplete data
- Consider implementing task cleanup to remove old completed tasks from active_tasks to prevent memory leaks
- Clients should implement exponential backoff when polling this endpoint to reduce server load
- Task IDs should be validated or sanitized if they come from user input to prevent injection attacks
Tags
Similar Components
AI-powered semantic similarity - components with related functionality:
-
function api_task_status 81.4% similar
-
function api_index_folder 80.2% similar
-
function get_task_status 75.5% similar
-
function index_all_documents 73.4% similar
-
function text_chat_get_progress 71.5% similar