🔍 Code Extractor

class OpenAIResponsesLLM

Maturity: 51

Adapter class for OpenAI's Responses API, specifically designed for GPT-5 family models with automatic fallback mechanisms to stable models when responses fail.

File:
/tf/active/vicechatdev/docchat/llm_factory.py
Lines:
22 - 72
Complexity:
moderate

Purpose

This class provides a robust interface to OpenAI's Responses API with multiple fallback strategies. It first attempts to use the Responses API (required for GPT-5 models), then falls back to Chat Completions API with special parameters, and finally routes to a stable GPT-4o model if all else fails. This ensures reliable LLM responses even when newer APIs have issues.

Source Code

class OpenAIResponsesLLM:
    """Adapter using OpenAI Responses API (required/ideal for GPT-5 family)."""

    def __init__(self, model: str, api_key: Optional[str] = None, max_output_tokens: int = 4096, fallback_model: str = 'gpt-4o'):
        from openai import OpenAI  # lazy import
        self.client = OpenAI(api_key=api_key)
        self.model = model
        self.max_output_tokens = max_output_tokens
        self.fallback_model = fallback_model

    @property
    def model_name(self) -> str:
        return self.model

    def invoke(self, prompt: str) -> LLMMessage:
        # Primary: Responses API (text input)
        resp = self.client.responses.create(
            model=self.model,
            input=prompt,
            max_output_tokens=self.max_output_tokens,
        )
        content = getattr(resp, 'output_text', None)
        if not content:
            parts = []
            for item in getattr(resp, 'output', []) or []:
                if getattr(item, 'type', '') == 'message':
                    for c in getattr(item, 'content', []) or []:
                        if getattr(c, 'type', '') == 'output_text':
                            parts.append(getattr(c, 'text', '') or '')
            content = ''.join(parts)

        # Fallback: try Chat Completions with extra_body override
        if not content:
            try:
                cc = self.client.chat.completions.create(
                    model=self.model,
                    messages=[{"role": "user", "content": prompt}],
                    # Omit temperature for GPT-5 (default only)
                    extra_body={"max_completion_tokens": self.max_output_tokens},
                )
                content = (cc.choices[0].message.content or '')
            except Exception as e:
                logger.warning(f"Fallback to chat.completions failed: {e}")

        # Last resort: route to stable GPT-4o
        if not content and self.fallback_model:
            logger.info(f"Responses returned empty. Falling back to {self.fallback_model}.")
            backup = OpenAIChatLLM(model=self.fallback_model, api_key=self.client.api_key, temperature=0, max_tokens=self.max_output_tokens)
            content = backup.invoke(prompt).content

        return LLMMessage(content=(content or '').strip())

Parameters

Name Type Default Kind
bases - -

Parameter Details

model: The OpenAI model identifier to use (e.g., 'gpt-5', 'gpt-5-turbo'). This should be a model that supports the Responses API.

api_key: Optional OpenAI API key. If None, the OpenAI client will attempt to use the OPENAI_API_KEY environment variable.

max_output_tokens: Maximum number of tokens to generate in the response. Default is 4096. Controls the length of generated text.

fallback_model: Model to use as a last resort if all other methods fail. Default is 'gpt-4o', which is a stable, reliable model.

Return Value

The __init__ method returns an instance of OpenAIResponsesLLM. The invoke() method returns an LLMMessage object containing the generated text content. The model_name property returns a string with the primary model identifier.

Class Interface

Methods

__init__(self, model: str, api_key: Optional[str] = None, max_output_tokens: int = 4096, fallback_model: str = 'gpt-4o')

Purpose: Initializes the OpenAI Responses API adapter with configuration parameters and creates an OpenAI client instance

Parameters:

  • model: The primary OpenAI model to use (e.g., 'gpt-5')
  • api_key: Optional API key; if None, uses OPENAI_API_KEY environment variable
  • max_output_tokens: Maximum tokens to generate (default 4096)
  • fallback_model: Backup model to use if primary fails (default 'gpt-4o')

Returns: None (constructor)

@property model_name(self) -> str property

Purpose: Returns the name of the primary model being used

Returns: String containing the model identifier (e.g., 'gpt-5')

invoke(self, prompt: str) -> LLMMessage

Purpose: Sends a prompt to the LLM and returns the generated response, with automatic fallback handling if the primary method fails

Parameters:

  • prompt: The text prompt/question to send to the language model

Returns: LLMMessage object containing the generated text content in its 'content' attribute

Attributes

Name Type Description Scope
client OpenAI The OpenAI client instance used to make API calls instance
model str The primary model identifier to use for generation instance
max_output_tokens int Maximum number of tokens to generate in responses instance
fallback_model str The backup model identifier to use when primary methods fail instance

Dependencies

  • openai
  • logging
  • typing
  • dataclasses

Required Imports

from typing import Optional
import logging

Conditional/Optional Imports

These imports are only needed under specific conditions:

from openai import OpenAI

Condition: Required for instantiation and API calls, lazily imported in __init__

Required (conditional)
from dataclasses import dataclass

Condition: Required if LLMMessage is a dataclass defined elsewhere in the codebase

Required (conditional)

Usage Example

# Basic usage
from openai_responses_llm import OpenAIResponsesLLM

# Instantiate with API key
llm = OpenAIResponsesLLM(
    model='gpt-5',
    api_key='your-api-key-here',
    max_output_tokens=2048,
    fallback_model='gpt-4o'
)

# Get model name
print(llm.model_name)  # Output: 'gpt-5'

# Invoke the model with a prompt
prompt = 'Explain quantum computing in simple terms.'
response = llm.invoke(prompt)
print(response.content)

# Using environment variable for API key
import os
os.environ['OPENAI_API_KEY'] = 'your-api-key'
llm_auto = OpenAIResponsesLLM(model='gpt-5')
response = llm_auto.invoke('What is machine learning?')

Best Practices

  • Always provide an API key either through the constructor or OPENAI_API_KEY environment variable
  • The class implements a three-tier fallback strategy: Responses API → Chat Completions API → Stable fallback model
  • Set max_output_tokens appropriately based on your use case to control costs and response length
  • The fallback_model should be a stable, well-tested model (default gpt-4o is recommended)
  • Monitor logs for fallback warnings to detect when the primary Responses API is failing
  • The invoke() method is synchronous and will block until a response is received
  • Empty responses trigger automatic fallback mechanisms, ensuring you always get content
  • The class creates a new OpenAI client instance on initialization, so reuse the same instance for multiple invocations
  • Temperature is intentionally omitted for GPT-5 models (uses default only)
  • The class depends on external LLMMessage and OpenAIChatLLM classes that must be available in the module

Similar Components

AI-powered semantic similarity - components with related functionality:

  • class OpenAIChatLLM 80.1% similar

    Adapter class for interacting with OpenAI's Chat Completions API, supporting both GPT-4 and GPT-5 model families with automatic parameter adjustment based on model type.

    From: /tf/active/vicechatdev/docchat/llm_factory.py
  • class AzureOpenAIChatLLM 69.3% similar

    Adapter class for interacting with Azure OpenAI's Chat Completions API, providing a simplified interface for generating chat responses using Azure-hosted OpenAI models.

    From: /tf/active/vicechatdev/docchat/llm_factory.py
  • class GPT5Validator 68.5% similar

    A comprehensive testing and validation class for OpenAI GPT models, with special support for GPT-5 family models using the Responses API.

    From: /tf/active/vicechatdev/docchat/test_gpt5_readiness.py
  • class LLMClient_v1 67.0% similar

    Multi-LLM client that provides a unified interface for interacting with OpenAI GPT-4o, Azure OpenAI, Google Gemini, and Anthropic Claude models.

    From: /tf/active/vicechatdev/vice_ai/new_app.py
  • class LLMClient_v1 64.9% similar

    A client class for interacting with Large Language Models (LLMs), specifically designed to work with OpenAI's chat completion API.

    From: /tf/active/vicechatdev/QA_updater/core/llm_client.py
← Back to Browse