🔍 Code Extractor

class MeetingMinutesGenerator_v1

Maturity: 37

A class that generates professional meeting minutes from meeting transcripts using either OpenAI's GPT-4o or Google's Gemini AI models.

File:
/tf/active/vicechatdev/advanced_meeting_minutes_generator.py
Lines:
38 - 255
Complexity:
moderate

Purpose

This class provides a complete solution for transforming raw meeting transcripts into structured, professional meeting minutes. It is specifically designed for pharmaceutical development team meetings but can be adapted for other contexts. The class handles transcript loading, metadata extraction (date, speakers, duration), prompt engineering, LLM-based content generation, and output saving. It supports two AI models (GPT-4o and Gemini) and manages API authentication for both. The generated minutes include executive summaries, structured agendas, discussion points organized by topic, action items in table format, and technical specifications.

Source Code

class MeetingMinutesGenerator:
    def __init__(self, model: str = "gpt-4o", api_key: Optional[str] = None):
        """Initialize the generator with specified model and API key."""
        self.model = model.lower()
        
        if self.model == "gpt-4o":
            if not OPENAI_AVAILABLE:
                raise Exception("OpenAI library not installed. Run: pip install openai")
            if not api_key:
                api_key = os.getenv('OPENAI_API_KEY')
            if not api_key:
                raise Exception("OpenAI API key not provided")
            self.client = openai.OpenAI(api_key=api_key)
            
        elif self.model == "gemini":
            if not GEMINI_AVAILABLE:
                raise Exception("Google Generative AI library not installed. Run: pip install google-generativeai")
            if not api_key:
                api_key = os.getenv('GEMINI_API_KEY')
            if not api_key:
                raise Exception("Gemini API key not provided")
            genai.configure(api_key=api_key)
            self.client = genai.GenerativeModel('gemini-2.0-flash-exp')
            
        else:
            raise Exception(f"Unsupported model: {model}. Choose 'gpt-4o' or 'gemini'")
        
    def load_transcript(self, file_path: str) -> str:
        """Load transcript from file."""
        try:
            with open(file_path, 'r', encoding='utf-8') as file:
                return file.read()
        except Exception as e:
            raise Exception(f"Error loading transcript: {e}")
    
    def parse_transcript_metadata(self, transcript: str) -> Dict[str, str]:
        """Extract meeting metadata from transcript."""
        # Extract date from filename or content
        date_match = re.search(r'(\d{4}-\d{2}-\d{2}|\d{8})', transcript)
        meeting_date = date_match.group(1) if date_match else datetime.now().strftime('%Y-%m-%d')
        
        # Extract unique speakers (filter out generic ones)
        speaker_pattern = r'^(.+) at \d+[h:]?\d*[:\-]\d+ - \d+[h:]?\d*[:\-]\d+'
        speakers = set()
        for line in transcript.split('\n'):
            line = line.strip()
            if not line:
                continue
            match = re.match(speaker_pattern, line)
            if match:
                speaker = match.group(1).strip()
                # Filter out generic speaker names and meeting rooms
                if (speaker and 
                    not re.match(r'^Speaker \d+$', speaker) and 
                    speaker != 'Vice Lln Level0 58d Meeting Room'):
                    speakers.add(speaker)
        
        return {
            'date': meeting_date,
            'speakers': sorted(list(speakers)),  # Sort for consistent output
            'duration': self._extract_duration(transcript)
        }
    
    def _extract_duration(self, transcript: str) -> str:
        """Extract meeting duration from transcript."""
        time_pattern = r'(\d+[h:]?\d+[:\-]\d+)'
        times = re.findall(time_pattern, transcript)
        if len(times) >= 2:
            start_time = times[0]
            end_time = times[-1]
            return f"{start_time} - {end_time}"
        return "Duration not available"
    
    def _create_prompt(self, transcript: str, metadata: Dict, meeting_title: str) -> str:
        """Create the prompt for LLM processing."""
        return f"""You are an expert meeting secretary tasked with creating professional meeting minutes from a transcript of a pharmaceutical development team meeting.

Transform the following meeting transcript into well-structured meeting minutes with these sections:

## Meeting Information
- **Title:** {meeting_title}
- **Date:** {metadata['date']}
- **Duration:** {metadata['duration']}
- **Attendees:** {', '.join(metadata['speakers']) if metadata['speakers'] else 'Multiple participants (see transcript)'}

## Executive Summary
Provide a brief overview of the meeting's main purpose and key outcomes (2-3 sentences).

## Meeting Agenda
Based on the topics discussed in the transcript, create a structured agenda with numbered items:
1. [Main topic 1 - e.g., Preclinical Publications & IP Updates]
2. [Main topic 2 - e.g., Clinical Development Plan Review]
3. [Main topic 3 - e.g., Study Design Modifications]
4. [Additional topics as identified from transcript]

## Meeting Discussion by Agenda Item

### 1. [Agenda Item Title]
**Summary:** [2-3 sentence summary of the key discussion points, decisions, and outcomes for this agenda item]

**Key Points:**
- [Bullet point 1]
- [Bullet point 2]
- [Additional relevant details]

**Decisions Made:**
- [Specific decision 1 with rationale if provided]
- [Specific decision 2 with rationale if provided]

### 2. [Agenda Item Title]
**Summary:** [2-3 sentence summary of the key discussion points, decisions, and outcomes for this agenda item]

**Key Points:**
- [Bullet point 1]
- [Bullet point 2]
- [Additional relevant details]

**Decisions Made:**
- [Specific decision 1 with rationale if provided]
- [Specific decision 2 with rationale if provided]

[Continue this pattern for all agenda items identified]

## Action Items

| Priority | Action Item | Responsible Party | Deadline | Status | Notes |
|----------|-------------|-------------------|----------|---------|-------|
| High | [Action description] | [Name/Team] | [Date/Timeline] | Open | [Additional context] |
| Medium | [Action description] | [Name/Team] | [Date/Timeline] | Open | [Additional context] |
| Low | [Action description] | [Name/Team] | TBD | Open | [Additional context] |

## Next Steps & Follow-up Meetings
- **Upcoming Meetings:** [List scheduled meetings with dates/purposes]
- **Outstanding Issues:** [Items requiring further discussion or resolution]
- **Future Planning:** [Long-term items or considerations]

## Technical Specifications & Key Data
- **Dosages & Formulations:** [Key technical specifications discussed]
- **Study Parameters:** [Important study design elements]
- **Timeline Impacts:** [Critical dates and dependencies]

**Instructions for Processing:**
1. First, analyze the entire transcript to identify 4-6 main agenda topics based on discussion flow
2. Create a logical agenda structure that reflects the meeting's natural progression
3. Organize all content under the appropriate agenda items with chapter-like summaries
4. Extract action items and format them in a clear table with priority levels (High/Medium/Low)
5. Clean up conversational language into professional meeting language
6. Ignore technical difficulties, off-topic chatter, and multilingual fragments
7. Use clear, professional pharmaceutical industry language
8. Be specific about dosages, study phases, and technical details
9. Maintain context of vaccine development discussions
10. Ensure each agenda chapter has a concise summary plus detailed key points

**Meeting Transcript:**
{transcript}

Generate comprehensive meeting minutes following the structure above, focusing on the pharmaceutical development context and organizing content by agenda chapters."""

    def generate_meeting_minutes_gpt4o(self, transcript: str, meeting_title: str) -> str:
        """Generate meeting minutes using GPT-4o."""
        metadata = self.parse_transcript_metadata(transcript)
        prompt = self._create_prompt(transcript, metadata, meeting_title)
        
        try:
            response = self.client.chat.completions.create(
                model="gpt-4o",
                messages=[
                    {
                        "role": "system", 
                        "content": "You are an expert pharmaceutical industry meeting secretary who creates clear, professional meeting minutes from transcripts. Focus on extracting key decisions, action items, and technical discussions while maintaining professional pharmaceutical industry terminology."
                    },
                    {"role": "user", "content": prompt}
                ],
                max_tokens=4000,
                temperature=0.3
            )
            
            return response.choices[0].message.content
        
        except Exception as e:
            raise Exception(f"Error generating meeting minutes with GPT-4o: {e}")
    
    def generate_meeting_minutes_gemini(self, transcript: str, meeting_title: str) -> str:
        """Generate meeting minutes using Gemini 2.5 Flash."""
        metadata = self.parse_transcript_metadata(transcript)
        prompt = self._create_prompt(transcript, metadata, meeting_title)
        
        try:
            response = self.client.generate_content(
                prompt,
                generation_config=genai.types.GenerationConfig(
                    max_output_tokens=4000,
                    temperature=0.3,
                )
            )
            
            return response.text
        
        except Exception as e:
            raise Exception(f"Error generating meeting minutes with Gemini: {e}")
    
    def generate_meeting_minutes(self, transcript: str, meeting_title: str = "Development Team Meeting") -> str:
        """Generate meeting minutes using the configured model."""
        if self.model == "gpt-4o":
            return self.generate_meeting_minutes_gpt4o(transcript, meeting_title)
        elif self.model == "gemini":
            return self.generate_meeting_minutes_gemini(transcript, meeting_title)
        else:
            raise Exception(f"Unknown model: {self.model}")
    
    def save_minutes(self, minutes: str, output_path: str):
        """Save meeting minutes to file."""
        try:
            with open(output_path, 'w', encoding='utf-8') as file:
                file.write(minutes)
            print(f"Meeting minutes saved to: {output_path}")
        except Exception as e:
            raise Exception(f"Error saving meeting minutes: {e}")

Parameters

Name Type Default Kind
bases - -

Parameter Details

model: The AI model to use for generating meeting minutes. Accepts 'gpt-4o' (OpenAI) or 'gemini' (Google). Case-insensitive. Defaults to 'gpt-4o'. This determines which API client is initialized and which generation method is called.

api_key: Optional API key for the selected model. If not provided, the class attempts to read from environment variables (OPENAI_API_KEY for GPT-4o, GEMINI_API_KEY for Gemini). If neither is available, initialization raises an exception.

Return Value

The __init__ method returns a MeetingMinutesGenerator instance with an initialized API client (either openai.OpenAI or genai.GenerativeModel). The generate_meeting_minutes method returns a string containing formatted meeting minutes in Markdown format with sections for meeting information, executive summary, agenda, discussions, action items, and next steps. The load_transcript method returns the raw transcript as a string. The parse_transcript_metadata method returns a dictionary with keys 'date', 'speakers', and 'duration'.

Class Interface

Methods

__init__(self, model: str = 'gpt-4o', api_key: Optional[str] = None)

Purpose: Initialize the generator with specified AI model and API credentials, setting up the appropriate API client

Parameters:

  • model: AI model to use ('gpt-4o' or 'gemini'), defaults to 'gpt-4o'
  • api_key: Optional API key; if None, reads from environment variables

Returns: None (constructor)

load_transcript(self, file_path: str) -> str

Purpose: Load and return the contents of a transcript file

Parameters:

  • file_path: Path to the transcript text file (UTF-8 encoded)

Returns: String containing the full transcript text

parse_transcript_metadata(self, transcript: str) -> Dict[str, str]

Purpose: Extract meeting metadata including date, speakers, and duration from the transcript text

Parameters:

  • transcript: Raw transcript text to parse

Returns: Dictionary with keys 'date' (YYYY-MM-DD format), 'speakers' (sorted list of unique speaker names), and 'duration' (time range string)

_extract_duration(self, transcript: str) -> str

Purpose: Internal method to extract meeting start and end times from transcript

Parameters:

  • transcript: Raw transcript text containing timestamps

Returns: String in format 'start_time - end_time' or 'Duration not available'

_create_prompt(self, transcript: str, metadata: Dict, meeting_title: str) -> str

Purpose: Internal method to construct the detailed prompt for the LLM with instructions and transcript

Parameters:

  • transcript: Raw transcript text
  • metadata: Dictionary containing meeting metadata (date, speakers, duration)
  • meeting_title: Title of the meeting

Returns: Formatted prompt string with instructions and transcript for LLM processing

generate_meeting_minutes_gpt4o(self, transcript: str, meeting_title: str) -> str

Purpose: Generate meeting minutes using OpenAI's GPT-4o model

Parameters:

  • transcript: Raw transcript text to process
  • meeting_title: Title of the meeting for the minutes header

Returns: Formatted meeting minutes as a Markdown string

generate_meeting_minutes_gemini(self, transcript: str, meeting_title: str) -> str

Purpose: Generate meeting minutes using Google's Gemini 2.0 Flash model

Parameters:

  • transcript: Raw transcript text to process
  • meeting_title: Title of the meeting for the minutes header

Returns: Formatted meeting minutes as a Markdown string

generate_meeting_minutes(self, transcript: str, meeting_title: str = 'Development Team Meeting') -> str

Purpose: Main method to generate meeting minutes using the configured model (routes to model-specific method)

Parameters:

  • transcript: Raw transcript text to process
  • meeting_title: Title of the meeting, defaults to 'Development Team Meeting'

Returns: Formatted meeting minutes as a Markdown string with sections for summary, agenda, discussions, action items, and next steps

save_minutes(self, minutes: str, output_path: str)

Purpose: Save generated meeting minutes to a file and print confirmation

Parameters:

  • minutes: Formatted meeting minutes string to save
  • output_path: File path where minutes should be saved (typically .md extension)

Returns: None (prints confirmation message to stdout)

Attributes

Name Type Description Scope
model str Lowercase name of the AI model being used ('gpt-4o' or 'gemini') instance
client Union[openai.OpenAI, genai.GenerativeModel] Initialized API client for the selected model - either OpenAI client or Gemini GenerativeModel instance instance

Dependencies

  • os
  • re
  • datetime
  • typing
  • argparse
  • openai
  • google-generativeai

Required Imports

import os
import re
from datetime import datetime
from typing import Dict, Optional

Conditional/Optional Imports

These imports are only needed under specific conditions:

import openai

Condition: only if using GPT-4o model (model='gpt-4o')

Optional
import google.generativeai as genai

Condition: only if using Gemini model (model='gemini')

Optional

Usage Example

# Basic usage with GPT-4o
import os
os.environ['OPENAI_API_KEY'] = 'your-api-key-here'

# Initialize generator
generator = MeetingMinutesGenerator(model='gpt-4o')

# Load transcript from file
transcript = generator.load_transcript('meeting_transcript.txt')

# Generate meeting minutes
minutes = generator.generate_meeting_minutes(
    transcript=transcript,
    meeting_title='Q1 Development Team Meeting'
)

# Save to file
generator.save_minutes(minutes, 'meeting_minutes.md')

# Alternative: Use Gemini model
generator_gemini = MeetingMinutesGenerator(
    model='gemini',
    api_key='your-gemini-api-key'
)
minutes_gemini = generator_gemini.generate_meeting_minutes(transcript)

# Extract metadata only
metadata = generator.parse_transcript_metadata(transcript)
print(f"Meeting date: {metadata['date']}")
print(f"Speakers: {metadata['speakers']}")

Best Practices

  • Always provide an API key either through the constructor parameter or environment variables before instantiation
  • Install the required library (openai or google-generativeai) before initializing with the corresponding model
  • The class is stateless after initialization - you can reuse the same instance for multiple transcript processing operations
  • Transcript files should be UTF-8 encoded text files for proper loading
  • The parse_transcript_metadata method expects transcripts with speaker names in the format 'Speaker Name at HH:MM - HH:MM'
  • Generated minutes are in Markdown format - save with .md extension for proper rendering
  • The model parameter is case-insensitive but must be exactly 'gpt-4o' or 'gemini'
  • For pharmaceutical contexts, the default prompt is optimized; for other domains, consider modifying the _create_prompt method
  • The temperature is set to 0.3 for consistent, factual output - modify in the generation methods if more creativity is needed
  • Action items are automatically formatted as tables - ensure the LLM output includes this section
  • The class filters out generic speaker names like 'Speaker 1' and meeting room names from the attendee list

Similar Components

AI-powered semantic similarity - components with related functionality:

  • class MeetingMinutesGenerator 92.4% similar

    A class that generates professional meeting minutes from meeting transcripts using OpenAI's GPT-4o model, with capabilities to parse metadata, extract action items, and format output.

    From: /tf/active/vicechatdev/meeting_minutes_generator.py
  • function main_v14 79.5% similar

    Command-line interface function that orchestrates the generation of meeting minutes from a transcript file using either GPT-4o or Gemini LLM models.

    From: /tf/active/vicechatdev/advanced_meeting_minutes_generator.py
  • function main_v27 75.5% similar

    Entry point function that orchestrates the process of loading a meeting transcript, generating structured meeting minutes using OpenAI's GPT-4o API, and saving the output to a file.

    From: /tf/active/vicechatdev/meeting_minutes_generator.py
  • function main_v2 73.5% similar

    Command-line interface function that orchestrates the generation of enhanced meeting minutes from transcript files and PowerPoint presentations using various LLM models (GPT-4o, Azure GPT-4o, or Gemini).

    From: /tf/active/vicechatdev/leexi/enhanced_meeting_minutes_generator.py
  • function generate_minutes 70.3% similar

    Flask route handler that processes uploaded meeting transcripts and optional supporting documents to generate structured meeting minutes using AI, with configurable output styles and validation.

    From: /tf/active/vicechatdev/leexi/app.py
← Back to Browse