extract_total_references - Code Extractor

function extract_total_references

Maturity: 41

Extracts the total count of references from markdown-formatted content by first checking for a header line with the total, then falling back to manually counting reference entries.

File:
/tf/active/vicechatdev/enhanced_word_converter_fixed.py

Lines:
73 - 88

Complexity:
simple

Purpose

This function is designed to parse markdown documents that contain bibliographic references and determine the total number of references present. It uses a two-stage approach: first attempting to find an explicit '**Total References**:' header line with the count, and if that fails, manually counting lines that match the reference format '**[...]**'. This is useful for document processing pipelines that need to validate or report on reference counts in markdown-formatted academic or technical documents.

Source Code

def extract_total_references(markdown_content):
    """Extract total number of references from the markdown content"""
    lines = markdown_content.split('\n')
    for line in lines:
        if line.startswith('**Total References**:'):
            try:
                return int(line.split(':')[1].strip())
            except:
                pass
    
    # Count references manually if not found in header
    ref_count = 0
    for line in lines:
        if line.startswith('**[') and ']**' in line:
            ref_count += 1
    return ref_count

Parameters

Name	Type	Default	Kind
`markdown_content`	-	-	positional_or_keyword

Parameter Details

markdown_content: A string containing markdown-formatted text. Expected to contain references formatted as '**[reference_id]**' or a header line '**Total References**: N' where N is an integer. Can be multi-line content with newline characters separating lines.

Return Value

Returns an integer representing the total number of references found in the markdown content. If a '**Total References**:' header is found and successfully parsed, returns that value. Otherwise, returns the count of lines matching the reference pattern '**[...]**'. Returns 0 if no references are found.

Usage Example

# Example 1: Markdown with explicit total
markdown_with_header = '''
**Total References**: 3

**[1]** Smith, J. (2020). Example Paper.
**[2]** Doe, J. (2021). Another Paper.
**[3]** Brown, A. (2022). Third Paper.
'''

total = extract_total_references(markdown_with_header)
print(f"Total references: {total}")  # Output: Total references: 3

# Example 2: Markdown without explicit total (manual count)
markdown_without_header = '''
**[1]** Smith, J. (2020). Example Paper.
**[2]** Doe, J. (2021). Another Paper.
'''

total = extract_total_references(markdown_without_header)
print(f"Total references: {total}")  # Output: Total references: 2

# Example 3: Empty or no references
empty_markdown = "Some text without references"
total = extract_total_references(empty_markdown)
print(f"Total references: {total}")  # Output: Total references: 0

Best Practices

Ensure markdown_content is a string; pass empty string '' instead of None to avoid AttributeError
The function uses a broad try-except block which silently catches all exceptions when parsing the header line; consider validating input format beforehand
Reference format must strictly match '**[' at line start and contain ']**' for manual counting to work correctly
The function assumes references are on separate lines; inline references won't be counted
If the '**Total References**:' header exists but contains invalid data, the function falls back to manual counting rather than raising an error

Similar Components

AI-powered semantic similarity - components with related functionality:

function parse_references_section 66.4% similar

Parses a formatted references section string and extracts structured data including reference numbers, sources, and content previews using regular expressions.
From: /tf/active/vicechatdev/improved_convert_disclosures_to_table.py
function process_markdown_content_v1 48.3% similar

Parses markdown-formatted text content and converts it into a structured list of document elements (headers, paragraphs, lists, tables, code blocks) with their types and formatting preserved in original order.
From: /tf/active/vicechatdev/vice_ai/new_app.py
function process_markdown_content 48.3% similar

Parses markdown-formatted text content and converts it into a structured list of content elements with type annotations and formatting metadata suitable for document export.
From: /tf/active/vicechatdev/vice_ai/complex_app.py
function extract_warranty_data_improved 47.8% similar

Parses markdown-formatted warranty documentation to extract structured warranty data including IDs, titles, sections, disclosure text, and reference citations.
From: /tf/active/vicechatdev/improved_convert_disclosures_to_table.py
class ReferenceManager 47.0% similar

Manages document references for inline citation and bibliography generation in a RAG (Retrieval-Augmented Generation) system.
From: /tf/active/vicechatdev/fixed_project_victoria_generator.py

🔍 Code Extractor

function extract_total_references

Purpose

Source Code

Parameters

Parameter Details

Return Value

Usage Example

Best Practices

Tags

Similar Components

function parse_references_section 66.4% similar

function process_markdown_content_v1 48.3% similar

function process_markdown_content 48.3% similar

function extract_warranty_data_improved 47.8% similar

class ReferenceManager 47.0% similar

function extract_total_references

Purpose

Source Code

Parameters

Parameter Details

Return Value

Usage Example

Best Practices

Tags

Similar Components

function parse_references_section 66.4% similar

function process_markdown_content_v1 48.3% similar

function process_markdown_content 48.3% similar

function extract_warranty_data_improved 47.8% similar

class ReferenceManager 47.0% similar

✨ Improve Code: extract_total_references

Code Comparison