function extract_total_references
Extracts the total count of references from markdown-formatted content by first checking for a header line with the total, then falling back to manually counting reference entries.
/tf/active/vicechatdev/enhanced_word_converter_fixed.py
73 - 88
simple
Purpose
This function is designed to parse markdown documents that contain bibliographic references and determine the total number of references present. It uses a two-stage approach: first attempting to find an explicit '**Total References**:' header line with the count, and if that fails, manually counting lines that match the reference format '**[...]**'. This is useful for document processing pipelines that need to validate or report on reference counts in markdown-formatted academic or technical documents.
Source Code
def extract_total_references(markdown_content):
"""Extract total number of references from the markdown content"""
lines = markdown_content.split('\n')
for line in lines:
if line.startswith('**Total References**:'):
try:
return int(line.split(':')[1].strip())
except:
pass
# Count references manually if not found in header
ref_count = 0
for line in lines:
if line.startswith('**[') and ']**' in line:
ref_count += 1
return ref_count
Parameters
| Name | Type | Default | Kind |
|---|---|---|---|
markdown_content |
- | - | positional_or_keyword |
Parameter Details
markdown_content: A string containing markdown-formatted text. Expected to contain references formatted as '**[reference_id]**' or a header line '**Total References**: N' where N is an integer. Can be multi-line content with newline characters separating lines.
Return Value
Returns an integer representing the total number of references found in the markdown content. If a '**Total References**:' header is found and successfully parsed, returns that value. Otherwise, returns the count of lines matching the reference pattern '**[...]**'. Returns 0 if no references are found.
Usage Example
# Example 1: Markdown with explicit total
markdown_with_header = '''
**Total References**: 3
**[1]** Smith, J. (2020). Example Paper.
**[2]** Doe, J. (2021). Another Paper.
**[3]** Brown, A. (2022). Third Paper.
'''
total = extract_total_references(markdown_with_header)
print(f"Total references: {total}") # Output: Total references: 3
# Example 2: Markdown without explicit total (manual count)
markdown_without_header = '''
**[1]** Smith, J. (2020). Example Paper.
**[2]** Doe, J. (2021). Another Paper.
'''
total = extract_total_references(markdown_without_header)
print(f"Total references: {total}") # Output: Total references: 2
# Example 3: Empty or no references
empty_markdown = "Some text without references"
total = extract_total_references(empty_markdown)
print(f"Total references: {total}") # Output: Total references: 0
Best Practices
- Ensure markdown_content is a string; pass empty string '' instead of None to avoid AttributeError
- The function uses a broad try-except block which silently catches all exceptions when parsing the header line; consider validating input format beforehand
- Reference format must strictly match '**[' at line start and contain ']**' for manual counting to work correctly
- The function assumes references are on separate lines; inline references won't be counted
- If the '**Total References**:' header exists but contains invalid data, the function falls back to manual counting rather than raising an error
Tags
Similar Components
AI-powered semantic similarity - components with related functionality:
-
function parse_references_section 66.4% similar
-
function process_markdown_content_v1 48.3% similar
-
function process_markdown_content 48.3% similar
-
function extract_warranty_data_improved 47.8% similar
-
class ReferenceManager 47.0% similar