function extract_warranty_sections
Parses markdown content to extract warranty section headers, returning a list of dictionaries containing section IDs and titles for table of contents generation.
/tf/active/vicechatdev/enhanced_word_converter_fixed.py
51 - 71
simple
Purpose
This function is designed to process markdown-formatted warranty documents and extract structured information from section headers. It specifically targets level-2 headers (##) that follow a pattern of 'ID - Title', filtering out 'References' sections. The extracted data is formatted for use in generating a table of contents or navigation structure for warranty documentation.
Source Code
def extract_warranty_sections(markdown_content):
"""Extract warranty section IDs and titles for TOC generation"""
sections = []
lines = markdown_content.split('\n')
for line in lines:
line = line.strip()
# Look for warranty section headers like ## 1.1(a) - Title
if line.startswith('## ') and ' - ' in line and 'References' not in line:
# Extract warranty ID and title
content = line[3:] # Remove '## '
if ' - ' in content:
parts = content.split(' - ', 1)
warranty_id = parts[0].strip()
warranty_title = parts[1].strip()
sections.append({
'id': warranty_id,
'title': warranty_title
})
return sections
Parameters
| Name | Type | Default | Kind |
|---|---|---|---|
markdown_content |
- | - | positional_or_keyword |
Parameter Details
markdown_content: A string containing markdown-formatted text with warranty sections. Expected to contain level-2 headers (##) formatted as '## ID - Title' where ID is the warranty section identifier (e.g., '1.1(a)') and Title is the descriptive name of the warranty section. The content should use newline characters to separate lines.
Return Value
Returns a list of dictionaries, where each dictionary represents a warranty section with two keys: 'id' (string containing the warranty section identifier, e.g., '1.1(a)') and 'title' (string containing the warranty section title). Returns an empty list if no matching sections are found. Example: [{'id': '1.1(a)', 'title': 'Financial Statements'}, {'id': '1.1(b)', 'title': 'Tax Compliance'}]
Usage Example
markdown_text = '''
# Warranty Document
## 1.1(a) - Financial Statements
Content about financial statements...
## 1.1(b) - Tax Compliance
Content about tax compliance...
## References
Reference materials...
'''
sections = extract_warranty_sections(markdown_text)
print(sections)
# Output: [{'id': '1.1(a)', 'title': 'Financial Statements'}, {'id': '1.1(b)', 'title': 'Tax Compliance'}]
# Generate TOC from sections
for section in sections:
print(f"{section['id']}: {section['title']}")
Best Practices
- Ensure markdown content uses consistent header formatting with '## ' prefix and ' - ' separator between ID and title
- The function expects warranty IDs to be in the format before the ' - ' separator (e.g., '1.1(a)', '2.3', etc.)
- Headers containing 'References' are intentionally excluded from results
- Input should use standard newline characters ('\n') for line separation
- The function performs basic string operations and does not validate warranty ID format - consider adding validation if strict ID formats are required
- Whitespace is automatically stripped from extracted IDs and titles
- Only level-2 headers (##) are processed; level-1 (#) or level-3+ (###) headers are ignored
Tags
Similar Components
AI-powered semantic similarity - components with related functionality:
-
function extract_warranty_data 88.8% similar
-
function extract_warranty_data_improved 85.7% similar
-
function create_enhanced_word_document 68.6% similar
-
function main_v15 66.6% similar
-
function main_v8 64.4% similar