function simple_markdown_to_html
Converts a subset of Markdown syntax to clean HTML, supporting headers, bold text, unordered lists, and paragraphs.
/tf/active/vicechatdev/vice_ai/new_app.py
2542 - 2605
moderate
Purpose
This function provides a lightweight Markdown-to-HTML converter specifically designed for displaying formatted text in data sections of a document management system. It handles common formatting elements (headers up to h6, bold text with **, unordered lists with - or *, and paragraphs) while intentionally skipping image markdown syntax. The function maintains proper HTML structure by managing list opening/closing tags and converting empty lines to line breaks.
Source Code
def simple_markdown_to_html(markdown_text):
"""
Convert markdown to clean HTML for display in data sections
Handles: headers (#, ##, ###), bold (**text**), lists, and paragraphs
"""
if not markdown_text:
return ""
lines = markdown_text.split('\n')
html_lines = []
in_list = False
for line in lines:
stripped = line.strip()
if not stripped:
if in_list:
html_lines.append('</ul>')
in_list = False
html_lines.append('<br/>')
continue
# Headers - check for ## pattern at start
if stripped.startswith('#'):
if in_list:
html_lines.append('</ul>')
in_list = False
# Count consecutive # at start
level = 0
for char in stripped:
if char == '#':
level += 1
else:
break
# Cap at h6
level = min(level, 6)
text = stripped[level:].strip()
# Handle bold within headers
text = re.sub(r'\*\*([^*]+)\*\*', r'<strong>\1</strong>', text)
html_lines.append(f'<h{level}>{text}</h{level}>')
# Lists
elif stripped.startswith('- ') or stripped.startswith('* '):
if not in_list:
html_lines.append('<ul>')
in_list = True
text = stripped[2:].strip()
# Handle bold in lists
text = re.sub(r'\*\*([^*]+)\*\*', r'<strong>\1</strong>', text)
html_lines.append(f'<li>{text}</li>')
# Regular paragraphs
else:
if in_list:
html_lines.append('</ul>')
in_list = False
# Handle bold text
text = re.sub(r'\*\*([^*]+)\*\*', r'<strong>\1</strong>', stripped)
# Skip image markdown for now (plots are handled separately)
if not text.startswith('!['):
html_lines.append(f'<p>{text}</p>')
if in_list:
html_lines.append('</ul>')
return '\n'.join(html_lines)
Parameters
| Name | Type | Default | Kind |
|---|---|---|---|
markdown_text |
- | - | positional_or_keyword |
Parameter Details
markdown_text: A string containing Markdown-formatted text. Can be None or empty string. Supports headers (# through ######), bold text (**text**), unordered lists (- or * prefix), and regular paragraphs. Image markdown (![...]) is intentionally ignored.
Return Value
Returns a string containing HTML markup. If input is None or empty, returns an empty string. Otherwise returns newline-separated HTML elements including <h1>-<h6> for headers, <strong> for bold text, <ul>/<li> for lists, <p> for paragraphs, and <br/> for empty lines. All HTML is properly nested with lists closed before starting new block elements.
Dependencies
re
Required Imports
import re
Usage Example
import re
def simple_markdown_to_html(markdown_text):
if not markdown_text:
return ""
lines = markdown_text.split('\n')
html_lines = []
in_list = False
for line in lines:
stripped = line.strip()
if not stripped:
if in_list:
html_lines.append('</ul>')
in_list = False
html_lines.append('<br/>')
continue
if stripped.startswith('#'):
if in_list:
html_lines.append('</ul>')
in_list = False
level = 0
for char in stripped:
if char == '#':
level += 1
else:
break
level = min(level, 6)
text = stripped[level:].strip()
text = re.sub(r'\*\*([^*]+)\*\*', r'<strong>\1</strong>', text)
html_lines.append(f'<h{level}>{text}</h{level}>')
elif stripped.startswith('- ') or stripped.startswith('* '):
if not in_list:
html_lines.append('<ul>')
in_list = True
text = stripped[2:].strip()
text = re.sub(r'\*\*([^*]+)\*\*', r'<strong>\1</strong>', text)
html_lines.append(f'<li>{text}</li>')
else:
if in_list:
html_lines.append('</ul>')
in_list = False
text = re.sub(r'\*\*([^*]+)\*\*', r'<strong>\1</strong>', stripped)
if not text.startswith('!['):
html_lines.append(f'<p>{text}</p>')
if in_list:
html_lines.append('</ul>')
return '\n'.join(html_lines)
# Example usage
markdown = """# Main Title
## Subtitle
This is a **bold** statement.
- First item
- Second **bold** item
- Third item
Regular paragraph text."""
html_output = simple_markdown_to_html(markdown)
print(html_output)
Best Practices
- Input validation: The function safely handles None and empty string inputs by returning an empty string
- The function does not escape HTML entities in the input text, so ensure markdown_text is from a trusted source or pre-sanitize it to prevent XSS vulnerabilities
- Image markdown syntax (![...]) is intentionally skipped as images are handled separately in the application context
- The function caps header levels at h6 (HTML standard maximum) even if more # symbols are provided
- List state is properly managed to ensure closing </ul> tags are added when transitioning between block types
- Bold text pattern (**text**) only matches non-greedy patterns and won't work correctly with nested asterisks
- The function preserves newlines in output for readability but they don't affect HTML rendering
- Consider using a full-featured Markdown library like markdown2 or mistune for production use with more complex Markdown syntax
Tags
Similar Components
AI-powered semantic similarity - components with related functionality:
-
function basic_markdown_to_html 88.1% similar
-
function html_to_markdown_v1 87.1% similar
-
function html_to_markdown 83.4% similar
-
function convert_markdown_to_html_v1 78.3% similar
-
function convert_markdown_to_html 77.6% similar