function convert_markdown_to_html
Converts basic markdown formatting (bold, italic, code) to HTML markup suitable for PDF generation using ReportLab.
/tf/active/vicechatdev/vice_ai/complex_app.py
1999 - 2015
simple
Purpose
This function transforms markdown-formatted text into HTML that can be rendered in PDF documents. It handles three common markdown patterns: bold (**text**), italic (*text*), and inline code (`text`). The function first escapes HTML entities to prevent injection issues, then applies regex-based transformations to convert markdown syntax to corresponding HTML tags. It's specifically designed for use with ReportLab's PDF generation where HTML-like markup is needed.
Source Code
def convert_markdown_to_html(text):
"""Convert basic markdown formatting to HTML for PDF generation"""
if not text:
return text
import html
import re
# Escape HTML entities first
text = html.escape(text)
# Convert markdown formatting to HTML
text = re.sub(r'\*\*(.*?)\*\*', r'<b>\1</b>', text) # Bold
text = re.sub(r'\*(.*?)\*', r'<i>\1</i>', text) # Italic
text = re.sub(r'`(.*?)`', r'<font name="Courier">\1</font>', text) # Code
return text
Parameters
| Name | Type | Default | Kind |
|---|---|---|---|
text |
- | - | positional_or_keyword |
Parameter Details
text: A string containing markdown-formatted text to be converted. Can be None or empty string, which will be returned as-is. Expected to contain markdown syntax like **bold**, *italic*, or `code`. No length constraints, but should be a string type.
Return Value
Returns a string with markdown formatting converted to HTML tags. Bold text becomes <b>text</b>, italic becomes <i>text</i>, and inline code becomes <font name="Courier">text</font>. HTML special characters are escaped. Returns the original value unchanged if input is None or empty string. Return type is str or None (matching input type).
Dependencies
htmlre
Required Imports
import html
import re
Usage Example
import html
import re
def convert_markdown_to_html(text):
if not text:
return text
import html
import re
text = html.escape(text)
text = re.sub(r'\*\*(.*?)\*\*', r'<b>\1</b>', text)
text = re.sub(r'\*(.*?)\*', r'<i>\1</i>', text)
text = re.sub(r'`(.*?)`', r'<font name="Courier">\1</font>', text)
return text
# Example usage
markdown_text = "This is **bold** and this is *italic* and this is `code`"
html_output = convert_markdown_to_html(markdown_text)
print(html_output)
# Output: This is <b>bold</b> and this is <i>italic</i> and this is <font name="Courier">code</font>
# Handles HTML escaping
text_with_html = "<script>alert('xss')</script> **bold**"
safe_output = convert_markdown_to_html(text_with_html)
print(safe_output)
# Output: <script>alert('xss')</script> <b>bold</b>
Best Practices
- This function only handles basic markdown syntax (bold, italic, inline code). For comprehensive markdown conversion, consider using a full markdown library.
- HTML entities are automatically escaped, making this function safe against HTML injection when processing user input.
- The function uses non-greedy regex matching (.*?) to handle multiple markdown elements on the same line correctly.
- Order of regex operations matters: bold is processed before italic to avoid conflicts with nested asterisks.
- The Courier font reference in the code formatting is specific to ReportLab PDF generation and may not work in standard HTML contexts.
- Returns None/empty input unchanged, so always check the return value before further processing.
- Does not handle nested markdown (e.g., ***bold and italic***) or other markdown features like headers, links, lists, or blockquotes.
- For production use with untrusted input, consider additional validation beyond HTML escaping.
Tags
Similar Components
AI-powered semantic similarity - components with related functionality:
-
function convert_markdown_to_html_v1 93.0% similar
-
function simple_markdown_to_html 77.6% similar
-
function format_inline_markdown 76.3% similar
-
function html_to_markdown 76.0% similar
-
function html_to_markdown_v1 75.5% similar