🔍 Code Extractor

function convert_markdown_to_html

Maturity: 47

Converts basic markdown formatting (bold, italic, code) to HTML markup suitable for PDF generation using ReportLab.

File:
/tf/active/vicechatdev/vice_ai/complex_app.py
Lines:
1999 - 2015
Complexity:
simple

Purpose

This function transforms markdown-formatted text into HTML that can be rendered in PDF documents. It handles three common markdown patterns: bold (**text**), italic (*text*), and inline code (`text`). The function first escapes HTML entities to prevent injection issues, then applies regex-based transformations to convert markdown syntax to corresponding HTML tags. It's specifically designed for use with ReportLab's PDF generation where HTML-like markup is needed.

Source Code

def convert_markdown_to_html(text):
    """Convert basic markdown formatting to HTML for PDF generation"""
    if not text:
        return text
    
    import html
    import re
    
    # Escape HTML entities first
    text = html.escape(text)
    
    # Convert markdown formatting to HTML
    text = re.sub(r'\*\*(.*?)\*\*', r'<b>\1</b>', text)  # Bold
    text = re.sub(r'\*(.*?)\*', r'<i>\1</i>', text)      # Italic
    text = re.sub(r'`(.*?)`', r'<font name="Courier">\1</font>', text)  # Code
    
    return text

Parameters

Name Type Default Kind
text - - positional_or_keyword

Parameter Details

text: A string containing markdown-formatted text to be converted. Can be None or empty string, which will be returned as-is. Expected to contain markdown syntax like **bold**, *italic*, or `code`. No length constraints, but should be a string type.

Return Value

Returns a string with markdown formatting converted to HTML tags. Bold text becomes <b>text</b>, italic becomes <i>text</i>, and inline code becomes <font name="Courier">text</font>. HTML special characters are escaped. Returns the original value unchanged if input is None or empty string. Return type is str or None (matching input type).

Dependencies

  • html
  • re

Required Imports

import html
import re

Usage Example

import html
import re

def convert_markdown_to_html(text):
    if not text:
        return text
    import html
    import re
    text = html.escape(text)
    text = re.sub(r'\*\*(.*?)\*\*', r'<b>\1</b>', text)
    text = re.sub(r'\*(.*?)\*', r'<i>\1</i>', text)
    text = re.sub(r'`(.*?)`', r'<font name="Courier">\1</font>', text)
    return text

# Example usage
markdown_text = "This is **bold** and this is *italic* and this is `code`"
html_output = convert_markdown_to_html(markdown_text)
print(html_output)
# Output: This is <b>bold</b> and this is <i>italic</i> and this is <font name="Courier">code</font>

# Handles HTML escaping
text_with_html = "<script>alert('xss')</script> **bold**"
safe_output = convert_markdown_to_html(text_with_html)
print(safe_output)
# Output: &lt;script&gt;alert(&#x27;xss&#x27;)&lt;/script&gt; <b>bold</b>

Best Practices

  • This function only handles basic markdown syntax (bold, italic, inline code). For comprehensive markdown conversion, consider using a full markdown library.
  • HTML entities are automatically escaped, making this function safe against HTML injection when processing user input.
  • The function uses non-greedy regex matching (.*?) to handle multiple markdown elements on the same line correctly.
  • Order of regex operations matters: bold is processed before italic to avoid conflicts with nested asterisks.
  • The Courier font reference in the code formatting is specific to ReportLab PDF generation and may not work in standard HTML contexts.
  • Returns None/empty input unchanged, so always check the return value before further processing.
  • Does not handle nested markdown (e.g., ***bold and italic***) or other markdown features like headers, links, lists, or blockquotes.
  • For production use with untrusted input, consider additional validation beyond HTML escaping.

Similar Components

AI-powered semantic similarity - components with related functionality:

  • function convert_markdown_to_html_v1 93.0% similar

    Converts basic Markdown syntax to HTML markup compatible with ReportLab PDF generation, including support for clickable links, bold, italic, and inline code formatting.

    From: /tf/active/vicechatdev/vice_ai/new_app.py
  • function simple_markdown_to_html 77.6% similar

    Converts a subset of Markdown syntax to clean HTML, supporting headers, bold text, unordered lists, and paragraphs.

    From: /tf/active/vicechatdev/vice_ai/new_app.py
  • function format_inline_markdown 76.3% similar

    Converts inline Markdown syntax (bold, italic, code, links) to HTML tags while escaping HTML entities for safe rendering.

    From: /tf/active/vicechatdev/vice_ai/complex_app.py
  • function html_to_markdown 76.0% similar

    Converts HTML text back to Markdown format using regex-based pattern matching and replacement, handling headers, code blocks, formatting, links, lists, and HTML entities.

    From: /tf/active/vicechatdev/vice_ai/complex_app.py
  • function html_to_markdown_v1 75.5% similar

    Converts HTML markup to Markdown syntax, handling headers, code blocks, text formatting, links, lists, and paragraphs with proper spacing.

    From: /tf/active/vicechatdev/vice_ai/new_app.py
← Back to Browse