basic_markdown_to_html - Code Extractor

function basic_markdown_to_html

Maturity: 47

Converts basic Markdown syntax to HTML without using external Markdown libraries, handling headers, lists, code blocks, and inline formatting.

File:
/tf/active/vicechatdev/vice_ai/complex_app.py

Lines:
1821 - 1914

Complexity:
moderate

Purpose

This function provides a lightweight Markdown-to-HTML converter for applications that need basic Markdown rendering without adding external dependencies. It supports common Markdown features including headers (h1-h3), ordered and unordered lists, code blocks (fenced with ), and inline formatting. The function processes text line-by-line, maintaining state for multi-line structures like lists and code blocks, and properly escapes HTML content within code blocks.

Source Code

def basic_markdown_to_html(text):
    """Basic Markdown to HTML conversion without external libraries"""
    if not text:
        return ""
    
    # Split into lines for processing
    lines = text.split('\n')
    result_lines = []
    in_list = False
    in_code_block = False
    list_type = None  # 'ul' or 'ol'
    
    i = 0
    while i < len(lines):
        line = lines[i]
        stripped = line.strip()
        
        # Handle code blocks
        if stripped.startswith('```'):
            # Close any open list before starting code block
            if in_list:
                result_lines.append(f'</{list_type}>')
                in_list = False
                list_type = None
            
            if not in_code_block:
                # Start code block
                in_code_block = True
                result_lines.append('<pre><code>')
            else:
                # End code block
                in_code_block = False
                result_lines.append('</code></pre>')
            i += 1
            continue
        
        # If we're in a code block, just add the line as-is (escaped)
        if in_code_block:
            result_lines.append(html.escape(line))
            i += 1
            continue
        
        # Close any open list if this line doesn't continue it
        if in_list and not (stripped.startswith('- ') or stripped.startswith('* ') or re.match(r'^\d+\. ', stripped)) and stripped:
            result_lines.append(f'</{list_type}>')
            in_list = False
            list_type = None
        
        # Headers (process before other formatting)
        if stripped.startswith('### '):
            result_lines.append(f'<h3>{stripped[4:]}</h3>')
        elif stripped.startswith('## '):
            result_lines.append(f'<h2>{stripped[3:]}</h2>')
        elif stripped.startswith('# '):
            result_lines.append(f'<h1>{stripped[2:]}</h1>')
        # Unordered list
        elif stripped.startswith('- ') or stripped.startswith('* '):
            if not in_list or list_type != 'ul':
                if in_list:
                    result_lines.append(f'</{list_type}>')
                result_lines.append('<ul>')
                in_list = True
                list_type = 'ul'
            content = format_inline_markdown(stripped[2:])
            result_lines.append(f'<li>{content}</li>')
        # Ordered list
        elif re.match(r'^\d+\. ', stripped):
            if not in_list or list_type != 'ol':
                if in_list:
                    result_lines.append(f'</{list_type}>')
                result_lines.append('<ol>')
                in_list = True
                list_type = 'ol'
            content = re.sub(r'^\d+\. ', '', stripped)
            content = format_inline_markdown(content)
            result_lines.append(f'<li>{content}</li>')
        # Empty line
        elif not stripped:
            if not in_list:
                result_lines.append('')
        # Regular paragraph
        else:
            content = format_inline_markdown(stripped)
            result_lines.append(f'<p>{content}</p>')
        
        i += 1
    
    # Close any open structures
    if in_code_block:
        result_lines.append('</code></pre>')
    if in_list:
        result_lines.append(f'</{list_type}>')
    
    return '\n'.join(result_lines)

Parameters

Name	Type	Default	Kind
`text`	-	-	positional_or_keyword

Parameter Details

text: A string containing Markdown-formatted text to be converted to HTML. Can be None or empty string, which will return an empty string. Supports headers (#, ##, ###), unordered lists (-, *), ordered lists (1., 2., etc.), code blocks (), and inline formatting (processed by format_inline_markdown helper function).

Return Value

Returns a string containing the HTML representation of the input Markdown text. The HTML includes semantic tags like <h1>-<h3>, <ul>, <ol>, <li>, <p>, <pre>, and <code>. Lines are joined with newline characters. Returns an empty string if input is None or empty. Code block content is HTML-escaped for safety.

Dependencies

html
re

Required Imports

import html
import re

Usage Example

import html
import re

# Note: You must define format_inline_markdown function first
def format_inline_markdown(text):
    # Simple implementation for example
    text = re.sub(r'\*\*(.+?)\*\*', r'<strong>\1</strong>', text)
    text = re.sub(r'\*(.+?)\*', r'<em>\1</em>', text)
    return text

markdown_text = '''# Main Title
## Subtitle
This is a paragraph.

- Item 1
- Item 2

1. First
2. Second


code example

'''

html_output = basic_markdown_to_html(markdown_text)
print(html_output)

Best Practices

Ensure the format_inline_markdown helper function is defined before calling this function, as it's a required dependency
Input text should use standard Markdown syntax; non-standard syntax may not be processed correctly
The function handles nested structures by closing open lists before starting code blocks, but does not support nested lists
Code blocks are HTML-escaped automatically for security, preventing XSS attacks
Empty lines within lists will close the list; ensure list items are consecutive if you want them in the same list
Only supports headers up to level 3 (###); deeper headers will be treated as regular paragraphs
The function processes text line-by-line with state management, so very large texts are handled efficiently
Ordered list numbering in the output HTML is automatic and doesn't preserve the original Markdown numbers

Similar Components

AI-powered semantic similarity - components with related functionality:

function simple_markdown_to_html 88.1% similar

Converts a subset of Markdown syntax to clean HTML, supporting headers, bold text, unordered lists, and paragraphs.
From: /tf/active/vicechatdev/vice_ai/new_app.py
function html_to_markdown_v1 85.0% similar

Converts HTML markup to Markdown syntax, handling headers, code blocks, text formatting, links, lists, and paragraphs with proper spacing.
From: /tf/active/vicechatdev/vice_ai/new_app.py
function html_to_markdown 83.2% similar

Converts HTML text back to Markdown format using regex-based pattern matching and replacement, handling headers, code blocks, formatting, links, lists, and HTML entities.
From: /tf/active/vicechatdev/vice_ai/complex_app.py
function markdown_to_html 78.4% similar

Converts Markdown formatted text to HTML using the python-markdown library with multiple extensions, falling back to basic conversion if the library is unavailable.
From: /tf/active/vicechatdev/vice_ai/complex_app.py
function format_inline_markdown 75.2% similar

Converts inline Markdown syntax (bold, italic, code, links) to HTML tags while escaping HTML entities for safe rendering.
From: /tf/active/vicechatdev/vice_ai/complex_app.py

← Back to Browse

Assistant

Hi! I can help improve this code. Tell me what you'd like to enhance (e.g., "add error handling", "optimize performance", "improve readability", "add type hints").

Code Comparison

Original Code

                            def basic_markdown_to_html(text):
    """Basic Markdown to HTML conversion without external libraries"""
    if not text:
        return ""
    
    # Split into lines for processing
    lines = text.split('\n')
    result_lines = []
    in_list = False
    in_code_block = False
    list_type = None  # 'ul' or 'ol'
    
    i = 0
    while i < len(lines):
        line = lines[i]
        stripped = line.strip()
        
        # Handle code blocks
        if stripped.startswith('```'):
            # Close any open list before starting code block
            if in_list:
                result_lines.append(f'</{list_type}>')
                in_list = False
                list_type = None
            
            if not in_code_block:
                # Start code block
                in_code_block = True
                result_lines.append('<pre><code>')
            else:
                # End code block
                in_code_block = False
                result_lines.append('</code></pre>')
            i += 1
            continue
        
        # If we're in a code block, just add the line as-is (escaped)
        if in_code_block:
            result_lines.append(html.escape(line))
            i += 1
            continue
        
        # Close any open list if this line doesn't continue it
        if in_list and not (stripped.startswith('- ') or stripped.startswith('* ') or re.match(r'^\d+\. ', stripped)) and stripped:
            result_lines.append(f'</{list_type}>')
            in_list = False
            list_type = None
        
        # Headers (process before other formatting)
        if stripped.startswith('### '):
            result_lines.append(f'<h3>{stripped[4:]}</h3>')
        elif stripped.startswith('## '):
            result_lines.append(f'<h2>{stripped[3:]}</h2>')
        elif stripped.startswith('# '):
            result_lines.append(f'<h1>{stripped[2:]}</h1>')
        # Unordered list
        elif stripped.startswith('- ') or stripped.startswith('* '):
            if not in_list or list_type != 'ul':
                if in_list:
                    result_lines.append(f'</{list_type}>')
                result_lines.append('<ul>')
                in_list = True
                list_type = 'ul'
            content = format_inline_markdown(stripped[2:])
            result_lines.append(f'<li>{content}</li>')
        # Ordered list
        elif re.match(r'^\d+\. ', stripped):
            if not in_list or list_type != 'ol':
                if in_list:
                    result_lines.append(f'</{list_type}>')
                result_lines.append('<ol>')
                in_list = True
                list_type = 'ol'
            content = re.sub(r'^\d+\. ', '', stripped)
            content = format_inline_markdown(content)
            result_lines.append(f'<li>{content}</li>')
        # Empty line
        elif not stripped:
            if not in_list:
                result_lines.append('')
        # Regular paragraph
        else:
            content = format_inline_markdown(stripped)
            result_lines.append(f'<p>{content}</p>')
        
        i += 1
    
    # Close any open structures
    if in_code_block:
        result_lines.append('</code></pre>')
    if in_list:
        result_lines.append(f'</{list_type}>')
    
    return '\n'.join(result_lines)
                        

Improved Code

🔍 Code Extractor

function basic_markdown_to_html

Purpose

Source Code

Parameters

Parameter Details

Return Value

Dependencies

Required Imports

Usage Example

Best Practices

Tags

Similar Components

function simple_markdown_to_html 88.1% similar

function html_to_markdown_v1 85.0% similar

function html_to_markdown 83.2% similar

function markdown_to_html 78.4% similar

function format_inline_markdown 75.2% similar

function basic_markdown_to_html

Purpose

Source Code

Parameters

Parameter Details

Return Value

Dependencies

Required Imports

Usage Example

Best Practices

Tags

Similar Components

function simple_markdown_to_html 88.1% similar

function html_to_markdown_v1 85.0% similar

function html_to_markdown 83.2% similar

function markdown_to_html 78.4% similar

function format_inline_markdown 75.2% similar

✨ Improve Code: basic_markdown_to_html

Code Comparison