🔍 Code Extractor

function add_markdown_formatting_to_paragraph

Maturity: 47

Parses markdown-formatted text and applies corresponding formatting (bold, italic, code) to runs within a python-docx paragraph object.

File:
/tf/active/vicechatdev/vice_ai/new_app.py
Lines:
3975 - 4003
Complexity:
moderate

Purpose

This function enables the conversion of markdown-style text formatting into Microsoft Word document formatting. It processes a text string containing markdown syntax (**bold**, *italic*, `code`) and adds appropriately formatted runs to a python-docx paragraph object. This is particularly useful when generating Word documents from markdown content or when you need to preserve text formatting from markdown sources in Word exports.

Source Code

def add_markdown_formatting_to_paragraph(paragraph, text):
    """Add markdown formatting (bold, italic, code) to a paragraph"""
    import re
    
    # Handle bold text **text**
    parts = re.split(r'\*\*(.*?)\*\*', text)
    for i, part in enumerate(parts):
        if i % 2 == 0:  # Regular text
            if part:
                # Check for italic within regular text
                italic_parts = re.split(r'\*(.*?)\*', part)
                for j, italic_part in enumerate(italic_parts):
                    if j % 2 == 0:  # Regular text
                        if italic_part:
                            # Check for code within regular text
                            code_parts = re.split(r'`(.*?)`', italic_part)
                            for k, code_part in enumerate(code_parts):
                                if k % 2 == 0:  # Regular text
                                    if code_part:
                                        paragraph.add_run(code_part)
                                else:  # Code text
                                    run = paragraph.add_run(code_part)
                                    run.font.name = 'Courier New'
                    else:  # Italic text
                        run = paragraph.add_run(italic_part)
                        run.italic = True
        else:  # Bold text
            run = paragraph.add_run(part)
            run.bold = True

Parameters

Name Type Default Kind
paragraph - - positional_or_keyword
text - - positional_or_keyword

Parameter Details

paragraph: A python-docx paragraph object (from docx.text.paragraph.Paragraph) to which formatted text runs will be added. This object must be from an active Document instance and will be modified in-place.

text: A string containing markdown-formatted text. Supports **bold** (double asterisks), *italic* (single asterisks), and `code` (backticks) syntax. The function handles nested formatting where italic and code can appear within regular text sections.

Return Value

This function returns None. It modifies the paragraph object in-place by adding formatted runs to it. Each run represents a segment of text with specific formatting applied (bold, italic, code font, or plain text).

Dependencies

  • re
  • python-docx

Required Imports

import re
from docx import Document

Conditional/Optional Imports

These imports are only needed under specific conditions:

import re

Condition: always required - used for regex pattern matching of markdown syntax

Required (conditional)

Usage Example

from docx import Document
import re

def add_markdown_formatting_to_paragraph(paragraph, text):
    import re
    parts = re.split(r'\*\*(.*?)\*\*', text)
    for i, part in enumerate(parts):
        if i % 2 == 0:
            if part:
                italic_parts = re.split(r'\*(.*?)\*', part)
                for j, italic_part in enumerate(italic_parts):
                    if j % 2 == 0:
                        if italic_part:
                            code_parts = re.split(r'`(.*?)`', italic_part)
                            for k, code_part in enumerate(code_parts):
                                if k % 2 == 0:
                                    if code_part:
                                        paragraph.add_run(code_part)
                                else:
                                    run = paragraph.add_run(code_part)
                                    run.font.name = 'Courier New'
                    else:
                        run = paragraph.add_run(italic_part)
                        run.italic = True
        else:
            run = paragraph.add_run(part)
            run.bold = True

# Create a new Word document
doc = Document()

# Add a paragraph
para = doc.add_paragraph()

# Add formatted text
markdown_text = "This is **bold text**, this is *italic text*, and this is `code text`."
add_markdown_formatting_to_paragraph(para, markdown_text)

# Save the document
doc.save('formatted_document.docx')

Best Practices

  • Ensure the paragraph object is from a valid python-docx Document instance before calling this function
  • The function modifies the paragraph in-place, so no return value needs to be captured
  • Markdown syntax must be properly closed (e.g., **bold** not **bold) for correct parsing
  • Nested formatting is supported but follows a specific hierarchy: bold > italic > code
  • The code font is hardcoded to 'Courier New' - modify the function if a different monospace font is needed
  • This function does not handle escaped markdown characters (e.g., \*\*) - they will be treated as literal asterisks
  • For complex markdown with multiple formatting types, test thoroughly as the regex-based parsing may have edge cases
  • The function uses greedy matching with .*? which works for most cases but may fail with nested identical markers

Similar Components

AI-powered semantic similarity - components with related functionality:

  • function add_inline_formatting_to_paragraph 89.5% similar

    Parses markdown-formatted text and applies inline formatting (bold, italic, code) to a Microsoft Word paragraph object using the python-docx library.

    From: /tf/active/vicechatdev/vice_ai/complex_app.py
  • function add_inline_formatting_to_paragraph_v1 76.3% similar

    Parses markdown-formatted text and adds it to a Word document paragraph, converting markdown links [text](url) into clickable hyperlinks while delegating other markdown formatting to a helper function.

    From: /tf/active/vicechatdev/vice_ai/new_app.py
  • function add_formatted_content_to_word 73.9% similar

    Converts processed markdown elements into formatted content within a Word document, handling headers, paragraphs, lists, tables, and code blocks with appropriate styling.

    From: /tf/active/vicechatdev/vice_ai/new_app.py
  • function add_formatted_content_to_word_v1 68.4% similar

    Converts processed markdown elements into formatted content within a Microsoft Word document, handling headers, paragraphs, lists, tables, and code blocks with appropriate styling.

    From: /tf/active/vicechatdev/vice_ai/complex_app.py
  • function convert_markdown_to_html 63.8% similar

    Converts basic markdown formatting (bold, italic, code) to HTML markup suitable for PDF generation using ReportLab.

    From: /tf/active/vicechatdev/vice_ai/complex_app.py
← Back to Browse