function add_markdown_formatting_to_paragraph
Parses markdown-formatted text and applies corresponding formatting (bold, italic, code) to runs within a python-docx paragraph object.
/tf/active/vicechatdev/vice_ai/new_app.py
3975 - 4003
moderate
Purpose
This function enables the conversion of markdown-style text formatting into Microsoft Word document formatting. It processes a text string containing markdown syntax (**bold**, *italic*, `code`) and adds appropriately formatted runs to a python-docx paragraph object. This is particularly useful when generating Word documents from markdown content or when you need to preserve text formatting from markdown sources in Word exports.
Source Code
def add_markdown_formatting_to_paragraph(paragraph, text):
"""Add markdown formatting (bold, italic, code) to a paragraph"""
import re
# Handle bold text **text**
parts = re.split(r'\*\*(.*?)\*\*', text)
for i, part in enumerate(parts):
if i % 2 == 0: # Regular text
if part:
# Check for italic within regular text
italic_parts = re.split(r'\*(.*?)\*', part)
for j, italic_part in enumerate(italic_parts):
if j % 2 == 0: # Regular text
if italic_part:
# Check for code within regular text
code_parts = re.split(r'`(.*?)`', italic_part)
for k, code_part in enumerate(code_parts):
if k % 2 == 0: # Regular text
if code_part:
paragraph.add_run(code_part)
else: # Code text
run = paragraph.add_run(code_part)
run.font.name = 'Courier New'
else: # Italic text
run = paragraph.add_run(italic_part)
run.italic = True
else: # Bold text
run = paragraph.add_run(part)
run.bold = True
Parameters
| Name | Type | Default | Kind |
|---|---|---|---|
paragraph |
- | - | positional_or_keyword |
text |
- | - | positional_or_keyword |
Parameter Details
paragraph: A python-docx paragraph object (from docx.text.paragraph.Paragraph) to which formatted text runs will be added. This object must be from an active Document instance and will be modified in-place.
text: A string containing markdown-formatted text. Supports **bold** (double asterisks), *italic* (single asterisks), and `code` (backticks) syntax. The function handles nested formatting where italic and code can appear within regular text sections.
Return Value
This function returns None. It modifies the paragraph object in-place by adding formatted runs to it. Each run represents a segment of text with specific formatting applied (bold, italic, code font, or plain text).
Dependencies
repython-docx
Required Imports
import re
from docx import Document
Conditional/Optional Imports
These imports are only needed under specific conditions:
import re
Condition: always required - used for regex pattern matching of markdown syntax
Required (conditional)Usage Example
from docx import Document
import re
def add_markdown_formatting_to_paragraph(paragraph, text):
import re
parts = re.split(r'\*\*(.*?)\*\*', text)
for i, part in enumerate(parts):
if i % 2 == 0:
if part:
italic_parts = re.split(r'\*(.*?)\*', part)
for j, italic_part in enumerate(italic_parts):
if j % 2 == 0:
if italic_part:
code_parts = re.split(r'`(.*?)`', italic_part)
for k, code_part in enumerate(code_parts):
if k % 2 == 0:
if code_part:
paragraph.add_run(code_part)
else:
run = paragraph.add_run(code_part)
run.font.name = 'Courier New'
else:
run = paragraph.add_run(italic_part)
run.italic = True
else:
run = paragraph.add_run(part)
run.bold = True
# Create a new Word document
doc = Document()
# Add a paragraph
para = doc.add_paragraph()
# Add formatted text
markdown_text = "This is **bold text**, this is *italic text*, and this is `code text`."
add_markdown_formatting_to_paragraph(para, markdown_text)
# Save the document
doc.save('formatted_document.docx')
Best Practices
- Ensure the paragraph object is from a valid python-docx Document instance before calling this function
- The function modifies the paragraph in-place, so no return value needs to be captured
- Markdown syntax must be properly closed (e.g., **bold** not **bold) for correct parsing
- Nested formatting is supported but follows a specific hierarchy: bold > italic > code
- The code font is hardcoded to 'Courier New' - modify the function if a different monospace font is needed
- This function does not handle escaped markdown characters (e.g., \*\*) - they will be treated as literal asterisks
- For complex markdown with multiple formatting types, test thoroughly as the regex-based parsing may have edge cases
- The function uses greedy matching with .*? which works for most cases but may fail with nested identical markers
Tags
Similar Components
AI-powered semantic similarity - components with related functionality:
-
function add_inline_formatting_to_paragraph 89.5% similar
-
function add_inline_formatting_to_paragraph_v1 76.3% similar
-
function add_formatted_content_to_word 73.9% similar
-
function add_formatted_content_to_word_v1 68.4% similar
-
function convert_markdown_to_html 63.8% similar