🔍 Code Extractor

function add_formatted_content_to_word_v1

Maturity: 45

Converts processed markdown elements into formatted content within a Microsoft Word document, handling headers, paragraphs, lists, tables, and code blocks with appropriate styling.

File:
/tf/active/vicechatdev/vice_ai/complex_app.py
Lines:
1494 - 1527
Complexity:
moderate

Purpose

This function serves as a markdown-to-Word converter that iterates through a list of parsed markdown elements and adds them to a Word document with proper formatting. It supports multiple element types including headers (up to 6 levels), paragraphs with inline formatting, bulleted and numbered lists, tables, and code blocks. The function is designed to be part of a document generation pipeline that converts markdown content into professionally formatted Word documents.

Source Code

def add_formatted_content_to_word(doc, elements):
    """Add processed markdown elements to Word document with proper formatting"""
    for element in elements:
        if element['type'] == 'header':
            level = min(element['level'], 6)  # Word supports up to 6 heading levels
            heading = doc.add_heading(element['content'], level)
            
        elif element['type'] == 'paragraph':
            para = doc.add_paragraph()
            add_inline_formatting_to_paragraph(para, element['content'])
            
        elif element['type'] == 'list_item':
            para = doc.add_paragraph(style='List Bullet')
            add_inline_formatting_to_paragraph(para, element['content'])
            
        elif element['type'] == 'numbered_list_item':
            para = doc.add_paragraph(style='List Number')
            add_inline_formatting_to_paragraph(para, element['content'])
            
        elif element['type'] == 'table':
            print(f"DEBUG: Adding table element with content: {element['content']}")
            add_table_to_word(doc, element['content'])
            
        elif element['type'] == 'code_block_start':
            # Handle code blocks (simplified for now)
            para = doc.add_paragraph(style='Normal')
            run = para.add_run(element['content'])
            run.font.name = 'Courier New'
            
        elif element['type'] == 'code_block_start':
            # Handle code blocks (simplified for now)
            para = doc.add_paragraph(style='Normal')
            run = para.add_run(element['content'])
            run.font.name = 'Courier New'

Parameters

Name Type Default Kind
doc - - positional_or_keyword
elements - - positional_or_keyword

Parameter Details

doc: A python-docx Document object representing the Word document to which formatted content will be added. This should be an instance of docx.Document that has been initialized before calling this function.

elements: A list of dictionaries where each dictionary represents a parsed markdown element. Each element must have a 'type' key (e.g., 'header', 'paragraph', 'list_item', 'numbered_list_item', 'table', 'code_block_start') and a 'content' key containing the text or data to be added. Header elements should also include a 'level' key (integer 1-6) indicating the heading level.

Return Value

This function does not return any value (implicitly returns None). It modifies the provided Word document object in-place by adding formatted content elements to it.

Dependencies

  • python-docx

Required Imports

from docx import Document
from docx.shared import Inches
from docx.shared import Pt
from docx.enum.text import WD_ALIGN_PARAGRAPH
from docx.enum.style import WD_STYLE_TYPE

Usage Example

from docx import Document

# Initialize a Word document
doc = Document()

# Define markdown elements
elements = [
    {'type': 'header', 'level': 1, 'content': 'Main Title'},
    {'type': 'paragraph', 'content': 'This is a paragraph with **bold** text.'},
    {'type': 'list_item', 'content': 'First bullet point'},
    {'type': 'numbered_list_item', 'content': 'First numbered item'},
    {'type': 'code_block_start', 'content': 'print("Hello World")'},
    {'type': 'table', 'content': [['Header1', 'Header2'], ['Row1Col1', 'Row1Col2']]}
]

# Add formatted content to document
add_formatted_content_to_word(doc, elements)

# Save the document
doc.save('output.docx')

Best Practices

  • Ensure the elements list is properly structured with required 'type' and 'content' keys before calling this function
  • The function has duplicate code for 'code_block_start' type - this appears to be a bug that should be fixed
  • Header levels are clamped to a maximum of 6 to comply with Word's heading level limitations
  • The function includes debug print statements that should be removed or replaced with proper logging in production
  • Requires helper functions (add_inline_formatting_to_paragraph and add_table_to_word) to be available in scope
  • The function modifies the document in-place, so ensure you have a reference to the document object after calling this function
  • Consider error handling for malformed elements or missing required keys in the elements dictionaries

Similar Components

AI-powered semantic similarity - components with related functionality:

  • function add_formatted_content_to_word 95.0% similar

    Converts processed markdown elements into formatted content within a Word document, handling headers, paragraphs, lists, tables, and code blocks with appropriate styling.

    From: /tf/active/vicechatdev/vice_ai/new_app.py
  • function add_inline_formatting_to_paragraph_v1 74.7% similar

    Parses markdown-formatted text and adds it to a Word document paragraph, converting markdown links [text](url) into clickable hyperlinks while delegating other markdown formatting to a helper function.

    From: /tf/active/vicechatdev/vice_ai/new_app.py
  • function add_formatted_content_to_pdf_v1 74.5% similar

    Converts processed markdown elements into formatted PDF content by adding paragraphs, headers, lists, and tables to a ReportLab story object with appropriate styling.

    From: /tf/active/vicechatdev/vice_ai/new_app.py
  • function add_inline_formatting_to_paragraph 74.0% similar

    Parses markdown-formatted text and applies inline formatting (bold, italic, code) to a Microsoft Word paragraph object using the python-docx library.

    From: /tf/active/vicechatdev/vice_ai/complex_app.py
  • function add_formatted_content_to_pdf 73.4% similar

    Processes markdown elements and adds them to a PDF document story with appropriate formatting, handling headers, paragraphs, lists, and tables.

    From: /tf/active/vicechatdev/vice_ai/complex_app.py
← Back to Browse