🔍 Code Extractor

function main_v15

Maturity: 47

Converts a markdown file containing warranty disclosure data into multiple tabular formats (CSV, Excel, Word) with timestamped output files.

File:
/tf/active/vicechatdev/convert_disclosures_to_table.py
Lines:
373 - 429
Complexity:
moderate

Purpose

This function serves as the main entry point for a markdown-to-table conversion pipeline specifically designed for Project Victoria disclosure documents. It reads a markdown file, extracts warranty data using a helper function, and generates three types of reports: CSV (summary and detailed), Excel workbook, and Word document. The function includes error handling for missing files and missing data, logs progress, and provides a comprehensive summary of the conversion process.

Source Code

def main():
    """Main function to convert markdown to tabular formats."""
    # Input and output paths
    input_file = Path('/tf/active/project_victoria_disclosures.md')
    output_dir = Path('/tf/active')
    
    # Check if input file exists
    if not input_file.exists():
        logger.error(f"Input file not found: {input_file}")
        return
    
    # Read markdown content
    logger.info(f"Reading markdown file: {input_file}")
    with open(input_file, 'r', encoding='utf-8') as f:
        content = f.read()
    
    # Extract warranty data
    logger.info("Extracting warranty data...")
    warranties = extract_warranty_data(content)
    
    if not warranties:
        logger.error("No warranties extracted from markdown file")
        return
    
    # Generate timestamp for output files
    timestamp = datetime.now().strftime('%Y%m%d_%H%M%S')
    
    # Create CSV report
    csv_file = output_dir / f'project_victoria_disclosures_table_{timestamp}.csv'
    create_csv_report(warranties, csv_file)
    
    # Create Excel report
    excel_file = output_dir / f'project_victoria_disclosures_table_{timestamp}.xlsx'
    excel_created = create_excel_report(warranties, excel_file)
    
    # Create Word report
    word_file = output_dir / f'project_victoria_disclosures_{timestamp}.docx'
    word_created = create_word_report(warranties, word_file)
    
    # Print summary
    print("\n" + "="*60)
    print("PROJECT VICTORIA - DISCLOSURE TABLE CONVERSION COMPLETE")
    print("="*60)
    print(f"Total warranties processed: {len(warranties)}")
    print(f"Output files created:")
    print(f"  - Summary CSV: {csv_file}")
    print(f"  - Detailed CSV: {csv_file.with_name(csv_file.stem + '_detailed.csv')}")
    if excel_created:
        print(f"  - Excel report: {excel_file}")
    else:
        print(f"  - Excel report: SKIPPED (openpyxl not available)")
    if word_created:
        print(f"  - Word report: {word_file}")
    else:
        print(f"  - Word report: SKIPPED (python-docx not available)")
    print("\nFiles are ready for review and analysis!")
    print("="*60)

Return Value

Returns None. The function performs side effects by creating output files and printing status messages to console. It may exit early (return None) if the input file is not found or if no warranties are extracted.

Dependencies

  • pathlib
  • logging
  • datetime
  • re
  • csv
  • pandas
  • html
  • openpyxl
  • python-docx

Required Imports

import re
import csv
import pandas as pd
from pathlib import Path
import html
import logging
from datetime import datetime
from docx import Document
from docx.shared import Inches
from docx.enum.style import WD_STYLE_TYPE

Conditional/Optional Imports

These imports are only needed under specific conditions:

from docx import Document

Condition: Required for Word document generation via create_word_report(). Function gracefully skips Word output if not available.

Optional
from docx.shared import Inches

Condition: Required for Word document formatting. Function gracefully skips Word output if not available.

Optional
from docx.enum.style import WD_STYLE_TYPE

Condition: Required for Word document styling. Function gracefully skips Word output if not available.

Optional
import openpyxl

Condition: Required for Excel file generation via create_excel_report(). Function gracefully skips Excel output if not available.

Optional

Usage Example

# Ensure logger is configured
import logging
logger = logging.getLogger(__name__)
logging.basicConfig(level=logging.INFO)

# Define required helper functions (simplified examples)
def extract_warranty_data(content):
    # Extract warranty data from markdown
    return [{'id': '1', 'title': 'Sample Warranty', 'description': 'Test'}]

def create_csv_report(warranties, csv_file):
    # Create CSV report
    pass

def create_excel_report(warranties, excel_file):
    # Create Excel report
    return True

def create_word_report(warranties, word_file):
    # Create Word report
    return True

# Ensure input file exists at the expected path
# Then call the main function
main()

# Output files will be created in /tf/active/ with timestamps:
# - project_victoria_disclosures_table_YYYYMMDD_HHMMSS.csv
# - project_victoria_disclosures_table_YYYYMMDD_HHMMSS_detailed.csv
# - project_victoria_disclosures_table_YYYYMMDD_HHMMSS.xlsx
# - project_victoria_disclosures_YYYYMMDD_HHMMSS.docx

Best Practices

  • Ensure the input file path '/tf/active/project_victoria_disclosures.md' exists before calling this function
  • Configure logging before calling main() to capture progress and error messages
  • Ensure all helper functions (extract_warranty_data, create_csv_report, create_excel_report, create_word_report) are defined in the same module
  • The function uses hardcoded paths - consider refactoring to accept parameters for input/output paths for better reusability
  • Install optional dependencies (openpyxl, python-docx) for full functionality, though the function will gracefully skip unavailable formats
  • Output files include timestamps to prevent overwriting previous conversions
  • The function creates multiple output files - ensure sufficient disk space in the output directory
  • Review the console summary output to verify all expected files were created successfully

Similar Components

AI-powered semantic similarity - components with related functionality:

  • function main_v8 94.7% similar

    Orchestrates the conversion of an improved markdown file containing warranty disclosures into multiple tabular formats (CSV, Excel, Word) with timestamp-based file naming.

    From: /tf/active/vicechatdev/improved_convert_disclosures_to_table.py
  • function main_v1 79.0% similar

    Main orchestration function that reads an improved markdown file and converts it to an enhanced Word document with comprehensive formatting, including table of contents, warranty sections, disclosures, and bibliography.

    From: /tf/active/vicechatdev/enhanced_word_converter_fixed.py
  • function create_enhanced_word_document 76.2% similar

    Converts markdown-formatted warranty disclosure content into a formatted Microsoft Word document with hierarchical headings, styled text, lists, and special formatting for block references.

    From: /tf/active/vicechatdev/improved_word_converter.py
  • function main_v18 74.5% similar

    Main entry point function that reads a markdown file, converts it to an enhanced Word document with preserved heading structure, and saves it with a timestamped filename.

    From: /tf/active/vicechatdev/improved_word_converter.py
  • function create_word_report 74.4% similar

    Generates a formatted Microsoft Word document report containing warranty disclosures with a table of contents, metadata, and structured sections for each warranty.

    From: /tf/active/vicechatdev/convert_disclosures_to_table.py
← Back to Browse