function test_markdown_link_parsing
A test function that validates markdown link parsing capabilities, specifically testing extraction and URL encoding of complex URLs containing special characters from Quill editor format.
/tf/active/vicechatdev/test_complex_hyperlink.py
50 - 80
simple
Purpose
This function serves as a unit test to verify that markdown links with complex URLs (containing special characters like &, commas, spaces, and URL fragments) can be correctly parsed, extracted, and encoded. It demonstrates the process of splitting markdown link syntax, extracting link text and URLs, and properly encoding URL paths while preserving query parameters and fragments.
Source Code
def test_markdown_link_parsing():
"""Test markdown link parsing with complex URLs"""
print("\nTesting markdown link parsing...")
# Test the exact format that would come from Quill editor
markdown_text = "[3.5.1 Cost model for WBPK022&K024,K034_20240624.xlsx](https://filecloud.vicebio.com/ui/core/index.html?filter=3.5.1+Cost+model+for+WBPK022&K024,K034_20240624.xlsx#expl-tabl./SHARED/vicebio_shares/Wuxi/3%20WO-CO%20&%20invoice%20plan/3.5%20Cost%20Model/)"
print(f"Input markdown: {markdown_text}")
import re
# Test URL extraction
link_parts = re.split(r'\[([^\]]+)\]\(([^)]+)\)', markdown_text)
print(f"Parsed parts: {link_parts}")
if len(link_parts) >= 3:
text = link_parts[1]
url = link_parts[2]
print(f"Extracted text: '{text}'")
print(f"Extracted URL: '{url}'")
# Test URL encoding
import urllib.parse
if '://' in url:
scheme_and_domain, path_part = url.split('://', 1)
if '/' in path_part:
domain, path = path_part.split('/', 1)
encoded_path = urllib.parse.quote(path, safe='/?&=:#%')
clean_url = f"{scheme_and_domain}://{domain}/{encoded_path}"
print(f"Cleaned URL: '{clean_url}'")
print("✅ URL parsing test completed")
Return Value
This function does not return any value (implicitly returns None). It prints test results and status messages to stdout, including the input markdown, parsed parts, extracted text and URL, and the cleaned/encoded URL.
Dependencies
reurllib.parse
Required Imports
import re
import urllib.parse
Usage Example
import re
import urllib.parse
def test_markdown_link_parsing():
"""Test markdown link parsing with complex URLs"""
print("\nTesting markdown link parsing...")
markdown_text = "[3.5.1 Cost model for WBPK022&K024,K034_20240624.xlsx](https://filecloud.vicebio.com/ui/core/index.html?filter=3.5.1+Cost+model+for+WBPK022&K024,K034_20240624.xlsx#expl-tabl./SHARED/vicebio_shares/Wuxi/3%20WO-CO%20&%20invoice%20plan/3.5%20Cost%20Model/)"
print(f"Input markdown: {markdown_text}")
link_parts = re.split(r'\[([^\]]+)\]\(([^)]+)\)', markdown_text)
print(f"Parsed parts: {link_parts}")
if len(link_parts) >= 3:
text = link_parts[1]
url = link_parts[2]
print(f"Extracted text: '{text}'")
print(f"Extracted URL: '{url}'")
if '://' in url:
scheme_and_domain, path_part = url.split('://', 1)
if '/' in path_part:
domain, path = path_part.split('/', 1)
encoded_path = urllib.parse.quote(path, safe='/?&=:#%')
clean_url = f"{scheme_and_domain}://{domain}/{encoded_path}"
print(f"Cleaned URL: '{clean_url}'")
print("✅ URL parsing test completed")
# Run the test
test_markdown_link_parsing()
Best Practices
- This is a test function meant for validation purposes, not production use
- The regex pattern r'\[([^\]]+)\]\(([^)]+)\)' assumes well-formed markdown links and may not handle nested brackets or escaped characters
- The URL encoding preserves specific safe characters ('/?&=:#%') which may need adjustment based on specific URL requirements
- The function assumes URLs contain '://' scheme separator and at least one path component
- For production code, consider using a dedicated markdown parsing library instead of regex
- The function prints directly to stdout; consider using logging or returning results for better testability
Tags
Similar Components
AI-powered semantic similarity - components with related functionality:
-
function test_markdown_processing 73.0% similar
-
function test_complex_url_hyperlink 64.7% similar
-
function convert_markdown_to_html_v1 60.0% similar
-
function format_inline_markdown 58.6% similar
-
function html_to_markdown_v1 57.0% similar