Markdown Proxy - URL to Markdown
将任意 URL 转为干净的 Markdown。支持需要登录的页面、PDF、专有平台。URL Routing (先判断再执行)
收到 URL 后,先判断类型,不同类型走不同通道:| URL Pattern | Route To | Reason |
|---|---|---|
mp.weixin.qq.com | scripts/fetch_weixin.py | 公众号需 Playwright 抓取 |
feishu.cn/docx/ feishu.cn/wiki/ larksuite.com/docx/ | scripts/fetch_feishu.py | 需飞书 API 认证 |
youtube.com youtu.be | yt-search-download skill | YouTube 有专用工具链 |
.pdf (URL or local path) | scripts/extract_pdf.sh | PDF 专用提取 |
| All other URLs | scripts/fetch.sh | 代理级联自动 fallback |
Workflow
Step 1: Route by URL Type
Step 2: Display Content
After fetching, show to user:Step 3: Save File (Default)
Save to~/Downloads/{title}.md with YAML frontmatter by default.
- Filename: use article title, remove special characters
- Format: YAML frontmatter (title, author, date, url, source) + Markdown body
- Tell user the saved path
- Skip only if user says “just preview” or “don’t save”
Examples
General URL
X/Twitter Post
WeChat Article
Feishu Document
PDF (Remote)
PDF (Local)
With Custom Proxy
Notes
- r.jina.ai and defuddle.md require no API key
fetch.shhandles proxy cascade with automatic fallback- Content validation: filters error pages, requires >5 lines
- WeChat script requires:
pip install playwright beautifulsoup4 lxml && playwright install chromium - Feishu script requires:
FEISHU_APP_ID+FEISHU_APP_SECRETenv vars - PDF extraction tries: marker-pdf → pdftotext → pypdf
- For detailed method documentation, see
references/methods.md