编程知识 cdmana.com

Python之html与markdown互相转换

Python之html与markdown互相转换

前言

Typora可以很容易的将md导出为html,我一直都有想法就是将html还原为markdown,于是在网上整理了几种方法,以便后期使用。

如果你只是转换单个文件,推荐直接在线转换:Link Link Link

1. html2text

pip install html2text

转换代码:

import html2text

md_text = open('ret.html', 'r', encoding='utf-8').read()

markdown = html2text.html2text(md_text)

with open('make2.md', 'w', encoding='utf-8') as file:
    
    file.write(markdown)
    

2. html2markdown

pip install html2markdown

转换代码:

import html2markdown

md_text = open('ret.html', 'r', encoding='utf-8').read()

markdown = html2markdown.convert(md_text)

with open('make3.md', 'w', encoding='utf-8') as file:
    
    file.write(markdown)

经过测试觉得html2text模块的转换还可以!

3. pandoc

pip install pandoc

在需要转换的目录下打开cmd

将md转换为HTML:

pandoc -f markdown -t html -o a.html a.md

HTML转化为md:

pandoc -f html -t markdown -o b.md b.html

在这里插入图片描述

在这里插入图片描述

4. 批处理

多个文件同时转换,示例代码:

html转md:

import os

path = r'文件路径'
all = os.listdir(path)
for file in all:
    if file.endswith('.html'):
    	name = os.path.splitext(file)[0]
        os.system('cd {} && pandoc -f html -t markdown -o {}.md {}.html '.format(path, os.path.splitext(name)[0], os.path.splitext(name)[0]))
""" @Author: ZS @CSDN : https://zsyll.blog.csdn.net/ @Time : 2021/11/25 12:36 """
import html2text
import os

for root, dirs, files in os.walk(r'E:\Python资料', topdown=True):
    for name in files:
        path = os.path.join(root, name)
        if path.endswith('.html'):
            with open(path, encoding='utf-8') as html, open(os.path.join(root, os.path.splitext(name)[0] + '.md'), 'w', encoding='utf-8') as md:
                markdown = html2text.html2text(html.read())
                md.write(markdown)

                print(os.path.splitext(name) + ' 转换成功!')

参考Llink Link


加油!

感谢!

努力!

版权声明
本文为[ZSYL]所创,转载请带上原文链接,感谢
https://blog.csdn.net/qq_46092061/article/details/121537820

Scroll to Top