在Python中使用PHP的Markdown解析包：Parsedown

需要在python中将markdown解析为HTML，却发现流行的几个包都转换出错，试了一下PHP的Parsedown可以正确地转换，也就是问题出在python package上，而不是我的markdown文本上。因为只是个人的工具上用，并不关注运行效率，于是索性在python中调用php用Parsedown处理转换。

试用的几个pyhon包分别是mardown、markdown2、misaka，出问题的地方是一个很长的代码区域，可能因为其中内容的格式相对复杂，这三个包都没有识别出这个区域，在其中插入了很多段落，成功地把格式搞坏了。这三个包都算是python中比较流行的Markdown处理包，却没法解决我的问题，看来python在HTML/Web领域的生态与PHP比确实还有不小的差距。

记录下这三个包的使用方法吧：

mdtext = '**hello world!**'

import markdown
html = markdown.markdown(mdtext)

import markdown2
html = markdown2.markdown(mdtext)

import misaka
html = misaka.html(mdtext)

下面是最终的方案：php文件中引入了Parsedown从事转换，python中用subprocess调用这个php文件。因为subprocess可传入参数的长度有限制，所以是将要处理的文本先写入临时文件，将临时文件路径作为参数传给php文件，由php文件读取文件内容进行处理的。

python文件中的相关代码

import os
import subprocess

def file_in_samedir(filename):
    return os.path.join(os.path.dirname(os.path.realpath(__file__)), filename)

mdtext = '** hello world **'
tmpfile = file_in_samedir("tmpfile")
codecs.open(tmpfile, "w", "utf-8").write(mdtext)
proc = subprocess.Popen('php %s -f %s' % (file_in_samedir("md2html.php"), tmpfile), shell=True, stdout=subprocess.PIPE)
html = proc.stdout.read().decode("utf-8", "ignore")

php文件md2html.php，与Parsedown.php放在同一路径下。

require_once __DIR__ . '/Parsedown.php';
$Parsedown = new Parsedown();
$f = getopt('f:')['f'];
$c = file_get_contents($f);
$c = $Parsedown->text($c);
echo $c;

Parsedown的获取地址：

Parsedown

参考资料

-- EOF --

本文最后修改于6年前 (2019-05-22)

(No Ratings Yet)

读取中...