MediaWiki
Jump to navigation
Jump to search
https://dev.to/wizlee/sanitize-and-convert-html-to-markdown-for-importing-notes-into-joplin-4537
Step 2 ↑top
To convert from the sanitize HTML into Markdown, pandoc is used. This is a command line tool, an external library is installed using pip to use pandoc easier in Python.
pip install pypandoc
After installing, the code snippet below shows how to call pypandoc to convert HTML into Markdown.
pypandoc.convert_file(html_path,
'markdown+pipe_tables+backtick_code_blocks-markdown_attribute',
format='html',
outputfile=md_path)
This pandoc documentation shows all the supported input and output formats. If you are curious about the ‘plus’ and ‘minus’ strings after the format, those are for adding or removing pandoc extensions respectively. The Markdown files generated using these extensions provide the best imported Joplin notes. Check out this section to understand more details about the extensions.