GithubHelp home page GithubHelp logo

supersongssr / blog2md Goto Github PK

View Code? Open in Web Editor NEW

This project forked from palaniraja/blog2md

0.0 1.0 0.0 75 KB

Convert Blogger & Wordpress backup blog posts to hugo compatible markdown documents

JavaScript 100.00%

blog2md's Introduction

Blogger to Markdown

Convert Blogger & WordPress backup blog posts to hugo compatible markdown documents

Usage: node index.js b|w <BLOGGER BACKUP XML> <OUTPUT DIR>

For Blogger imports, blog posts and comments (as seperate file <postname>-comments.md) will be created in "out" directory

node index.js b your-blogger-backup-export.xml out

For WordPress imports, blog posts and comments (as seperate file <postname>-comments.md) will be created in "out" directory

node index.js w your-wordpress-backup-export.xml out

If you want the comments to be merged in your post file itself. you can use flag m at the end. Defaults to s for seperate comments file

node index.js w your-wordpress-backup-export.xml out m

If converting from WordPress, and you have posts that do not contain HTML, you can use a paragraph-fix flag at the end.

node index.js w your-wordpress-backup-export.xml out m paragraph-fix

Installation (usual node project)

  • Download or Clone this project
  • cd to directory
  • Run npm install to install dependencies
  • Run node index.js <arg...>

Notes to self

Script to convert posts from Blogger to Markdown.

  • Read XML
  • Parse Entries (Posts and comments) (with xpath?)
  • Parse Title, Link, Created, Updated, Content, Link
  • List Post & Respective comment counts
  • Content to MD - pandoc?
  • Parse Images, Files, Videos linked to the posts
  • Create output dir
  • List items that are not downloaded( or can't) along with their .md file for user to proceed

Reasons

  • Wrote this to consolidate and convert my blogs under one roof.
  • Plain simple workflow with hugo
  • Ideas was to download associated assets (images/files) linked to post. Gave up, because it was time consuming and anyhow I need to validate the markdown with assets of converted. And I don't see benefit.
  • Initial assumption was to parse with xpath but I found xml2json.js was easier
  • Also thought pandoc is a overkill and turndown.js was successful, though I had to wrap empty text to md instead of html.
  • I want to retain comments. Believe it or not, There were some good comments.
  • Was sick and spent around ~12 hrs over 5 days in coding and testing with my blog contents over ~150 posts. And also, I find parsing oddly satisfying when it result in success. ¯\_(ツ)_/¯

blog2md's People

Contributors

palaniraja avatar coliff avatar ct-martin avatar jamesskemp avatar dependabot[bot] avatar chathuras avatar lonelydev avatar joshuaulrich avatar foolip avatar amansh39 avatar

Watchers

James Cloos avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.