GithubHelp home page GithubHelp logo

wp2middleman's Introduction

Build Status Code Climate

wp2middleman

A command line tool to help move a Wordpress blog to Middleman.

wp2middleman migrates the posts contained in a Wordpress XML export file to Middleman-style markdown files.

Installation

gem install wp2middleman

Commandline Usage

wp2mm some_wordpress_export.xml

Results in YYYY-MM-DD-Some-Title.html.markdown, formatted as such:

---
title: 'Some Title'
date: YYYY-MM-DD
tags: foo, bar
---

<p>The post content in HTML or text, depending on how it was saved to Wordpress.</p>
<ul>
<li>list item</li>
<li>another list item</li>
</ul>

Optional Parameters

--body_to_markdown                         converts to markdown
--include_fields=FIELD_ONE FIELD_TWO ETC   includes specific fields in frontmatter

Convert to Markdown

wp2mm some_wordpress_export.xml --body_to_markdown true

Results in YYYY-MM-DD-Some-Title.html.markdown, formatted as such:

---
title: 'Some Title'
date: YYYY-MM-DD
tags: foo, bar
---

The post content in markdown or text, depending on how it was saved to Wordpress.

* list item
* another list item

Include specific post fields

wp2mm some_wordpress_export.xml --include_fields wp:post_id link

Pulls the specified key/values out of the post xml and includes it in frontmatter:

---
title: 'Some Title'
date: YYYY-MM-DD
tags: foo, bar
wp:post_id: '280'
link: http://somewebsite.com/2012/10/some-title
---

<p>The post content in HTML or text, depending on how it was saved to Wordpress.</p>
<ul>
<li>list item</li>
<li>another list item</li>
</ul>

wp2middleman's People

Contributors

bensheldon avatar brianauton avatar jgarber avatar justincampbell avatar mdb avatar monfresh avatar natedavisolds avatar ronen avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar

wp2middleman's Issues

Single quotes within frontmatter titles are not escaped

If the title of a wordpress post has a single quote in it, it isn't properly escaped and the middleman parser throws a fit. I think the single quote wrapping could be done away with entirely (yaml is pretty resilient).

Post categories and tags should be treated separately

Currently, the specs assume that the following domain attributes should both be assigned to tags:

<category domain="category" nicename="another tag"><![CDATA[Another Tag]]></category>
<category domain="post_tag" nicename="tag"><![CDATA[tag]]></category>

That is wrong. In Wordpress, tags and categories are separate entities. Most people use categories in the URL for SEO purposes. For example, one of the categories on my music blog is "Fresh Tunes," and all my posts in that category have "fresh-tunes" in the URL, like this: http://www.monfresh.com/fresh-tunes/ltj-bukem-journey-inwards-good-looking-records/

In order to preserve that URL structure in Middleman, you can create a "category" entry in the frontmatter of the post and then set the permalink in config.rb like this:

activate :blog do |blog|
  blog.permalink = "{category}/{title}.html"
  ...
end

On my blog, each post only belongs to one category, so I just added a new category method in post.rb:

def category
  categories = post.xpath("category").to_a

  categories.map! do |cat|
    cat.css("@nicename").text if cat.css("@domain").text == "category"
  end

  categories.reject { |cat| cat.nil? }.first
end

and added a new frontmatter entry in migrator.rb:

def file_content(post)
  ...
  file_content << "category: #{post.category}\n"
  ...
end

This works beautifully for me personally, but it's possible that others might have more than one top-level category for their post. Since Middleman only supports one category per post, they would potentially have to do a little more work once the middleman files are generated in case the preferred category was not picked.

The other option is to remove the .first from the category method so that it returns an array of all the categories. But this creates more work because you would have to go through each post and delete the extra categories so that only the preferred one remains. Otherwise, your URL prefix would be a combination of all the categories separated by a dash. The other downside is semantic. Middleman only supports one category, so the frontmatter key should be singular, but if the method returns an array, it should be plural, which can be confusing.

I'm leaning towards the first option. What do you think?

Test Migrator#frontmatter

I'd like to ensure all public methods have test coverage. This way, if ever there's a problem, we can more confidently and rapidly isolate the issue.

Update README

There are a few new options available that documented in the README.

File name too long err

Hello,

I just installed wp2middleman and tried to import an old blog, but I run into this problem:

/usr/local/opt/rbenv/versions/2.3.1/lib/ruby/gems/2.3.0/gems/wp2middleman-0.0.3/lib/wp2middleman/migrator.rb:18:in `write': File name too long @ rb_sysopen - /home/ramon/code/ramonsuarez/bbmm/export/2006-12-13-Menos-mal-que-la-economa-va-bien-un-estudio-revela-que-el-salario-medio-de-los-espaoles-no-ha-variado-desde-1997-corta-pega.html.markdown (Errno::ENAMETOOLONG)
	from /usr/local/opt/rbenv/versions/2.3.1/lib/ruby/gems/2.3.0/gems/wp2middleman-0.0.3/lib/wp2middleman/migrator.rb:18:in `block in migrate'
	from /usr/local/opt/rbenv/versions/2.3.1/lib/ruby/gems/2.3.0/gems/wp2middleman-0.0.3/lib/wp2middleman/post_collection.rb:17:in `each'
	from /usr/local/opt/rbenv/versions/2.3.1/lib/ruby/gems/2.3.0/gems/wp2middleman-0.0.3/lib/wp2middleman/post_collection.rb:17:in `each'
	from /usr/local/opt/rbenv/versions/2.3.1/lib/ruby/gems/2.3.0/gems/wp2middleman-0.0.3/lib/wp2middleman/migrator.rb:17:in `migrate'
	from /usr/local/opt/rbenv/versions/2.3.1/lib/ruby/gems/2.3.0/gems/wp2middleman-0.0.3/lib/wp2middleman.rb:14:in `migrate'
	from /usr/local/opt/rbenv/versions/2.3.1/lib/ruby/gems/2.3.0/gems/wp2middleman-0.0.3/lib/wp2middleman/cli.rb:20:in `wp2mm'
	from /usr/local/opt/rbenv/versions/2.3.1/lib/ruby/gems/2.3.0/gems/thor-0.19.1/lib/thor/command.rb:27:in `run'
	from /usr/local/opt/rbenv/versions/2.3.1/lib/ruby/gems/2.3.0/gems/thor-0.19.1/lib/thor/invocation.rb:126:in `invoke_command'
	from /usr/local/opt/rbenv/versions/2.3.1/lib/ruby/gems/2.3.0/gems/thor-0.19.1/lib/thor.rb:359:in `dispatch'
	from /usr/local/opt/rbenv/versions/2.3.1/lib/ruby/gems/2.3.0/gems/thor-0.19.1/lib/thor/base.rb:440:in `start'
	from /usr/local/opt/rbenv/versions/2.3.1/lib/ruby/gems/2.3.0/gems/wp2middleman-0.0.3/bin/wp2mm:9:in `<top (required)>'
	from /usr/local/opt/rbenv/versions/2.3.1/bin/wp2mm:23:in `load'
	from /usr/local/opt/rbenv/versions/2.3.1/bin/wp2mm:23:in `<main>'

I've not been able to find information on anny option that would rename these files. How should I do?

Thanks

duplicate .html.markdown

Filenames are having extensions with .html.markdown.html.markdown because the extension is being appended both within the migrator and within the post. Also .markdown is being appended regardless of whether the --body_to_markdown flag is set.

Just wanted to document it here until I'm able to make a pull request.

Handle images embedded in post content

It would be great if wp2mm could handle images embedded in post content.

For example, the migrator could download Wordpress media library images referenced in post content to an 'export/images' directory and appropriately tweak each image's 'src' value in the post content to reference its new 'images/*' location.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.