GithubHelp home page GithubHelp logo

meridius / confluence-to-markdown Goto Github PK

View Code? Open in Web Editor NEW

This project forked from ewhite613/confluence-to-github-markdown

138.0 5.0 50.0 104 KB

Confluence to Markdown converter which is actually working

License: MIT License

CoffeeScript 7.86% HTML 10.90% CSS 81.22% JavaScript 0.02%

confluence-to-markdown's Introduction

Confluence to Markdown converter which is actually working

Convert Confluence HTML export to Markdown

Requirements

You must have pandoc command line tool installed. Check it by running:

pandoc --version

Install all project dependencies:

npm install

Usage

In the converter's directory:

npm run start <pathResource> <pathResult>

Parameters

parameter description
<pathResource> File or directory to convert with extracted Confluence export
<pathResult> Directory to where the output will be generated to. Defaults to current working directory

Process description

  • Confluence page IDs in HTML file names and links are replaced with that pages' heading
  • overall index.md is created linking all Confluence spaces - their indexes
  • images and other inserted attachments are linked to generated markdown
    • whole images and attachments directories are copied to resulting directory
      • there is no checking done whether perticular file/image is used or not
  • markdown links to internal pages are generated without the trailing .md extension to comply to gitit expectations
    • this can be changed by finding all occurances of gitit requires link to pages without .md extension in the .coffee files and adding the extension there.
    • or you can send a PR ;)
  • the pandoc utility can accept quite a few options to alter its default behavior
    • those can be passed to it by adding them to @outputTypesAdd, @outputTypesRemove, @extraOptions properties in the App.coffee file
    • or you can send a PR ;)
    • here is the list of options pandoc can accept
  • throughout the application a single console logger is used, its default verbosity is set to INFO
    • you can change the verbosity to one of DEBUG, INFO, WARNING, ERROR levels in the Logger.coffee file
    • or you can send a PR ;)
  • a series of formatter rules is applied to the HTML text of Confluence page for it to be converted properly

Room for improvement

If you happen to find something not to your liking, you are welcome to send a PR. Some good starting points are mentioned in the Process description section above.

Export to HTML

Note that if the converter does not know how to handle a style, HTML to Markdown typically just leaves the HTML untouched (Markdown does allow for HTML tags).

Step by step guide for Confluence data export

  1. Go to the space and choose Space tools > Content Tools on the sidebar.
  2. Choose Export. This option will only be visible if you have the Export Space permission.
  3. Select HTML then choose Next.
  4. Decide whether you need to customize the export:
  • Select Normal Export to produce an HTML file containing all the pages that you have permission to view.
  • Select Custom Export if you want to export a subset of pages, or to exclude comments from the export.
  1. Extract zip

WARNING
Please note that Blog will NOT be exported to HTML. You have to copy it manually or export it to XML or PDF. But those format cannot be processed by this utility.

Attribution

Thanks to Eric White for a starting point.

confluence-to-markdown's People

Contributors

carun avatar christian-git-md avatar ewhite613 avatar meridius avatar romanlevin avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar

confluence-to-markdown's Issues

"Callback must be a function" error

Hi @meridius

I'm trying to run this in an azure devops build pipeline (after I tried it manually on another host)
This (cfr. below) is what I see in the build output and also when trying to execute it manually afterwards.

Any idea what's going wrong and how to resolve it?
Thanks in advance.

C:\agent\_work\2\s\confluence-to-markdown\confluence-to-markdown-master>npm run start ..\..\Confluence-export ..\..\Markdown

> [email protected] start C:\agent\_work\2\s\confluence-to-markdown\confluence-to-markdown-master
> coffee ./src/index.coffee "..\..\Confluence-export" "..\..\Markdown"

Using source: C:\agent\_work\2\s\Confluence-export
Using destination: C:\agent\_work\2\s\Markdown
TypeError [ERR_INVALID_CALLBACK]: Callback must be a function
    at makeCallback (fs.js:137:11)
    at Object.unlink (fs.js:917:14)
    at App.writeMarkdownFile (C:\agent\_work\2\s\confluence-to-markdown\confluence-to-markdown-master\src\App.coffee:94:10)
    at App.writeGlobalIndexFile (C:\agent\_work\2\s\confluence-to-markdown\confluence-to-markdown-master\src\App.coffee:105:6)
    at App.convert (C:\agent\_work\2\s\confluence-to-markdown\confluence-to-markdown-master\src\App.coffee:55:6)
    at Bootstrap.run (C:\agent\_work\2\s\confluence-to-markdown\confluence-to-markdown-master\src\Bootstrap.coffee:34:9)
    at Object.<anonymous> (C:\agent\_work\2\s\confluence-to-markdown\confluence-to-markdown-master\src\index.coffee:7:11)
    at Object.<anonymous> (C:\agent\_work\2\s\confluence-to-markdown\confluence-to-markdown-master\src\index.coffee:1:1)
    at Module._compile (internal/modules/cjs/loader.js:689:30)
    at Object.exports.run (C:\agent\_work\2\s\confluence-to-markdown\confluence-to-markdown-master\node_modules\coffee-script\lib\coffee-script\coffee-script.js:173:23)
    at compileScript (C:\agent\_work\2\s\confluence-to-markdown\confluence-to-markdown-master\node_modules\coffee-script\lib\coffee-script\command.js:224:29)
    at compilePath (C:\agent\_work\2\s\confluence-to-markdown\confluence-to-markdown-master\node_modules\coffee-script\lib\coffee-script\command.js:174:14)
    at Object.exports.run (C:\agent\_work\2\s\confluence-to-markdown\confluence-to-markdown-master\node_modules\coffee-script\lib\coffee-script\command.js:98:20)
    at Object.<anonymous> (C:\agent\_work\2\s\confluence-to-markdown\confluence-to-markdown-master\node_modules\coffee-script\bin\coffee:15:45)
    at Module._compile (internal/modules/cjs/loader.js:689:30)
    at Object.Module._extensions..js (internal/modules/cjs/loader.js:700:10)
    at Module.load (internal/modules/cjs/loader.js:599:32)
    at tryModuleLoad (internal/modules/cjs/loader.js:538:12)
    at Function.Module._load (internal/modules/cjs/loader.js:530:3)
    at Function.Module.runMain (internal/modules/cjs/loader.js:742:12)
    at startup (internal/bootstrap/node.js:279:19)
    at bootstrapNodeJSCore (internal/bootstrap/node.js:752:3)

npm ERR! code ELIFECYCLE
npm ERR! errno 1
npm ERR! [email protected] start: `coffee ./src/index.coffee "..\..\Confluence-export" "..\..\Markdown"`
npm ERR! Exit status 1
npm ERR!
npm ERR! Failed at the [email protected] start script.
npm ERR! This is probably not a problem with npm. There is likely additional logging output above.

npm ERR! A complete log of this run can be found in:
npm ERR!     C:\Users\***\AppData\Roaming\npm-cache\_logs\2018-10-28T22_35_12_048Z-debug.log

Sanitize the generated Markdown filenames

The filename of each generated Markdown file is derived from each exported page's title.

Sometimes a title contains characters (like colons or quotes) which are invalid on Windows file-systems. (*nix file-systems tend to allow more characters).

I'll raise a pull-request which more aggressively sanitizes the generated filenames so that the tool will run equally well on any platform.

Extension blank_before_header not supported for gfm

All of the images are saved, but no MD files are saved.

I see this in the output:

Parsing ... somefile.html
Making Markdown ... somefile.md
The extension blank_before_header is not supported for gfm

pandoc version (on Windows)
pandoc.exe 2.9.2.1
Compiled with pandoc-types 1.20, texmath 0.12.0.1, skylighting 0.8.3.2
Default user data directory: C:\Users\cmerritt\AppData\Roaming\pandoc
Copyright (C) 2006-2020 John MacFarlane
Web: https://pandoc.org

How to include Author details in the converted file

Hi,
I am using this code for converting confluence exported HTML files to .md files, which im uploading to Azure Wiki.
Could you please help me how do i include the Confluence file author details in the .md file ?

i see this detail available the exported HTML file, but the converted .md file is missing that.

I do not have much exposure on the coffee scripts.

Appreciate your help here.

Thank you!

Not able to generate md file

Hello,
I saved the project page into html and launched the converter. It does something but I didn't get any md file. Do you have any idea where I can be wrong ?

`
C:\Users...\18_Tools\confluence-to-markdown-master\confluence-to-markdown-master>npm run start "C:\Users...\Downloads\GitHub - meridius_confluence-to-markdown.html"

[email protected] start C:\Users...\18_Tools\confluence-to-markdown-master\confluence-to-markdown-master
coffee ./src/index.coffee "C:\Users...\Downloads\GitHub - meridius_confluence-to-markdown.html"

Using source: C:\Users...\Downloads\GitHub - meridius_confluence-to-markdown.html
Using destination: C:\Users...\18_Tools\confluence-to-markdown-master\confluence-to-markdown-master
Parsing ... C:\Users...\Downloads\GitHub - meridius_confluence-to-markdown.html
Making Markdown ... C:\Users...\18_Tools\confluence-to-markdown-master\confluence-to-markdown-master\Downloads\GitHub_-meridius_confluence-to-markdown_Confluence_to_Markdown_converter_which_is_actually_working.md
pandoc: ...\18_Tools\confluence-to-markdown-master\confluence-to-markdown-master\Downloads\GitHub
-_meridius_confluence-to-markdown_Confluence_to_Markdown_converter_which_is_actually_working.md: openBinaryFile: does not exist (No such file or directory)

Done

Conversion done
(node:4568) [DEP0013] DeprecationWarning: Calling an asynchronous function without callback is deprecated.
`

Thanks,
G

Dependency tree error from npm / chai

During install, receive dependency error:

npm ERR! code ERESOLVE
npm ERR! ERESOLVE unable to resolve dependency tree
npm ERR!
npm ERR! While resolving: [email protected]
npm ERR! Found: [email protected]
npm ERR! node_modules/chai
npm ERR!   dev chai@"^3.5.0" from the root project
npm ERR!
npm ERR! Could not resolve dependency:
npm ERR! peer chai@">= 1.6.1 < 2" from [email protected]
npm ERR! node_modules/chai-fs
npm ERR!   dev chai-fs@"^0.1.0" from the root project
npm ERR!
npm ERR! Fix the upstream dependency conflict, or retry
npm ERR! this command with --force, or --legacy-peer-deps
npm ERR! to accept an incorrect (and potentially broken) dependency resolution.

npm package not working

I try to get the npm package from https://www.npmjs.com/package/confluence-to-markdown to work. But this only opens the index.coffee with the content:

Bootstrap = require './Bootstrap'

pathResource = process.argv[2] # can also be a file
pathResult = process.argv[3]

bootstrap = new Bootstrap
bootstrap.run pathResource, pathResult

Clone this repo and run npm run start ... works.

Image element size formatting issue

Hi there,
thanks for your script, it fixes most of the problems i had with other implementations i tried.
I only had one issue: Images that have specified sizes come out as e.g.
![](images/image.gif){width="8" height="8"}
which is supported by any markdown version I could find. It would be nice if this was formatted to be more compatible with general markdown.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.