GithubHelp home page GithubHelp logo

ciconia's People

Contributors

evert avatar hkdobrev avatar joelcuevas avatar kzykhys avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

ciconia's Issues

sometimes GFM mode didn't add <br> of lines ends with '&xx;'

with GFM whitespace extension enabled:

&lt;
Hello

should be rendered as:

<p>&lt;<br>
Hello</p>

but currently the result is:

<p>&lt;
Hello</p>

anyway, thanks to your great project :)

Update:

I found that lines with Chinese characters meet mistakes too.

你好
世界

is rendered as:

<p>你好
世界</p>

'_' bug in parsing link and image

/* Expected results are generated by marked in GFM mode. */

Hello! 2 new bugs :)

(1)

Input:

![Hello_world_img](/path/to/hello_world_img.jpg)

Output:

<img src="/path/to/hello_world_img.jpg" alt="Hello&lt;em&gt;world&lt;/em&gt;img">

Expected:

<img src="/path/to/hello_world_img.jpg" alt="Hello_world_img">

(2)

Input:

[Hello_world_link](/hello_world_link.html)

Output:

<a href="/hello&lt;em&gt;world&lt;/em&gt;link.html">Hello<em>world</em>link</a>

Expected:

<a href="/hello_world_link.html">Hello_world_link</a>

Escaping raw HTML

Hello!

I tried removing the htmlBlock extension, but raw HTML is still allowed. I can't find any option to disable it. That's fine for command-line usage where you control the inputs, but if you want to parse Markdown on a site with user-generated content, allowing raw HTML is a hazard.

Where and how could this be done?

Cheers,
Eugen

GFM whitespace extension creates too many <br> tags

I'm not 100% sure how it should work, but I've been rendering this document with Ciconia, and the whitespace behavior is inconsistent with that of GitHub.

Look at the leading paragraph, which ends in the words, "with a gentle learning curve" - there's an extra line break inserted there.

Then look at the source document - there is a line-break before those words, but it doesn't render as a <br> on GitHub.

If I comment out $engine->addExtension(new Gfm\WhiteSpaceExtension()), it looks more like GitHub.

What gives?

How to disable certain markdown?

I would like to disable certain markdown from the core, for example, I don't want users to post images, I can go ahead and not document that markdown but if a user is little bit technical and knows about markdown can easily use them...

I would like to disable then the following markdown:

![Alt text](/path/to/image.png)

Is there any option to do this? I haven't found anything...

Hook to process links

Hi!

I'm working on a static site generator with Ciconia.

I would like to automatically be able to process all the links in the markdown source, and prepend the link with a base url (if they are relative).

Ideally, I would also be able to automatically add .md to the resulting like.

To do this effectively, it would be awesome if it were somehow possible to add a callback that allows me to process and rewrite links... For example something like this:

$ciconia->setLinkHandler(function($in) {

   return $in . '.md';

});

But I would also settle with a simple baseUrl and let Ciconia do the heavy lifting :)

Improve performance

$ php bin/markbench benchmark --profile=github-sample
Runtime: PHP5.5.3
Host:    Linux vm1 3.8.0-31-generic #46-Ubuntu SMP Tue Sep 10 20:03:44 UTC 2013 x86_64
Profile: Sample content from Github (http://github.github.com/github-flavored-markdown/sample_content.html) / 1000 times
Class:   Markbench\Profile\GithubSampleProfile

+----------------------+---------+---------+---------------+---------+--------------+
| package              | version | dialect | duration (MS) | MEM (B) | PEAK MEM (B) |
+----------------------+---------+---------+---------------+---------+--------------+
| erusev/parsedown     | 0.4.6   |         | 10819         | 6291456 | 6553600      |
| michelf/php-markdown | 1.3     |         | 36887         | 6815744 | 6815744      |
| michelf/php-markdown | 1.3     | extra   | 49626         | 6815744 | 7340032      |
| kzykhys/ciconia      | v0.1.4  |         | 64959         | 7340032 | 7602176      |
| kzykhys/ciconia      | v0.1.4  | gfm     | 68987         | 7077888 | 7602176      |
+----------------------+---------+---------+---------------+---------+--------------+

Parsing bug

Hi, first, thank for Ciconia. Nice project!
I have problems with the compilation of html markup and markdown, it's possible?

<nav class="class">

* [Item <span class="sep">›</span>](/)
* [Item <span class="sep">›</span>](/link)
* [Item](/link)

</nav>

# lorem ipsum

this compiles

<p><nav class="class"></p>

<ul>
<li><a href="/">Item <span class="sep"></span></a></li>
<li><a href="/link">Item <span class="sep"></span></a></li>
<li><a href="/link">Item</a></li>
</ul>

<p></nav></p>

<h1>lorem ipsum</h1>

it may create paragraphs between markup in html?
Thanks

GFM auto-link recreated an existing link

New bug found ;D

[http://hello.world.com](http://hello.world.com)

expecting output:

<p><a href="http://hello.world.com">http://hello.world.com</a></p>

current output:

<p><a href="http://hello.world.com"></a><a href="http://hello.world.com&lt;/a">http://hello.world.com&gt;</a></p>

Improve tests

  • <pre></pre> within blockquotes
  • Reference-style image with no ID
  • More complex markdown

Parsing bug

Hi, first, thank for this projet. I like its extensibility aspect !

Else, I've got some troubles in special case when trying to convert strings.
In my mind, a text like

(*My italic text*)

Will be converted to

(<em>My italic text</em>)

But it doesn't. It seems that the problem come from de space character. Once this kind of sentence contains space, conversion is wrong.

Regards.

CLI support

Usage

ciconia /path/to/markdown.md > /path/to/html.html

TODO

  • Configure composer.json (bin and extra requirements)
  • Add console commands (use symfony/console)
  • Create single PHAR package

Wrong rendering with Gfm\UrlAutoLinkExtension()

I use Redactor as the entry point for text. This is rich editor so the text for example when I enter a simple URL is coming raw like this:

<p>https://github.com/kzykhys/Ciconia</p>

I'm rendering the with the extension Gfm\UrlAutoLinkExtension() and the results are these:

<p><p><a href="https://github.com/kzykhys/Ciconia</">https://github.com/kzykhys/Ciconia</</a>p></p>

I'm processing the text with Ciconia with the following extensions:

        $ciconia = new Ciconia();
        $ciconia->removeExtension('header');
        $ciconia->removeExtension('code');
        $ciconia->removeExtension('image');
        $ciconia->addExtension(new Gfm\UrlAutoLinkExtension());
        $text = $ciconia->render($text);

My guess is there's something wrong with regexp being used but not sure how to fix it. I'll appreciate any help.

Regards

Parsing error when having url in code block

Ciconia is a VERY VERY useful markdown converter for me, excellent work!

But I found an issue that when a script tag in a code block, then after by a link in a code quote, then some parsing error occurred.

Since I don't know how to make github issue show these code correctly, I put it in the pastebin: http://pastebin.com/MccGF5rk

Consecutive tables - only first table is parsed

Two consecutive tables

| head | head |
|-------|-------|
| body | body |

| head | head |
|-------|-------|
| body | body |

are converted to

<table>
<thead>
<tr>
<th>head</th>
<th>head</th>
</tr>
</thead>
<tbody>
<tr>
<td>body</td>
<td>body</td>
</tr>
</tbody>
</table>

<p>| head | head |<br>
|-------|-------|<br>
| body | body |</p>

Feature: support for GFM anchors

Support for GFM-style auto-generated anchor-tags is missing - for example, ## Opinionated in GFM generates a <a> tag with an auto-generated name attribute, e.g.:

<h2>
    <a id="user-content-opinionated" class="anchor" href="#opinionated" aria-hidden="true">
        <span class="octicon octicon-link"></span>
    </a>
    Opinionated
</h2>

Is support for GFM deliberately partial in Ciconia?

If so, it might be a good idea to clarify this in the documentation - I got the impression that GFM was fully-supported, but it appears to be partial? It would be good to list in the README not just which features are supported, but which ones are unsupported.

Extensible extension

Hi Kazuyuki Hayashi ,

Good work on the Parsedown. Not sure this is the right way to ask a doubt, if so apologize.

I was looking and playing with @sculpin and to convert my blog in octopress to sculpin.
I was having a few struggle in the process with the markdown. Jekyll support some plugins for the markdown extras?.

So what I was looking was a parser which can be extensible.

This is my markdown

[Aura.Cli_Project](https://github.com/auraphp/Aura.Cli_Project) help 
you to build cli ( command line ) applications.

{% showterm 4baa2e4db41b12786a7ce %}

If you need only web based application then 
[Aura.Web_Project](https://github.com/auraphp/Aura.Web_Project) 
is what you need.

You can see a showterm which takes this plugin and convert to html https://github.com/harikt/harikt.github.com/blob/source/plugins/showterm.rb

Similarly there are other plugins, so do you think we can create a catch for {% and get the string showterm or any <plugin-name> and return an appropriate html ?

Error on italics parsing

This:

$ciconia->render('`user_id` `user_id`');

Renders this:

<p><code>user<em>id</code> <code>user</em>id</code></p> 

Note the <em> opening before id even when there is no _ closing char before the code tag.

And bold, of course, behaves the same:

$ciconia->render('`user__id` `user__id`');

Anyways, should italics and bold be parsed inside code markup?

Question about weird error

@kzykhys, this is more a question than a bug.

I have a kinda large table in GFM with code tags inside its cells that breaks my local Apache Server when I parse it.

This doesn't happens in "Try Ciconia" nor my prod server, but I've already reproduced it in 3 completely different dev machines with the same results.

Do you have any idea about what could it be?

This is the Markdown:

Table
-----

Atributo       | Tipo      | Notas
--             | --        | --
id             | Integer   | |
code           | String    | |
subcode        | String    | |
description    | String    | |
status         | Integer   | Uno de: `0` (pendiente), `1` (disponible), `2` (terminada), `3` (cancelada), `4` (vencida).
type           | Integer   | Uno de: `0` (normal), `1` (encuesta), `2` (supervisión).
priority       | Integer   | Del 1 al 5, siendo 5 la prioridad más alta.
street         | String    | |
district       | String    | |
zipcode        | String    | |
city           | String    | |
state          | String    | |
country        | String    | |
address        | String    | Dirección estilizada para mostrar.
latitude       | Decimal   | |
longitude      | Decimal   | |
form_id        | Integer   | |
group_id       | Integer   | |
created_at     | Timestamp | |
updated_at     | Timestamp | |
available_at   | Timestamp | |
expires_at     | Timestamp | |
started_at     | Timestamp | |
finished_at    | Timestamp | |
received_at    | Timestamp | |
location_id    | Integer   | |
distance       | Integer   | Distancia en metros a la que se realizó la visita.
timespan       | Integer   | Duración en minutos de la visita.
alarms         | Integer   | |
supervising_id | Integer   | El id de la visita que se está supervisando.
supervision    | Integer   | Uno de: `null` (sin supervisar), `0` (en supervisión), `1` (aceptada), `2` (corregida), `3` (rechazada).
version        | Integer   | |

It just "break" the server and the Apache logs says this:

[mpm_winnt:notice] [pid 5668:tid 468] AH00428: Parent: child process exited with status 3221225725 -- Restarting.

I'm pretty sure that the Markdown is fine.

BTW, you can try any of this and suddenly the parser will work again:

  • Remove one (any) of the lines with more than one code tag in the last column.
  • Remove two or more of any of the other lines.

I've tried to find the last piece of code in Ciconia that's beign executed, but honestly I was unable to track it.

Any hints?

Missing Gfm features

  • Strike-through

    ~~word~~ into <del>word</del>

  • Table

    | TH | TH (align=right) | TH (align=center) |
    |----|-----------------:|:-----------------:|
    | TD |               TD |        TD         |
    

    into

    TH TH (align=right) TH (align=center)
    TD TD TD
  • Autolinking

These option should be optional

Extra <br> in lists

For this:

$ciconia = new Ciconia();
$ciconia->addExtension(new Gfm\WhiteSpaceExtension());

$ciconia->render(<<<EOT
- One
  - Two
- Three
EOT
);

Is this the expected output?:

<ul>
  <li>One
    <br>
    <ul>
      <br>
      <li>Two</li>
      <br>
    </ul>
  </li>
  <li>Three</li>
</ul>

I think there are a couple extra <br> in the code.

Maintaining

Is this library still evolving? I like this library very much and it would be a shame if it dies. :-(
I would also help to maintain it. Should we split it or how should we proceed?

Parsing error with italic and URLs

Hi

First off, excellent lib! I really like it :-)
I came across an issue where Ciconia is transforming URLs in <a> tags as well as <img> tags (don't know if it affects even more).

How to reproduce:

  1. Install Ciconia via Composer "kzykhys/ciconia": "1.*"
  2. Use the following script:
<?php

include 'vendor/autoload.php';

use Ciconia\Ciconia;

$ciconia = new Ciconia();
echo $ciconia->render('<a href="assets/images/5/tab_data_n_origin.png">Image</a>');
  1. The output will be
    <p><a href="assets/images/5/tab<em>data</em>n_origin.png">Image</a></p>

which is wrong because it should not touch inline HTML code (see http://daringfireball.net/projects/markdown/syntax#html).

A question about the performance of Ciconia

First of all, thank you for publishing this project. I like its code, the use of Symfony components + PHP 5.4 and I absolutely love he extension mechanism. However, I'm concerned about its performance.

Have you made any performance benchmark of the library? Have you compared it to the well-known michelf/php-markdown library? Is the performance one of the goals of Ciconia or is it more oriented to provide a lot of features? In my projects I currently use michelf/php-markdown and I'm always looking for a better performance alternative (I'm a heavy user of Markdown because I use it to publish books with hundreds of pages).

Multiple underscores in words

1) MarkdownTest::testGfmMultipleUnderscore
Failed asserting that two strings are equal.
--- Expected
+++ Actual
@@ @@
-'<p>foo_bar_baz</p>'
+'<p>foo<em>bar</em>baz</p>'

Triggers error or exception on "strict" mode

Possible syntax errors are:

  • Invalid ID of Reference-style link
  • Invalid ID of Reference-style image
  • Number of table cells on each rows are not the same

For example:

[World Wide Web Consortium][w4c]

[w3c]: http://www.w3.org/

On normal mode syntax error should be ignored silently:

<p>[World Wide Web Consortium][w4c]</p>

On strict mode, exception will be thrown.

It would be great if error message points where is the problem.

[SyntaxError]
Unable to find id "w4c" in Reference-style link at line 1

Wrong list parsing

Input:

- item1
- item2
- item3

test
1. item1
2. item2
3. item3

* item1
* item2
* item3

Output:

<ul> <li>item1</li> <li>item2</li> <li>item3</li> </ul> <p>test</p> <ol> <li>item1</li> <li>item2</li> <li><p>item3</p></li> <li><p>item1</p></li> <li>item2</li> <li>item3</li> </ol>

Duplicate tests in the test-suite

I can understand duplicating certain tests when there is overlap between two test-suites:

2 duplicates:
- kzykhys/ciconia/test/Ciconia/Resources/core/em-spaces.md|out
- kzykhys/ciconia/test/Ciconia/Resources/gfm/em-spaces.md|out
2 duplicates:
- kzykhys/ciconia/test/Ciconia/Resources/core/markdown-testsuite/list-multiparagraphs.md|out
- kzykhys/ciconia/test/Ciconia/Resources/gfm/ws-list-multiparagraphs.md|out
2 duplicates:
- kzykhys/ciconia/test/Ciconia/Resources/gfm/table-invalid-body.md|out
- kzykhys/ciconia/test/Ciconia/Resources/options/strict/gfm/table-invalid-body.md|out

But what's the purpose of tests duplicated within the same test-suite?

2 duplicates:
- kzykhys/ciconia/test/Ciconia/Resources/core/markdown-testsuite/EOL-CR+LF.md|out
- kzykhys/ciconia/test/Ciconia/Resources/core/markdown-testsuite/EOL-LF.md|out
2 duplicates:
- kzykhys/ciconia/test/Ciconia/Resources/core/markdown-testsuite/inline-code-with-visible-backtick.md|out
- kzykhys/ciconia/test/Ciconia/Resources/core/markdown-testsuite/inline-code.md|out
2 duplicates:
- kzykhys/ciconia/test/Ciconia/Resources/core/markdown-testsuite/unordered-list-items-leading-1space.md|out
- kzykhys/ciconia/test/Ciconia/Resources/core/markdown-testsuite/unordered-list-items.md|out

I inspected the EOL-CR+LF.md and EOL-LF.md in a hex editor as per your note, and these files are identical, as far as I can tell - and they both appear (on github.com) to be precisely 119 bytes, which, if one has CR+LF bytes for line breaks, it should have slightly more bytes than the one that has only LF bytes, correct?

Perhaps your IDE or some other tool, at some point, cleaned up what was deliberately intended to be leading/trailing space for test purposes?

I hope this is helpful :-)

Github newlines do not work, in case there's additional formatting on the next line

For example, the following markdown:

**Article**: description
**Another one**: something-something

Will result in:

Article: description Another one: something-something

On the other hand, the following will work as expected:

**Article**: description
Another one: something-something

Resulting in:

Article: description
Another one: something-something

PHP User Deprecated Warning - OptionsResolver

i get an user deprecated warning:

Calling the Symfony\Component\OptionsResolver\OptionsResolver::setAllowedTypes method with an array of options is deprecated since version 2.6 and will be removed in 3.0. Use the new signature with a single option instead.

code.

$ciconia = new \Ciconia\Ciconia();
$html = $ciconia->render(
        'Markdown is **awesome**',
        ['tabWidth' => 8, 'nestedTagLevel' => 5, 'strict' => true]
);

composer.json

"kzykhys/ciconia": "1.0.*"

Chaining extensions

I wrote a small extension that helps me integrate bootstrap's grid system. But when the extension gets executed, it does not gets parsed as a regular paragraph block anymore. Is there a way to to "chain" extensions so that the parsing does not stop on 1 extension?

For example, if I have something like this:

{.col-md-6} This is a paragraph

It will be wrapped with <div class=".col-md-6"> by my extension, but it will not be transformed to a paragraph.

URL in [ ] breaks output HTML

<?php
require 'vendor/autoload.php';

use Ciconia\Ciconia;
use Ciconia\Extension\Gfm;

$ciconia = new Ciconia();
$ciconia->addExtension(new Gfm\FencedCodeBlockExtension());
$ciconia->addExtension(new Gfm\TaskListExtension());
$ciconia->addExtension(new Gfm\InlineStyleExtension());
$ciconia->addExtension(new Gfm\WhiteSpaceExtension());
$ciconia->addExtension(new Gfm\TableExtension());
$ciconia->addExtension(new Gfm\UrlAutoLinkExtension());

$html = $ciconia->render(
    '[phalcon-php.spec - See http://blog.ohgaki.net/phalcon-php-rpm-package](https://gist.github.com/yohgaki/ee26b252d85ac6655c16)'
);

echo $html, PHP_EOL;

Actual result:
<p><a href="https://gist.github.com/yohgaki/ee26b252d85ac6655c16">phalcon-php.spec - See <a href="http://blog.ohgaki.net/phalcon-php-rpm-package</a">http://blog.ohgaki.net/phalcon-php-rpm-package</a</a>></p>

Expected result:
<p><a href="https://gist.github.com/yohgaki/ee26b252d85ac6655c16">phalcon-php.spec - See http://blog.ohgaki.net/phalcon-php-rpm-package</a></p>

In GFM, a cell with 0 results in an empty cell

For this:

H1  | H2
--  | --
0   | Empty cell!

Is expected this?:

<table>
  <thead>
    <tr>
      <th>H1</th>
      <th>H2</th>
    </tr>
  </thead>
  <tbody>
    <tr>
      <td></td>
      <td>Empty cell!</td>
    </tr>
  </tbody>
</table>

The zero from the cell in the body is removed resulting in an empty cell.

Array syntax?

Original title was 'Why 5.4?' but just noticed you are using traits.

Yes I could easily dig through the code (and I have to an extent) but I was wondering what the need for 5.4 was?

I noticed that you are not using the new short array syntax $array = []; (which was introduced in 5.4), any reason why?

Wrong underscore parsing

Input:

under_score and under_score

Output:

<p>under<em>score and under</em>score</p>

Probably the _ should not be parsed if not is preceded by a blank space.

Footnotes

Do you have plans to support this syntax?

I get 10 times more traffic from [Google] 1 than from
[Yahoo] 2 or [MSN] 3.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.