GithubHelp home page GithubHelp logo

jbroadway / urlify Goto Github PK

View Code? Open in Web Editor NEW
668.0 34.0 77.0 162 KB

A fast PHP slug generator and transliteration library that converts non-ascii characters for use in URLs.

License: BSD 3-Clause "New" or "Revised" License

PHP 100.00%
urlify php slugs transliteration blogging seo slug slugify ascii unicode

urlify's Issues

1.2.4 changed transliteration behaviour

Upgrading from 1.2.3 to 1.2.4 broke our test suite, in particular some characters are transliterated differently, breaking assertions and semver.

E.g. we test that това е текст на бълрагски за тест becomes tova-e-tekst-na-blragski-za-test which is true in 1.2.3 and false in 1.2.4.

In 1.2.4 it instead transliterates to tova-e-tekst-na-bielragski-za-test.

urlify version in out
1.2.3 бълрагски blragski
1.2.4 бълрагски bielragski

I'm sure the dependency has its reasons for doing this, but composer pulled in 1.2.4 automatically and broke out test suites, this should have been a 1.3.0 or a 2.0.0 release.

how to reverse url slug

echo URLify::slug('中文简体');
result zhong-wen-jian-ti
how to get back slug in chines
i means how can reverse slug translate

Please retain license

If this is a port of URLify.js as you write in the README, please retain the original license otherwise you don't have the right to port.

From a quick look original license is this one here: https://github.com/django/django/blob/master/LICENSE

Copyright (c) Django Software Foundation and individual contributors.
All rights reserved.

Redistribution and use in source and binary forms, with or without modification,
are permitted provided that the following conditions are met:

    1. Redistributions of source code must retain the above copyright notice,
       this list of conditions and the following disclaimer.

    2. Redistributions in binary form must reproduce the above copyright
       notice, this list of conditions and the following disclaimer in the
       documentation and/or other materials provided with the distribution.

    3. Neither the name of Django nor the names of its contributors may be used
       to endorse or promote products derived from this software without
       specific prior written permission.

THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" AND
ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED
WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE
DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT OWNER OR CONTRIBUTORS BE LIABLE FOR
ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES
(INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES;
LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON
ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
(INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS
SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.

Passing certain characters to add_chars() method causes "preg_match_all(): Unknown modifier ']'"

Consider the following:

URLify::add_chars(['/' => '']);

This causes a language exception, preg_match_all(): Unknown modifier ']', because the / character is used as the regular expression delimiter within the URLify library.

The above example derives from a fairly common and reasonable use-case: I want to remove all illegal characters from a file name, and on UNIX and Windows, / is illegal.

To fix this, PHP's preg_quote() function must be called on the keys in the array argument passed to add_chars().

I'll submit a PR shortly that seeks to fix the issue.

Lithuanian map :)

    'lithuanian_map' => array (
        'ą' => 'a', 'č' => 'c', 'ę' => 'e', 'ė' => 'e', 'į' => 'i', 'š' => 's', 'ų' => 'u', 'ū' => 'u', 'ž' => 'z',
        'Ą' => 'A', 'Č' => 'C', 'Ę' => 'E', 'Ė' => 'E', 'Į' => 'I', 'Š' => 'S', 'Ų' => 'U', 'Ū' => 'U', 'Ž' => 'Z'
    )

preserve case feature

Hello,

I think there could be an option to preserve case?

Sometimes we want:

"Alfred is good"
to be converted to "Alfred-is-good" instead of all lowercase...

save file extension when $file_name = true

\URLify::filter('abcdefghi.jpg', 6, 'en', true); // returns abcdef

It would be good to save extension of the file name so to have the result 'abcdef.jpg' in this case.

Difficulty generating a slug with / and ,

When generating a slug that contains / and , , it replaces these characters with nothing, when it should correctly replace them with a separator.

  • Test string: Bomba Submersa 1/4HP 0,25 110V Lepono
  • Incorrect: bomba-submersa-14hp-025-110v-lepono
  • Correct: bomba-submersa-1-4hp-0-25-110v-lepono

I modified the following code snippet:

$string = (string) \preg_replace(
            [
                // 1) remove un-needed chars
                '/[^' . $separatorEscaped . $removePatternAddOn . '\-a-zA-Z0-9\s]/u',
                // 2) convert spaces to $separator
                '/[\s]+/u',
                // 3) remove some extras words
                $removeWordsSearch,
                // 4) remove double $separator's
                '/[' . ($separatorEscaped ?: ' ') . ']+/u',
                // 5) remove $separator at the end
                '/[' . ($separatorEscaped ?: ' ') . ']+$/u',
            ],
            [
                '',
                $separator,
                '',
                $separator,
                '',
            ],
            $string
        );

To:

$string = (string) \preg_replace(
            [
                // 1) remove un-needed chars
                '/[^' . $separatorEscaped . $removePatternAddOn . '\-a-zA-Z0-9\s]/u',
                // 2) convert spaces to $separator
                '/[\s]+/u',
                // 3) remove some extras words
                $removeWordsSearch,
                // 4) remove double $separator's
                '/[' . ($separatorEscaped ?: ' ') . ']+/u',
                // 5) remove $separator at the end
                '/[' . ($separatorEscaped ?: ' ') . ']+$/u',
            ],
            [
                $separator,
                $separator,
                '',
                $separator,
                '',
            ],
            $string
        );

And it worked correctly.

Replacing underscores with spaces

Hi. I'm trying to modify the code so that underscores are not treated as spaces. I thought it would be as simple as commenting out that line of code, but that doesn't work. Any ideas why?

Unable to urlify properly

Hi there,

I've been trying to urlify a very simple string but the last part is being dropped. It's probably a wanted behaviour but it could be useful if there may be an option to avoid that.

My string is "Brazilian Série A" and I want it to become "brazilian-serie-a". It becomes "brazilian-serie" instead without the final "-a" part. Any way I can do this?

Below my code:

\URLify::filter('Brazilian Série A') // produces "brazilian-serie"

Tried also with:

\URLify::filter('Brazilian Série A', 120, 'en') // produces "brazilian-serie"

Ó => o

This is probably a typo in the code:

The uppercase Ó is coverted to lowercase o due to line 72 in URLify.php:

'Ó' => 'o',

correct:

'Ó' => 'O',

Why is $underscoreToSpace removed ?

Hi,

Why is $underscoreToSpace removed from the filter ? It was pretty handy to make underscores hypens of you wanted, or spaces ofcourse.

I hope there is a good reason for it!

Thanks

Support more characters by default

Had to add the following chars for our transliteration test to pass:

        URLify::add_chars(
            array(
                'Ÿ' => 'Y',
                'µ' => 'u',
                '¥' => 'Y',
                'Ĉ' => 'C',
                'ĉ' => 'c',
                'Ċ' => 'C',
                'ċ' => 'c',
                'Ĝ' => 'G',
                'ĝ' => 'g',
                'Ġ' => 'G',
                'ġ' => 'g',
                'Ĥ' => 'H',
                'ĥ' => 'h',
                'Ħ' => 'H',
                'ħ' => 'h',
                'Ĕ' => 'E',
                'ĕ' => 'e',
                'Ĭ' => 'I',
                'ĭ' => 'i',
                'Ĵ' => 'J',
                'ĵ' => 'j',
                'Ĺ' => 'L',
                'ĺ' => 'l',
                'Ľ' => 'L',
                'ľ' => 'l',
                'Ŀ' => 'L',
                'ŀ' => 'l',
                'ʼn' => 'n',
                'Ō' => 'O',
                'ō' => 'o',
                'Ŏ' => 'O',
                'ŏ' => 'o',
                'Ŕ' => 'R',
                'ŕ' => 'r',
                'Ŗ' => 'R',
                'ŗ' => 'r',
                'Ŝ' => 'S',
                'ŝ' => 's',
                'Ŧ' => 'T',
                'ŧ' => 't',
                'Ŭ' => 'U',
                'ŭ' => 'u',
                'Ŵ' => 'W',
                'ŵ' => 'w',
                'Ŷ' => 'Y',
                'ŷ' => 'y',
                'ſ' => 'i',
                'ƒ' => 'f',
                'O' => 'O',
                'o' => 'o',
                'U' => 'U',
                'u' => 'u',
                'Ǎ' => 'A',
                'ǎ' => 'a',
                'Ǐ' => 'I',
                'ǐ' => 'i',
                'Ǒ' => 'O',
                'ǒ' => 'o',
                'Ǔ' => 'U',
                'ǔ' => 'u',
                'Ǖ' => 'U',
                'ǖ' => 'u',
                'Ǘ' => 'U',
                'ǘ' => 'u',
                'Ǚ' => 'U',
                'ǚ' => 'u',
                'Ǜ' => 'U',
                'ǜ' => 'u',
                'Ǻ' => 'A',
                'ǻ' => 'a',
                'Ǿ' => 'O',
                'ǿ' => 'o',
                'Ǽ' => 'Ae',
                'ǽ' => 'ae',
                'IJ' => 'IJ',
                'ij' => 'ij',
                'J' => 'J',
                'ĸ' => 'k',
                'Ŋ' => 'N',
                'ŋ' => 'n',
                'Ẁ' => 'W',
                'ẁ' => 'w',
                'Ẃ' => 'W',
                'ẃ' => 'w',
                'Ẅ' => 'W',
                'ẅ' => 'w',
            )
        );

Unfortunately, since I do not know what language they belong to, I find it difficult to provide a PR when the code is structured based on language.

Missing A char

Hi, I found a strange bug, look at the below code (local ENV: php 5.6 on mac os, dev-prod ENV: php 5.6 on ubuntu 16):

  • var_dump(\URLify::filter('Text sample A')); // text-sample
  • var_dump(\URLify::filter('Text sample B')); // text-sample-b
  • var_dump(\URLify::filter('Text sample AA')); // text-sample-aa

Where is, in the first var_dump, the last "a" char?

Is this package still maintained?

Not compatable with Laravel 9

Since Laravel 9 is requiring voku/portable-ascii:^2.0 and this repo is requiring voku/portable-ascii:^1.4 it causes a conflict when trying to update composer.

Underscores as spaces

I just had URLify take the title _Summer and return the slug _summer, when I was actually expecting just summer. Maybe this is just me though?

Going forward I have my wrapper replace all occurrences of underscores with spaces. This matches at least my own internal logic much better but I wanted to throw it out there and see if maybe someone else also liked this behavior. Then I could roll a PR for it.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.