GithubHelp home page GithubHelp logo

elvanto / litemoji Goto Github PK

View Code? Open in Web Editor NEW
76.0 76.0 12.0 135 KB

A PHP library simplifying the conversion of unicode, HTML and shortcode emoji 🔥

License: MIT License

PHP 100.00%
emoji php unicode

litemoji's People

Contributors

bensinclair avatar brandonkelly avatar inkognitoo avatar joshmcrae avatar karelwintersky avatar spirit55555 avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar

litemoji's Issues

Cp1251 support?

Hi! It is possible to convert cp1251 string "🔥 - это огонь" into "🔥 - это огонь"?
Now LitEmoji::encodeHtml("🔥 - это огонь") returns " - это огонь" :(

unicodeToShortcode() removes mdash symbol

// text before:
// <p>А ведь тут совсем другой смысл заложен. Mdash — это черточка шириной с букву М. В русской типографике ее называют длинным тире. Ndash — соответственно более короткая черточка, часто даже уже, чем буква N.</p>\n

$content = self::unicodeToShortcode($content);

// text after
// "<p>А ведь тут совсем другой смысл заложен. Mdash  это черточка шириной с букву М. В русской типографике ее называют длинным тире. Ndash  соответственно более короткая черточка, часто даже уже, чем буква N.</p>\n"

Mdash symbol copypasted from this article: https://medium.com/@sergeisoloviev/mdash-31c331397e46 (2nd paragraph)

Request: List

is there somewhere a list of full shortcodes with a sample of the output?

Kazakhstan flag not working

Hello.
Code: LitEmoji::encodeUnicode(':kz:');
Expected: 🇰🇿
Received: :kz:

Version litemoji: 4.3.0
PHP 8.1

Missing emoji in regex

Hi,
there is missing emoji Shopping trolley (https://unicode-table.com/en/1F6D2/) in regex:
https://github.com/elvanto/litemoji/blob/master/src/LitEmoji.php
const MB_REGEX = '/( \x23\xE2\x83\xA3 # Digits [\x30-\x39]\xE2\x83\xA3 | \xE2[\x9C-\x9E][\x80-\xBF] # Dingbats | \xF0\x9F[\x85-\x88][\xA6-\xBF] # Enclosed characters | \xF0\x9F[\x8C-\x97][\x80-\xBF] # Misc | \xF0\x9F\x98[\x80-\xBF] # Smilies | \xF0\x9F\x99[\x80-\x8F] | \xF0\x9F\x9A[\x80-\xBF] # Transport and map symbols | \xF0\x9F[\xA4-\xA7][\x80-\xBF] # Supplementary symbols and pictographs )/x';

UTF-8 (hex) for shopping trolley is 0xF0 0x9F 0x93 0xA6

Skin tones and genders not working correctly

Unfortunately, this lib has nearly the same problem as every alternative: It doesn't handle skin tones correctly:

This: Hello 👼🏿
Results in Hello :angel::tone5:
But is expected to be: Hello :angel_tone5:

which is obviously not correct.

But it's even worse with this:

This: Hello 👷🏿‍♀️
Results in: Hello :construction_worker::tone5:‍♀️
But is expected to be: Hello :woman_construction_worker_tone5:

The shortcodes-array has these items:

'angel_tone5' => '1F47C-1F3FF',
// ...
'woman_construction_worker_dark_skin_tone' => '1F477-1F3FF-200D-2640-FE0F',
// ...
'woman_construction_worker_tone5' => '1F477-1F3FF-200D-2640-FE0F',

PHP 5.6 constant used

LitEmoji's composer.json says it requires PHP 5.4+:

"require": {
"php": ">=5.4"
},

However v1.4 is using the ARRAY_FILTER_USE_KEY constant, which is only available in PHP 5.6+ (7c0276d).

}, ARRAY_FILTER_USE_KEY);

So either that code should be updated to not use a PHP 5.6+ constant, or composer.json should be updated to require PHP 5.6+.

Missing geometric emojis in regex

Looks like there's lack of support for a few geometric emojis in the MB_REGEX constant. For reference - https://unicode.org/emoji/charts/full-emoji-list.html#geometric
🟠 🟡 🟢 🟣 🟤 ⚫ ⚪ 🟥 🟧 🟨 🟩 🟦 🟪 🟫 ⬛ ⬜ ◼ ◻

The following are included already:
🔴 🔵

This is despite all these already appearing in the shortcodes-array file, it just doesn't match against the regex pattern.

I'd love to put a PR together, but I'm a novice with Regex at best - sorry!.

Additionally, it'd be neat if this constant could be changed to a static property to allow other to modify the regex to save you having to update it all the time. So we could do:

LitEmoji::$MB_REGEX = ...

encodeUnicode encodes Emojis in URLs

I'm using this library to encode emojis in HTML. We experience the problem, that emojis in links are encoded as well, which leads to faulty links.

This is a sample input:

<h1>This is a heading :party:</h1>
<a href="https://dbsw.sharepoint.com/:x:/r/teams/Data/123">This is a link!</a>

This is the outcome:

<h1>This is a heading 🎉</h1>
<a href="https://dbsw.sharepoint.com/❌/r/teams/Data/123">This is a link!</a>

This is the expected outcome:

<h1>This is a heading 🎉</h1>
<a href="https://dbsw.sharepoint.com/:x:/r/teams/Data/123">This is a link!</a>

Is there any way to limit the conversion to shortcodes, which have a space before or after the shortcode? E.g. :x:, :x: . This would prevent the conversion of "shortcodes" in links, which should not include unescaped spaces!

Of course, I could parse the html and iterate through the filtered texts, which seems quite complex. Maybe there is a better / simpler solution for this problem?

At the moment I help myself by parsing the html and iterate through all links after the first conversion:

$html = LitEmoji::encodeUnicode($html);
$dom = new Dom();
$dom->loadStr($html);
$dom->find('a')->each(function ($a) {
    $a->setAttribute('href', LitEmoji::unicodeToShortcode($a->getAttribute('href')));
});
$html = $dom;

Thanks for any help in advance! :)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.