danielstjules / stringy Goto Github PK
View Code? Open in Web Editor NEWA PHP string manipulation library with multibyte support
License: MIT License
A PHP string manipulation library with multibyte support
License: MIT License
It would be interesting if the namespace was added automatically in autoload_namespaces.php file.
Some of the special characters are not translated to ASCII correctly.
For example the letter Ľ (which in slovak language is used in word Ľudia => people) is not translated, so the output is "udia" instead of "Ludia"
kebab-case
is almost equal to slug-case
, the only difference is that kebab accepts dots .
.
Is this compatible with https://github.com/nicolas-grekas/Patchwork-UTF8?
Just got bitten by this bug: https://bugs.php.net/bug.php?id=47742
Stringy's toLowerCase and toUpperCase use them.
This seems to be a workaround that works for me:
mb_convert_case($string, MB_CASE_UPPER, 'UTF-8');
mb_convert_case($string, MB_CASE_LOWER, 'UTF-8');
Interestingly enough, you're already using mb_convert_case
in the toTitleCase() method.
Fatal error: Cannot redeclare Stringy\create() (previously declared in /Users/crynobone/.composer/vendor/danielstjules/stringy/src/Create.php:14) in /Users/crynobone/Projects/orchestra/testbench/vendor/danielstjules/stringy/src/Create.php on line 17
Since this package is now used by illuminate/support
which commonly installed per project and globally the function need to be wrapped inside function_exists()
to work properly.
See composer/composer#3003 and reactphp/promise#23 for reference.
In src/Stringy.php, l.1862 :
throw new \**RuntimeExpception**('Stringy method requires the ' . 'mbstring module for encodings other than ASCII and UTF-8');
Fix :
throw new \RuntimeException(...)
I'm using a project that uses your project :) I'm attempting to submit my project to a repository that has a commit hook that checks all php code against a PHP 5.4 Linter (wordpress.org). Although Stringy supports PHP 5.6 shortcuts, it doesn't seem to require it as the lint test is only failing for one of your test cases that is using use function
. Any possibility that the test case can be rewritten to use the Stringy\Stringy::create instead?
I need to be able to find a block defined by the occurrence of an H2
title followed by a number of html elements and ending just before the opening of following H2
, or the end of the container, in order to add a <section>
around it. Here is an example of the markdown:
##Ingredients
* Garlic
* Bread
* Cherry tomatos
##Preparation
* Peel the garlic
* Cut the cherry tomatoes in half
* Toast the bread
* Eat the whole thing
The final output from twig
should be something like this:
<section>
<h2>Ingredients</h2>
<ul>(…)</ul>
</section>
<section>
<h2>Preparation</h2>
<ul>(…)</ul>
</section>
Can Stringy help me to do that? How?
Any help on the subject is appreciated!
public function slugify($replacement = '-')
{
$stringy = self::create($this->str, $this->encoding);
$stringy->str = preg_replace("/[^a-zA-Z\d $replacement]/u", '',
$stringy->toAscii());
$stringy->str = $stringy->collapseWhitespace()->str;
$stringy->str = str_replace(' ', $replacement, strtolower($stringy));
return $stringy;
}
You need to replace all consecutive "unaccepted" characters with a single instance of the $replacement
preg_replace("/[^a-zA-Z\d $replacement]+/u", $replacement, $stringy->toAscii());
When I first started using stringy, I added this line at the top of my bootloader:
use Stringy\Stringy as S;
and I was able to use it throughout the site like this:
$this->entityName = s($entity_slug)->upperCamelize();
but suddenly it stopped working for some reason. Now I'm getting this error message:
Fatal error: Call to undefined function s() in C:\wamp64\www\spider\chebi\orm_tools.php on line 39
any idea what the problem here might be?
For a 2.0.0 release of the lib, restructure things using the delegation pattern for composition. Avoid traits, as while they're a perfect solution, they're not compatible with PHP 5.3. Though it's been EOL'ed, it's still pretty popular, and I think Stringy is suitable for legacy projects.
S::create('área')->slugify()
Instead of converting characters like área
to area
, the method converts área
to 225rea
.
I need to implement a "Did you mean ....?" functionality which you usually achieve using the levenshtein()
function. But in that case I would need to search a database for possible matches. So given I have a search string foobar
, I need all possible permutations to execute a LIKE
query 😄
So I imagine this:
$permutations = s('foobar')->permutate();
// $permutations is now an array of "foobra", "foorba", "barfoo" etc. pp.
A cool addition to that would be an optional $maxLevenshteinDistance
parameter. Like so:
$permutations = s('foobar')->permutate(2);
// $permutations should contain "foobra" (distance of 2) but not "barfoo" (distance of 7)
Do you see that fitting in this library? It's string manipulation, it just doesn't return a modified string. But chars()
and others don't do that either :)
I was expecting the titilize method to return "Title Case" for the string "TITLE CASE". Instead, it stayed all caps. I had to call toLower, then titleize to get the desired result.
Not so much an issue as a implementation question.
Was there a specific reason for using mb_ereg_match
in matchPattern() vs. preg_match
?
For example, the former would use [[:upper:]]
to find uppercase letters and the latter would use \p{Lu}
.
I know Posix has been around longer, so maybe PCRE isn't as multibyte capable? Just trying to come to grips with the more awkward (to me) Posix syntax.
For example, if I wanted to add a hasUpperCase()
method to see if a string had any uppercase character in it, it looks like .*[[:upper:]]
is the pattern that works, although my brain wants to interpret that as "the first character is upper or lowercase and the rest are uppercase".
applyDelimiter
should be public or have a public implementation, e.g. delimit()
. Stringy currently only supports two fixed implementations, dasherize
and underscored
, which are wrong for many reasons:
underscored
should actually be underscore
.protected
access.You'll agree maintaining 2 codebases (regular and static) is easily conceivable as nightmarish.
Instead of waiting to see if one of the APIs is dying, how about using __callStatic()
for the static class? Is it desirable, doable?
Hi there. I have seen a few older comments around adding pluralize
and singularize
methods to Stringy, and that the concern was making it multilingual.
Would you accept a PR that added these methods using ICanBoogie/Inflector, which is multiligual and as of now supports English, French, Norwegian Bokmal, Portuguese, Spanish, and Turkish?
I don't think this directly has to do with this but I thought I would ask. I have been building a CI pipeline for a Laravel app with a PHPUnit testing suite and for days I have been tracing errors back to
use function Stringy\create as s;
in the CreateTest
class. I don't have this problem when I run it locally but for whatever reasons PHPUnit running in my CI container hates this and causes PHPunit to exit with a 255 code. Do you have any idea why this may be? Tried with PHP 5.5.9 and PHP 5.6
I use the static implementation, StaticStringy
exclusively, but the v2 implementation is a regression. You might argue it's less maintenance overhead with the magic method implementation presently, but it's not, because you still have the maintenance burden of maintaining a list of valid methods and argument counts. Therefore I strongly recommend you revert to the v1 implementation for StaticStringly
.
If you cannot be persuaded to revert, I strongly suggest you add @method
annotations for the StaticStringy
class. The v2 implementation breaks static analysis in editors meaning features like code insight/code completion do not work. Adding the @method
annotations for each method exported by the class will fix this, but again, the maintenance overhead begins to bloom so much that you may as well just revert to the v1 implementation; moreover magic methods incur a performance hit.
I earnestly cannot see any tangible benefit to the v2 implementation at all and implore you to revert it. In the meantime my Composer definition will remain locked to v1.
Warning: mb_regex_encoding(): Unknown encoding "" - \Stringy\src\Stringy.php on line 970
Receiving this error when an instance is cloned. Such as:
require_once 'Stringy/src/Stringy.php';
use Stringy\Stringy as Stringy;
final S extends Stringy
{
public function __construct()
{
return clone new Stringy;
}
}
echo S::create('Camel-Case')->camelize();
Any suggestions?
Note, I have mbstring enable.
Might need to update the format
There is a wrong matching character.
The greek letter "θ" corresponds to "th" not to "o".
When I tried to extent the base Stringy class I've noticed some issues. These are:
class StringyExtended extends Stringy { public function hasUpperCase() { // Match uppercase anywhere in the string $patternForUpperCase = ".*([[:upper:]])"; return (bool) mb_ereg_match($patternForUpperCase, $this->str); } }
I couldn't use Stringy::matchesPattern
because it's a private method and the $str
is also declared as private so consider these issues. Otherwise, it's a great library. Probably, protected
would solve these issues but it's still possible other way.
Please take a look at this constructor from https://github.com/lanthaler/IRI/blob/a04d4f923700dc5b4a19e1e105f978b50647efaa/IRI.php#L80
public function __construct($iri = null)
{
if (null === $iri) {
return;
} elseif (is_string($iri)) {
$this->parse($iri);
} elseif ($iri instanceof IRI) {
$this->scheme = $iri->scheme;
$this->userinfo = $iri->userinfo;
$this->host = $iri->host;
$this->port = $iri->port;
$this->path = $iri->path;
$this->query = $iri->query;
$this->fragment = $iri->fragment;
} else {
throw new \InvalidArgumentException(
'Expecting a string or an IRI, got ' .
(is_object($iri) ? get_class($iri) : gettype($iri))
);
}
}
I naively wrote this:
$foo = new Stringy('bar');
$baz = new IRI($foo);
and got that:
InvalidArgumentException: Expecting a string or an IRI, got Stringy\Stringy
The solution is explicit type conversion:
$baz = new IRI((string)$foo);
My naive approach was done under the impression that Stringy should act like a PHP string, even though it is an object. How would you approach this issue? I can imagine:
(string)
). That could include an interface which is_string recognizes and returns true for. Also, an instance of the class implementing the interface would be treated as a PHP string most of the time, except when methods are invoked on the instance. And the string object would be passed by reference of course.Approaches 1 and 2 seem like hacks, though they are in reach. Approach 3 is the best from user perspective but I don't know if it's viable. It might entail a ton of changes in PHP core.
I think toAscii
method should not convert @ to "at"
as @
sign is already 100% ascii.
Though "@" => "at" conversion can absolutely be part of slugify()
.
But toAscii
should do exactly what method name says.
Just had a few hours of debugging.
Turned out that after upgrading Stringy to 2.2.0+ our emails addresses to-ascii downgrading was broken.
Example:
$email = "cé[email protected]";
$expected = "[email protected]";
$actual = (string) StaticStringy::toAscii($email); // produces cecileatexample.com
echo $expected == $actual ? 'Pass' : 'Fail'; // Fails
It would be nice if the "between" function was available. What I man is this function: http://stringjs.com/#methods/between-left-right
Would be cool if english pluralisation and singularisation was possible in this library. If you agree, I would seek to make a more robust implementation of the below:
public function pluralize($word, $num = 2)
{
if ($num < 2){
return $this->singularize($word);
}
$plural = array(
'/(quiz)$/i' => '\1zes',
'/^(ox)$/i' => '\1en',
'/([m|l])ouse$/i' => '\1ice',
'/(matr|vert|ind)ix|ex$/i' => '\1ices',
'/(x|ch|ss|sh)$/i' => '\1es',
'/([^aeiouy]|qu)ies$/i' => '\1y',
'/([^aeiouy]|qu)y$/i' => '\1ies',
'/(hive)$/i' => '\1s',
'/(?:([^f])fe|([lr])f)$/i' => '\1\2ves',
'/sis$/i' => 'ses',
'/([ti])um$/i' => '\1a',
'/(buffal|tomat)o$/i' => '\1oes',
'/(bu)s$/i' => '\1ses',
'/(alias|status)/i'=> '\1es',
'/(octop|vir)us$/i'=> '\1i',
'/(ax|test)is$/i'=> '\1es',
'/s$/i'=> 's',
'/$/'=> 's');
$uncountable = array('equipment', 'information', 'rice', 'money', 'species', 'series', 'fish', 'sheep');
$irregular = array(
'person' => 'people',
'man' => 'men',
'child' => 'children',
'sex' => 'sexes',
'move' => 'moves');
$lowercased_word = strtolower($word);
foreach ($uncountable as $_uncountable){
if(substr($lowercased_word,(-1*strlen($_uncountable))) == $_uncountable){
return $word;
}
}
foreach ($irregular as $_plural=> $_singular){
if (preg_match('/('.$_plural.')$/i', $word, $arr)) {
return preg_replace('/('.$_plural.')$/i', substr($arr[0],0,1).substr($_singular,1), $word);
}
}
foreach ($plural as $rule => $replacement) {
if (preg_match($rule, $word)) {
return preg_replace($rule, $replacement, $word);
}
}
return false;
}
/**
* Singularizes English nouns.
*
* @access public
* @static
* @param string $word English noun to singularize
* @return string Singular noun.
*/
public function singularize($word, $num = 1)
{
if ($num > 1){
return $this->pluralize($word);
}
$singular = array (
'/(quiz)zes$/i' => '\1',
'/(matr)ices$/i' => '\1ix',
'/(vert|ind)ices$/i' => '\1ex',
'/^(ox)en/i' => '\1',
'/(alias|status)es$/i' => '\1',
'/([octop|vir])i$/i' => '\1us',
'/(cris|ax|test)es$/i' => '\1is',
'/(shoe)s$/i' => '\1',
'/(o)es$/i' => '\1',
'/(bus)es$/i' => '\1',
'/([m|l])ice$/i' => '\1ouse',
'/(x|ch|ss|sh)es$/i' => '\1',
'/(m)ovies$/i' => '\1ovie',
'/(s)eries$/i' => '\1eries',
'/([^aeiouy]|qu)ies$/i' => '\1y',
'/([lr])ves$/i' => '\1f',
'/(tive)s$/i' => '\1',
'/(hive)s$/i' => '\1',
'/([^f])ves$/i' => '\1fe',
'/(^analy)ses$/i' => '\1sis',
'/((a)naly|(b)a|(d)iagno|(p)arenthe|(p)rogno|(s)ynop|(t)he)ses$/i' => '\1\2sis',
'/([ti])a$/i' => '\1um',
'/(n)ews$/i' => '\1ews',
'/s$/i' => '',
);
$uncountable = array('equipment', 'information', 'rice', 'money', 'species', 'series', 'fish', 'sheep');
$irregular = array(
'person' => 'people',
'man' => 'men',
'child' => 'children',
'sex' => 'sexes',
'move' => 'moves');
$lowercased_word = strtolower($word);
foreach ($uncountable as $_uncountable){
if(substr($lowercased_word,(-1*strlen($_uncountable))) == $_uncountable){
return $word;
}
}
foreach ($irregular as $_plural=> $_singular){
if (preg_match('/('.$_singular.')$/i', $word, $arr)) {
return preg_replace('/('.$_singular.')$/i', substr($arr[0],0,1).substr($_plural,1), $word);
}
}
foreach ($singular as $rule => $replacement) {
if (preg_match($rule, $word)) {
return preg_replace($rule, $replacement, $word);
}
}
return $word;
}
It would be great if there was a Laravel Service Provider and Facade so that we can use Stringy with Laravel views.
As per title, it would be nice to be able to do this:
if (s("string")->matchesPattern("pattern"))
{
echo 'We found a match.';
}
When I try the following:
Stringy\Stringy::at("Test", 2, "UTF-8")
I get this:
Strict standards: Non-static method Stringy\Stringy::at() should not be called statically
Isn't it the right way of calling it?
Looks like becouse of removing ext-mbsring from composer.json default Heroku PHP buildpack not installs it during deploy. So it uses mb_regex_encoding from polyfill and considering that it's not a php function, it's required to use "" before function name to specify root namespace.
How to reproduce: I suggest to run tests on environment without ext-mbstring
PHP: 7.0.4
First of all, thanks for a great library!
A strip_tags() function that actually works would be a great addition!
Must be able to handle:
Here is what I'm using at the moment:
https://gist.github.com/web64/b0efa3f001401950f26fb72f5eff8674
I was testing the class on a project I'm working on with Laravel, and I was using camelize() to return strings the way they should be used when you name functions, now, if I send a string like this one "MY FUNCTION", it would return "mYFUNCTION".
Another case is when you type a string like "TeStInG ClAsS", it would output the following "teStInGClAsS".
Is this intended behavior, or should it return the string fully camelized as when you send through a lower case one?
I saw the mention in description that it uses the polyfill when the module is not installed, however seems that it's not sufficient?
Fatal error: Uncaught Error: Call to undefined function mb_regex_encoding()
The product that uses Stringy is supposed to be distributed to users that may be unable to install the mbstring module.
Tested on php PHP 7.0.4-6+deb.sury.org~trusty+3
I found missing variants in $charsArray :
'a'=>'ä'
'ae'=>'æ'
'ss'=>'ß'
'U'=>'Ў'
'u'=>'ў'
tnx for your exellent package!
https://github.com/danielstjules/Stringy/blob/master/tests/CreateTest.php#L5
The statement of use function ...
is a feature which comes in PHP 5.6 (http://php.net/manual/en/language.namespaces.importing.php). Your composer.json
file however, declares that this package is compatible with PHP 5.3+. This mistake causes my build scripts to break. Please either fix composer.json
declaration, or don't use namespaced function calls. Thanks!
Is there any way to get an instance of Stringy without passing a string to the constructor since StaticStringy has been removed?
I need to have Stringy in the IoC container which would usually be just this short piece.
$this->app->singleton('stringy', function ($app) {
return new Stringy();
});
The issue is that Stringy expects the first parameter to be a string and since calling the facade would be something like Stringy::slugify('fòô bàř')
there is nothing I can pass to the new Stringy()
which is why I would need new StaticStringy()
which works.
So how would I put Stringy behind a Facade now that StaticStringy is gone?
Hi,
I'am using laravel 5.2.5 and this package using small part codes of Stringy. I have problem with some Turkish character (ö, Ö, ü, U) converting to slug.
For example in Turkish Language:
'Ö' => 'O',
'ö' => 'o',
in German or etc
'Ö' => 'OE',
'ö' => 'oe',
It is really big problem. (for information: https://en.wikipedia.org/wiki/%C3%96)
Converting process must be by localization.
Here laravel issue: laravel/framework#11619
Currently working on it in the https://github.com/danielstjules/Stringy/tree/2.0.0 branch. Welcome any input :)
So now that StaticStringy
has been removed for 2.0 (#93) I already hade a few situations where I needed the StaticStringy
class because I had to inject it or something like that.
So my proposal would be that StaticStringy
would be brought back to life into a separate package like danielstjules/staticstringy
which would only offer the StaticStringy
class. This would make it optional and not clutter the core package but we would still have the advantages of it.
So would this be a possibility or is StaticStringy
gone for good and if I wanted to use StaticStringy
post 2.0 I should create a package for it myself?
Hi. Started using your library today. It has most things, but lacks one that's pretty useful - a method that returns matches based on a regex pattern (More specifically, returns the matches from a preg_match. Maybe we can also have a matchAll() method corresponding to preg_match_all function).
The current implementation of title case is not actual title case, in the sense that it simply capitalises each word. Notwithstanding the ability to set $ignore words, I think we could do better.
I have ported John Gruber's title casing script to PHP; would you accept a PR altering the toTitleCase
method to use this technique?
Hello @danielstjules,
Your library is really great! I'm using it in several projects and it is a must-have swiss knife.
Here's what I suggest for improving it:
On the next major version, rename your package to stringy/stringy. No problem with your username, but I always have to go on github to check the vendor name.
The split()
method is very useful, but a join()
factory (to do the inverse thing, join an array of Stringy
objects) would also be really helpful
Please merge SubStringy methods into the main library. The substring methods are extremely useful (and we expect them when using a string maniuplation library) but the use of a "plugin", which is actually an override requiring to refactor each existing import, is a real pain.
I have made my own fork which integrates these 3 suggestions: https://github.com/bpolaszek/Stringy (I didn't registered it on packagist to let you submit the stringy/stringy package if you agree with my 1st point).
That's what I use in my projects now, and things are way simpler.
What do you think?
Having a way to pluralize words would be nice. Currently using Stringy with Laravel, and it already has a str_plural(), but I'm using Stringy for the OO chaining approach.
I really like this library but there are a number of methods that have been added that do not belong here.
A good example is isJson()
. Where do you draw the line with this? Should we also add isXml()
and isPdf()
? There are already built-in PHP functions for this. Can Fileinfo detect JSON? I don't know, but regardless, such a function belongs to a JSON library, not a general purpose string manipulation library.
The methods I would remove are:
is*
.All of these contribute bloat either because they don't belong or because they would seldom, if ever, be used.
I think it would be beneficial to add the capability for the humanize method to format dashes to make the transition between slugify and humanize easier without having to use the replace method to remove dashes and add spaces.
For example right now I would have to do this:
Stringy::create('some-slug-text')->replace('-', ' ')->humanize()
This would be nice:
Stringy::create('some-slug-text')->humanize()
Not a big deal, just something I do often in a project I am working on.
I can submit a PR for it if you like.
Hello,
I'm about to fork this in order to implement the following methods:
containsInt()
: Checks if the string contains an integer value.
containsFloat()
: Checks if the string contains an float value.
containsNumber()
: Checks if the string contains a number integer or float.
isEmpty()
: Checks if the string is empty (doesn't contain any char).
afterFirstSubstring()
: Returns a substring of the input string containing everything after first occurrence of a given substring
afterLastSubstring()
: Returns a substring of the input string containing everything after last occurrence of a given substring
beforeFirstSubstring()
: Returns a substring of the input string containing everything before first occurrence of a given substring
beforeLastSubstring()
: Returns a substring of the input string containing everything before last occurrence of a given substring
However I'm not sure about the naming convention I should be using. I had this methods on a String class I've been developing for years but now I'm about to move to Stringy.
What do you guys think?
Thank you.
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.