Comments (3)
Do you know what charset your are working with? I just tried this extension on some Shift-JS charset characters and it failed to encode them in UTF-8. I think this extension is not detecting Shift-JS correctly.
Try doing it manually with iconv.
$string_to_encode = '質量';
$encoded_str = iconv('shift-jis','utf-8'.'//TRANSLIT', $string_to_encode);
echo $encoded_str;
from forceutf8.
The main function of this package is toUTF8(), which encodes Latin1 to UTF8 but detects characters that are already UTF-8 encoded and keeps them as they are, avoiding the usual problem of double encoding them.
FixUTF8 is an auxiliary function that fix Latin1 characters on double-encoded UTF8 strings, converting, for instance, "República" back to "República", but losing all non-Latin1 characters.
In other words, do not try to use it with japanese charactes (or any character outside the Latin1 set). It will break them.
Do not use it in production either. It's a hacky and slow function designed to be used in manual, human assisted batch processes by people that only use characters in the Latin1 set.
from forceutf8.
Oh okay thanks for the explanation! Anyway I wrote a function to handle the problem with it. I post it here if anyone want to use it or improve it. $nickname represents an unicode string wich means it has X pairs of bytes. In my case I'm working with BigEndian format. This function sucessfully returns any unicode string by getting its hexadecimal value.
function fixNick($nickname){
$fixNick ="";
$i=0;
$strlen = strlen($nickname)-2;
$unicode = (ord($nickname[$i+1]) << 8)+ord($nickname[$i]);
while($i<$strlen and $unicode>0){
$i=$i+2;
$fixNick.=mb_convert_encoding('&#'.$unicode, 'UTF-8', 'HTML-ENTITIES');
$unicode = (ord($nickname[$i+1]) << 8)+ord($nickname[$i]);
}
return $fixNick;
}
from forceutf8.
Related Issues (20)
- Still not resolving some non UTF-8 '\xC3'
- Removing Polish characters HOT 3
- Abandoned Package ?! HOT 9
- Latin-2 support (ISO 8859-2) HOT 2
- Problem with Ù character HOT 4
- utf symbols converted to questionmarks HOT 5
- License file/composer entry HOT 1
- Encoding em/en-dash HOT 3
- is it possible to support unicode format txt file?
- no support for source encoding UTF16 (LE/BE) HOT 1
- Not all words are converted HOT 2
- Consider making this a website HOT 1
- UTF8 encoding issue where curly single apostrophe being converted as? HOT 4
- HTML-ENTITIES not converted HOT 1
- PHP 7.4 deprecations. HOT 1
- PHP future deprecation warning HOT 2
- How do you use this? HOT 2
- Didn't work with some languages for example CZECH HOT 4
- Deprecations in PHP8 HOT 1
- Emojis not working HOT 1
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from forceutf8.