Comments (10)
This seems to work for me (although I admit I have not pushed all of my changes yet). Can you provide an example of how this fails for you?
from chcsvparser.
Hi @brwnx,
I have a unit test in place to test for this, but it appears to be passing. Can you provide more information about how you're seeing this fail?
from chcsvparser.
I think I had the same problem. When I have special characters in names the parser stops at that character for the line. The CSV file I had came from exporting from Excel. However, I believe it is Excel that is failing to export UTF-8 characters correctly.
eg:
148,S†TTERLIN Jasha,MOV,MOVISTAR TEAM
should have been:
148,SÜTTERLIN Jasha,MOV,MOVISTAR TEAM
So, the fault was with the file Excel created when I used Save As ... CSV.
from chcsvparser.
@skyvalleystudio both of those strings parse correctly with the latest release of the parser.
from chcsvparser.
I tried with the July version and still had the problem (first on the line with bib 148). My test file is here:
https://drive.google.com/file/d/0B7DnwOciz86uWWk0UDNXV1IteXM/edit?usp=sharing
Download with:
https://docs.google.com/uc?authuser=0&id=0B7DnwOciz86uWWk0UDNXV1IteXM&export=download
I still think Excel is not really saving in unicode.
from chcsvparser.
Thanks @skyvalleystudio, I'll start working on it. Is this CSV file something that I could check into the repository as part of the unit tests?
from chcsvparser.
Feel free to use the file. I wish I understood character sets better right about now...
I work around the problem by exporting to UTF-16 .txt in Excel. Then replacing Tab with Comma and renaming the file. The result imports fine with your parser.
from chcsvparser.
It's a file encoding problem. It's coming across the Ü
in the file, which is encoded as 0x86
. However, 0x86
in UTF-8 is the beginning of a multi-byte character, but it's not able to successfully extract a multi-byte character, likely because the file isn't actually encoded as UTF-8 (if it were, it would not have encoded Ü
as 0x86
).
You could work around this by explicitly specifying a different encoding for the file, but I'll try and figure out what the parser is supposed to do.
from chcsvparser.
Any progress with this? I have same problem, just realised I created a duplicate issue report :/
Tried forcing different encodings to parser, none helped. Have no control over actual file, have to use it as given. Don't care how long parsing takes, so would be happy to modify each row in my own code before parser sees it.
from chcsvparser.
I am also facing this with some special chars on ~2-7mb files on both iOS and OSX.
Choosing encoding manually helps sometimes and sometimes it doesn't.
I also don't have control over the file encoding/structure.
@jomnius's #73 is totally related
from chcsvparser.
Related Issues (20)
- Four warnings when I build. Xcode 7.3, targeting iOS 8 or later. HOT 4
- issue with (º)degree symbol at time of export HOT 2
- When open a csv file in appending mode, set the _currentLine the last line number of the file
- Exception raised when using -writeLineWithDictionary: to append records
- Fails to parse line with backslashed quotes, even with 'recognize backslashes' option
- Swift delegate prototypes (a comment, not an issue) HOT 2
- Parsed Data Replaces Double Quotes with Two Sets of Double Quotes HOT 1
- unable to parse csv file with initWithContentsOfCSVFile HOT 1
- Fails to parse record with unescaped parenthesis HOT 10
- Parsing ends if field includes nullchar HOT 1
- Thoughts on providing SAX style parsing
- The parser doesn't see field when it contains unescaped '\n' HOT 1
- Two warnings when building using XCode 7 Beta 5 HOT 16
- tvos support HOT 2
- NSInputStream inputStreamWithURL always nil HOT 1
- Using the 3.0.0 Version in Swift HOT 1
- Output parsing to an Array? HOT 1
- writeField:(id)field raises exception if first line of CSV has any null field HOT 1
- Typo in readme.markdown
- Use a proper CSV writer HOT 1
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from chcsvparser.