Comments (10)
I can confirm this issue. It looks like the problem started in this commit: d4dd5d1
- @encoding = raw_encoding(nil) ||
- ( if encoding = options.delete(:internal_encoding)
- case encoding
- when Encoding; encoding
- else Encoding.find(encoding)
- end
- end ) ||
- ( case encoding = options.delete(:encoding)
- when Encoding; encoding
- when /\A[^:]+/; Encoding.find($&)
- end ) ||
+ internal_encoding = Encoding.find(internal_encoding) if internal_encoding
+ if encoding
+ encoding, = encoding.split(":") if encoding.is_a?(String)
+ encoding = Encoding.find(encoding)
+ end
+ @encoding = raw_encoding(nil) || internal_encoding || encoding ||
On current master, the relevant section looks like this:
# honor the IO encoding if we can, otherwise default to ASCII-8BIT
internal_encoding = Encoding.find(internal_encoding) if internal_encoding
external_encoding = Encoding.find(external_encoding) if external_encoding
if encoding
encoding, = encoding.split(":", 2) if encoding.is_a?(String)
encoding = Encoding.find(encoding)
end
@encoding = raw_encoding(nil) || internal_encoding || encoding ||
Encoding.default_internal || Encoding.default_external
Beyond the issue that @ShockwaveNN raised, I also noticed we aren't using the external_encoding
argument anywhere.
from csv.
I am still having issues trying to read a CSV string that has a byte order mark prepended to it:
require 'csv'
puts "CSV VERSION #{CSV::VERSION}" # Shows 3.0.0
bom_character = 65_279
contents = "first_name\nRyan".codepoints.unshift(bom_character).pack("U*")
csv = CSV.parse(contents, headers: true, encoding: 'bom|utf-8')
csv.each do |row|
p row.to_h.keys.first.codepoints
p "ROW FIRST NAME IS #{row["first_name"]}"
end
This outputs nil
for the first name and indicates that the key also contains the BOM. What am I doing wrong here?
from csv.
BOM is for opening a file not parse target string.
What am I doing wrong here?
You should not reuse closed issue. You should open a new issue.
from csv.
Is it a good practice to post bugs here or better post in on https://bugs.ruby-lang.org/ ?
from csv.
@kou, @hsbt:
It looks like change in the order of operations is what broke things. raw_encoding(nil)
returns <Encoding:UTF-8>
because @io.external_encoding == <Encoding:UTF-8>
def raw_encoding(default = Encoding::ASCII_8BIT)
if @io.respond_to? :internal_encoding
@io.internal_encoding || @io.external_encoding
elsif @io.is_a? StringIO
@io.string.encoding
elsif @io.respond_to? :encoding
@io.encoding
else
default
end
end
To me, it looks like raw_encoding
will always return an encoding, so
raw_encoding(nil) || internal_encoding || encoding || Encoding.default_internal || Encoding.default_external
will never reach internal_encoding
BTW, neither Ruby 2.4.3 or 2.5.0 recognize bom|utf-8
as a valid encoding:
irb(main):001:0> RUBY_VERSION
=> "2.4.3"
irb(main):002:0> Encoding.find("bom|utf-8")
ArgumentError: unknown encoding name - bom|utf-8
from (irb):2:in `find'
from (irb):2
from /Users/steven/.rbenv/versions/2.4.3/bin/irb:11:in `<main>'
It doesn't look like we use the internal_encoding
, encoding
, external_encoding
arguments. Do we still need them?
from csv.
@ShockwaveNN Thanks for your report. I've fixed it.
You can use here.
from csv.
@kou 4d13339 is so cool!
Could you release a new version including this commit?
I want to use this fix as a released version.
from csv.
OK.
Can you update news.md
? We can release a new version after we have a release note for the next version in new.md
.
from csv.
Of course!
Thanks for contribution chance for me.
I sent a PR for news.md
. #36
from csv.
Great!
from csv.
Related Issues (20)
- CSV.generate is not working with Rails 7 HOT 18
- `CSV::Row` pattern matching `Symbol` assumption HOT 1
- :date_time converter fails to recognize "YYYY-MM-DD HH:MM" HOT 7
- Add quoted information to CSV::FieldInfo HOT 1
- ArgumentError: unknown encoding name - iso-8859-1|utf-8 HOT 2
- Feature Request: Generate CSV String from Array HOT 5
- #eof? method returning wrong value when it's used on a csv file HOT 3
- #eof? method returning wrong value when it's used on a csv file with #each, #map, #filter HOT 1
- row access method(like .first, .count, .map) remove row unintentionally HOT 2
- New release for Ruby 3.2 HOT 7
- How about GH releases generated by `gh release create --generate-notes` HOT 5
- feature: add option to limit length of strings HOT 3
- Suggestion to add `sep` option HOT 1
- Broken links in documention HOT 11
- Duplicated last line in CSV.foreach HOT 14
- Recipes not copied downstream HOT 6
- A fiber to execute ':heder_converters' has been changed since v3.2.6 HOT 1
- liberal parsing does not split column as expected with quote at end of column HOT 8
- Inconsistent behaviour between `CSV::Table.new` and `CSV.parse` HOT 10
- Loading line becomes strange when skip_lines is specified HOT 6
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from csv.