infused / dbf Goto Github PK
View Code? Open in Web Editor NEWDBF is a small, fast Ruby library for reading dBase, xBase, Clipper, and FoxPro database files.
Home Page: http://rdoc.info/projects/infused/dbf
License: MIT License
DBF is a small, fast Ruby library for reading dBase, xBase, Clipper, and FoxPro database files.
Home Page: http://rdoc.info/projects/infused/dbf
License: MIT License
Hi there,
I think there is a slightest typo over converters.rb file (instead of pointing rc it referred to 'content')
def self.create_converters
...
:point => lambda { |rc| Point.new *(content.unpack("@4EE"))},
In the supported data types documentation, the data type M (Memo) seems to not be supported for Version 30 - Visual FoxPro (instead of Y
there'a -
).
However, using your library I successfully read a .dbf file that has been identified as vesion=30 and the corresponding .fpt and noticed that the memo field values were imported OK. Thank you for that! 😄
So, I wondered if the docs should say that this combination is supported.
Add documentation on how to specify encodings.
For some administration job, I got a dbf-file with some columns which aren't unique. The widgets doesn't seem to work, since the column names are used as keys in the hash.
The CSV export does see all columns correctly, but the rows aren't handled correctly. I suspect the last value of a same-name-column is used for all other columns with the same name.
Here's a minimal test case:
require 'base64'
File.write("sample", Base64.decode64("iw8CBAEAAAChA3QBAAAAAAAAAAAAAAAAAAAAAAAAAABJTVBBQ1RfMQAAAEMA\nAAAAAgAAAAAAAAAAAAAAAAAAAElNUEFDVF8yAAAAQwAAAAAeAAAAAAAAAAAA\nAAAAAAAARE1HX01FTU8AAABNAAAAAAoAAAAAAAAAAAAAAAAAAABEQl9WX0NP\nREUAAEMAAAAABwAAAAAAAAAAAAAAAAAAAFBMQVRFX05PAAAAQwAAAAAKAAAA\nAAAAAAAAAAAAAAAAUExBVEVfU1QAAABDAAAAAAIAAAAAAAAAAAAAAAAAAABW\nX1ZJTgAAAAAAAEMAAAAAGQAAAAAAAAAAAAAAAAAAAFZfQ09ORAAAAAAAQwAA\nAAACAAAAAAAAAAAAAAAAAAAAVl9QUk9EX0RUAABDAAAAAAQAAAAAAAAAAAAA\nAAAAAABWX01PREVMX1lSAEMAAAAAAgAAAAAAAAAAAAAAAAAAAFZfTUFLRUNP\nREUAQwAAAAAMAAAAAAAAAAAAAAAAAAAAVl9NQUtFREVTQwBDAAAAABQAAAAA\nAAAAAAAAAAAAAABWX01PREVMAAAAAEMAAAAAMgAAAAAAAAAAAAAAAAAAAFZf\nVFlQRQAAAAAAQwAAAAACAAAAAAAAAAAAAAAAAAAAVl9CU1RZTEUAAABDAAAA\nABQAAAAAAAAAAAAAAAAAAABWX1RSSU1DT0RFAEMAAAAAFAAAAAAAAAAAAAAA\nAAAAAFRSSU1fQ09MT1IAQwAAAAAUAAAAAAAAAAAAAAAAAAAAVl9NTERHQ09E\nRQBDAAAAABQAAAAAAAAAAAAAAAAAAABWX0VOR0lORQAAAEMAAAAAFAAAAAAA\nAAAAAAAAAAAAAFZfTUlMRUFHRQAAQwAAAAAGAAAAAAAAAAAAAAAAAAAAVl9P\nUFRJT05TAABNAAAAAAoAAAAAAAAAAAAAAAAAAABWX0NPTE9SAAAAAEMAAAAA\nFAAAAAAAAAAAAAAAAAAAAFZfVE9ORQAAAAAATgAAAAABAAAAAAAAAAAAAAAA\nAAAAVl9TVEFHRQAAAABOAAAAAAEAAAAAAAAAAAAAAAAAAABQQUlOVF9DRDEA\nAEMAAAAADwAAAAAAAAAAAAAAAAAAAFBBSU5UX0NEMgAAQwAAAAAPAAAAAAAA\nAAAAAAAAAAAAUEFJTlRfQ0QzAABDAAAAAA8AAAAAAAAAAAAAAAAAAABWX01F\nTU8AAAAAAE0AAAAACgAAAAAAAAAAAAAAAAAAAA0gICAgICAgICAgICAgICAg\nICAgICAgICAgICAgICAgICAgICAgICAgICA0TjY2NjQgIFhYWFhYICAgICAg\nIDJDNFJER0JHM0NSMTY2Nzk1ICAgICAgICAwICAgICAxMk42NjYgICAgICAg\nIERvZGdlICAgICAgICAgICAgICAgQ2FyYXZhbiAgICAgICAgICAgICAgICAg\nICAgICAgICAgICAgICAgICAgICAgICAgICAgIEdyYW5kIFNYVCA0IERSIFBh\nc3NlICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAg\nICAgICAgICAgICAgICAgICAgNmN5bCBHYXNvbGluZSAzLjYgICA4MDAwMCAg\nICAgICAgICA2ICAgICAgICAgICAgICAgICAgICAxMiAgICAgICAgICAgICAg\nICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgIDc=\n"))
require 'dbf'
table_from_path = DBF::Table.new("sample")
table_from_file = DBF::Table.new(File.open("sample"))
table_from_stringio = DBF::Table.new(StringIO.new(File.read("sample")))
table_from_path.record(0)["V_MAKEDESC"] == "Dodge"
table_from_file.record(0)["V_MAKEDESC"] == "Dodge"
table_from_stringio.record(0)["V_MAKEDESC"] == "66 Dodge"
I found this bug when extracting files from a zip and wrapping them in a StringIO to pass to DBF. Some fields are read correctly, some are partially correct, some are seemingly random data. My best guess is that there's a difference between how File or StringIO rewind or seek to a field and StringIO needs an explicit seek/rewind the other doesn't.
There appears to be a problem with the gem since the last version bump. I get the following errors when trying to install the gem.
$ gem install dbf
ERROR: Could not find a valid gem 'dbf' (>= 0) in any repository
ERROR: Possible alternatives: dbf
$ gem install dbf -v '2.0.1'
ERROR: Could not find a valid gem 'dbf' (= 2.0.1) in any repository
ERROR: Possible alternatives: dbf
The dbf gem is a prerequisite for both georuby and rgeo. A bundle install provides the following additional information:
Gem::RemoteFetcher::FetchError: bad response Not Found 404 (https://s3.amazonaws.com/production.s3.rubygems.org/gems/dbf-2.0.1.gem)
Here's a test file with binary data in the IMAGEN column:
My expected date is 06/10/17 in column but I got #<Date: 2017-10-06 ((2458033j,0s,0n),+0s,2299161j)> in response...
DBF specifies an exact version dependency on ActiveSupport 3.0.1. This of course breaks if we're trying to run Rails 3.0.3 or any other version. Can't we relax this a bit and say ">= 3.0.1" or "~> 3.0"?
Depends on #33
On table instantiation, dynamically define a record class using the bindata field descriptors. Type casting should then operate on the defined bindata values.
Stop writing to a file in the current directory and output to STDOUT. This makes the command much more versatile.
Dear Team ,
i have a trouble when use DBF with Capitalize name Column..
how ti fix it ?
// in my controller
@itemases = DBF::Table.new("/mnt/g/DATA/SALES/ITEMAS.DBF")
// in views
<%
i=0
@itemases.each do |record|
i+=1
%>
<%=i%>
<%= record.C_ITENO%>
<br>
<%end%>
// result
undefined method `C_ITENO'
I have encountered .dbf files that have columns with spaces in their names. This causes the eval in 'define_accessors' to generate a syntax error due to it including the space in the method name and in the ivar.
Problem is here:
https://github.com/infused/dbf/blob/master/lib/dbf/record.rb#L51
I'm not sure the best way to fix it, replacing space with underscore is quick and easy but what if another column exists that is the same name but with an underscore instead of a space?
Here's a very small example .dbf with this issue:
http://dl.dropbox.com/u/8491051/bad.dbf
Which results in:
SyntaxError: (eval):2: syntax error, unexpected tIDENTIFIER, expecting keyword_end
@fiber b_id ||= attributes['fiber b_id']
^
from /Users/Ryan/.rvm/gems/ruby-1.9.3-p0/gems/dbf-1.7.2/lib/dbf/record.rb:53:in `class_eval'
from /Users/Ryan/.rvm/gems/ruby-1.9.3-p0/gems/dbf-1.7.2/lib/dbf/record.rb:53:in `block in define_accessors'
from /Users/Ryan/.rvm/gems/ruby-1.9.3-p0/gems/dbf-1.7.2/lib/dbf/record.rb:51:in `each'
Is it possible to properly read a file without having the corresponding memo file? I have the same issue outlined in #64 since I don't have the memo file.
Hi, need your advice, could you tell me what is the simple way to write data to dbf?
Hello,
I am trying to read the DBF file with some date 'D' column and its returning nil, I have found the method that handles the date fields column.rb (decode_date) and the error is that the .to_date does not work because its a string.
I have replaced the method with "puts value" and it returns "2010 715"
Do you know how I can resolve this issue?
gem version dbf 1.2.7
This is the DBF file:
http://rapidshare.com/files/407109992/file.dbf
Hi I need this encoding http://en.wikipedia.org/wiki/ISO/IEC_8859-1
Can I do it with your gem?
According to the xBase spec that I found [1] the I type should be an 4 byte little endian integer and is specific to FoxPro. I have a FoxPro database that I'm trying to read and it is failing to work. Unfortunately I cannot give you my dbf file and I'm not sure how to create a test case for you.
[1] http://www.clicketyclick.dk/databases/xbase/format/data_types.html
I am syncing data between a Rails app and a dbf file on a samba share. I had to add the following code:
table_variable.data.close
table_variable.memo.close
in order to eject the share when finished since otherwise the data file is still busy for as long as the ruby process is active (which for a Rails app is hopefully a long time).
You might consider adding a close method which closes the 2 file references which are opened when a new table instance is initialized.
Move all memo related code into a separate class
dbf -s spec/fixtures/dbase_30.dbf
/opt/local/lib/ruby/1.8/pathname.rb:263: warning: `*' interpreted as argument prefix
Problem with ruby 1.9 only:
dbf-1.2.2/lib/dbf/record.rb:8:in <class:Record>': undefined method
delegate' for DBF::Record:Class (NoMethodError)
from /usr/lib/ruby/gems/1.9.1/gems/dbf-1.2.2/lib/dbf/record.rb:4:in <module:DBF>' from /usr/lib/ruby/gems/1.9.1/gems/dbf-1.2.2/lib/dbf/record.rb:1:in
<top (required)>'
from /usr/lib/ruby/gems/1.9.1/gems/dbf-1.2.2/lib/dbf.rb:17:in require' from /usr/lib/ruby/gems/1.9.1/gems/dbf-1.2.2/lib/dbf.rb:17:in
<top (required)>'
It was running fine at some older version...
Thanks ;)
I have a dbf file (09_02_23-nqhq.dbf.zip) which has 8687 rows, but record_count
returns 8733. And maybe it's the cause that many methods failed.
filepath = '09_02_23-nqhq.dbf'
table = DBF::Table.new(filepath, nil, 'gbk')
table.record_count # works, returns 8733
table.count # error
table.each {|record| }# error
table.find 8686 # works
table.find 8687 # error
All errors are like the following:
NoMethodError: undefined method `unpack' for nil:NilClass
from /home/vanitas/.gem/ruby/2.3/gems/dbf-3.0.5/lib/dbf/table.rb:282:in `deleted_record?'
from /home/vanitas/.gem/ruby/2.3/gems/dbf-3.0.5/lib/dbf/table.rb:124:in `record'
from /home/vanitas/.gem/ruby/2.3/gems/dbf-3.0.5/lib/dbf/table.rb:113:in `block in each'
from /home/vanitas/.gem/ruby/2.3/gems/dbf-3.0.5/lib/dbf/table.rb:113:in `times'
from /home/vanitas/.gem/ruby/2.3/gems/dbf-3.0.5/lib/dbf/table.rb:113:in `each'
from (irb):18
Not sure whether the dbf file is corrupted, but reading it with Python's dbfread has no problem.
To speed up some imports from a legacy FoxPro DB into a postgresql, I am trying to utilize the parallel gem. But whenever I pass an array of DBF::Record objects to the parallel worker processes, I get the following error:
TypeError (no _dump_data is defined for class StringIO):
/usr/local/bundle/gems/parallel-1.12.1/lib/parallel.rb:66:in `dump'
/usr/local/bundle/gems/parallel-1.12.1/lib/parallel.rb:66:in `work'
/usr/local/bundle/gems/parallel-1.12.1/lib/parallel.rb:382:in `block (4 levels) in work_in_processes'
/usr/local/bundle/gems/parallel-1.12.1/lib/parallel.rb:495:in `with_instrumentation'
/usr/local/bundle/gems/parallel-1.12.1/lib/parallel.rb:381:in `block (3 levels) in work_in_processes'
/usr/local/bundle/gems/parallel-1.12.1/lib/parallel.rb:369:in `loop'
/usr/local/bundle/gems/parallel-1.12.1/lib/parallel.rb:369:in `block (2 levels) in work_in_processes'
/usr/local/bundle/gems/parallel-1.12.1/lib/parallel.rb:206:in `block (2 levels) in in_threads'
Now I know this is not due to your gem, but the problem lies that the parallel gem is relying on Marshal.dump
, but Marshal.dump
does not work together with StringIO, which is the underlying class of DBF::Record data.
Do you have any suggestions how to make it work together?
This is the code I try to execute:
slices = dbf_table.each_slice(1023).to_a
def import_in_parallel(slices)
Parallel.each(
-> { slices.pop || Parallel::Stop},
in_processes: [1, Concurrent.processor_count - 1].max
) do |slice|
perform_import_job(slice)
end
end
I have written a rake task to import from a dbf file into my current rails application but have run into a strange bug.
desc "Import brothers"
task :import_brothers => :environment do
bro_db = DBF::Table.new("bro.dbf")
bros = bro_db.find(:all)
bros.each do |brother|
b = Brother.new
b.last_name = brother["L_NAME"] unless brother["L_NAME"].nil?
I receive the error undefined method
get' for nil:NilClass`
The strange part is that I can p bros.first and see that there is a column named L_NAME and I created a fresh rails project with the same gemfile and same schema for "brother" and it ran fine. Any ideas?
I'm sure you're aware of this, but I was working with a table with ~32,000 records (~2MB), and doing lookups by a single field was taking many seconds per query.
I ended up reading the whole table into a Hash keyed on my column of interest and just doing lookups by the hash key. Didn't take long to read in and lookups were near instantaneous. Wouldn't work for much larger tables, but maybe something to consider? I dunno, just wanted to let you know someone was making use of your gem. :-)
Move column typecasting into the record class. This is the 1st step of refactoring to use the BinData FieldDescriptors directly and will allow the removal of the old Column::Dbase and Column::Foxpro classes.
For whatever reason the DBF files I'm working with use a carriage return (\n) instead of a null terminator (\n) to indicate the end of the field descriptor section. So when I attempt to find a record I get something like this...
ruby-1.9.3-p0 :001 > DBF::Table.new('tkempsch.dbf')
=> #<DBF::Table:0x00000100cb55d8 @data=#<File:tkempsch.dbf>, @version="30", @record_count=3103, @header_length=1160, @record_length=192, @encoding_key="03", @encoding="cp1252", @memo=nil>
ruby-1.9.3-p0 :002 > DBF::Table.new(tkempsch.dbf').find :first
SyntaxError: (eval):1: syntax error, unexpected tDOT2
def ..\..\data.dbc
^
from /Users/brent/dbf/lib/dbf/record.rb:54:in `class_eval'
from /Users/brent/dbf/lib/dbf/record.rb:54:in `block in define_accessors'
from /Users/brent/dbf/lib/dbf/record.rb:52:in `each'
from /Users/brent/dbf/lib/dbf/record.rb:52:in `define_accessors'
from /Users/brent/dbf/lib/dbf/record.rb:14:in `initialize'
from /Users/brent/dbf/lib/dbf/table.rb:95:in `new'
from /Users/brent/dbf/lib/dbf/table.rb:95:in `record'
from /Users/brent/dbf/lib/dbf/table.rb:83:in `block in each'
from /Users/brent/dbf/lib/dbf/table.rb:83:in `times'
from /Users/brent/dbf/lib/dbf/table.rb:83:in `each'
from /Users/brent/dbf/lib/dbf/table.rb:264:in `detect'
from /Users/brent/dbf/lib/dbf/table.rb:264:in `find_first'
from /Users/brent/dbf/lib/dbf/table.rb:182:in `find'
from (irb):2
Because the carriage return is not terminating parsing of columns it tries to read the backlink section of the DBF file as a column and create a accessor method for it on the record. This fails because the file path it is trying to make a method name out of contains dots.
I found a noisy case, when I tried to read a field on a DBF record, the string contains some characters with accents, for example "Inglés" and it returns something like this "Ingl�s", I have tried so many ways to get the correct string but the problem seems to be the reading encoding (well I'm not an expert, I have been working in Rails for six months). It gets the ASCII version of the string, can someone help me?... thanks in advance...
The dbf.gemspec includes this code:
if RUBY_VERSION.to_f < 1.9
s.add_dependency 'fastercsv', '~> 1.5.4'
end
I do not believe this works. The gemspec file gets executed when your gem is built, but then a new gemspec actually gets generated and written into the gem itself. So this code is going to decide whether or not to add the dependency, based not on what version of Ruby is running at install time (which is what you want) but on what version of Ruby you are running when you build the gem.
The upshot is that 1.8 users are not getting the correct dependency (because it looks like you are running 1.9 when you are building the gem). They have to add fastercsv explicitly to their Gemfiles.
I do not believe there is a way to set dependencies conditionally based on the environment of the installer. If you want to support 1.8 and you need fastercsv, you probably just need to add it to the gemspec for all users. It's not the end of the world for 1.9 users to require the fastercsv gem.
Really love this library. I have files in Visual FoxPro format with memo fields.
In the docs it says that Memo fields are supported in f5 type tables. I am unable to read memo field data from f5 type tables.
Thanks
Rod
Remove initialize_attributes in favor of lazy loading
Hi and first thanks for this great gem!
I haven't touched DBF files in 30 years, so bear with me 😄 If I open a given file in LibreOffice, I have one line where there's a column which is blank, while when parsing it with the gem I get a 0.0 instead.
Is NULL supported at all in dBase files (version "03" specifically)? If so, should the data come back as nil
rather than 0.0
?
I can provide the file privately if needed!
Thanks!
hi,
When I run bin/dbf (on Debian), I get the following error:
/usr/bin/env: ruby -s: No such file or directory
It seems that env cannot use an interpreter with a switch.
As a temporary workaround, I use #!/usr/bin/ruby -s as shebang.
I know that -s is very convenient to do cheap option parsing. But could you consider porting the script to some standard solution, like optparse?
Thanks!
I've recently encountered a problem with the wrong encoding - the file header says the file contents are encoded in X encoding, but the actual encoding is Y. Fortunately, every file belongs to a certain category and I know the actual encoding used by every category. I've solved the problem by monkeypatching DBF::Table#initialize and DBF::Table#get_header_info- I've added a parameter to specify encoding on opening a dbf file. I was going to make a pull request and then I've noticed this feature has already been removed from the project (commit c08b766). Would you be interested in adding this feature back to the project? If so, I'll make the pull request at once.
welcome
here is my code
socrs = DBF::Table.new(Rails.root.join('fias','socrbase.dbf'))
socr = socrs.find(1)
puts socr.scname
at the command line I see - "????????? ??????"
files are taken from the database of the Federal Information Address System of the Russian Federation - http://fias.nalog.ru/Public/DownloadPage.aspx
file is not damaged, successfully read DBFViewer
may bug is related to the presence of Russian characters?
socr.to_json return -
{"data":["1 \ufffd\ufffd\ufffd\ufffd\ufffd\ufffd\ufffd\ufffd\ufffd\ufffd \ufffd\ufffd\ufffd\ufffd\ufffd \ufffd\ufffd 0 "],"columns":[{"enc
oding":null,"version":"03","decimal":0,"length":5,"type":"C","name":"LEVEL","underscored_name":"level"},{"encoding":null,"version":"03","decimal":0,"length":50,"type":"C","name":"SOCRN
AME","underscored_name":"socrname"},{"encoding":null,"version":"03","decimal":0,"length":10,"type":"C","name":"SCNAME","underscored_name":"scname"},{"encoding":null,"version":"03","dec
imal":0,"length":4,"type":"C","name":"KOD_T_ST","underscored_name":"kod_t_st"}],"version":"03","memo":null,"column_names":["level","socrname","scname","kod_t_st"]}
Several people have requested a file I/O only mode for use with large files that will not fit into memory.
Hi,
I am trying to use the in memory database feature, but I get an error:
require 'dbf'
require 'pp'
clients = DBF::Table.new(StringIO.new(File.read("tmp/exports/paciente.dbf")))
client = clients.find(50)
pp client.attributes
# => "/gems/dbf-2.0.3/lib/dbf/record.rb:74:in `init_attribute': undefined method `get' for nil:NilClass (NoMethodError)"
clients = DBF::Table.new("tmp/exports/paciente.dbf")
client = clients.find(50)
pp client.attributes
# => { ...all the attributes... }
Thanks
Is there anyway to read dbt files?
Currently, when parsing invalid dates such negative numbers DBF throws an error invalid date
due to the default behavior of Date.strptime
.
Is that the desired behavior from behalf of this gem or should this return nil for invalid dates? It seems that DateTime for example is more lenient towards faulty input (https://github.com/infused/dbf/blob/master/lib/dbf/column_type.rb#L71)
Context:
We had a client providing us shape files and due to a bug on their end the shapes started containing negative dates after a point. Even though the columns in question were never used by us, RGeo uses dbf gem to parse every line and this blew up in our face.
We had to hack around this with monkey_patching dbf:
DBF::ColumnType::Date.class_eval do
def type_cast(value)
value =~ /\d{8}/ && ::Date.strptime(value, '%Y%m%d')
rescue
nil
end
end
My table version is type "30" and I haven't been able to read any memo files from it. Besides in this same table, let's say 2 columns named "name" and "lastname" has a record name="leonardo" and lastname="cabeza". dbf shows me record: name="leona" and lastname "rdocabeza". Am I doing something wrong? I can send the sample file if you want. Thanks
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.