herrfugbaum / qsv Goto Github PK
View Code? Open in Web Editor NEWProcess CSV and TSV files with SQL.
License: MIT License
Process CSV and TSV files with SQL.
License: MIT License
So, I ran a simple SELECT * FROM table
on a 28 MB file and got this:
QSV> SELECT * FROM table
<--- Last few GCs --->
[30685:0x336f730] 226079 ms: Scavenge 1381.8 (1422.7) -> 1380.9 (1423.2) MB, 5.4 / 0.0 ms (average mu = 0.314, current mu = 0.284) allocation failure
[30685:0x336f730] 226086 ms: Scavenge 1381.8 (1423.2) -> 1380.9 (1423.7) MB, 4.5 / 0.0 ms (average mu = 0.314, current mu = 0.284) allocation failure
[30685:0x336f730] 226091 ms: Scavenge 1381.9 (1423.7) -> 1381.0 (1424.2) MB, 3.5 / 0.0 ms (average mu = 0.314, current mu = 0.284) allocation failure
<--- JS stacktrace --->
==== JS stack trace =========================================
0: ExitFrame [pc: 0xba9113dbe1d]
1: StubFrame [pc: 0xba9113dd190]
Security context: 0x0a4abec9e6e1 <JSObject>
2: /* anonymous */(aka /* anonymous */) [0x341e99061761] [/media/common/code/projects/github/qsv/node_modules/slice-ansi/index.js:~42] [pc=0xba9119dc292](this=0x3dc8beb026f1 <undefined>,str=0x208746d1b409 <String[64]\: \x1b[32mBecause the jobs ads are right there on Stack Overflow\x1b[39m>,begin=0,end=58)
3: /* ano...
FATAL ERROR: Ineffective mark-compacts near heap limit Allocation failed - JavaScript heap out of memory
1: 0x8dbaa0 node::Abort() [node]
2: 0x8dbaec [node]
3: 0xad83de v8::Utils::ReportOOMFailure(v8::internal::Isolate*, char const*, bool) [node]
4: 0xad8614 v8::internal::V8::FatalProcessOutOfMemory(v8::internal::Isolate*, char const*, bool) [node]
5: 0xec5c42 [node]
6: 0xec5d48 v8::internal::Heap::CheckIneffectiveMarkCompact(unsigned long, double) [node]
7: 0xed1e22 v8::internal::Heap::PerformGarbageCollection(v8::internal::GarbageCollector, v8::GCCallbackFlags) [node]
8: 0xed2754 v8::internal::Heap::CollectGarbage(v8::internal::AllocationSpace, v8::internal::GarbageCollectionReason, v8::GCCallbackFlags) [node]
9: 0xed53c1 v8::internal::Heap::AllocateRawWithRetryOrFail(int, v8::internal::AllocationSpace, v8::internal::AllocationAlignment) [node]
10: 0xe9e844 v8::internal::Factory::NewFillerObject(int, bool, v8::internal::AllocationSpace) [node]
11: 0x113dfae v8::internal::Runtime_AllocateInNewSpace(int, v8::internal::Object**, v8::internal::Isolate*) [node]
12: 0xba9113dbe1d
Aborted (core dumped)
Unexpected, right? π
For smaller files (say, less than 50 MB?), maybe we should have an option to read them entirely into memory and then run operations on them. Will it speed things up? If yes, maybe we should make this as a first question when a file is loaded. We can caution the end user that importing into memory might take a long time.
I would really like the convenience of not having to reach for the Shift key when typing keywords. select * from table
seems as legit to me as SELECT * FROM table
does. And maybe I'm wrong, but I feel that this should be a simple matter of applying lowercase functions to the keywords (tokens?).
The README misses whitespaces in the where clause examples.
Don't worry you can't break anything!
Your pull request will get checked by some bots and eventually by a human.
If everything goes well your first pull request will be accepted and merged into the codebase.
CONGRATULATIONS!
πβ¨πβ¨πβ¨
If you need help just comment here or join the chat
Streams should be helpful to handle larger files.
In Papa Parse, the csv parser used in this project, there is already support for streams built in.
Files could be handled row by row instead of "all or nothing".
So I ran the tool against the StackOverflow Survey Results CSV (28 MB file) that looks like this:
And then when I try to run qsv I get the following errors:
qsv$ node bin/qsv.js -p "/home/ankush/Desktop/data.csv" -h
QSV> select * from table;
Lexing Errors detected.
unexpected character: ->;<- at offset: 19, skipped 1 characters.
QSV> Select * from table;
Lexing Errors detected.
unexpected character: ->;<- at offset: 19, skipped 1 characters.
QSV> SELECT * from table;
Lexing Errors detected.
unexpected character: ->;<- at offset: 19, skipped 1 characters.
QSV> SELECT * from Table
Parsing errors detected!
Expecting token of type --> From <-- but found --> 'from' <--
QSV> SELECT * from table
Parsing errors detected!
Expecting token of type --> From <-- but found --> 'from' <--
QSV> qsv$ node bin/qsv.js -p "/home/ankush/Desktop/data.csv" -h
QSV> select * from table;
Lexing Errors detected.
unexpected character: ->;<- at offset: 19, skipped 1 characters.
I'm at a loss to understand why even the basic "SELECT" statement is not being accepted. I've tried several variations, as evident in the snippet, but something seems to be stuck somewhere.
How to resolve this? Happy to provide more info if needed.
When passing the CSV path an a string, qsv doesn't seem to recognize the tilde "~" character on Linux (Ubuntu 18.04). Here's the command history to show how to reproduce the bug:
qsv$ node bin/qsv.js -p "~/Desktop/data.csv" -h
{ [Error: ENOENT: no such file or directory, open '~/Desktop/data.csv']
errno: -2,
code: 'ENOENT',
syscall: 'open',
path: '~/Desktop/data.csv' }
(node:16684) UnhandledPromiseRejectionWarning: TypeError [ERR_INVALID_ARG_TYPE]: The "chunk" argument must be one of type string or Buffer. Received type object
at validChunk (_stream_writable.js:258:10)
at WriteStream.Writable.write (_stream_writable.js:292:21)
at start (/media/common/code/projects/github/qsv/bin/qsv.js:23:20)
(node:16684) UnhandledPromiseRejectionWarning: Unhandled promise rejection. This error originated either by throwing inside of an async function without a catch block, or by rejecting a promise which was not handled with .catch(). (rejection id: 2)
(node:16684) [DEP0018] DeprecationWarning: Unhandled promise rejections are deprecated. In the future, promise rejections that are not handled will terminate the Node.js process with a non-zero exit code.
qsv$ node bin/qsv.js -p "/home/ankush/Desktop/data.csv" -h
QSV>
Related issue: #22
On a 28 MB file containing ~27,000 rows, the tool took two minutes and then broke. Even if it didn't break, such response times are unacceptable.
I was thinking that maybe we can add a page-by-page scroll functionality as offered by standard Unix file utilities.
Is your feature request related to a problem? Please describe.
Command line users can't see which version of qsv they are using. Also there is no notification if a newer version is available for download.
Describe the solution you'd like
The current version should be exposed by a -v switch.
On startup there should be an automatic version check (once a day).
ToDo
Describe alternatives you've considered
The only way to determine which version you are running is looking into the package.json of the globally installed package. That's no considerable option :)
Additional context
There is a small package by Zeit to handle the update check and notification, which includes version caching.
Describe the bug
If a table has many columns or if the console window isn't large enough the tables will render quite clunky.
To Reproduce
Steps to reproduce the behavior:
qsv -p path/to/file
SELECT * FROM test
Expected behavior
Tables should look like tables
Software Versions (please complete the following information):
Additional context
Probably this can be tackled by configuring table correctly
1.3.0
to 1.4.0
.π¨ View failing branch.
This version is covered by your current version range and after updating it in your project the build failed.
stryker-jest-runner is a devDependency of this project. It might not break your production code or affect downstream projects, but probably breaks your build or test tools, which may prevent deploying or publishing.
There is a collection of frequently asked questions. If those donβt help, you can always ask the humans behind Greenkeeper.
Your Greenkeeper Bot π΄
A declarative, efficient, and flexible JavaScript library for building user interfaces.
π Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. πππ
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google β€οΈ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.