Comments (4)
This is currently the default behavior. qsv reads the file into memory and then starts to parse the csv into an array of objects, after that the array gets processed according to the SQL statement.
I think reading files into memory isn't that problematic. If I recall correctly node.js supports files up to 2GB out of the box.
The problem occurs either on 1. parsing the csv or 2. executing the sql statement or 3. rendering the result into a table. I'm not sure yet if we are exceeding the size limit of an array or if there is some kind of memory leak 🤔
Related #22
from qsv.
So I tested on the 28 MB StackOverflow results file, and have the following observations:
select Country from table
gives rapidly scrolling out and finishes in a few seconds (no problems here).
select * from table limit 10
-- such queries are also reasonably fast. Even with limit 100, this works fine.
select * from table
causes everything to freeze, and roughly speaking, I saw memory usage jumping from 100 MB to 200 MB to 600 MB.
I have a feeling that your point "3. redering the result to a table" might just be the bottleneck. I've come across such issues before where displaying to the terminal is slow (because the terminal is bufferred, I guess?).
Any idea how this can be examined/confirmed?
from qsv.
You can have a look into memory usage with the chrome dev tools for example in VS Code or with ndb as a stand alone tool. I'm not really experienced with the memory debugging tools though.
But what i've seen is that the files content, in the state before it gets parsed, doesn't get garbage collected. I think thats not a huge issue as long as files are small, because the memory get's allocated only once, so it's not a classic memory leak that starts to bloat, but never the less it should be removed.
The other thing I found is a huge collection of strings that are ready to get rendered (ansi escape codes are applied). And as far as I can tell this is the point where we're running out of memory because there are a lot of ansi escape sequences added to the data.
Another good claim for Point 3 is that currently the SELECT
statement is currently applied last, so in the case of select country from table
it operated on the complete table in memory without causing any problems. Relevant code.
So to make it work my idea would be to try streaming the results to the terminal instead of trying to dump the complete thing at once.
from qsv.
I've just created #29 so we can test a version that supports streams. If we get this working we can support files of (theoretically) unlimited size.
from qsv.
Related Issues (12)
- Doesn't recognize shell meta-characters in file path HOT 3
- Errors executing even the first "SELECT" statement HOT 2
- Investigate support for large(r) files HOT 4
- docs: Missing whitespaces
- Large tables rendering HOT 2
- Breaks on medium-sized files HOT 2
- Takes too long to parse medium-sized files HOT 1
- Make the parser case-insensitive HOT 2
- Show update notifications in CLI
- Poll: SQL Queries HOT 3
- An in-range update of stryker-jest-runner is breaking the build 🚨 HOT 2
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from qsv.