Comments (8)
Do we need the regular version of this? It feels like clutter. The idea of minimal verbs is that you can do everything you need with them. Why not just Series.filter(not is_nil(_))
?
from explorer.
Just thinking out loud here: I think there's got to be a more elegant way of dealing with lists than cleaving to Polars's API too closely. I'm not sure I like everything about purrr
but I wonder if something like map_depth
might help us out here?
There's also modify_tree
. Something that indicates it's recursive might be useful?
from explorer.
@cigrainger you are right, we don't need drop_nil
at the root. However, the issue presented here is also available for many of the aggregate functions. How to distinguish between max
of a list and max
of the series?
map_depth
/modify_tree
is definitely interesting. Implementing it is a bit less trivial. We would need to introduce some sort of LazyDepthSeries, that collects operations in a series, but as it relates to a struct or list field, and then translate that to polars. Do we want to go down this route?
from explorer.
Maybe. I need to explore how Polars handles this itself and in py polars.
from explorer.
So putting this a bit to the test with Python Polars:
It seems like nested lists may not be supported? That would fix the recursion problem pretty cleanly.
In [16]: df = pl.DataFrame({"values": [[None, 1, None, 2], [None], [3, 4], [[None, 1], [2], [None]]]})
In [17]: df
Out[17]:
shape: (4, 1)
┌────────────────────┐
│ values │
│ --- │
│ list[i64] │
╞════════════════════╡
│ [null, 1, … 2] │
│ [null] │
│ [3, 4] │
│ [null, null, null] │
└────────────────────┘
from explorer.
@cigrainger all lists need to be nested equally. When it fails to cast to a certain type, it returns null
instead of raising.
from explorer.
@cigrainger Also, in case you missed it, there was an interesting saga of us discovering what Polars was doing WRT nested lists here:
Some take-aways:
- The first element of a nested list is the tie-breaker when the dtype is ambiguous.
- We decided to be more strict with our inference code than py-polars in certain situations.
from explorer.
I did! Thanks @billylanchantin
from explorer.
Related Issues (20)
- Seeing `:nif_not_loaded` error for `Series.split/2` when mutating a dataframe HOT 1
- [Feature request] Add support for read_database in Polars backend. HOT 1
- Using `sort_by` with a grouped data frame doesn't respect `nils:` option HOT 1
- `{:datetime, :second}` dtype support HOT 2
- Add :streaming option to DataFrame.to_csv/3 HOT 1
- Exporting to CSV with a duration column returns an error
- Regression in `DataFrame.concat_rows/2` in v0.8.2 HOT 1
- Filter throwing undefined variable error HOT 1
- Error using is_finite and is_infinite within mutate HOT 1
- Explorer NIF broken on FreeBSD HOT 12
- Support Elixir built in Duration struct HOT 1
- Bug: Rounding Error in Tests HOT 1
- exposing the `fold` expressions from Polars HOT 7
- :nif_panicked "Chunk require all its arrays to have an equal number of rows" HOT 1
- Sorting an empty DataFrame results in a runtime Polars error HOT 1
- Performance of `DataFrame.new/2` on dataframes containing list columns HOT 7
- `Series.filter` should work inside `DataFrame.summarise` HOT 5
- Large memory usage when using `Explorer.Dataframe.concat_columns` on 30k (small) data frames. Memory leak? HOT 4
- [Not Issue] - Are the plans to use duckdb as an alternative backend? HOT 2
- Support streaming: true on collect HOT 1
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from explorer.