Comments (23)
I believe the numbers should definitely be displayed in full if there's space, or at least otherwise notify the user that they have been modified. Wasn't one of the founding principles of plyr (and thus the genesis of the tidyverse in general) to not surprise the user (i.e. provide output consistent with input)? If I enter 1000.34 in my data entry, I certainly don't expect to see "1000".
from pillar.
I don’t if my opnion is worth much, but FWIW, I too find the current behaviour very misleading. It makes it look like there are no non-zero decimals (up to the precision/width used). I’m OK with hiding trailing zeros (up to the precision used), but hiding trailing non-zeros is confusing.
The current behaviour is:
pillar::pillar(c(1000.34, 1000, 0.34567))
#> <dbl>
#> 1000
#> 1000
#> 0.346
I would be happy with this being rendered as either
#> <dbl>
#> 1000.34
#> 1000
#> 0.346
or
#> <dbl>
#> 1000.340
#> 1000.000
#> 0.346
But perhaps dropping the decimals could be restricted to integers (defined as numbers x
where x == round(x)
), e.g.:
pillar::pillar(c(1000.34, 1000.000078, 1000, 0.34567))
#> <dbl>
#> 1000.340
#> 1000.000
#> 1000
#> 0.346
or (preferably?)
pillar::pillar(c(1000.34, 1000.000078, 1000, 0.34567))
#> <dbl>
#> 1000.34
#> 1000.
#> 1000
#> 0.346
That is, omitting the .
indicates real integers. Or, in other words, having a decimal point is the formatting function telling the user ‘there is something after the decimal point – even though I might not display it (due to lack of space/precision)’.
from pillar.
I like the idea of using a trailing .
to indicate that there's more there
from pillar.
This is what I get when I dont specify any option
> pillar::pillar(c(1000.34, 1000.000078, 1000, 0.34567))
<dbl>
1000.
1000.
1000.
0.346
and now if I run
> options(pillar.sigfig=10)
> pillar::pillar(c(1000.34, 1000.000078, 1000, 0.34567))
<dbl>
1.000340000e+3
1.000000078e+3
1.000000000e+3
3.456700000e-1
Damn... I just want to see my full number 1000.000078
.
Lets try again
> options(pillar.sigfig=5)
> pillar::pillar(c(1000.34, 1000.000078, 1000, 0.34567))
<dbl>
1000.3
1000.0
1000.0
0.34567
which is still rounding my numbers :(
How can I disable this rounding + forced scientific formatting altogether? Again, rounding numbers like this is misleading and dangerous (if enabled by default). Perhaps some users may like that, I am pretty sure most people wont.
Please let me know
Thanks!
from pillar.
I'm glad that pillar.sigfig = 7
works for you:
data.frame(x = 1000.000078)
#> x
#> 1 1000
sprintf("%.23f", 1000.000078)
#> [1] "1000.00007800000003044260666"
Created on 2018-03-01 by the reprex package (v0.2.0).
from pillar.
I was the one who proposed the ‘trailing decimal point’ feature, but FWIW, I’m not happy with the way it has been implemented. The idea was to use the dot to indicate that ‘there is more here, but we’re not displaying all of it (because of lack of space)’. But the way it’s implemented is to add a trailing decimal dot for all double
numbers, regardless of whether they are integers (i.e. x %% 1 == 0
, or x == round(x)
).
So now only integer
values are shown without a dot. I don’t think that’s useful, and it clutters the display of tibbles. To see if a number is an integer
or a double
, it’s enough to look at the column header, so the extra dot doesn’t add any information. And, at least in my experience, it’s very common that integer values are stored in numeric
(double
) columns.
I still think the original idea made sense. It’s useful to see if a number is an integer (not necessarily an integer
) or if it has been truncated for display purposes. Having a trailing .
shown only for truncated numbers (x %% 1 != 0
) would give this information, and would make it easy to spot hard-to-find floating-point related issues (e.g. code that assumes that (.1 + .2) * 10
produces the number 3, something it doesn’t (it produces a number slightly larger than 3), but which R by default hides from you).
from pillar.
One last thought. Cleveland's seminal work on visualizing data led to many improvements in graphing parameters and paradigms. The excellent lattice and ggplot2 packages make use of many of his concepts. Similarly, Brewer's extensive work in cartography and color theory guides optimal use of color in visualizations. I wonder if there exists some cognitive research on effective presentation of tabular data? If not, perhaps there's something for data that's analogous to the Chicago or Oxford Manuals of Style that could guide default format choices?
from pillar.
@randomgambit I think it's wholly unfair to have folks need an understanding of floating point approximations in the beautification of tibble output. There's only once mention of floating points in the entirety of http://r4ds.had.co.nz/ and that's as wooly as possible.
from pillar.
Yes
from pillar.
Maybe we could print them if there's enough space?
from pillar.
It was a deliberate choice. Maybe it's worth rethinking, as it does seem a bit arbitrary to not display digits when space is available, and sigfigs are highlighted using colour so it's still scannable.
from pillar.
@dpeterson71 what do you expect sqrt(2) ^ 2
to print?
from pillar.
In this case, sqrt(2)^2
should be just 2
, as in base R. I would expect sqrt(2)
to provide the precision I've requested by base-R's digits option. That's the crux of the problem, though, isn't it? The computer doesn't know a-priori whether I have entered in specific digits (or read them from a manually generated file) or computed something that could potentially be an irrational number.
If the computer is going to modify or change the data I have given it, it should at least have the courtesy to notify me that it has done so rather than blindly dropping information.
from pillar.
My point is that no floating point number is exact - I don't think it's unreasonable for tibble to not print .34 when it's only a small part of the value.
(BTW I don't like the principle of avoiding surprise; because different things surprise different people based on what they know)
from pillar.
@huftis has two very good suggestions, in my opinion. My personal preference would be the first example, where the entries for doubles are justified:
pillar::pillar(c(1000.34, 1000.000078, 1000, 0.34567)) #> <dbl> #> 1000.340 #> 1000.000 #> 1000 #> 0.346
The last option at least solves part of the issue where the careful observer might notice that the data has been modified by the subtle visual cue of a decimal point with missing digits. However, even though I could eventually learn to deal with that format, it is still harder to read and interpret with the uneven formatting and ragged edges. Our research group would never be allowed to present data that way in a public forum where readability and policy decisions matter.
from pillar.
in my opinion this is extremely dangerous. I mean, I could honestly lose my job if I think I have 100 in my dataframe whereas I have 100.2
Formatting and color are fun, but this is way beyond that.
from pillar.
How do you like the current output with the decimal dot always printed?
from pillar.
actually setting pillar.sigfig = 7
seems to be a good compromise here. 👍
from pillar.
interesting. I think it would be worthwhile to educate the user about floating-point approximations here. Like you could share a link to http://floating-point-gui.de/basic/ on the main tibble page as a reminder/warning.
from pillar.
@martinjhnhadley come on, seriously? anybody can understand that, the point would be to say - look - you can control the sigfig
parameter and do all sort of funny stuff with color/shading. However, keep in mind that there is a physical limit on how accurate a number can be in the computer's memory. The reprex from @krlmlr is a nice example/reminder.
from pillar.
I really like the idea of the dot meaning there is more - but we dont see it. However, in practice, i will likely set enough significance digits so that I would always see a few digits in the decimal space. So that option would not impact me as much as the other ones.
from pillar.
Closing in favor of #105. The dot will be shown only if x %% 1 != 0
.
from pillar.
This old thread has been automatically locked. If you think you have found something related to this, please open a new issue and link to this old issue if necessary.
from pillar.
Related Issues (20)
- Feedback regarding column superscripts HOT 15
- Use of `focus` argument causes an error HOT 2
- Very small numbers print as `Inf.e-324` HOT 6
- Should `*` or `i` be used for footer information? HOT 11
- Truncated list-cols don't seem to get grey coloring HOT 2
- pillar_num and logical vectors cause (potential?) issues
- Installation of pillar 1.9.0 in GitHub actions fails HOT 2
- Should time zones be changing datetime formatting?
- Conditional colouring of values in specific columns HOT 2
- Infinite date-time being printed twice
- Why do `new_pillar_shaft()` and `new_pillar_shaft_simple()` default `min_width` to `width`? HOT 1
- FR: don't print seconds in dttm if there is not enough space
- FR: Add a pillar.max_chars option HOT 1
- Modify tbl_df subclass print in a package HOT 2
- Feature request: Option to print both head and tail of tables? HOT 6
- Use pillar to make column titles red HOT 1
- Guidance colorizing values within a column HOT 2
- The output of `glimpse()` is too wide in RStudio Visual Editor mode HOT 6
- Show colour in `glimpse()` HOT 1
- pillar::num type vector is not properly treated by base::sum (with respect to na.rm=TRUE) HOT 1
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from pillar.