<div class="highlight highlight-source-r notranslate position-relative overflow-auto" dir="auto" dat

This is what I get when I dont specify any option <div class="snippet-clipboard-co

<a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-url="/us

Trailing insignificant digits not printed? about pillar HOT 23 CLOSED

r-lib commented on July 18, 2024

Trailing insignificant digits not printed?

from pillar.

Comments (23)

dpeterson71 commented on July 18, 2024 7

I believe the numbers should definitely be displayed in full if there's space, or at least otherwise notify the user that they have been modified. Wasn't one of the founding principles of plyr (and thus the genesis of the tidyverse in general) to not surprise the user (i.e. provide output consistent with input)? If I enter 1000.34 in my data entry, I certainly don't expect to see "1000".

from pillar.

huftis commented on July 18, 2024 4

I don’t if my opnion is worth much, but FWIW, I too find the current behaviour very misleading. It makes it look like there are no non-zero decimals (up to the precision/width used). I’m OK with hiding trailing zeros (up to the precision used), but hiding trailing non-zeros is confusing.

The current behaviour is:

pillar::pillar(c(1000.34, 1000, 0.34567))
#>    <dbl>
#> 1000    
#> 1000    
#>    0.346

I would be happy with this being rendered as either

#>    <dbl>
#> 1000.34    
#> 1000    
#>    0.346

#>    <dbl>
#> 1000.340    
#> 1000.000    
#>    0.346

But perhaps dropping the decimals could be restricted to integers (defined as numbers x where x == round(x)), e.g.:

pillar::pillar(c(1000.34, 1000.000078, 1000, 0.34567))
#>    <dbl>
#> 1000.340    
#> 1000.000    
#> 1000    
#>    0.346

or (preferably?)

pillar::pillar(c(1000.34, 1000.000078, 1000, 0.34567))
#>    <dbl>
#> 1000.34    
#> 1000.
#> 1000    
#>    0.346

That is, omitting the . indicates real integers. Or, in other words, having a decimal point is the formatting function telling the user ‘there is something after the decimal point – even though I might not display it (due to lack of space/precision)’.

from pillar.

hadley commented on July 18, 2024 2

I like the idea of using a trailing . to indicate that there's more there

from pillar.

randomgambit commented on July 18, 2024 2

This is what I get when I dont specify any option

> pillar::pillar(c(1000.34, 1000.000078, 1000, 0.34567))
   <dbl>
1000.   
1000.   
1000.   
   0.346

and now if I run

> options(pillar.sigfig=10)
> pillar::pillar(c(1000.34, 1000.000078, 1000, 0.34567))
         <dbl>
1.000340000e+3
1.000000078e+3
1.000000000e+3
3.456700000e-1

Damn... I just want to see my full number 1000.000078.
Lets try again

> options(pillar.sigfig=5)
> pillar::pillar(c(1000.34, 1000.000078, 1000, 0.34567))
     <dbl>
1000.3    
1000.0    
1000.0    
   0.34567

which is still rounding my numbers :(

How can I disable this rounding + forced scientific formatting altogether? Again, rounding numbers like this is misleading and dangerous (if enabled by default). Perhaps some users may like that, I am pretty sure most people wont.

Please let me know
Thanks!

from pillar.

krlmlr commented on July 18, 2024 2

I'm glad that pillar.sigfig = 7 works for you:

data.frame(x = 1000.000078)
#>      x
#> 1 1000
sprintf("%.23f", 1000.000078)
#> [1] "1000.00007800000003044260666"

Created on 2018-03-01 by the reprex package (v0.2.0).

from pillar.

huftis commented on July 18, 2024 2

I was the one who proposed the ‘trailing decimal point’ feature, but FWIW, I’m not happy with the way it has been implemented. The idea was to use the dot to indicate that ‘there is more here, but we’re not displaying all of it (because of lack of space)’. But the way it’s implemented is to add a trailing decimal dot for all double numbers, regardless of whether they are integers (i.e. x %% 1 == 0, or x == round(x)).

So now only integer values are shown without a dot. I don’t think that’s useful, and it clutters the display of tibbles. To see if a number is an integer or a double, it’s enough to look at the column header, so the extra dot doesn’t add any information. And, at least in my experience, it’s very common that integer values are stored in numeric (double) columns.

I still think the original idea made sense. It’s useful to see if a number is an integer (not necessarily an integer) or if it has been truncated for display purposes. Having a trailing . shown only for truncated numbers (x %% 1 != 0) would give this information, and would make it easy to spot hard-to-find floating-point related issues (e.g. code that assumes that (.1 + .2) * 10 produces the number 3, something it doesn’t (it produces a number slightly larger than 3), but which R by default hides from you).

from pillar.

dpeterson71 commented on July 18, 2024 1

One last thought. Cleveland's seminal work on visualizing data led to many improvements in graphing parameters and paradigms. The excellent lattice and ggplot2 packages make use of many of his concepts. Similarly, Brewer's extensive work in cartography and color theory guides optimal use of color in visualizations. I wonder if there exists some cognitive research on effective presentation of tabular data? If not, perhaps there's something for data that's analogous to the Chicago or Oxford Manuals of Style that could guide default format choices?

from pillar.

charliejhadley commented on July 18, 2024 1

@randomgambit I think it's wholly unfair to have folks need an understanding of floating point approximations in the beautification of tibble output. There's only once mention of floating points in the entirety of http://r4ds.had.co.nz/ and that's as wooly as possible.

from pillar.

hadley commented on July 18, 2024

Yes

from pillar.

krlmlr commented on July 18, 2024

Maybe we could print them if there's enough space?

from pillar.

hadley commented on July 18, 2024

It was a deliberate choice. Maybe it's worth rethinking, as it does seem a bit arbitrary to not display digits when space is available, and sigfigs are highlighted using colour so it's still scannable.

from pillar.

hadley commented on July 18, 2024

@dpeterson71 what do you expect sqrt(2) ^ 2 to print?

from pillar.

dpeterson71 commented on July 18, 2024

In this case, sqrt(2)^2 should be just 2, as in base R. I would expect sqrt(2) to provide the precision I've requested by base-R's digits option. That's the crux of the problem, though, isn't it? The computer doesn't know a-priori whether I have entered in specific digits (or read them from a manually generated file) or computed something that could potentially be an irrational number.

If the computer is going to modify or change the data I have given it, it should at least have the courtesy to notify me that it has done so rather than blindly dropping information.

from pillar.

hadley commented on July 18, 2024

My point is that no floating point number is exact - I don't think it's unreasonable for tibble to not print .34 when it's only a small part of the value.

(BTW I don't like the principle of avoiding surprise; because different things surprise different people based on what they know)

from pillar.

dpeterson71 commented on July 18, 2024

@huftis has two very good suggestions, in my opinion. My personal preference would be the first example, where the entries for doubles are justified:

pillar::pillar(c(1000.34, 1000.000078, 1000, 0.34567))
  #>    <dbl>
  #> 1000.340    
  #> 1000.000    
  #> 1000    
  #>    0.346

The last option at least solves part of the issue where the careful observer might notice that the data has been modified by the subtle visual cue of a decimal point with missing digits. However, even though I could eventually learn to deal with that format, it is still harder to read and interpret with the uneven formatting and ragged edges. Our research group would never be allowed to present data that way in a public forum where readability and policy decisions matter.

from pillar.

randomgambit commented on July 18, 2024

in my opinion this is extremely dangerous. I mean, I could honestly lose my job if I think I have 100 in my dataframe whereas I have 100.2

Formatting and color are fun, but this is way beyond that.

from pillar.

krlmlr commented on July 18, 2024

How do you like the current output with the decimal dot always printed?

from pillar.

randomgambit commented on July 18, 2024

actually setting pillar.sigfig = 7 seems to be a good compromise here. 👍

from pillar.

randomgambit commented on July 18, 2024

interesting. I think it would be worthwhile to educate the user about floating-point approximations here. Like you could share a link to http://floating-point-gui.de/basic/ on the main tibble page as a reminder/warning.

from pillar.

randomgambit commented on July 18, 2024

@martinjhnhadley come on, seriously? anybody can understand that, the point would be to say - look - you can control the sigfig parameter and do all sort of funny stuff with color/shading. However, keep in mind that there is a physical limit on how accurate a number can be in the computer's memory. The reprex from @krlmlr is a nice example/reminder.

from pillar.

randomgambit commented on July 18, 2024

I really like the idea of the dot meaning there is more - but we dont see it. However, in practice, i will likely set enough significance digits so that I would always see a few digits in the decimal space. So that option would not impact me as much as the other ones.

from pillar.

krlmlr commented on July 18, 2024

Closing in favor of #105. The dot will be shown only if x %% 1 != 0.

from pillar.

github-actions commented on July 18, 2024

This old thread has been automatically locked. If you think you have found something related to this, please open a new issue and link to this old issue if necessary.

from pillar.

Trailing insignificant digits not printed? about pillar HOT 23 CLOSED

Comments (23)

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent

Jobs