GithubHelp home page GithubHelp logo

Comments (23)

dpeterson71 avatar dpeterson71 commented on June 19, 2024 7

I believe the numbers should definitely be displayed in full if there's space, or at least otherwise notify the user that they have been modified. Wasn't one of the founding principles of plyr (and thus the genesis of the tidyverse in general) to not surprise the user (i.e. provide output consistent with input)? If I enter 1000.34 in my data entry, I certainly don't expect to see "1000".

from pillar.

huftis avatar huftis commented on June 19, 2024 4

I don’t if my opnion is worth much, but FWIW, I too find the current behaviour very misleading. It makes it look like there are no non-zero decimals (up to the precision/width used). I’m OK with hiding trailing zeros (up to the precision used), but hiding trailing non-zeros is confusing.

The current behaviour is:

pillar::pillar(c(1000.34, 1000, 0.34567))
#>    <dbl>
#> 1000    
#> 1000    
#>    0.346

I would be happy with this being rendered as either

#>    <dbl>
#> 1000.34    
#> 1000    
#>    0.346

or

#>    <dbl>
#> 1000.340    
#> 1000.000    
#>    0.346

But perhaps dropping the decimals could be restricted to integers (defined as numbers x where x == round(x)), e.g.:

pillar::pillar(c(1000.34, 1000.000078, 1000, 0.34567))
#>    <dbl>
#> 1000.340    
#> 1000.000    
#> 1000    
#>    0.346

or (preferably?)

pillar::pillar(c(1000.34, 1000.000078, 1000, 0.34567))
#>    <dbl>
#> 1000.34    
#> 1000.
#> 1000    
#>    0.346

That is, omitting the . indicates real integers. Or, in other words, having a decimal point is the formatting function telling the user ‘there is something after the decimal point – even though I might not display it (due to lack of space/precision)’.

from pillar.

hadley avatar hadley commented on June 19, 2024 2

I like the idea of using a trailing . to indicate that there's more there

from pillar.

randomgambit avatar randomgambit commented on June 19, 2024 2

This is what I get when I dont specify any option

> pillar::pillar(c(1000.34, 1000.000078, 1000, 0.34567))
   <dbl>
1000.   
1000.   
1000.   
   0.346

and now if I run

> options(pillar.sigfig=10)
> pillar::pillar(c(1000.34, 1000.000078, 1000, 0.34567))
         <dbl>
1.000340000e+3
1.000000078e+3
1.000000000e+3
3.456700000e-1

Damn... I just want to see my full number 1000.000078.
Lets try again

> options(pillar.sigfig=5)
> pillar::pillar(c(1000.34, 1000.000078, 1000, 0.34567))
     <dbl>
1000.3    
1000.0    
1000.0    
   0.34567

which is still rounding my numbers :(

How can I disable this rounding + forced scientific formatting altogether? Again, rounding numbers like this is misleading and dangerous (if enabled by default). Perhaps some users may like that, I am pretty sure most people wont.

Please let me know
Thanks!

from pillar.

krlmlr avatar krlmlr commented on June 19, 2024 2

I'm glad that pillar.sigfig = 7 works for you:

data.frame(x = 1000.000078)
#>      x
#> 1 1000
sprintf("%.23f", 1000.000078)
#> [1] "1000.00007800000003044260666"

Created on 2018-03-01 by the reprex package (v0.2.0).

from pillar.

huftis avatar huftis commented on June 19, 2024 2

I was the one who proposed the ‘trailing decimal point’ feature, but FWIW, I’m not happy with the way it has been implemented. The idea was to use the dot to indicate that ‘there is more here, but we’re not displaying all of it (because of lack of space)’. But the way it’s implemented is to add a trailing decimal dot for all double numbers, regardless of whether they are integers (i.e. x %% 1 == 0, or x == round(x)).

So now only integer values are shown without a dot. I don’t think that’s useful, and it clutters the display of tibbles. To see if a number is an integer or a double, it’s enough to look at the column header, so the extra dot doesn’t add any information. And, at least in my experience, it’s very common that integer values are stored in numeric (double) columns.

I still think the original idea made sense. It’s useful to see if a number is an integer (not necessarily an integer) or if it has been truncated for display purposes. Having a trailing . shown only for truncated numbers (x %% 1 != 0) would give this information, and would make it easy to spot hard-to-find floating-point related issues (e.g. code that assumes that (.1 + .2) * 10 produces the number 3, something it doesn’t (it produces a number slightly larger than 3), but which R by default hides from you).

from pillar.

dpeterson71 avatar dpeterson71 commented on June 19, 2024 1

One last thought. Cleveland's seminal work on visualizing data led to many improvements in graphing parameters and paradigms. The excellent lattice and ggplot2 packages make use of many of his concepts. Similarly, Brewer's extensive work in cartography and color theory guides optimal use of color in visualizations. I wonder if there exists some cognitive research on effective presentation of tabular data? If not, perhaps there's something for data that's analogous to the Chicago or Oxford Manuals of Style that could guide default format choices?

from pillar.

charliejhadley avatar charliejhadley commented on June 19, 2024 1

@randomgambit I think it's wholly unfair to have folks need an understanding of floating point approximations in the beautification of tibble output. There's only once mention of floating points in the entirety of http://r4ds.had.co.nz/ and that's as wooly as possible.

from pillar.

hadley avatar hadley commented on June 19, 2024

Yes

from pillar.

krlmlr avatar krlmlr commented on June 19, 2024

Maybe we could print them if there's enough space?

from pillar.

hadley avatar hadley commented on June 19, 2024

It was a deliberate choice. Maybe it's worth rethinking, as it does seem a bit arbitrary to not display digits when space is available, and sigfigs are highlighted using colour so it's still scannable.

from pillar.

hadley avatar hadley commented on June 19, 2024

@dpeterson71 what do you expect sqrt(2) ^ 2 to print?

from pillar.

dpeterson71 avatar dpeterson71 commented on June 19, 2024

In this case, sqrt(2)^2 should be just 2, as in base R. I would expect sqrt(2) to provide the precision I've requested by base-R's digits option. That's the crux of the problem, though, isn't it? The computer doesn't know a-priori whether I have entered in specific digits (or read them from a manually generated file) or computed something that could potentially be an irrational number.

If the computer is going to modify or change the data I have given it, it should at least have the courtesy to notify me that it has done so rather than blindly dropping information.

from pillar.

hadley avatar hadley commented on June 19, 2024

My point is that no floating point number is exact - I don't think it's unreasonable for tibble to not print .34 when it's only a small part of the value.

(BTW I don't like the principle of avoiding surprise; because different things surprise different people based on what they know)

from pillar.

dpeterson71 avatar dpeterson71 commented on June 19, 2024

@huftis has two very good suggestions, in my opinion. My personal preference would be the first example, where the entries for doubles are justified:

pillar::pillar(c(1000.34, 1000.000078, 1000, 0.34567))
  #>    <dbl>
  #> 1000.340    
  #> 1000.000    
  #> 1000    
  #>    0.346

The last option at least solves part of the issue where the careful observer might notice that the data has been modified by the subtle visual cue of a decimal point with missing digits. However, even though I could eventually learn to deal with that format, it is still harder to read and interpret with the uneven formatting and ragged edges. Our research group would never be allowed to present data that way in a public forum where readability and policy decisions matter.

from pillar.

randomgambit avatar randomgambit commented on June 19, 2024

in my opinion this is extremely dangerous. I mean, I could honestly lose my job if I think I have 100 in my dataframe whereas I have 100.2

Formatting and color are fun, but this is way beyond that.

from pillar.

krlmlr avatar krlmlr commented on June 19, 2024

How do you like the current output with the decimal dot always printed?

from pillar.

randomgambit avatar randomgambit commented on June 19, 2024

actually setting pillar.sigfig = 7 seems to be a good compromise here. 👍

from pillar.

randomgambit avatar randomgambit commented on June 19, 2024

interesting. I think it would be worthwhile to educate the user about floating-point approximations here. Like you could share a link to http://floating-point-gui.de/basic/ on the main tibble page as a reminder/warning.

from pillar.

randomgambit avatar randomgambit commented on June 19, 2024

@martinjhnhadley come on, seriously? anybody can understand that, the point would be to say - look - you can control the sigfig parameter and do all sort of funny stuff with color/shading. However, keep in mind that there is a physical limit on how accurate a number can be in the computer's memory. The reprex from @krlmlr is a nice example/reminder.

from pillar.

randomgambit avatar randomgambit commented on June 19, 2024

I really like the idea of the dot meaning there is more - but we dont see it. However, in practice, i will likely set enough significance digits so that I would always see a few digits in the decimal space. So that option would not impact me as much as the other ones.

from pillar.

krlmlr avatar krlmlr commented on June 19, 2024

Closing in favor of #105. The dot will be shown only if x %% 1 != 0.

from pillar.

github-actions avatar github-actions commented on June 19, 2024

This old thread has been automatically locked. If you think you have found something related to this, please open a new issue and link to this old issue if necessary.

from pillar.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.