GithubHelp home page GithubHelp logo

Comments (3)

christophbaur avatar christophbaur commented on May 18, 2024

Hey @yvesmauron! Can you take a look at this bug in the next few days?

from catmaply.

yvesmauron avatar yvesmauron commented on May 18, 2024

Hi @SimonSchuepbach

It seems that the value ranges of occupancy per occupancy_cat overlap. Is this intended/possible in the vbz dataset @christophbaur, or did I miss something?

library(dplyr)

data("vbz")
df <- vbz[[3]]

df %>% 
  group_by(occ_category) %>% 
  summarize(
    min_occupancy = min(occupancy),
    max_occupancy = max(occupancy)
  )
# A tibble: 4 × 3
  occ_category min_occupancy max_occupancy
         <int>         <dbl>         <dbl>
1            1           0            31  
2            2          23            61.9
3            3          45.7          93  
4            4          69.2          96.2

Nonetheless, there seems to be still a small issue with the current logic when ranges are not evenly balanced. So for example, evenly distributed z values per category as shown below work without issue:

df_test <- tibble(
  x=as.integer(c(1,1,1,1,2,2,2,2)),
  y=as.integer(c(1,2,3,4,1,2,3,4)),
  z=as.integer(c(1,3,5,7,8,6,4,2)),
  z_cat=as.factor(as.integer(c(1,2,3,4,4,3,2,1)))
)

df_test %>%
  group_by(z_cat) %>%
  summarize(
    min_z = min(z),
    max_z = max(z),
    category_range = max(z) - min(z)
  )

# A tibble: 4 × 4
  z_cat min_z max_z category_range
  <fct> <int> <int>          <int>
1 1         1     2              1
2 2         3     4              1
3 3         5     6              1
4 4         7     8              1

catmaply(
  df_test,
  x=x,
  y=y,
  z=z,
  categorical_color_range = TRUE,
  categorical_col = z_cat,
  legend_interactive = FALSE,
  x_range = 2
)

image

However, if ranges are uneven, the legend is off as min/max values of z do not align with the legend:

df_test <- tibble(
  x=as.integer(c(1,1,1,1,2,2,2,2)),
  y=as.integer(c(1,2,3,4,1,2,3,4)),
  z=as.integer(c(1,3,5,7,11,6,4,2)),
  z_cat=as.factor(as.integer(c(1,2,3,4,4,3,2,1)))
)

df_test %>%
  group_by(z_cat) %>%
  summarize(
    min_z = min(z),
    max_z = max(z),
    category_range = max(z) - min(z)
  )

# A tibble: 4 × 4
  z_cat min_z max_z category_range
  <fct> <int> <int>          <int>
1 1         1     2              1
2 2         3     4              1
3 3         5     6              1
4 4         7    11              4

catmaply(
  df_test,
  x=x,
  y=y,
  z=z,
  categorical_color_range = TRUE,
  categorical_col = z_cat,
  legend_interactive = FALSE,
  x_range = 2
)

image

We need to investigate the best option to fix this; such as e.g. drawing the ranges of the legend based on ranges of values per category in the dataset or other options. Possible solutions will be posted in this thread in the following weeks.

from catmaply.

christophbaur avatar christophbaur commented on May 18, 2024

Hey @yvesmauron!
Yes, the categories (occ_category) can overlap, as they depend on the vehicles in this example. This is intended. e.g. a long train with no free seats has a different total number of passengers as a small bus with no free seats.

Also the values of occupancy are based on "real" measurements, so the calculated min/max-values per category are random.

library(catmaply)
library(dplyr)

data("vbz")
df <- vbz[[3]]

df %>% 
  group_by(occ_category,
           vehicle) %>% 
  summarize(
    min_occupancy = min(occupancy),
    max_occupancy = max(occupancy)
  )%>%
  ungroup()%>%
  arrange(vehicle)
# A tibble: 8 × 4
  occ_category vehicle min_occupancy max_occupancy
         <int> <fct>           <dbl>         <dbl>
1            1 DGT             0              31  
2            2 DGT            31.0            61.9
3            3 DGT            62.2            93  
4            4 DGT            93.5            96.2
5            1 GT              0.389          22.5
6            2 GT             23              44.2
7            3 GT             45.7            64.2
8            4 GT             69.2            84.1

Adding the category (occ_category) is part of the preprocessing of the data and catmaply does not know anything about how categories are calculated. And this is also intended.

The mentioned Issue from @SimonSchuepbach depends on the switch between legend_interactive = FALSE or legend_interactive = TRUE. Catmaply should render the correct category no matter of the state of legend_interactive, shouldn't it? May the vbz-example is a bit overloaded and tricky due to the overlapping cateogories.

Let's try with this one.
Please note: each z has its own category z_cat with the same name in z_cat_name. The only difference is legend_interactive = FALSE or legend_interactive = TRUE

df_test <- tibble(
  x=as.integer(c(1,1,1,1,2,2,2,2)),
  y=as.integer(c(1,2,3,4,1,2,3,4)),
  z=as.integer(c(1,3,5,11,1,3,5,11)),
  z_cat=as.integer(c(1,3,5,11,1,3,5,11)),
  z_cat_name=as.character(c(1,3,5,11,1,3,5,11))
)


catmaply(
  df_test,
  x=x,
  y=y,
  z=z,
  categorical_color_range = TRUE,
  color_palette = viridis::inferno,
  categorical_col = z_cat,
  legend_interactive = TRUE,
  legend_col = z_cat_name,
  x_range = 2
)

'5' is one of the orange colors, looks like expected
image

VS.

df_test <- tibble(
  x=as.integer(c(1,1,1,1,2,2,2,2)),
  y=as.integer(c(1,2,3,4,1,2,3,4)),
  z=as.integer(c(1,3,5,11,1,3,5,11)),
  z_cat=as.integer(c(1,3,5,11,1,3,5,11)),
  z_cat_name=as.character(c(1,3,5,11,1,3,5,11))
)


catmaply(
  df_test,
  x=x,
  y=y,
  z=z,
  categorical_color_range = TRUE,
  color_palette = viridis::inferno,
  categorical_col = z_cat,
  legend_interactive = FALSE,
  legend_col = z_cat_name,
  x_range = 2
)

'5' is not in the orange colors, but i would expect it there.
image

from catmaply.

Related Issues (19)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.