Skip to content

adding ppc_rootogram_grouped function#419

Draft
behramulukir wants to merge 1 commit intomasterfrom
ppc-rootogram-grouped
Draft

adding ppc_rootogram_grouped function#419
behramulukir wants to merge 1 commit intomasterfrom
ppc-rootogram-grouped

Conversation

@behramulukir
Copy link
Collaborator

This PR adds ppc_rootogram_grouped function requested at #377. If you have any feedback, I am happy to hear @mhollanders.

ppc_rootogram_grouped simply creates multiple facets (plots) of the rootogram based on the grouping user passes to the function as an argument.

Something to note here is that in ppc_bars_grouped, force_axes_in_facets function is used to make sure all facets have axes. This was resulting in errors when used with ppc_rootogram_grouped(style="discrete") since -Inf results in NaN when used with scale_y_sqrt(). It was still used in other rootogram styles, but so far, I hadn't managed to get it working for discrete style. I'll mark this PR as ready once I find a solution to that.

Tasks:

  • implement functions
  • implement tests
  • update documentation
  • fixing axes in discrete style

Examples

y <- rpois(100, 20)
yrep <- matrix(rpois(10000, 20), ncol = 100)
group <- gl(2, 50, length = 100, labels = c("GroupA", "GroupB"))

ppc_rootogram_grouped(y, yrep, group, prob = 0.5)

ppc_rootogram_grouped

ppc_rootogram_grouped(y, yrep, group, prob = 0.5, style = "discrete")

ppc_rootogram_grouped_discrete

@codecov-commenter
Copy link

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 98.66%. Comparing base (d831b30) to head (16f1338).

Additional details and impacted files
@@            Coverage Diff             @@
##           master     #419      +/-   ##
==========================================
+ Coverage   98.62%   98.66%   +0.04%     
==========================================
  Files          35       35              
  Lines        5798     5830      +32     
==========================================
+ Hits         5718     5752      +34     
+ Misses         80       78       -2     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@behramulukir behramulukir linked an issue Jan 28, 2026 that may be closed by this pull request
@behramulukir behramulukir self-assigned this Jan 28, 2026
@mhollanders
Copy link

Fantastic, thank you! I will doubtlessly have a use case for this in the next two weeks, and I will report if there's any issues.

@mhollanders
Copy link

@behramulukir, sorry for the basic question, but how do I get to the function? I don't see it when I install the latest development version of bayesplot.

@behramulukir
Copy link
Collaborator Author

Did you install the correct branch @mhollanders? As the changes are not merged to the master branch yet, you should install the branch I am working on. You can do it with devtools::install_github("stan-dev/bayesplot", ref = "ppc-rootogram-grouped")

@mhollanders
Copy link

Thank you, and pardon for my ignorance. It works great!

@behramulukir
Copy link
Collaborator Author

No worries! I'm glad that you liked it

@jgabry
Copy link
Member

jgabry commented Feb 5, 2026

Thanks @behramulukir and thanks @mhollanders for testing it out!

Something to note here is that in ppc_bars_grouped, force_axes_in_facets function is used to make sure all facets have axes. This was resulting in errors when used with ppc_rootogram_grouped(style="discrete") since -Inf results in NaN when used with scale_y_sqrt(). It was still used in other rootogram styles, but so far, I hadn't managed to get it working for discrete style. I'll mark this PR as ready once I find a solution to that.

Does 0 work instead of -Inf? If 0 works here we could add an argument to force_axes_in_facets() to change between -Inf and 0 (leaving the default as -Inf so it stays the same for the other plots).

@mhollanders
Copy link

It's tangentially related but I might as well piggy-back on to this post: are there any PPC tips for zero-inflated count data? This output is for occupancy models, where zero-inflation is modeled at the level of the site and species. So rare species will have lots of 0s. I'm attaching two plots, one for the aggregated (predicted) counts per species and site (Q), and one for the predicted counts per species, site, and replicate survey. (y) The excess zeros make them hard to interpret!

fig-ppc-Q fig-ppc-y

@behramulukir
Copy link
Collaborator Author

Does 0 work instead of -Inf? If 0 works here we could add an argument to force_axes_in_facets() to change between -Inf and 0 (leaving the default as -Inf so it stays the same for the other plots).

The issue here is that scale_y_sqrt adds a padding before the x-axis line so that it is plotted at the negative y range. This exists in the original ppc_rootogram(style="discrete") as well. It is normally fine and looks visually good. The issue comes up when we try to plot the y-axis line in the negative y range. Since we cannot assign a negative y value, we can only plot the y-axis line up to y=0. This creates a gap between the end of the y-axis line and the x-axis line. Trying different values in force_axes_in_facets() yielded me results that work only in some cases and fail in other cases. For example, here is a half-working solution I was able to come up with:

force_axes_in_facets_sqrt <- function() {
  thm <- bayesplot_theme_get()
  annotate(
    "segment",
    x = c(-Inf, -Inf), xend = c(0, -Inf),
    y = c(0, 0), yend = c(0, Inf),
    color = thm$axis.line$colour %||% thm$line$colour %||% "black",
    linewidth = thm$axis.line$linewidth %||% thm$line$linewidth %||% 0.5
  )
}

As you can see in the following examples, there is a gap between the y-axis line and the x-axis line, and it fails in cases where we have multiple rows, since it doesn't plot an x-axis.

ppc_rootogram_grouped(y, yrep, group, prob = 0.5, style = "discrete")
ppc_rootogram_grouped(y, yrep, group, prob = 0.5, style = "discrete", facet_args = list(nrow=2))
plot-76 plot-74

If we try to fix the issue of the lack of an x-axis line in multiple rows of plots, then we get two x-axis lines:

force_axes_in_facets_sqrt <- function() {
  thm <- bayesplot_theme_get()
  annotate(
    "segment",
    x = c(-Inf, -Inf), xend = c(Inf, -Inf),
    y = c(0, 0), yend = c(0, Inf),
    color = thm$axis.line$colour %||% thm$line$colour %||% "black",
    linewidth = thm$axis.line$linewidth %||% thm$line$linewidth %||% 0.5
  )
}
plot-78 plot-77

This is simply because we cannot have any negative value at y = c(0, 0), yend = c(0, Inf). The only viable solution I was able to find was removing the padding between the y=0 and the x-axis line by using scale_y_sqrt(expand = expansion(mult = c(0, 0.05))).

plot-80 plot-81

If we decide to use this solution, we need either to change the look of the discrete rootogram slightly or to have a small difference between the grouped and normal versions of the discrete rootogram. I am not sure which one is better, though.

@behramulukir
Copy link
Collaborator Author

It's tangentially related but I might as well piggy-back on to this post: are there any PPC tips for zero-inflated count data?

Among the existing PPC count plots, I think ppc_rootogram(style = "suspended") is worth trying if you haven't tried it already. Since it plots the differences between expected and observed counts, it might be easier to interpret where the observed count is zero, and the expected count is non-zero (or vice versa).

You can also check Visualizing Count Data Regressions Using Rootograms paper by Christian Kleiber and Achim Zeileis. The paper contains an example case where "a well-known data set from ethology, for which excess zeros and, more generally, overdispersion require treatment".

If you find a solution that you think is worth implementing in bayesplot, please let me know! I would be interested in implementing it (unless you don't want to do it yourself, of course)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Feature request: ppc_rootogram_grouped()

4 participants