Countin’ with histograms and pools

dyce provides two core primitives for enumeration¹.

1	`>>> from dyce import H, P`

H objects represent histograms for modeling discrete outcomes. They encode finite discrete probability distributions as integer counts without any denominator. P objects represent pools (ordered sequences) of histograms. If all you need is to aggregate outcomes (sums) from rolling a bunch of dice (or perform calculations on aggregate outcomes), H objects are probably sufficient. If you need to select certain histograms from a group prior to computing aggregate outcomes (e.g., taking the highest and lowest of each possible roll of n dice), that’s where P objects come in.

As a wise person whose name has been lost to history once said: “Language is imperfect. If at all possible, shut up and point.” So with that illuminating (or perhaps impenetrable) introduction out of the way, let’s dive into some examples!

Basic examples

A six-sided die can be modeled as:

1 2	`>>> H(6) H(6)`

H(n) is shorthand for explicitly enumerating outcomes \([{ {1} .. {n} }]\), each with a frequency of 1.

1 2	`>>> H(6) == H({1: 1, 2: 1, 3: 1, 4: 1, 5: 1, 6: 1}) True`

Tuples with repeating outcomes are accumulated. A six-sided “2, 3, 3, 4, 4, 5” die can be modeled as:

1 2	`>>> H((2, 3, 3, 4, 4, 5)) H({2: 1, 3: 2, 4: 2, 5: 1})`

A fudge die can be modeled as:

1 2	`>>> H((-1, 0, 1)) H({-1: 1, 0: 1, 1: 1})`

Python’s matrix multiplication operator (@) is used to express the number of a particular die (roughly equivalent to the “d” operator in common notations). The outcomes of rolling two six-sided dice (2d6) are:

1 2	`>>> 2@H(6) H({2: 1, 3: 2, 4: 3, 5: 4, 6: 5, 7: 6, 8: 5, 9: 4, 10: 3, 11: 2, 12: 1})`

A pool of two six-sided dice is:

1 2	`>>> P(H(6), H(6)) P(6, 6)`

Where n is an integer, P(n, ...) is shorthand for P(H(n), ...). Python’s matrix multiplication operator (@) can also be used with pools. The above can be expressed more succinctly.

1 2	`>>> 2@P(6) P(6, 6)`

Pools (in this case, Sicherman dice) can be compared to histograms.

1
2
3

>>> d_sicherman = P(H((1, 2, 2, 3, 3, 4)), H((1, 3, 4, 5, 6, 8)))
>>> d_sicherman == 2@H(6)
True

Both histograms and pools support arithmetic operations. 3×(2d6+4) is:

1 2	`>>> 3*(2@H(6)+4) H({18: 1, 21: 2, 24: 3, 27: 4, 30: 5, 33: 6, 36: 5, 39: 4, 42: 3, 45: 2, 48: 1})`

The results show there is one way to make 18, two ways to make 21, three ways to make 24, etc.

Histograms provide rudimentary formatting for convenience.

>>> print((2@H(6)).format())
avg |    7.00
std |    2.42
var |    5.83
  2 |   2.78% |#
  3 |   5.56% |##
  4 |   8.33% |####
  5 |  11.11% |#####
  6 |  13.89% |######
  7 |  16.67% |########
  8 |  13.89% |######
  9 |  11.11% |#####
 10 |   8.33% |####
 11 |   5.56% |##
 12 |   2.78% |#

The Miwin-Distribution is:

>>> miwin_iii = H((1, 2, 5, 6, 7, 9))
>>> miwin_iv = H((1, 3, 4, 5, 8, 9))
>>> miwin_v = H((2, 3, 4, 6, 7, 8))
>>> miwin_dist = miwin_iii + miwin_iv + miwin_v ; miwin_dist
H({4: 1, 5: 2, 6: 3, 7: 4, 8: 7, ..., 22: 7, 23: 4, 24: 3, 25: 2, 26: 1})
>>> print((miwin_dist).format(scaled=True))
avg |   15.00
std |    4.47
var |   20.00
  4 |   0.46% |##
  5 |   0.93% |#####
  6 |   1.39% |#######
  7 |   1.85% |##########
  8 |   3.24% |##################
  9 |   4.17% |#######################
 10 |   4.63% |##########################
 11 |   5.09% |############################
 12 |   7.87% |############################################
 13 |   8.80% |#################################################
 14 |   8.33% |###############################################
 15 |   6.48% |####################################
 16 |   8.33% |###############################################
 17 |   8.80% |#################################################
 18 |   7.87% |############################################
 19 |   5.09% |############################
 20 |   4.63% |##########################
 21 |   4.17% |#######################
 22 |   3.24% |##################
 23 |   1.85% |##########
 24 |   1.39% |#######
 25 |   0.93% |#####
 26 |   0.46% |##

One way to model the outcomes of subtracting the lesser of two six-sided dice from the greater is:

1 2	`>>> abs(H(6) - H(6)) H({0: 6, 1: 10, 2: 8, 3: 6, 4: 4, 5: 2})`

Arithmetic operations implicitly “flatten” pools into histograms.

>>> 3*(2@P(6)+4)
H({18: 1, 21: 2, 24: 3, 27: 4, 30: 5, 33: 6, 36: 5, 39: 4, 42: 3, 45: 2, 48: 1})
>>> abs(P(6) - P(6))
H({0: 6, 1: 10, 2: 8, 3: 6, 4: 4, 5: 2})

Histograms should be sufficient for most calculations. However, pools are useful for “taking” (selecting) only some of each roll’s outcomes. This is done by providing one or more index arguments to the P.h method or the P.rolls_with_counts method. Indexes can be integers, slices, or a mix thereof. Outcome indexes are ordered from least to greatest with negative values counting from the right, as one would expect (i.e., [0], [1], …, [-2], [-1]). Summing the least two faces when rolling three six-sided dice would be:

>>> 3@P(6)
P(6, 6, 6)
>>> (3@P(6)).h(0, 1)  # see warning below about parentheses
H({2: 16, 3: 27, 4: 34, 5: 36, 6: 34, 7: 27, 8: 19, 9: 12, 10: 7, 11: 3, 12: 1})

Mind your parentheses

Parentheses are needed in the above example because @ has a lower precedence than . and […].

>>> 2@P(6).h(1)  # equivalent to 2@(P(6).h(1))
Traceback (most recent call last):
...
IndexError: tuple index out of range
>>> (2@P(6)).h(1)
H({1: 1, 2: 3, 3: 5, 4: 7, 5: 9, 6: 11})

Taking the least, middle, or greatest face when rolling three six-sided dice would be:

>>> p_3d6 = 3@P(6)
>>> p_3d6.h(0)
H({1: 91, 2: 61, 3: 37, 4: 19, 5: 7, 6: 1})
>>> print(p_3d6.h(0).format())
avg |    2.04
std |    1.14
var |    1.31
  1 |  42.13% |#####################
  2 |  28.24% |##############
  3 |  17.13% |########
  4 |   8.80% |####
  5 |   3.24% |#
  6 |   0.46% |

>>> p_3d6.h(1)
H({1: 16, 2: 40, 3: 52, 4: 52, 5: 40, 6: 16})
>>> print(p_3d6.h(1).format())
avg |    3.50
std |    1.37
var |    1.88
  1 |   7.41% |###
  2 |  18.52% |#########
  3 |  24.07% |############
  4 |  24.07% |############
  5 |  18.52% |#########
  6 |   7.41% |###

>>> p_3d6.h(2)
H({1: 1, 2: 7, 3: 19, 4: 37, 5: 61, 6: 91})
>>> print(p_3d6.h(-1).format())
avg |    4.96
std |    1.14
var |    1.31
  1 |   0.46% |
  2 |   3.24% |#
  3 |   8.80% |####
  4 |  17.13% |########
  5 |  28.24% |##############
  6 |  42.13% |#####################

Summing the greatest and the least faces when rolling a typical six-die polygonal set would be:

>>> d10 = H(10)-1 ; d10  # a common “d10” with faces [0 .. 9]
H({0: 1, 1: 1, 2: 1, 3: 1, 4: 1, 5: 1, 6: 1, 7: 1, 8: 1, 9: 1})
>>> h = P(4, 6, 8, d10, 12, 20).h(0, -1)
>>> print(h.format(scaled=True))
avg |   13.48
std |    4.40
var |   19.39
  1 |   0.00% |
  2 |   0.01% |
  3 |   0.06% |
  4 |   0.30% |#
  5 |   0.92% |#####
  6 |   2.03% |###########
  7 |   3.76% |####################
  8 |   5.57% |##############################
  9 |   7.78% |###########################################
 10 |   8.99% |##################################################
 11 |   8.47% |###############################################
 12 |   8.64% |################################################
 13 |   8.66% |################################################
 14 |   6.64% |####################################
 15 |   5.62% |###############################
 16 |   5.16% |############################
 17 |   5.00% |###########################
 18 |   5.00% |###########################
 19 |   5.00% |###########################
 20 |   5.00% |###########################
 21 |   4.50% |#########################
 22 |   2.01% |###########
 23 |   0.73% |####
 24 |   0.18% |

Pools are ordered and iterable.

1 2	`>>> list(2@P(8, 4, 6)) [H(4), H(4), H(6), H(6), H(8), H(8)]`

Indexing selects particular histograms into a new pool.

>>> 2@P(8, 4, 6)
P(4, 4, 6, 6, 8, 8)
>>> (2@P(8, 4, 6))[:2]
P(4, 4)
>>> (2@P(8, 4, 6))[::3]
P(4, 6)

An inefficient way to enumerate all possible rolls is:

1
2
3

>>> import itertools
>>> list(itertools.product(*P(-3, 3)))
[(-3, 1), (-3, 2), (-3, 3), (-2, 1), (-2, 2), (-2, 3), (-1, 1), (-1, 2), (-1, 3)]

Both histograms and pools support various comparison operations as well as substitution. The odds of observing all even faces when rolling \(n\) six-sided dice, for \(n\) in \([1..6]\) is:

>>> d6_even = H(6).is_even()
>>> for n in range(6, 0, -1):
...   number_of_evens_in_nd6 = n@d6_even
...   all_even = number_of_evens_in_nd6.eq(n)
...   print(f"{n: >2}d6: {all_even[1] / sum(all_even.counts()): >6.2%}")
 6d6:  1.56%
 5d6:  3.12%
 4d6:  6.25%
 3d6: 12.50%
 2d6: 25.00%
 1d6: 50.00%

The odds of scoring at least one nine or higher on any single die when rolling \(n\) “exploding” six-sided dice, for \(n\) in \([1..10]\) is:

>>> # By the time we're rolling a third die, we're guaranteed a nine or higher, so we only need to look that far
>>> exploding_d6 = H(6).explode(max_depth=2)
>>> for n in range(10, 0, -1):
...   d6e_ge_9 = exploding_d6.ge(9)
...   number_of_nines_or_higher_in_nd6e = n@d6e_ge_9
...   at_least_one_9 = number_of_nines_or_higher_in_nd6e.ge(1)
...   print(f"{n: >2}d6-exploding: {at_least_one_9[1] / sum(at_least_one_9.counts()): >6.2%}")
10d6-exploding: 69.21%
 9d6-exploding: 65.36%
 8d6-exploding: 61.03%
 7d6-exploding: 56.15%
 6d6-exploding: 50.67%
 5d6-exploding: 44.51%
 4d6-exploding: 37.57%
 3d6-exploding: 29.77%
 2d6-exploding: 20.99%
 1d6-exploding: 11.11%

Dependent probabilities

Where we can identify independent terms and reduce the dependent term to a calculation solely involving independent terms, dependent probabilities can often be compactly expressed via H.substitute,H.foreach, or P.foreach. First, we express independent terms as histograms or pools. Second, we express the dependent term as a callback function. Finally, we can pass the dependent callback function to H.substitute, H.foreach, or P.foreach, along with the independent terms (as arguments in the cases of the latter two).

H.substitute is well suited to situations with a single independent term that can be expressed as a histogram. Say we want to roll a d6 and compare whether the result is strictly greater than its distance from some constant.

>>> d6 = H(6)  # independent term
>>> constant = 4

>>> def outcome_vs_distance_to_constant(__: H, d6_outcome):
...   return d6_outcome > abs(d6_outcome - constant)  # dependent term

>>> print(d6.substitute(outcome_vs_distance_to_constant).format())
avg |    0.67
std |    0.47
var |    0.22
  0 |  33.33% |################
  1 |  66.67% |#################################

Instead of a constant, let’s use another die as a second independent term. We’ll roll a d4 and a d6 and compare whether the d6 is strictly greater than the absolute difference between dice. For multiple independent terms that can be expressed as histograms, the H.foreach class method is often easiest.

>>> d4 = H(4)  # first independent term
>>> d6 = H(6)  # second independent term

>>> def outcome_vs_difference_dependent_term(d4_outcome, d6_outcome):
...   return d6_outcome > abs(d4_outcome - d6_outcome)  # dependent term

>>> h = H.foreach(outcome_vs_difference_dependent_term, d4_outcome=d4, d6_outcome=d6)
>>> print(h.format())
avg |    0.83
std |    0.37
var |    0.14
  0 |  16.67% |########
  1 |  83.33% |#########################################

In the alternative, one could nest substitution functions, where the innermost holds the dependent term, and the outer functions each establish the scope of their respective independent outcomes. However, this isn’t very readable, and is often less efficient than using H.foreach class method.

>>> def sub_d4(__: H, d4_outcome):
...   def sub_d6(__: H, d6_outcome):
...     return d6_outcome > abs(d4_outcome - d6_outcome)
...   return d6.substitute(sub_d6)

>>> d4.substitute(sub_d4) == h
True

Where the dependent term requires inspection of rolls from one or more pools as independent terms, P.foreach class method is useful. Let’s say we have two pools. A roll from the first pool wins if it shows no duplicates but a roll from the second does. A roll from the second pool wins if it shows no duplicates but a roll from the first does. Otherwise, it’s a tie (i.e., if neither or both rolls show duplicates). Let’s compare how three six-sided dice fair against two four-sided dice.

>>> from enum import IntEnum

>>> class DupeVs(IntEnum):
...   SECOND_WINS = -1  # where second_roll shows no duplicates, but first_roll does
...   TIE = 0  # where both rolls show no duplicates or rolls pools have duplicates
...   FIRST_WINS = 1  # where first_roll shows no duplicates, but second_roll does

>>> def compare_fours(first_roll, second_roll):
...   return DupeVs((len(set(first_roll)) == len(first_roll)) - (len(set(second_roll)) == len(second_roll)))

>>> h = P.foreach(compare_fours, first_roll=P(6, 6, 6), second_roll=P(4, 4)) ; h
H({<DupeVs.SECOND_WINS: -1>: 12,
 <DupeVs.TIE: 0>: 19,
 <DupeVs.FIRST_WINS: 1>: 5})
>>> print(h.format())
avg |   -0.19
std |    0.66
var |    0.43
 -1 |  33.33% |################
  0 |  52.78% |##########################
  1 |  13.89% |######

Visualization

H objects provide a distribution method and a distribution_xy method to ease integration with plotting packages like matplotlib. In addition, anydyce provides additional visualization and interactivity conveniences. (Many of the figures in these docs leverage anydyce in their construction.)

Plot: Distribution for 3d6

Source: plot_histogram.py

# ======================================================================================
# Copyright and other protections apply. Please see the accompanying LICENSE file for
# rights and restrictions governing use of this software. All rights not expressly
# waived or licensed are reserved. If that file is missing or appears to be modified
# from its original, then please contact the author before viewing or using this
# software in any capacity.
# ======================================================================================

from __future__ import annotations

from anydyce.viz import plot_bar

from dyce import H


def do_it(style: str) -> None:
    import matplotlib.pyplot

    ax = matplotlib.pyplot.axes()
    text_color = "white" if style == "dark" else "black"
    ax.tick_params(axis="x", colors=text_color)
    ax.tick_params(axis="y", colors=text_color)
    plot_bar(ax, [("", 3 @ H(6))])
    ax.set_title("Distribution for 3d6", color=text_color)

Time to get meta-evil on those outcomes!

Thanks to numerary, dyce offers best-effort support for arbitrary number-like outcomes, including primitives from symbolic expression packages such as SymPy.

>>> import sympy.abc
>>> d6x = H(6) + sympy.abc.x
>>> d8y = H(8) + sympy.abc.y
>>> P(d6x, d8y, d6x).h()
H({2*x + y + 3: 1, 2*x + y + 4: 3, 2*x + y + 5: 6, ..., 2*x + y + 18: 6, 2*x + y + 19: 3, 2*x + y + 20: 1})

Note

Be aware that, depending on implementation, performance can suffer quite a bit when using symbolic primitives.

For histograms and pools, dyce remains opinionated about ordering. For non-critical contexts where relative values are indeterminate, dyce will attempt a “natural” ordering based on the string representation of each outcome. This is to accommodate symbolic expressions whose relative values are often unknowable.

>>> expr = sympy.abc.x < sympy.abc.x * 3 ; expr
x < 3*x
>>> bool(expr)  # nope
Traceback (most recent call last):
  ...
TypeError: cannot determine truth value of Relational

SymPy does not even attempt simple relative comparisons between symbolic expressions, even where they are unambiguously resolvable. Instead, it relies on the caller to invoke its proprietary solver APIs.

>>> bool(sympy.abc.x < sympy.abc.x + 1)
Traceback (most recent call last):
  ...
TypeError: cannot determine truth value of Relational
>>> import sympy.solvers.inequalities
>>> sympy.solvers.inequalities.reduce_inequalities(sympy.abc.x < sympy.abc.x + 1, [sympy.abc.x])
True

dyce, of course, is happily ignorant of all that keenness. (As it should be.) In practice, that means that certain operations won’t work with symbolic expressions where correctness depends on ordering outcomes according to relative value (e.g., dice selection from pools).

Flattening pools works.

>>> d3x = H(3) * sympy.abc.x ; d3x
H({2*x: 1, 3*x: 1, x: 1})
>>> p = P(d3x / 3, (d3x + 1) / 3, (d3x + 2) / 3)
>>> p.h()
H({2*x + 1: 7, 3*x + 1: 1, 4*x/3 + 1: 3, 5*x/3 + 1: 6, 7*x/3 + 1: 6, 8*x/3 + 1: 3, x + 1: 1})

Selecting the “lowest” die doesn’t.

>>> p.h(0)
Traceback (most recent call last):
  ...
TypeError: cannot determine truth value of Relational

Selecting all dice works, since it’s equivalent to flattening (no sorting is required).

1 2	`>>> p.h(slice(None)) H({2x + 1: 7, 3x + 1: 1, 4x/3 + 1: 3, 5x/3 + 1: 6, 7x/3 + 1: 6, 8x/3 + 1: 3, x + 1: 1})`

Enumerating rolls doesn’t, even where there is no selection, because each roll’s outcomes are sorted least-to-greatest.

>>> list(p.rolls_with_counts())
Traceback (most recent call last):
  ...
TypeError: cannot determine truth value of Relational

P.roll “works” (i.e., falls back to natural ordering of outcomes), but that is a deliberate compromise of convenience.

1 2	`>>> p.roll() # doctest: +SKIP (2x/3, 2x/3 + 1/3, x/3 + 2/3)`

P.umap can help pave the way back to concrete outcomes.

>>> f = lambda outcome: outcome.subs({sympy.abc.x: sympy.Rational(1, 3)})
>>> p.umap(f)
P(H({1/9: 1, 2/9: 1, 1/3: 1}), H({4/9: 1, 5/9: 1, 2/3: 1}), H({7/9: 1, 8/9: 1, 1: 1}))
>>> p.umap(f).h(-1)
H({7/9: 9, 8/9: 9, 1: 9})

Further exploration

Consider delving into some applications and translations for more sophisticated examples, or jump right into the API.

Anywhere you see a Binder logo Binder , you can click on it to immediately start tinkering with a temporal instance of that example using anydyce. Just be aware that Binder will eventually delete any changes, so make sure to download any notebooks you want to preserve.

dyce also provides additional primitives (R objects and their kin) which are useful for producing weighted randomized rolls without the overhead of enumeration. These are covered seperately. ↩