Timothy McCarthy
Timothy McCarthy

Categories

I’ve written a library called intime, which provides exhaustive integration between the classes in the java.time library and some common Scala libraries. The most interesting problem I encountered was defining an Ordering (or an Order for Cats) for the java.time.Period class. Because months and years cannot simply be expressed as a number of days, This post will discuss those issues.

The Period class

The java.time package is one of my favourite APIs to work with. Using these classes when they were released along with Java 8 taught me most of what I know about working with dates and times. The first time I tried to convert the truly awful java.util.Date class to LocalDateTime was a formative programming experience. Maybe I’ll write more about the lessons I learned from that experience in another post.

Among the key improvements that came with the java.time classes was the ability to represent “amounts” of time in a semantically clear way. This came through two classes:

  • Duration, which is basically a wrapper around an amount of elapsed nanoseconds, and
  • Period, which represents a number of elapsed days, months and years.

Duration is convenient, but fundamentally is a pretty simple class. Two given points on the “time line” will always be separated by a number of seconds/nanoseconds. When it came time for me to define integrations for Duration for my intime library, they were pretty simple.

Period is an entirely different beast. Whereas the idea of a “day” is pretty simple in this context, the fact that it is composed of “months” and “years” introduces lots of complexity. Exactly how long a month or a year is, depends on what your calendar rules are, and where you sit in history.

In the end, the rules for combining two Period instances (ie, defining a Semigroup) were pretty simple. The additive operation is provided by Period.plus, and Period forms an abelian or commutative group (Cats CommutativeGroup) with this operation.

Comparing two Period values, though, was an interesting problem.

Comparing Period values

Intuitively, we know that a Period has an ordering. 1 day is clearly shorter than 2 days. 14 days is obviously shorter than a month. One year is definitely longer than 150 days.

But there are some comparisons that don’t work intuitively. Is 1 month shorter or longer than 30 days? It’s shorter if if it’s February (28 or 29 days), but it’s longer if it’s March (31 days). A 5 year period might have two leap years (1827 days), or none (1825 days).

We should consider periods to be a partially ordered set. That is, not all pairs of Period values can be compared (eg 29 days vs 1 month), but most pairs can be compared. In Scala we can represent this relationship with the PartialOrdering trait in the standard library, and PartialOrder in Cats.

In this way, we can construct a simple formulation of how this partial ordering should work:

Adding a period p to a base date d gives d' = d + p. Here, d' is the date after p has elapsed from the base date d.

Then, for any two periods p₁ and p₂:

  • p₁ = p₂ if d + p₁ = d + p₂ for all dates d
  • p₁ > p₂ if d + p₁ > d + p₂ for all dates d
  • p₁ < p₂ if d + p₁ < d + p₂ for all dates d
  • p₁ and p₂ are incomparable otherwise

For example:

  • 1 month > 27 days because no matter what date you pick, a date 1 month later will always be after a date 27 days later.
  • 1 month is incomparable with 30 days because 1 month can be longer than, the same as, or shorter than 30 days depending on the date.

So how long is a month?

To efficiently compare two periods, we need to determine their minimum and maximum lengths. For periods under 1 year, variation comes from the fact that months vary in length. We can see this behaviour in the following table:

Num months Min length Max length Variation
0 0 0 0
1 28 31 3
2 59 62 3
3 89 92 3
4 120 123 3
5 150 153 3
6 181 184 3
7 212 215 3
8 242 245 3
9 273 276 3
10 303 306 3
11 334 337 3
12 365 366 1

At first glance, you might imagine that because the shortest 1-month period is 28 days, you can just double this number to get the shortest two month period (56 days). In fact, the shortest 2-month period is 59 days. This is because you never have two back-to-back 28-day months. February is always preceded by January and followed by March, both of which have 31 days.

If we plot the “variation” in periods of n months, we can also see that something weird happens at multiples of 12 months:

We can see here that every 12 month period has either 365 or 366 days. This variation is obviously down to leap years, but it is interesting that when you pick a multiple of 12 months, the variation due to month length disappears.

How long is a year?

Periods over a year vary in length due to leap years. The algorithm for leap years introduces a bit more variation in period length than you might expect, since you “skip” a leap year in 3 of every 4 centuries.

The effect of this is that over a 400 year period, the variation in period length due to the number of leap years varies from 0 days to 3 days.

Interestingly, no matter what the start date, every period that’s a multiple of 400 years has exactly 146097 days.

Pulling it all together

A Period then, can vary in length by up to 3 days due to the number of months, and by another 3 days depending on the number of years. This means that for any given Period, there are up to 6 other Period values to which it cannot be compared.