33  The Julia Ecosystem Today

Alec Loudenback

33.1 Chapter Overview

A tour of relevant available packages as of 2023.

33.2 The Julia Ecosystem

The Julia ecosystem favors composability and interoperability, enabled by multiple dispatch. In other words, because it’s easy to automatically specialize functionality based on the type of data being used, there’s much less need to bundle a lot of features within a single package.

As you’ll see, Julia packages tend to be less vertically integrated because it’s easier to pass data around. Counterexamples of this in Python and R:

  • Numpy-compatible packages that are designed to work with a subset of numerically fast libraries in Python
  • special functions in Pandas to read CSV, JSON, database connections, etc.
  • The Tidyverse in R has a tightly coupled set of packages that works well together but has limitations with some other R packages

Julia is not perfect in this regard, but it’s neat to see how frequently things just work. It’s not magic, but because of Julia features outside the scope of this article it’s easy for package developers (and you!) to do this.

Julia also has language-level support for documentation, so packages can follow a consistent style of help-text and have the docs be auto-generated into web pages available locally or online.

The following highlighted packages were chosen for their relevance to typical actuarial work, with a bias towards those used regularly by the authors. This is a small sampling of the over 6000 registered Julia Packages[^2]

33.2.1 Data

Julia offers a rich data ecosystem with a multitude of available packages. Perhaps at the center of the data ecosystem are CSV.jl and DataFrames.jl. CSV.jl is for reading and writing files text files (namely CSVs) and offers top-class read and write performance. DataFrames.jl is a mature package for working with dataframes, comparable to Pandas or dplyr.

Other notable packages include ODBC.jl, which lets you connect to any database (given you have the right drivers installed), and Arrow.jl which implements the Apache Arrow standard in Julia.

Worth mentioning also is Dates, a built-in package making date manipulation straightforward and robust.

Check out JuliaData org for more packages and information.

33.2.2 Plotting

Plots.jl is a meta-package providing an interface to consistently work with several plotting backends, depending if you are trying to emphasize interactivity on the web or print-quality output. You can very easily add animations or change almost any feature of a plot.

StatsPlots.jl extends Plots.jl with a focus on data visualization and compatibility with dataframes.

Makie.jl supports GPU-accelerated plotting and can create very rich, beautiful visualizations, but it’s main downside is that it has not yet been optimized to minimize the time-to-first-plot.

33.2.3 Statistics

Julia has first-class support for missing values, which follows the rules of three-valued logic so other packages don’t need to do anything special to incorporate missing values.

StatsBase.jl and Distributions.jl are essentials for a range of statistics functions and probability distributions respectively.

Others include:

  • Turing.jl, a probabilistic programming (Bayesian statistics) library, which is outstanding in its combination of clear model syntax with performance.
  • GLM.jl for any type of linear modeling (mimicking R’s glm functionality).
  • LsqFit.jl for fitting data to non-linear models.
  • MultivariateStats.jl for multivariate statistics, such as PCA.

You can find more packages and learn about them here.

33.2.4 Machine Learning

Flux, Gen, Knet, and MLJ are all very popular machine learning libraries. There are also packages for PyTorch, Tensorflow, and SciKitML available. One advantage for users is that the Julia packages are written in Julia, so it can be easier to adapt or see what’s going on in the entire stack. In contrast to this design, PyTorch and Tensorflow are built primarily with C++.

Another advantage is that the Julia libraries can use automatic differentiation to optimize on a wider range of data and functions than those built into libraries in other languages.

33.2.5 Differentiable Programming

Sensitivity testing is very common in actuarial workflows: essentially, it’s understanding the change in one variable in relation to another. In other words, the derivative!

Julia has unique capabilities where almost across the entire language and ecosystem, you can take the derivative of entire functions or scripts. For example, the following is real Julia code to automatically calculate the sensitivity of the ending account value with respect to the inputs:

julia> using Zygote

julia> function policy_av(pol)
    COIs = [0.00319, 0.00345, 0.0038, 0.00419, 0.0047, 0.00532]
    av = 0.0
    for (i,coi) in enumerate(COIs)
        av += av * pol.credit_rate
        av += pol.annual_premium
        av -= pol.face * coi
    end
    return av                # return the final account value
end

julia> pol = (annual_premium = 1000, face = 100_000, credit_rate = 0.05);

julia> policy_av(pol)        # the ending account value
4048.08

julia> policy_av'(pol)       # the derivative of the account value with respect to the inputs
(annual_premium = 6.802, face = -0.0275, credit_rate = 10972.52)

When executing the code above, Julia isn’t just adding a small amount and calculating the finite difference. Differentiation is applied to entire programs through extensive use of basic derivatives and the chain rule. Automatic differentiation, has uses in optimization, machine learning, sensitivity testing, and risk analysis. You can read more about Julia’s autodiff ecosystem here.

33.2.6 Utilities

There are also a lot of quality-of-life packages, like Revise.jl which lets you edit code on the fly without needing to re-run entire scripts.

BenchmarkTools.jl makes it incredibly easy to benchmark your code - simply add @benchmark in front of what you want to test, and you will be presented with detailed statistics. For example:

julia> using ActuaryUtilities, BenchmarkTools

julia> @benchmark present_value(0.05,[10,10,10])

BenchmarkTools.Trial: 10000 samples with 994 evaluations.
 Range (min  max):  33.492 ns  829.015 ns  ┊ GC (min  max): 0.00%  95.40%
 Time  (median):     34.708 ns               ┊ GC (median):    0.00%
 Time  (mean ± σ):   36.599 ns ±  33.686 ns  ┊ GC (mean ± σ):  4.40% ±  4.55%

  ▁▃▆▆▆██▇▄▃▂         ▁                                        ▂
  █████████████▆▆▇█▇████▇██▇█▇█▇▇▆▆▅▅▅▅▅▄▅▄▄▅▅▅▅▄▄▁▅▄▄▅▄▄▅▅▆▅▆ █
  33.5 ns       Histogram: log(frequency) by time      45.6 ns <

 Memory estimate: 112 bytes, allocs estimate: 1.

Test is a built-in package for performing testsets, while Documenter.jl will build high-quality documentation based on your inline documentation.

ClipData.jl lets you copy and paste from spreadsheets to Julia sessions.

33.2.7 Other packages

Julia is a general-purpose language, so you will find packages for web development, graphics, game development, audio production, and much more. You can explore packages (and their dependencies) at https://juliahub.com/.

33.2.8 Actuarial packages

Saving the best for last, the next article in the series will dive deeper into actuarial packages, such as those published by JuliaActuary for easy mortality table manipulation, common actuarial functions, financial math, and experience analysis.