5  Elements of Programming

Alec Loudenback

“Programming is not about typing, it’s about thinking.” — Rich Hickey (2011)

5.1 Chapter Overview

Start building up computer science concepts by introducing tangible programming essentials. Data types, variables, control flow, functions, and scope are introduced.

TipOn Your First Read-through

This chapter is intended to be an introductory reference for most of the basic building blocks for which we will build abstractions on top of in chapters that follow. We want this chapter to essentially be an easy and mildly opinionated stepping-stone on your journey.

At some point, you will likely find yourself seeking more precise or thorough documentation and will begin directly searching or reading the documentation of a language or library itself. However, it may be intimidating or frustrating reading reference documentation due to the density and terminology - let this chapter (and book, writ large) be a bridge for you.

If reading this book in a linear fashion and new to programming, we suggest skipping the following sections and returning when encountering the concept or term later in the book:

Caution

This introductory chapter is intended to provide a survey of the important concepts and building blocks, not to be a complete reference. For full details on available functions, more complete definitions, and a more complete tour of all language features, see the Manual at docs.julialang.org.

5.2 Computer Science, Programming, and Coding

Computer Science is the study of computing and information. As a science, it is distinct from programming languages which are merely coarse implementations of specific computer science concepts1.

Programming (or “coding”) is the art and science of writing code in programming languages to have the computer perform desired tasks. While this may sound mechanistic, programming truly is one of the highest forms of abstract thinking. The design space of potential solutions is so large and potentially complex that much art and experience is needed to create a well-made program.

The language of computer science also provides a lexicon so that financial practitioners can discuss model architecture and characteristics of problems with precision and clarity. Simply having additional terminology and language to describe a concept illuminates aspects of the problem in new ways, opening one’s self up to more innovative solutions.

In this light, the financial modeling that we do can be considered a type of computer program. It takes as input abstract information (data), performs calculations (an algorithm), and returns new data as an output. We generally do not need to consider many things that a software engineer may contemplate such as a graphical user interface, networking, or access restrictions. However, the programming fundamentals are there: a good financial modeler must understand data types, algorithms, and some hardware details.

We will build up the concepts over this and the following chapter:

  • This chapter will provide a survey of important concepts in computer science that will prove useful for our financial modeling. First, we will talk about data types, boolean logic, and basic expressions. We’ll build on those to discuss algorithms (functions) which perform useful work and use control flow and recursion.
  • In the following chapters about abstraction, we will step back and discuss higher level concepts: the “schools of thought” around organizing the relationship between data and functions (functional versus object-oriented programming), design patterns, computational complexity, and compilation.
Tip

There will be brief references to hardware considerations for completeness, but hardware knowledge is not necessary to understand most programming languages (including Julia). It’s impossible to completely avoid talking about hardware when you care about the performance of your code, so feel free to gloss over the reference to hardware details on the first read and come back later after Chapter 9.

It’s highly recommended that you follow along and have a Julia session open (e.g. a REPL or a notebook) when first going through this chapter. See the first part of Chapter 21 if you haven’t gotten that set up yet. Follow along with the examples as we go.

Tip

You can get some help in the REPL by typing a ? followed by the symbol you want help with, for example:

 help?> sum
search: sum sum! summary cumsum cumsum! ...

  sum(f, itr; [init])


  Sum the results of calling function f on each element of itr.

... More text truncated...

5.3 Expressions and Control Flow

5.3.1 Naming Values with Variables

One of the first things it will be convenient to understand is the concept of variables. In virtually every programming language, we can assign values to make our program more organized and meaningful to the human reader. In the following example, we assign values to intermediate symbols to benefit us humans as we convert (silly!) American distance units:

feet_per_yard = 3
yards_per_mile = 1760

feet = 3000
miles = feet / feet_per_yard / yards_per_mile
0.5681818181818182

The above is technically the same thing as just computing 3000 / 3 / 1760, however we’ve given the elements names meaningful to the human user.

Beyond readability, variables are a form of abstraction which allows us to think beyond specific instances of data and numbers to a more general representation. For example, the last line in the prior code example is a very generic computation of a unit conversion relationship and feet could be any number and the expression remains a valid calculation.

Tip

We will dive a little bit deeper into variables and assignment in Section 5.3.4, distinguishing between assignment and references.

5.3.2 Expressions

Within the code examples above, we can zoom in onto small pieces of code called expressions. Expressions are effectively the basic block of code that gets evaluated to produce a value. Here is an expression that adds two integers together that evaluate to a new integer (3 in this case):

1 + 2
3

A bigger program is built up of many of these smaller bits of code.

5.3.2.1 Compound Expression

There’s two kinds of blocks where we can ensure that sub-expressions get evaluated in order and return the last expression as the overall return value: begin and let blocks.

c = begin
    a = 3
    b = 4
    a + b
end

a, b, c
(3, 4, 7)

Alternatively, you can chain together ;s to create a compound expression:

z = (x = 1; y = 2; x + y)
3

Compound expressions allow you to group multiple operations together while still having the entire block evaluate to a single value, typically the last expression. This makes it easy to use complex logic anywhere a value is needed, like in function arguments or assignments.

let blocks define variables within its scope and cannot be accessed outside thats scope. More on scope in Section 5.6.

c = let
    g = 3
    h = 4
    g + h
end

@show c
@show g
c = 7
UndefVarError: `g` not defined in `Main.Notebook`
Suggestion: check for spelling errors or missing imports.
Stacktrace:
 [1] macro expansion
   @ show.jl:1232 [inlined]
 [2] top-level scope
   @ ~/prog/julia-fin-book/foundations-of-programming.qmd:131

5.3.2.2 Conditional Expressions

Conditionals are expressions that evaluate to a boolean true or false. This is the beginning of really being able to assemble complex logic to perform useful work. Here are a handful expressions that would evaluate to true:

1 > 0
1 == 1 # check for equality
1.0 isa Float64
(5 > 0) & (-1 < 2) # "and" expression
(5 > 0) | (-1 > 2) # "or" expression
1 != 2
true
Note

In Julia, the booleans have an integer equality: true is equal to 1 (true == 1) and false is equal to 0 (false == 0). However:

  • true != 5. Only 1 is equal to true (in some languages, any non-zero number is “truthy”).
  • true is not egal to 1 (egal is defined later in this chapter).

Conditionals can be used to assemble different logical paths for the program to follow and the general pattern is an if block:

if condition
    # do one thing
elseif condition
    # do something else
else
    # do something if none of the 
    # other conditions are met
end

A complete example:

function buy_or_sell(my_value, market_price)
    if my_value > market_price
        "buy more"
    elseif my_value < market_price
        "sell"
    else
        "hold"
    end
end

buy_or_sell(10, 15), buy_or_sell(15, 10), buy_or_sell(10, 10)
("sell", "buy more", "hold")

5.3.3 Equality

The “Ship of Theseus problem”2 is an example of how equality can be philosophically complex concept. In computer science we have the advantage that while we may not be able to resolve what’s the “right” type of equality, we can be more precise about it.

Here is an example for which we can see the difference between two types of equality:

  • Egal equality is when a program could not distinguish between two objects at all

  • Equal equality is when the values of two objects are the same

If two things are egal, then they are also equal.

In the following example, s and t are equal but not egal:

s = [1, 2, 3]
t = [1, 2, 3]
s == t, s === t
(true, false)

One way to think about this is that while the values are equal, there is a way that one of the arrays could be made not equal to the other:

t[2] = 5
t
3-element Vector{Int64}:
 1
 5
 3

Now t is no longer equal to s:

s == t
false

The reason this happens is that arrays are containers that can have their contents modified. Even though they originally had the same values, s and t are different containers, and it just so happened that the values they contained started out the same.

Some data can’t be modified, including some kinds of collections. Immutable types like the following tuple, with the same stored values, are egal because there is no way for us to make them different:

(2, 4) === (2, 4)
true

Using this terminology, we could now interpret the “Ship of Theseus” as that his ship is “equal” but not “egal”.

5.3.4 Assignment, References, and Mutability

When we say x = 2 we are assigning the integer value of 2 to the variable x. This is an expression that lets us bind a something to a variable so that it can be referenced more concisely or in different parts of our code. When we re-assign the variable we are not mutating the value: x = 3 does not change the 2.

When we have a mutable object (e.g. an Array or a mutable struct), we can mutate the value inside the referenced container. For example:

1x = [1, 2, 3]
2x[1] = 5
x
1
x refers to the array which currently contains the elements 1, 2, and 3.
2
We re-assign the first element of the array to be the value 5 instead of 1
3-element Vector{Int64}:
 5
 2
 3

In the above example, x has not been reassigned. It is possible for two variables to refer to the same object:

x = [1, 2, 3]
1y = x
x[1] = 6
y
1
y refers to the same underlying array as x
3-element Vector{Int64}:
 6
 2
 3

Note that variables can be re-assigned unless they are marked as const:

const PHI =  π * 2
1
Capitalizing constant variables is a convention in Julia.

If we tried to re-assign PHI, we would get an error.

Warning

Note that if we declare a const variable that refers to a mutable container like an array, the container can still be mutated. It’s the reference to the container that remains constant, not necessarily the elements within the container.

For example, while MY_ARRAY will point always to the same array, the array itself can get mutated

const MY_ARRAY = [1,2]
MY_ARRAY[1] = 99
MY_ARRAY
2-element Vector{Int64}:
 99
  2

5.3.5 Loops

Loops are ways for the program to move through a program and repeat expressions while we want it to. There are two primary loops: for and while.

for loops are loops that iterate over a defined range or set of values. Let’s assume that we have the array v = [6,7,8]. Here are multiple examples of using a for loop in order to print each value to output (println):

# use fixed indices
for i in 1:3
    println(v[i])
end
# use indices the of the array
for i in eachindex(v)
    println(v[i])
end
# use the elements of the array
for x in v
    println(x)
end
# use the elements of the array
for x  v          # ∈ is typed \in<tab>
    println(x)
end

while loops will run repeatedly until an expression is false. Here’s some examples of printing each value of v again:

# index the array
i = 1
while i <= length(v) 
    println(v[i])
1    global i += 1
end
1
global is used to increment i by 1. i is defined outside the scope of the while loop (see Section 5.6).
# index the array
i = 1
while true
    println(v[i])
    if i >= length(v)
1        break
    end
    global i += 1 
end
1
break is used to terminate the loop manually, since the condition that follows the while will never be false.

5.3.6 Performance of loops

Loops are highly performant in Julia and often the fastest way to accomplish things. This approach contrasts with advice often given to Python or R users, where vectorized operations are heavily favored over loops for performance. In Julia, ⁠for loops are highly performant and often the most efficient and readable way to implement an algorithm.

5.4 Data Types

Data types are a way of categorizing information by intrinsic characteristics. We instinctively know that 13.24 is different than "this set of words" and types are how we will formalize this distinction. This is a key conceptual point, and mathematically it’s like we have different sets of objects to perform specialized operations on. Beyond this set-like abstraction is implementation details related to computer hardware. You probably know that computers only natively “speak” in binary zeros and ones. Data types are a primary way that a computer can understand if it should interpret 01000010 as B or as 663.

Each 0 or 1 within a computer is called a bit and eight bits in a row form a byte (such as 01000010). This is where we get terms like “gigabytes” or “kilobits per second” as a measure of the quantity or rate of bits something can handle4.

5.4.1 Numbers

Numbers are usually grouped into two categories: integers and floating-point5 numbers. Integers are like the mathematical set of integers while floating-point is a way of representing decimal numbers. Both have some limitations since computers can only natively represent a finite set of numbers due to the hardware (more on this in Chapter 9). Here are three integers that are input into the REPL (Read-Eval-Print-Loop)6 and the result is printed below the input:

2
2
423
423
1929234
1929234

And three floating-point numbers:

0.2
0.2
-23.3421
-23.3421
14e3      # the same as 14,000.0
14000.0

On most systems, 0.2 will be interpreted as a 64-bit floating point type called Float64 in Julia since most architectures these days are 64-bit7, while on a 32-bit system 0.2 would be interpreted as a Float32. Given that there are a finite amount of bits attempting to represent a continuous, infinite set of numbers means that some numbers are not able to be represented with perfect precision. For example, if we ask for 0.2, the closest representations in 64 and 32 bit are:

  • 0.20000000298023223876953125 in 32-bit

  • 0.200000000000000011102230246251565404236316680908203125 in 64-bit

This leads to special considerations that computers take when performing calculations on floating point maths, some of which will be covered in more detail in Chapter 9. For now, just note that floating point numbers have limited precision and even if we input 0.2, your computations will use the above decimal representations even if it will print out a number with fewer digits shown:

1x = 0.2

2big(x)
1
Here, we assign the value 0.2 to a variable x. More on variables/assignments in Section 5.3.4.
2
big(x) is a arbitrary precision floating point number and by default prints the full precision that was embedded in our variable x, which was originally Float64.
0.200000000000000011102230246251565404236316680908203125
Note

Note the difference in what printed between the last example and when we input 0.2 earlier in the chapter. The former had the same (not-exactly equal to \(0.2\)) value, but it printed an abbreviated set of digits as a nicety for the user, who usually doesn’t want to look at floating point numbers with their full machine precision. The system has the full precision (0.20...3125) but is truncating the output.

In the last example, we’ve converted the normal Float64 to a BigFloat which will not truncate the output when printing.

Integers are similarly represented as 32 or 64 bits (with Int32 and Int64) and are limited to exact precision:

  • -32,767 to 32,767 for Int32

  • -2,147,483,647 to 2,147,483,647 for Int64

Additional range in the positive direction if one chooses to use “unsigned”, non-negative numbers (UInt32 and UInt64). Unlike floating point numbers, the integers have a type Int which will use the system bit architecture by default (that is, Int(30) will create a 64 bit integer on 64-bit systems and 32-bit on 32-bit systems).

TipFloating Point and Excel

Excel’s numeric storage and routine is complex and not quite the same as most programming languages, which follow the Institute of Electrical and Electronics Engineer’s standards (such as the IEEE 754 standard for double precision floating point numbers). Excel uses IEEE for the computations but results (and therefore the cells that comprise many calculations interim values) are stored with 15 significant digits of information. In some ways this is the worst of both worlds: having the sometimes unusual (but well-defined) behavior of floating point arithmetic and having additional modifications to various steps of a calculation. In general, you can assume that the programming language result (following the IEEE 754 standard) is a better result because there are aspects to the IEEE 754 defines techniques to minimize issues that arise in floating point math. Some of the issues (round-off or truncation) can be amplified instead of minimized with Excel.

In practice, this means that it can be difficult to exactly replicate a calculation in Excel in a programming language and vice-versa. It’s best to try to validate a programming model versus Excel model using very small unit calculations (e.g. a single step or iteration of a routine) instead of an all-in result. You may need to define some tolerance threshold for comparison of a value that is the result of a long chain of calculation.

TipCurrencies and Decimals

Due to the inherent imprecision with floating point numbers, they should not be used in storing financial transaction records! The trade-offs inherent floating point math described above to not lend itself to accurate record-keeping. For example, in looking at summing up two products sold for $19.99 and $84.99, since floating point operations like 19.99 + 84.99 do not precisely equal 105.98.

BigFloat(0.11 + 0.12)
0.229999999999999982236431605997495353221893310546875

See how the prior is slightly lower than 105.98 — if we were adding US Dollars here, our system would be off by fractions of a cent. Do that for millions of transactions in a day and you have a problem!

Generally, when doing modeling or even creating a valuation model, it’s okay to use floating point math. As an example, if you are trying to determine the value of an exotic option, your model is likely just fine outputting a value like 101.987087 . If you go and sell this option, you’ll have to settle for either 101.98 or 101.99 when booking it. In most contexts this imprecision is likely okay!

If you are implementing a transaction or trading system, ensure proper treatment of the types representing your monetary numbers. A full treatment is beyond the scope of this book, but for a good introduction to the subject, see https://cs-syd.eu/posts/2022-08-22-how-to-deal-with-money-in-software.

5.4.2 Type Hierarchy

We can describe a hierarchy of types. Both Float64 and Int64 are examples of Real numbers (here, Real is an abstract Julia type which corresponds to the mathematical set of real numbers commonly denoted with \(\mathbb{R}\) ). Both Float64 and Int32 are Real numbers, so why not just define all numbers as a Real type? Because for performant calculations, the computer must know in advance how many bits each number is represented with.

?fig-julia-numeric-types shows the type hierarchy for most built-in Julia number types.

TODO: Once Quarto Issue #10961 is resolved, render the mermaid diagram.

%%| label: fig-julia-numeric-types
%%| fig-cap: "Numeric Type Hierarchy in Julia. Leafs of the tree are concrete types."
%%| fig-width: 6.5
graph TD
    Number --> Real
    Number --> Complex

    Real --> Integer
    Real --> AbstractFloat
    Real --> Rational
    Real --> Irrational

    Integer --> Signed
    Integer --> Unsigned

    Signed --> Int8
    Signed --> Int16
    Signed --> Int32
    Signed --> Int64
    Signed --> Int128
    Signed --> BigInt

    Unsigned --> UInt8
    Unsigned --> UInt16
    Unsigned --> UInt32
    Unsigned --> UInt64
    Unsigned --> UInt128

    AbstractFloat --> Float16
    AbstractFloat --> Float32
    AbstractFloat --> Float64
    AbstractFloat --> BigFloat

The integer and floating point types described in the prior section are known as concrete types because there are no possible sub types (child types). Further, a concrete type can be a bit type if the data type will always have the same number of bits in memory: a Float32 will always be 32 bits in memory, for example. Contrast this with strings (described below) which can contain an arbitrary number of characters.

5.4.3 Collections

Collections are types that are really useful for storing data which contains many elements. This section describes some of the most common and useful types of containers.

5.4.3.1 Arrays

Arrays are the most common way to represent a collection of similar data. For example, we can represent a set of integers as follows:

[1, 10, 300]
3-element Vector{Int64}:
   1
  10
 300

And a floating point array:

[0.2, 1.3, 300.0]
3-element Vector{Float64}:
   0.2
   1.3
 300.0

Note the above two arrays are different types of arrays. The first is Vector{Int64} and the second is Vector{Float64}. These are arrays of concrete types and so Julia will know that each element of an array is the same amount of bits which will enable more efficient computations. With the following set of mixed numbers, Julia will promote the integers to floating point since the integers can be accurately represented8 in floating point.

[1, 1.3, 300.0, 21]
4-element Vector{Float64}:
   1.0
   1.3
 300.0
  21.0

However, if we explicitly ask Julia to use a Real-typed array, the type is now Vector{Real}. Recall that Real is an abstract type. Having heterogeneous types within the array is conceptually fine, but in practice limits performance. Again, this will be covered in more detail in Chapter 9.

In Julia, arrays can be multi-dimensional. Here are are two three-dimensional arrays with length three in each dimension:

rand(3, 3, 3)
3×3×3 Array{Float64, 3}:
[:, :, 1] =
 0.227295  0.155038  0.386715
 0.693694  0.846773  0.600266
 0.816159  0.484829  0.119723

[:, :, 2] =
 0.0742944  0.0137831  0.496126
 0.0402936  0.987477   0.562173
 0.156228   0.510746   0.880989

[:, :, 3] =
 0.409902   0.706882  0.130413
 0.0212694  0.663674  0.270278
 0.526567   0.111197  0.313601
[x + y + z for x in 1:3, y in 11:13, z in 21:23]
3×3×3 Array{Int64, 3}:
[:, :, 1] =
 33  34  35
 34  35  36
 35  36  37

[:, :, 2] =
 34  35  36
 35  36  37
 36  37  38

[:, :, 3] =
 35  36  37
 36  37  38
 37  38  39

The above example demonstrates array comprehension syntax which is a convenient way to create arrays in Julia.

A two-dimensional array has the rows by semi-colons (;):

x = [1 2 3; 4 5 6]
2×3 Matrix{Int64}:
 1  2  3
 4  5  6
Note

In Julia, a Vector{Float64} is simply a one-dimensional array of floating points and a Matrix{Float64} is a two-dimensional array. More precisely, they are type aliases of the more generic Array{Float64,1} and Array{Float64,2} names. Arrays with three or more dimensions don’t have a type alias pre-defined.

5.4.3.2 Array indexing

Array elements are accessed with the integer position, starting at 1 for the first element9 10:

v = [10, 20, 30, 40, 50]
v[2]
20

We can also access a subset of the vector’s contents by passing a range:

v[2:4]
3-element Vector{Int64}:
 20
 30
 40

And we can generically reference the array’s contents, such as:

v[begin+1:end-1]
3-element Vector{Int64}:
 20
 30
 40

We can assign values into the array as well, as well as combine arrays and push new elements to the end:

v[2] = -1
push!(v, 5)
vcat(v, [1, 2, 3])
9-element Vector{Int64}:
 10
 -1
 30
 40
 50
  5
  1
  2
  3

5.4.3.3 Array Alignment

When you have an MxN matrix (M rows, N columns), a choice must be made as to which elements are next to each other in memory. Typical math convention and fundamental computer linear algebra libraries (dating back decades!) are column major and Julia follows that legacy. Column major means that elements going down the rows of a column are stored next to each other in memory. This is important to know so that (1) you remember that vectors are treated like a column vector when working with arrays (that is: a N element 1D vector is like a Nx1 matrix), and (2) when iterating through an array, it will be faster for the computer to access elements next to each other column-wise. A 10x10 matrix is actually stored in memory as 100 elements coming in order, one after another in single file.

This 3x4 matrix is stored with the elements of columns next to each other, which we can see with vec:

mat = [1 2 3; 4 5 6; 7 8 9]
3×3 Matrix{Int64}:
 1  2  3
 4  5  6
 7  8  9
vec(mat)
9-element Vector{Int64}:
 1
 4
 7
 2
 5
 8
 3
 6
 9

5.4.3.4 Ranges

A range is a representation of a range of numbers. We actually used them above to index into arrays. They are expressed as start:stop

We don’t have to actually store all of these numbers on the computer somewhere as in an Array. Instead, this is an object that represents the ordered set of numbers. So for example, we can sum up 1 through the number of atoms on the earth instantaneously:

sum(1:big(100_000_000_000_000_000_000_000_000_000_000_000_000_000_000_000_000))
5000000000000000000000000000000000000000000000000050000000000000000000000000000000000000000000000000

This is possible due to two things:

  1. not needing to actually store that many numbers in memory, and
  2. Julia being smart enough to apply the triangular number formula11 when sum is given a consecutive range.

There are more general ways to construct ranges:

Step by another number instead of the default 1:

1:2:7
1:2:7

Specify the number of values within the range, inclusive of the first number12:

# range(start, stop, length)
range(0, 10, 21)
0.0:0.5:10.0

5.4.3.5 Characters, Strings, and Symbols

Characters are represented in most programming languages as letters within quotation marks. In Julia, individual characters are represented using single quotes:

'a'
'a': ASCII/Unicode U+0061 (category Ll: Letter, lowercase)

Letters and other characters present more difficulties than numbers to represent within a computer (think of how many languages and alphabets exist!), and it essentially only works because the world at large has agreed to a given representation. Originally ASCII (American Standard Code for Information Interchange) was used to represent just 95 of the most common English characters (“a” through “z”, zero through nine, etc.). Now, UTF (Unicode Transformation Format) can encode more than a million characters and symbols from many human languages.

Strings are a collection13 of characters, and can be created in Julia with double quotes:

"hello world"
"hello world"

It’s easy to ascertain how ‘normal’ characters can be inserted into a string, but what about things like new lines or tabs? They are represented by their own characters but are normally not printed in computer output. However, those otherwise invisible characters do exist. For example, here we will use a string literal (indicated by the """ ) to tell Julia to interpret the string as given, including the invisible new line created by hitting return on the keyboard between the two words:

"""
hello
world
"""
"hello\nworld\n"

The output above shows the \n character contained within the string.

Symbols are a way of representing an identifier which cannot be seen as a collection of individual characters. :helloworld is distinct from "helloworld" - you can kind of think of the former as an un-executed bit of code - if we were to execute it (with eval(:helloworld)), we would get an error UndefVarError: `a` not defined . Symbols can look like strings but do not behave like them. For now, it is best to not worry about symbols but it is an important aspect of Julia which allows the language to represent aspects of itself as data. This allows for powerful self-reference and self-modification of code but this is a more advanced topic generally out of scope of this book.

5.4.3.6 Tuples

Tuples are a set of values that belong together and are denoted by a values inside parenthesis and separated by a comma. An example might be x-y coordinates in 2 dimensional space:

x = 3
y = 4
p1 = (x, y)
(3, 4)

Tuple’s values can be accessed like arrays:

p1[1]
3

Tuples fill a middle ground between scalar types and arrays in more ways that one:

  • Tuples have no problem having heterogeneous types in the different slots.

  • Tuples are immutable, meaning that you cannot overwrite the value in memory (an error will be thrown if we try to do p[1] = 5).

  • It’s generally expected that within an array, you would be able to apply the same operation to all the elements (e.g. square each element) or do something like sum all of the elements together which isn’t generally case for a tuple.

  • Tuples are generally stack allocated instead of being heap allocated like arrays14, meaning that a lot of times they can be faster than arrays.

5.4.3.6.1 Named Tuples

Named tuples provide a way to give each field within the tuple a specific name. For example, our x-y coordinate example above could become:

p2 = (x=3, y=4)
(x = 3, y = 4)

The benefit is that we can give more meaning to each field and access the values in a nicer way. Previously, we used location[1] to access the x-value, but with the new definition we can access it by name:

p2.x
3

5.4.3.7 Dictionaries

Dictionaries are a container which relates a key to an associated value. Kind of like how arrays relate an index to a value, but the difference is that a dictionary is (1) un-ordered and (2) the key doesn’t have to be an integer.

Here’s an example which relates a name to an age:

d = Dict(
    "Joelle" => 10,
    "Monica" => 84,
    "Zaylee" => 39,
)
Dict{String, Int64} with 3 entries:
  "Monica" => 84
  "Zaylee" => 39
  "Joelle" => 10

Then we can look up an age given a name:

d["Zaylee"]
39

Dictionaries are super flexible data strucures and can be used in many situations.

5.4.4 Parametric Types

We just saw how tuples can contain heterogeneous types of data inside a common container. Parametric Types are a way of allowing types themselves to be variable, with a wrapper type containing a to-be-specified inner-type.

Let’s look at this a little bit closer by looking at the full type:

typeof(p1)
Tuple{Int64, Int64}

location is a Tuple{Int64,Int64} type, which means that its first and second elements are both Int64. Contrast this with:

typeof(("hello", 1.0))
Tuple{String, Float64}

These tuples are both of the form Tuple{T,U} where T and U are both types. Why does this matter? We and the compiler can distinguish between a Tuple{Int64,Int64} and a Tuple{String,Float64} which allows us to reason about things (“I can add the first element of tuple together only if both are numbers”) and the compiler to optimize (sometimes it can know exactly how many bits in memory a tuple of a certain kind will need and be more efficient about memory use). Further, we will see how this can become a powerful force in writing appropriately abstracted code and more logically organize our entire program when we encounter “multiple dispatch” later on.

This is a very powerful technique - we’ve already seen the flexibility of having an Array type which can contain arbitrary inner types and dimensions. The full type signature for an Array looks like Array{InnerType,NumDimensions).

let
    x = [1 2
        3 4]
    typeof(x)
end
Matrix{Int64} (alias for Array{Int64, 2})

5.4.5 Types for things not there

nothing represents that there’s nothing to be returned - for example if there’s no solution to an optimization problem or if a function just doesn’t have any value to return (such as in the case with input/output like println).

missing is to represent something should be there but it’s not, as is all too common in real-world data. Julia natively supports missing and three-value logic, which an an extension of the two-value boolean (true/false) logic, to handle missing logical values:

Table 5.1: Three value logic with true, missing, and false.
(a) Not logic
NOT (!) Value
true false
missing missing
false true
(b) And logic
AND (&) true missing false
true true missing false
missing missing missing false
false false false false
(c) Or Logic
OR (|) true missing false
true true true true
missing true missing missing
false true missing false
Tip

Missing and Nothing are the types while missing and nothing are the values here15. This is analogous to Float64 being a type and 2.0 being a value.

5.4.6 Union Types

When two types may arise in a context, union types are a way to represent that. For example, if we have a data feed and we know that it will produce either a Float64 or a Missing type then we can say that the value for this is Union{Float64,Missing}. This is much better for the compiler (and our performance!) than saying that the type of this is Any.

5.4.7 Creating User Defined Types

We’ve talked about some built-in types but so much additional capabilities come from being able to define our own types. For example, taking the x-y-coordinate example from above, we could do the following instead of defining a tuple:

struct BasicPoint
    x::Int64
    y::Int64
end

p3 = BasicPoint(3, 4)
BasicPoint(3, 4)

BasicPoint is a composite type because it is composed of elements of other types. Fields are accessed the same way as named tuples:

1p3.x, p3.y
1
Note that here, Julia will return a tuple instead of a single value due to the comma separated expressions.
(3, 4)

structs in Julia are immutable like tuples above.

But wait, didn’t tuples let us mix types too via parametric types? Yes, and we can do the same with our type!

struct Point{T}
    x::T
    y::T
end

Line 1 The {T} after the type’s name allows for different Points to be created depending on what the type of the underlying x and y is.

Here’s two new points which now have different types:

p4 = Point(1, 4)
p5 = Point(2.0, 3.0)

p4, p5
(Point{Int64}(1, 4), Point{Float64}(2.0, 3.0))

Note that the types are not equal because they have different type parameters!

typeof(p4), typeof(p5), typeof(p4) == typeof(p5)
(Point{Int64}, Point{Float64}, false)

But both are now subtypes of PPoint2D. The expression X isa Y is true when X is a (sub)type of Y:

p4 isa Point, p5 isa Point
(true, true)

Note though, that the x and y are both of the same type in each PPoint2D that we created. If instead we wanted to allow the coordinates to be of different types, then we could have defined PPoint2D as follows:

struct Point{T,U}
    x::T
    y::U
end
Note

Can we define the structs above without indicating a (parametric) type? Yes!

struct PointUntyped
    x # no type here!
    y # no type declared here either!
end

But! x and y will both be allowed to be Any, which is the fallback type where Julia says that it doesn’t know any more about the type until runtime (the time at which our program encounters the data when running). Observe that the type of x and y here is Any:

fieldtypes(PointUntyped)
(Any, Any)

This means that the compiler (and us!) can’t reason about or optimize the code as effectively as when the types are explicit or parametric. This is an example of how Julia can provide a nice learning curve - don’t worry about the types until you start to get more sophisticated about the program design or need to extract more performance from the code.

The above structs that we have defined are examples of concrete types types which hold data. Abstract types don’t directly hold data themselves but are used to define a hierarchy of types which we will later exploit (Chapter 8) to implement custom behavior depending on what type our data is.

Here’s an example of (1) defining a set of related types that sits above our Point2D:

abstract type Coordinate end
abstract type CartesianCoordinate <: Coordinate end
abstract type PolarCoordinate <: Coordinate end

struct Point2D{T} <: CartesianCoordinate
    x::T
    y::T
end

struct Point3D{T} <: CartesianCoordinate
    x::T
    y::T
    z::T
end

struct Polar2D{T} <: PolarCoordinate
    r::T
    θ::T
end
TipUnicode Characters

Julia has wonderful Unicode support, meaning that it’s not a problem to include characters like θ. The character can be typed in Julia editors by entering \theta and then pressing the TAB key on the keyboard.

Unicode is helpful for following conventions that you may be used to in math. For example, the math formula \(\text{circumference}(r) = 2 \times r \times \pi\) can be written in Julia with circumference(r) = 2 * r * π.

The name for the characters follows the same for LaTeX, so you can search the internet for,e.g. “theta LaTeX” to find the appropriate name. Furhter, you can use the REPL help mode to find out how to enter a character if you can copy and paste it from somewhere:

help?> θ
"θ" can be typed by \theta<tab>

To constrain the types that could be used within our coordinates above, such as if we wanted the fields to all be Real-valued, we could modify the struct definitions with the <:Real annotation:

struct Point2D{T<:Real} <: CartesianCoordinate
    # ...
end

struct Point3D{T<:Real} <: CartesianCoordinate
    # ...
end

struct Polar2D{T<:Real} <: PolarCoordinate
    # ...
end

5.4.8 Mutable structs

It is possible to define structs where the data can be modified - such a data field is said to be mutable because it can be changed or mutated. Here’s an example of what it would look like if we made Point2D mutable:

mutable struct Point2D{T}
    x::T
    y::T
end

You may find that this more naturally represents what you are trying to do. However, recall that an advantage of an immutable datatype is that costly memory doesn’t necessarily have to be allocated for it. So you may think that you’re being more efficient by re-using the same object… but it may not actually be faster. Again, more will be revealed in Chapter 9.

TipFinancial Modeling Pro-tip

For financial models, it is best practice to default to immutable ⁠structs. Immutability prevents accidental modification of data, making your model’s state easier to reason about and debug. This is especially critical in complex models with many interacting components. Use ⁠mutable struct only when you have a specific reason to modify data in-place.

5.4.9 Constructors

Constructors are functions that return a data type (functions will be covered more generally later in the chapter). When we declare a struct, an implicit function is defined that takes a tuple of arguments and returns the data type that was declared. In the following example, after we define MyType the struct, Julia creates a function (also called MyType) which takes two arguments and will return the datatype MyType:

struct MyDate
    year::Int
    month::Int
    day::Int
end

methods(MyDate)
# 2 methods for type constructor:

Implicit constructors are nice in that you don’t have to define a default method and the language does it for you. Sometimes there’s reasons to want to control how an object is created, either for convenience or to enforce certain restrictions.

We can use an inner constructor (i.e. inside the struct block) to enforce restrictions:

struct MyDate
    year::Int
    month::Int
    day::Int

    function MyDate(y,m,d)
        if ~(m in 1:12)
            error("month is not between 1 and 12")
        else if ~(d in 1:31)
            error("day is not between 1 and 31")
        else
            return new(y,m,d)
        end

    end
                
end

And outer constructors are simply functions defined that have the same name as the data type , but are not defined inside the struct block. Extending the MyDate example, say we want to provide a default constructor for if no day is given such that the date returns the 1st of the month:

function MyDate(y,m)
    return MyDate(y,m,1)
end

5.5 Functions

Functions are a set of expressions that take inputs and return specified outputs.

5.5.1 Special Operators

Operators are the glue of expressions which combine values. We’ve already seen quite a few, but let’s develop a little bit of terminology for these functions.

Unary operators are operators which only take a single argument. Examples include the ! which negates a boolean value or - which negates a number:

!true, -5
(false, -5)

Binary operators take two arguments and are some of the most common functions we encounter, such as + or - or >:

1 + 2, 1 - 2, 1 > 2
(3, -1, false)

The above unary and binary operators are special kinds of functions which don’t require the use of parenthesis. However, they can be written with parenthesis for greater clarity:

!(true), -(5), +(1, 2), -(1, 2)
(false, -5, 3, -1)

In Julia, we distinguish between functions which define behavior that maps a set of inputs to outputs. But a single function can adapt its behavior to the arguments themselves. We have just seen the function - be used in two different ways: negation and subtraction depending on whether it had one or two arguments given to it. In this way there is a conceptual hierarchy of functions that complements the hierarchy we have discussed in relation to types:

  • - is the overall function
  • -(x) is a unary function which negates its values, -(x,y) subtracts y from x
  • Specific methods are then created for each combination of concrete types: -(x::Float64) is a different method than -(x::Int)

Methods are specific compiled versions of the function for specific types. This is important because at a hardware level, operations for different types (e.g. integers versus floating point) differ considerably. By optimizing for the specific types Julia is able to achieve nearly ideal performance without the same sacrifices of other dynamic languages. We will develop more with respect to methods when we talk about dispatch in Chapter 8.

For example, factorial would be referred to as the function, while specific implementations are called methods. We can see all of the methods for any function with the methods function, like the following for factorial which has implementations taking into account the specialized needs for different types of arguments:

methods(factorial)
# 7 methods for generic function factorial from Base:

5.5.2 Defining Functions

Functions more generally are defined like so:

function functionname(arguments)
    # ... code that does things
end

Here’s an example which returns the difference between the highest and lowest values in a collection:

function value_range(collection)

    hi = maximum(collection)
    lo = minimum(collection)
1    return hi - lo
end
1
return is optional but recommended to convey to readers of the program where you expect your function to terminate and return a value.

5.5.3 Defining Methods on Types

Here’s another example of a function which calculates the distance between a point and the origin:

1function distance(point)
2    return sqrt(point.x^2 + point.y^2)
end
1
A function block is declared with the name distance which takes a single argument called point
2
We compute the distance formula for a point with x and y coordinates. The return value make explicit what value the function will output.
distance (generic function with 1 method)
Note

An alternate, simpler function syntax for distance would be:

distance(point) = sqrt(point.x^2 + point.y^2)

However, we might at this point note a flaw in our function’s definition if we think about the various Coordinates we defined earlier: our definition would currently only work for Point2D. For example, if we try a Point3D we will get the wrong answer:

distance(Point3D(1, 1, 1,))
1.4142135623730951

The above value should be \(\sqrt(3)\), or approximately \(1.73205\).

What we need to do is define a refined distance for each type, which we’ll call dist to distinguish from the earlier definition.

"""
    dist(point)

The euclidean distance of a point from the origin.
"""
dist(p::Point2D) = sqrt(p.x^2 + p.y^2)
dist(p::Point3D) = sqrt(p.x^2 + p.y^2 + p.z^2)
dist(p::Polar2D) = p.r
dist (generic function with 3 methods)

Now our result will be correct:

dist(Point3D(1, 1, 1,))
1.7320508075688772

This is referred to dispatching on the argument types. Julia will look up to find the most specific method of a function for the given argument types, and falling back to a generic implementation if one is defined.

In Chapter 8 we will see how dispatch (single and multiple) can provide very nice abstractions to simplify the design of a model.

NoteDocstrings (Documentation Strings)

Notice the strings preceding the definition of dist. In Julia, putting a string ("...") or string literal ("""...""") right above the definition will allow Julia to recognize the string as documentation and provided it to the user in help mode (Section 21.4.1) and/or have a documentation tool create a webpage or PDF documentation resource.

CautionDefining Methods for Parametric Types

We learned that Float64 <: Real in the type hierarchy. However, note that Tuple{Float64} is not a sub-type of Tuple{Real}. This is called being invariant in type theory… but for our purposes this just practically means that when we define a method we need to specify that we want it to apply to all subtypes.

For example, myfunction(x::Tuple{Real}) would not be called if x was a Tuple{Float64} because it’s not a sub-type of Tuple{Real}. To act the way we want, would define the method with the signature of myfunction(Tuple{<:Real}) or myfunction{Tuple{T}} where {T<:Real}.

5.5.4 Keyword Arguments

Keyword arguments are arguments that are passed to a function but do not use position to pass data to functions but instead used named arguments. In the following example, filepath is a positional argument while the two arguments after the semicolon (;) are keyword arguments.

function read_data(filepath; normalize_names, has_header_row)
    # ... function would be defined here
end

The function would need to be called and have the two keyword arguments specified:

read_data("results.csv"; normalizenames=true, hasheaderrow=false)

5.5.5 Default Arguments

We are able to define default arguments for both positional and keyword arguments via an assignment expression in the function signature. For example, we can make it so that the user need not specify all the options for each call. Modifying the prior example so that typical CSVs work with less customization from the user:

function read_data(filepath;
    normalizenames = true,
    hasheader = false
)

This is a simplified example, but if you look at the documentation for most data import packages you’ll see a lot of functionality defined via keyword arguments which have sensible defaults so that most of the the time you need not worry about modifying them.

5.5.6 Anonymous Functions

Anonymous functions are functions that have no name and are used in contexts where the name does not matter. The syntax is x -> ...expression with x.... As an example, say that we want to create a vector from another where each element is squared. map applies a function to each member of a given collection:

v = [4, 1, 5]
1map(x -> x^2, v)
1
The x -> x^2 is the anonymous function in this example.
3-element Vector{Int64}:
 16
  1
 25

They are often used when constructing something from another value, or defining a function within optimization or solving routines.

5.5.7 First Class Nature

Functions in many languages including Julia are first class which means that functions can be assigned and moved around like data variables.

In this example, we have a general approach to calculate the error of a modeled result compared to a known truth. In this context, there are different ways to measure error of the modeled result and we can simplify the implementation of loss by keeping the different kinds of error defined separately. Then, we can assign a function to a variable and use it as an argument to another function:

function square_error(guess, correct)
    (correct - guess)^2
end

function abs_error(guess, correct)
    abs(correct - guess)
end

# obs meaning "observations"
function loss(modeled_obs,
    actual_obs,
1    loss_function
)
    sum(
        loss_function.(modeled_obs, actual_obs)
    )
end

2let
3    a = loss([1, 5, 11], [1, 4, 9], square_error)
    b = loss([1, 5, 11], [1, 4, 9], abs_error)
    a, b
end
1
loss_function is a variable that will refer to a function instead of data.
2
Using a let block here is good practice to not have temporary variables a and b scattered around our workspace.
3
Using a function as an argument to another function is an example of functions being treated as “first class”.
(5, 3)

5.5.8 Broadcasting

Looking at the prior definition of dist, what if we wanted to compute the squared distance from the origin for a set of points? If those points are stored in an array, we can broadcast functions to all members of a collection at the same time. This is accomplished using the dot-syntax as follows:

points = [Point2D(1, 2), Point2D(3, 4), Point2D(6, 7)]
dist.(points) .^ 2
3-element Vector{Float64}:
  5.000000000000001
 25.0
 85.0

Let’s unpack that a bit more:

  1. The . in dist.(points) tells Julia to apply the function dist to each element in points.
  2. The . in .^ tells Julia to square each values as well

Why broadcasting is useful:

  1. Without needing any redefinition of functions we were able to transform the function dist and exponentiation (^) to work on a collection of data. This means that we can keep our code simpler and easier to reason about (operating on individual things is easier than adding logic to handle collections of things).
  2. When multiple broadcasted operations are joined together, Julia can fuse the operations so that each operation is performed at the same time instead of each step sequentially. That is, if the operation were not fused, the computer would first calculate dist for each point, and then apply the square on the collection of distances. When it’s fused, the operations can happen at the same time without creating an interim set of values.
Note

For readers coming from numpy-flavored Python or R, broadcasting is a way that can feel familiar to the array-oriented behavior of those two languages. Once you feel comfortable with Julia in general, you may find yourself relaxing and relying less on array-oriented design and instead picking whichever iteration paradigm feels most natural for the problem at hand: loops or broadcasting over arrays.

5.5.8.1 Broadcasting Rules

What happens if one of the collections is not the same size as the others? When broadcasting, singleton dimensions (i.e. the 1 in 1xN, “1-by-N”, dimensions) will be expanded automatically when it makes sense. For example, if you have a single element and a one dimensional array, the single element will be expanded in the function call without using any additional memory (if that dimension matches one of the dimensions of the other array).

The rules with an MxN and a PxQ array:

  • either (M and P) or (N and Q) need to be the same, and
  • one of the non-matching dimensions needs to be 1

Some examples might clarify. This 1x1 element is being combined with a 4x1, so there is a compatible dimension (N and Q match, M is 1):

2 .^ [0, 1, 2, 3]
4-element Vector{Int64}:
 1
 2
 4
 8

Here, this 1x3 works with the 2x3 (N and Q match, M is 1)

[1 2 3] .+ [1 2 3; 4 5 6]
2×3 Matrix{Int64}:
 2  4  6
 5  7  9

This 3x1 isn’t compatible with this 2x3 array (neither M and P nor N and Q match)

#| error: true
[1, 2, 3] .+ [1 2 3; 4 5 6]

This 2x4 isn’t compatible with the 2x3 (M and P match, but N nor Q is 1):

#| error: true
[1 2; 3 4] .+ [1 2 3; 4 5 6]

5.5.8.2 Not Broadcasting

What if you do not want the array to be used element-wise when broadcasting? Then you can wrap the array in a Ref, which is used in broadcasting to make the array be treated like a scalar. In the example below, in(needle,haystack) searches a collection (haystack) for an item (needle) and returns true or false if the item is in the collection:

in(4, [1 2 3; 4 5 6])
true

What if we had an array of things (“needles”) that we wanted to search for? By default, broadcasting would effectively split the array up into collections of individual elements to search:

in.([1, 9], [1 2 3; 4 5 6])
2×3 BitMatrix:
 1  0  0
 0  0  0

Effectively, the result above is the result of this broadcasted result:

in(1, [1,2,3]) # the first row of the above result
in(9, [4,5,6])

If we were expecting Julia to return [1,0] (that the first needle is in the haystack but the second needle is not), then we need to tell Julia not to broadcast along the second array with Ref:

in.([1, 9], Ref([1 2 3; 4 5 6]))
2-element BitVector:
 1
 0

5.5.9 Passing by Sharing

We often want to share data between scopes, such as between modules or by passing something into a function’s scope. Arguments to a function in Julia are passed-by-sharing which means that an outside variable can be mutated from within a function. We can modify the array in the outer scope (scope discussed later in this chapter) from within the function. In this example, we modify the array that is assigned to v by doubling each element:

v = [1, 2, 3]

function double!(v)
    for i in eachindex(v)
        v[i] = 2 * v[i]
    end
end

double!(v)

v
3-element Vector{Int64}:
 2
 4
 6
Tip

Convention in Julia is that a function that modifies it’s arguments has a ! in it’s name and we follow this convention in double! above. Another example would be the built-in function sort! which will sort an array in-place without allocating a new array to store the sorted values.

We won’t discuss all potential ways that programming languages can behave in this regard, but an alternative that one may have seen before (e.g. in Matlab) is pass-by-value where a modification to an argument only modifies the value within the scope. Here’s how to replicate that in Julia by copying the value before handing it to a function. This time, v is not modified because we only passed a copy of the array and not the array itself:

v = [1, 2, 3]
double!(copy(v))
v
3-element Vector{Int64}:
 1
 2
 3

5.5.10 The Function Type

In Julia, every function is an object with its own unique type. Function is the abstract supertype of all functions. You can see this by inspecting the type of a function:

typeof(+)
typeof(+) (singleton type of function +, subtype of Function)

The output, typeof(+), indicates that the function + has its own special type. This specific type is a subtype of the abstract Function type:

typeof(+) <: Function
true

This is true for any function, including ones you define:

function my_func(x)
    x + 1
end

typeof(my_func) <: Function
true

5.6 Scope

In projects of even modest complexity, it can be challenging to come up with unique identifiers for different functions or variables. Scope refers to the bounds for which an identifier is available. We will often talk about the local scope that’s inside some expression that creates a narrowly defined scope (such as a function or let or module block) or the global scope which is the top level scope that contains everything else inside of it. Here are a few examples to demonstrate scope.

```{julia}
1i = 1
2let
3    j = 3
    i + j
end
```
1
i is defined in the global scope and would be available to other inner scopes.
2
The let ... end block creates a local scope which inherits the defined global scope definitions.
3
j is only defined in the local scope created by the let block.

In fact, if we try to use j outside of the scope defined above we will get an error:

j
UndefVarError: UndefVarError(:j, Main.Notebook)
UndefVarError: `j` not defined in `Main.Notebook`
Suggestion: check for spelling errors or missing imports.
Tip

let blocks are a great way to organize your code in bite-sized chunks or to be able to re-use common variable names without worrying about conflict. Here’s an example of using let blocks to:

  1. Perform intermediate calculations without fear of returning a partially modified variable
  2. Re-use common variable names

bonds = let
    df = CSV.read("bonds.csv", DataFrame)
    df.issuer = lookup_issuer(df.CUSIP)
    df
end

mortgages = let
    df = CSV.read("bonds.csv", DataFrame)
    df.issuer = lookup_issuer(df.CUSIP)
    df
end

If we were running this interactively (e.g. step-by step in VS Code, the REPL, or notebooks) then these two code blocks will run completely and will run separately. The short, descriptive name df is reused, but there’s no chance of conflict. We also can’t easily run the block of code (let ... end) and get a partially evaluated result (e.g. getting the dataframe before it has been appropriately modified to add the issuer column).

Here is an example with functions:

x = 2
base = 10
1foo() = base^x
2foo(x) = base^x
3foo(x, base) = base^x

foo(), foo(4), foo(4, 4)
1
Both base and x are inherited from the global scope.
2
x is based on the local scope from the function’s arguments and base is inherited from the global scope.
3
Both base and x are defined in the local scope via the function’s arguments.
(100, 10000, 256)

In Julia, it’s always best to explicitly pass arguments to functions rather than relying on them coming from an inherited scope. This is more straight-forward and easier to reason about and it also allows Julia to optimize the function to run faster because all relevant variables coming from outside the function are defined at the function’s entry point (the arguments).

5.6.1 Modules and Namespaces

Modules are ways to encapsulate related functionality together. Another benefit is that the variables inside the module don’t “pollute” the namespace of your current scope. Here’s an example:

1module Shape

struct Triangle{T}
    base::T
    height::T
end

2function area(t::Triangle)
    return 1 / 2 * t.base * t.height
end
end

3t = Shape.Triangle(4, 2)
4area = Shape.area(t)
1
module defines an encapsulated block of code which is anchored to the namespace Shape
2
Here, area a function defined within the Shape module.
3
Outside of Shape module, we can access the definitions inside via the Module.identifier syntax.
4
Here, area is a variable in our global scope that does not conflict with the area defined within the Shape module. If Shape.area were not within a module then when we said area = ... we would have reassigned area to no longer refer to the function and instead would refer to the area of our triangle.
4.0
Note

Summarizing related terminology:

  • A module is a block of code such as module MySimulation ... end

  • A package is a module that has a specific set of files and associated metadata. Essentially, it’s a module with a Project.toml file that has a name and unique identifier listed, and a file in a src/ directory called MySimulation.jl

    • Library is just another name for a package, and the most common context this comes up is when talking about the packages that are bundled with Julia itself called the standard library (stdlib).

  1. Said differently, computer science may contemplate ideas and abstractions more generally than a specific implementation, as in mathematics where a theorem may be proved (\(a^2 + b^2 = c^2\)) without resorting to specific numeric examples (\(3^2 + 4^2 = 5^2\)).↩︎

  2. The Ship of Theseus problem specifically refers to a legendary ancient Greek ship, owned by the hero Theseus. The paradox arises from the scenario where, over time, each wooden part of the ship is replaced with identical materials, leading to the question of whether the fully restored ship is still the same ship as the original. The Ship of Theseus problem is a thought experiment in philosophy that explores the nature of identity and change. It questions whether an object that has had all of its components replaced remains fundamentally the same object.↩︎

  3. This binary representations correspond to B and 66 with the ASCII character set and 8-bit integer encodings respectively, discussed later in this chapter.↩︎

  4. Some distinctions you may encounter: in short-form, “kb” means kilobits while the upper-case “B” in “kB” means kilobytes. Also confusingly, sometimes the “k” can be binary or decimal - because computers speak in binary, a binary “k” means 1024 (equal to 2^10) instead of the usual decimal 1000. In most computer contexts, the binary (multiples of 1024) is more common.↩︎

  5. The term floating point refers to the fact that the number’s radix (decimal) point can “float” between the significant digits of the number.↩︎

  6. That is, it reads the code input from the user, evaluates what code was given to it, prints the result of the input to the screen, and loops through the process again.↩︎

  7. This means that their central processing units (CPUs) use instructions that are 64 bits long.↩︎

  8. Accurate only to a limited precision, as described in Section 5.4.1.↩︎

  9. Whether an index starts at 1 or 0 is sometimes debated. Zero-based indexing is natural in the context of low-level programming which deal with bits and positional offsets in computer memory. For higher level programming one-based indexing is more natural: in a set of data stored in an array, it is much more natural to reference the first (through \(n^{th}\)) datum instead of the zeroth (through \((n-1)^{th}\) datum.↩︎

  10. Arrays in Julia can actually be indexed with an arbitrary starting point: see the package OffsetArrays.jl↩︎

  11. The triangular numbers (sum of integers from \(1\) to \(n\)) are:\[ T_n = \sum_{k=1}^n k = 1 + 2 + \cdots + n = \frac{n^2 + n}{2} = \frac{n(n+1)}{2} = \binom{n+1}{2} \]↩︎

  12. Whether the last number is in the resulting range depends on if the step evenly divides the end of the range.↩︎

  13. Under the hood, strings are essentially a vector of characters but there are complexities with character encoding that don’t allow a lossless conversion to individual characters of uniform bit length. This is for historical compatibility reasons and to avoid making most documents’ file sizes larger than it needs to be.↩︎

  14. What this means will be explained in Chapter 9 .↩︎

  15. Missing and Nothing are instances of singleton type, which means that there is only a single value that either type can take on.↩︎