19  Matrices and Their Uses

Yun-Tien Lee

“The essence of mathematics is not to make simple things complicated, but to make complicated things simple. ― Stan Gudder”

19.1 Chapter Overview

Matrices and their myriad uses: reframing problems through the eyes of linear algebra, an intuitive refreshing on applicable maths, and recurring patterns of matrix operations in financial modeling.

19.2 Matrix manipulation

We first review basic matrix manipulation routines before going into more advanced topics.

19.2.1 Addition and subtraction

Think of each matrix as a data grid (like a spreadsheet). Adding or subtracting values element by element is analogous to combining two sets of financial figures—such as merging cash inflows and outflows.

Example: Combining variations in A and B where each element represents a specific scenario’s impact when doing cash flow projections. We would like to see combined variations of A and B, and we also would like to know the difference in variations between A and B.

# Define two matrices
A = [1 2 3;
    4 5 6;
    7 8 9]
B = [9 8 7;
    6 5 4;
    3 2 1]
# Perform element-wise matrix addition and subtraction
C = A .+ B
D = A .- B
# Display the result
println("Result of matrix addition:")
println(C)
println("Result of matrix subtraction:")
println(D)
Result of matrix addition:
[10 10 10; 10 10 10; 10 10 10]
Result of matrix subtraction:
[-8 -6 -4; -2 0 2; 4 6 8]

19.2.2 Transpose

Transposing a matrix is akin to flipping a dataset over its diagonal—turning rows into columns. This operation is useful when aligning data for regression or matching dimensions in financial models.

Example: Converting a time-series (rows as time points) in A into a format suitable for cross-sectional analysis (columns as different variables) in B.

# Define a matrix
A = [1 2 3;
    4 5 6;
    7 8 9]
# Perform matrix transpose
B = A'
# Display the result
println("Result of matrix transpose:")
println(B)
Result of matrix transpose:
[1 4 7; 2 5 8; 3 6 9]

19.2.3 Determinant

The determinant acts as a “volume-scaling” factor. It indicates how much a linear transformation stretches or compresses space. A zero determinant signals that the transformation collapses the space into a lower dimension, implying that the matrix cannot be inverted.

Given a matrix A \[ \mathbf{A} = \begin{pmatrix} a_{11} & a_{12} & \cdots & a_{1n} \\ a_{21} & a_{22} & \cdots & a_{2n} \\ \vdots & \vdots & \ddots & \vdots \\ a_{n1} & a_{n2} & \cdots & a_{nn} \end{pmatrix} \]

the determinant of matrix A can be calculated as \[ \det(\mathbf{A}) = \sum_{j=1}^{n} (-1)^{1+j} a_{1j} \det(\mathbf{A}_{1j}) \]

Example: In portfolio theory, a near-zero determinant of a covariance matrix might indicate multicollinearity among assets.

using LinearAlgebra

# Define a matrix
A = [1 2 3;
    5 10 20;
    7 8 9]
# Perform matrix determinant calculation
B = det(A)
# Display the result
println("Result of matrix determinant:")
println(B)
Result of matrix determinant:
30.000000000000007

19.2.4 Trace

The trace, being the sum of the diagonal elements, offers a quick summary that can reflect the total variance or influence of a matrix.

Example: In risk analysis, the trace of a covariance matrix may provide insights into the overall market volatility captured by the diagonal elements.

using LinearAlgebra

# Define a matrix
A = [1 2 3;
    5 10 20;
    7 8 9]
# Perform matrix determinant calculation
B = tr(A)
# Display the result
println("Result of matrix trace:")
println(B)
Result of matrix trace:
20

19.2.5 Norm

A matrix norm measures the “size” or “energy” of the matrix. It generalizes the concept of vector length to matrices, quantifying the overall magnitude.

The Frobenius norm of a matrix is defined as: \[ \|\mathbf{A}\|_F = \sqrt{\sum_{i=1}^{m} \sum_{j=1}^{n} |A_{ij}|^2} \]

Example: A common usage of norms include error analysis, where the norm of the difference between two matrices measures how far an approximation has deviated from the true values. Another common usage in machine learning is during regularization where it allows the training process to know how large an error would be to guide the direction of updating parameters.

using LinearAlgebra

# Define a matrix
A = [1 2 3;
    4 5 6;
    7 8 9]
# Perform matrix norm calculation
B = norm(A)
# Display the result
println("Result of matrix norm:")
println(B)
Result of matrix norm:
16.881943016134134

19.2.6 Multiplication

Matrix multiplication (non-element-wise) represents the composition of linear transformations. It’s like applying a sequence of financial adjustments—first transforming the data with one factor and then modifying it with another. Other applications include:

  • Transforming asset returns by a matrix representing factor loadings to obtain risk contributions.

  • Neural network construction. Matrix multiplication is fundamental for training and using neural networks.

  • Systems of linear equations. Many real-world problems reduce to solving systems of linear equations.

# Define two matrices
A = [1 2 3;
    4 5 6;
    7 8 9]
B = [9 8 7;
    6 5 4;
    3 2 1]
# Perform non-element-wise matrix multiplication
C = A * B
# Display the result
println("Result of non-element-wise matrix multiplication:")
println(C)
Result of non-element-wise matrix multiplication:
[30 24 18; 84 69 54; 138 114 90]

On the other hand, element-wise multiplication multiplies corresponding elements directly (like applying a weight matrix).

Example: Adjusting individual cash flow items by their respective risk weights in stress testing.

# Define two matrices
A = [1 2 3;
    4 5 6;
    7 8 9]
B = [9 8 7;
    6 5 4;
    3 2 1]
# Perform element-wise matrix multiplication
C = A .* B
# Display the result
println("Result of element-wise matrix multiplication:")
println(C)
Result of element-wise matrix multiplication:
[9 16 21; 24 25 24; 21 16 9]

19.2.7 Inversion

Matrix inversion “reverses” a transformation. If a matrix transforms one set of financial assets into another state, its inverse would bring them back.

Example: In solving linear systems for equilibrium pricing, obtaining the inverse of the coefficient matrix allows you to revert to the original asset prices.

# Define a matrix
A = [1 2; 3 4]
# Compute the inverse of the matrix
A_inv = inv(A)
# Display the result
println("Inverse of matrix A:")
println(A_inv)
Inverse of matrix A:
[-1.9999999999999996 0.9999999999999998; 1.4999999999999998 -0.4999999999999999]
Note

For a matrix to be inverted, it must meet several important criteria. - Square Matrix. The matrix must be square, meaning it has the same number of rows and columns. - The determinant of the matrix must be non-zero.

19.3 Matrix decomposition

19.3.1 Eigenvalues

Eigenvalue decomposition, also known as eigen decomposition, is a matrix factorization that decomposes a matrix into its eigenvectors and eigenvalues. This technique uncovers the intrinsic “modes” or principal directions in a dataset. The eigenvalues indicate the strength of each mode, while eigenvectors show the direction or pattern associated with that strength. Eigenvalues and eigenvectors are fundamental concepts in linear algebra and play key roles include:

  • Eigenvalues help in analyzing how linear transformations affect vectors in a vector space.
  • Eigenvalues facilitate the diagonalization of matrices and simplify the calculations.
  • In systems of differential equations, eigenvalues help determine the stability of equilibrium points.
  • Identifying the main factors that cause variance in a set of asset returns, which is critical for risk management or stress testing portfolios.
  • In graph theory, eigenvalues of the adjacency matrix provide insights into the properties of the graph, such as connectivity, stability, and clustering.
  • Many algorithms in data science, like clustering and factorization methods, rely on eigenvalues to identify patterns and reduce dimensionality, which enhances computational efficiency and interpretability.
using LinearAlgebra

# Create a square matrix
A = [1 2 3;
    4 5 6;
    7 8 9]
# Perform eigenvalue decomposition
eigen_A = eigen(A)
# Extract eigenvalues and eigenvectors
λ = eigen_A.values
V = eigen_A.vectors

# Display the results
println("Original Matrix:")
println(A)
println("\nEigenvalues:")
println(λ)
println("\nEigenvectors:")
println(V)
Original Matrix:
[1 2 3; 4 5 6; 7 8 9]

Eigenvalues:
[-1.1168439698070434, -8.582743335036247e-16, 16.11684396980703]

Eigenvectors:
[-0.7858302387420671 0.4082482904638635 -0.2319706872462857; -0.0867513392566285 -0.8164965809277261 -0.5253220933012335; 0.6123275602288101 0.4082482904638627 -0.8186734993561815]
Note

For a matrix to get eigenvalues, it must be square, meaning it has the same number of rows and columns.

19.3.2 Singular values

Singular value decomposition (SVD) breaks a matrix into three matrices U, Σ, and V, representing the left singular vectors (analogous to the primary features), the singular values (diagonal matrix capturing the importance), and the right singular vectors (detail on how features interact), respectively. Singular values are key to:

  • Matrix factorization, which simplifies many matrix operations, making it easier to analyze and manipulate data.
  • Dimensionality reduction. This is particularly useful in high-dimensional data scenarios, where reducing dimensions helps eliminate noise and improve computational efficiency.
  • SVD can be used for data compression, particularly in image processing.
  • SVD helps filter out noise in data analysis.
  • SVD provides a robust method for solving linear equations, particularly when the matrix is ill-conditioned or singular.
  • In machine learning, SVD helps extract important features from datasets.
  • SVD provides insights into the relationships within data. The singular values indicate the strength of the relationship, while the singular vectors offer a way to visualize and interpret those relationships.
using LinearAlgebra

# Create a random matrix
A = rand(4, 3)
# Perform Singular Value Decomposition (SVD)
U, Σ, V = svd(A)
# U: Left singular vectors
# Σ: Singular values (diagonal matrix)
# V: Right singular vectors (transpose)
# Reconstruct original matrix
A_reconstructed = U * Diagonal(Σ) * V'

# Display the results
println("Original Matrix:")
println(A)
println("\nLeft Singular Vectors:")
println(U)
println("\nSingular Values:")
println(Σ)
println("\nRight Singular Vectors:")
println(V)
println("\nReconstructed Matrix:")
println(A_reconstructed)
Original Matrix:
[0.5007618114416251 0.38164807936424316 0.565467532007918; 0.6141067393084273 0.8664751813175299 0.5171314041778887; 0.8264862204021687 0.8634101183424749 0.0029527900648631533; 0.5226928043290482 0.11709986845477349 0.8717720160766714]

Left Singular Vectors:
[-0.4192185697014721 -0.2571978343996437 0.008815372281681305; -0.5970891216177052 0.10743252252358972 -0.7552358624092278; -0.5423566385190048 0.6344036061632345 0.5450686994826252; -0.41664091051841623 -0.7209990232805003 0.3639247095030027]

Singular Values:
[1.9493714212901558, 0.8547557200151304, 0.2373139990152151]

Right Singular Vectors:
[-0.6374519419604103 0.09902831643955806 0.764099741024723; -0.6127214439967609 0.5361182687397203 -0.5806458765805965; -0.46714821390799105 -0.8383141382958251 -0.2810728584831737]

Reconstructed Matrix:
[0.5007618114416252 0.3816480793642434 0.5654675320079182; 0.6141067393084277 0.86647518131753 0.5171314041778889; 0.8264862204021688 0.8634101183424752 0.0029527900648633567; 0.5226928043290484 0.11709986845477378 0.8717720160766715]

19.3.3 Matrix Factorization and Factorization Machines

Matrix factorization is a popular technique in recommendation systems for modeling user-item interactions and making personalized recommendations. The core idea behind matrix factorization is to decompose the user-item interaction matrix into two lower-dimensional matrices, capturing latent factors that represent user preferences and item characteristics. By learning these latent factors, the recommendation system can make predictions for unseen user-item pairs.

Factorization Machines (FM) are a type of supervised machine learning model designed for tasks such as regression and classification, especially in the context of recommendation systems and predictive modeling with sparse data. FM models extend traditional linear models by incorporating interactions between features, allowing them to capture complex relationships within the data.

Example: In credit scoring or recommendation systems for financial products, these techniques reveal latent factors that influence customer behavior.

using Recommendation, SparseArrays, MLDataUtils

# Generate synthetic user-item interaction data
num_users = 100
num_items = 50
num_ratings = 500
user_ids = rand(1:num_users, num_ratings)
item_ids = rand(1:num_items, num_ratings)
ratings = rand(1:5, num_ratings)
# Create a sparse user-item matrix
user_item_matrix = sparse(user_ids, item_ids, ratings)
# Split data into training and testing sets
train_data, test_data = splitobs(user_item_matrix, 0.8)
# Set parameters for matrix factorization
num_factors = 10
num_iterations = 10
# Train matrix factorization model
data = DataAccessor(user_item_matrix)
recommender = MF(data) # FactorizationMachines(data) alternatively
fit!(recommender)
# Predict ratings for the test set
rec = Dict()
for user in 1:num_users
    rec[user] = recommend(recommender, user, num_items, collect(1:num_items))
end
# Evaluate model performance
predictions = []
for (i, j, v) in zip(findnz(test_data.data)[1], findnz(test_data.data)[2], findnz(test_data.data)[3])
    for p in rec[i]
        if p[1] == j
            push!(predictions, p[2])
            break
        end
    end
end
rmse = measure(RMSE(), predictions, nonzeros(test_data.data))
println("Root Mean Squared Error (RMSE): ", rmse)
Root Mean Squared Error (RMSE): 1.3720390535260727

19.3.4 Principal component analysis

Principal Component Analysis (PCA) is a widely used technique in various fields for dimensionality reduction, data visualization, feature extraction, and noise reduction. PCA can also be applied to detect anomalies or outliers in the data by identifying data points that deviate significantly from the normal patterns captured by the principal components. Anomalies may appear as data points with large reconstruction errors or as outliers in the low-dimensional space spanned by the principal components.

Example: Compressing various economic indicators into a handful of principal components to illustrate predominant trends in market dynamics or risk factors.

using MultivariateStats

# Generate some synthetic data
data = randn(100, 5)  # 100 samples, 5 features
# Perform PCA
pca_model = fit(PCA, data; maxoutdim=2)  # Project to 2 principal components
# Transform the data
transformed_data = transform(pca_model, data)
# Access principal components and explained variance ratio
principal_components = pca_model.prinvars
explained_variance_ratio = pca_model.prinvars / sum(pca_model.prinvars)

# Print results
println("Principal Components:")
println(principal_components)
println("Explained Variance Ratio:")
println(explained_variance_ratio)
Principal Components:
[35.77870977694024, 26.19672337340703]
Explained Variance Ratio:
[0.5773047150819269, 0.422695284918073]