Chapter 5 Important Methods

1. Introduction to Systems of Linear Equations

A linear equation in variables is an equation that can be written in the form where b and the coefficients (a) are real or complex numbers. A system of linear equations (or a linear system) is a collection of one or more linear equations involving the same variables. Such a system can be compactly represented in several ways:

As a coefficient matrix.
As an augmented matrix.
As a vector equation involving a linear combination of vectors.
As a matrix equation of the form Ax = b.

A linear system can have three types of solution sets:

One unique solution.
Infinitely many solutions.
No solution (inconsistent).

A system is homogeneous if it is written as Ax = 0. This type of system always has at least one solution, namely the trivial solution x = 0. If x = 0 is the only solution, the vectors are linearly independent; otherwise, if non-zero vectors satisfy the equation, they are non-trivial solutions, and the vectors are linearly dependent. A nonhomogeneous linear system (Ax = b) has solutions of the form x = xₚ + xₕ, where xₚ is a particular solution and xₕ is any solution of the homogeneous equation Ax = 0.

2. Naïve Gauss Elimination (Gaussian Elimination with Back-Substitution)

This is a direct method used for solving linear systems.

Purpose: To transform a given matrix into a row echelon form and then use back-substitution to find the solution for x.
Process: It involves a systematic procedure using three basic row operations on the augmented matrix:
1. Replacing one row by the sum of itself and a multiple of another row.
2. Interchanging two rows.
3. Multiplying a row by a non-zero constant. The steps are:
4. Create an augmented matrix.
5. Begin with the leftmost non-zero column (pivot column) and identify the pivot position (top of the pivot column).
6. Select a non-zero entry in the pivot column as a pivot; interchange rows if necessary to move it to the pivot position.
7. Use row replacement operations to create zeros in all positions below the pivot.
8. Apply steps 2-4 to the submatrix that remains, repeating until the augmented matrix is in row echelon form.
9. The new system, which now has a triangular form, is then solved using back-substitution starting from the last equation.
Key Concepts:
- A matrix is in row echelon form if: all non-zero rows are above any rows of all zeros; each leading entry (leftmost non-zero entry) of a row is in a column to the right of the leading entry of the row above it; and all entries in a column below a leading entry are zeros.
- A pivot position is a location in a matrix that corresponds to a leading 1 in its reduced row echelon form. A pivot column is a column containing a pivot position. A pivot is a non-zero number in a pivot position used to create zeros.
- The elimination steps to reach a row echelon form are known as the forward phase of row reduction.

3. Gauss-Jordan Elimination

This is also a direct method for solving linear systems.

Purpose: To transform a given matrix into a reduced row echelon form, which directly provides the solution for x without needing back-substitution.
Process: It extends Naïve Gauss Elimination:
1. Follow the steps of Naïve Gauss Elimination (the forward phase) to transform the augmented matrix into row echelon form.
2. Starting with the rightmost pivot and working upward and to the left (the backward phase), create zeros above each pivot.
3. If a pivot is not 1, make it 1 by a division (scaling) operation.
4. Repeat until the matrix is in reduced row echelon form.
5. The resultant linear combination of the vector equation directly gives the solution for x.
Key Concepts:
- A matrix in reduced row echelon form satisfies all properties of row echelon form, plus: the leading entry in each non-zero row is 1, and each leading 1 is the only non-zero entry in its column.
- If free variables exist (columns without a pivot position in the reduced row echelon form), the system has infinitely many solutions.
- The forward phase generally takes longer than the backward phase.

4. Solving a System of Linear Equations by Matrix Inversion

This method solves the matrix equation Ax = b by utilising the inverse of the coefficient matrix A.

Process:
- If A is an invertible n x n matrix, the unique solution is given by x = A⁻¹b.
- For a 2x2 matrix, the inverse A⁻¹ can be found using a specific formula involving its determinant: if , then if det A = ad - bc ≠ 0, then.
- For a general n x n matrix, the inverse A⁻¹ can be found by augmenting matrix A with the identity matrix I ([A | I]) and performing row operations to transform A into I. If successful, the identity matrix I will transform into A⁻¹ ([I | A⁻¹]). If A cannot be reduced to I, then A has no inverse.
Key Concepts:
- An n x n matrix A is invertible if there exists an n x n matrix C such that AC = CA = I, where I is the identity matrix. The unique inverse is denoted as A⁻¹.
- If the determinant of A is zero (det A = 0), then A is not invertible; such a matrix is called a singular matrix. A matrix with no inverse is singular.
- This method is generally less efficient than elimination methods for larger systems, except for the 2x2 case.
- A condition number can be computed for a square matrix to indicate the accuracy of the solution x. A larger condition number means the matrix is closer to being singular. An ill-conditioned or “nearly singular” matrix is invertible but can become singular with slight changes to its entries.

5. LU Factorization

This method expresses a matrix A as a product of two or more matrices (an L matrix and a U matrix) to efficiently solve multiple linear equations that share the same coefficient matrix A.

Purpose: It is a form of “matrix factorization” or “analysis of data” that organises data into more useful parts. It is particularly useful for solving a sequence of equations with the same coefficient matrix.
Process:
1. If matrix A can be row-reduced to echelon form (U) without row interchanges, it can be factored as A = LU.
2. L is an m x m lower triangular matrix with 1s on the diagonal (a unit triangular matrix), and U is the m x n echelon form of A. The entries of L are derived from the multipliers used during the row reduction of A to U.
3. To solve Ax = b using LU factorization, one first solves Ly = b for y using forward substitution, and then solves Ux = y for x using back-substitution.
Key Concepts: This factorization simplifies solving subsequent systems because solving triangular systems (Ly=b and Ux=y) is computationally easier.

6. Iterative Methods (General)

Unlike direct methods (e.g., Gaussian Elimination) which theoretically yield an exact solution in a finite number of steps, iterative methods begin with an initial guess for the solution and refine it in successive steps, gradually converging towards the true solution.

Reasons for Use:
1. Large, sparse systems: Direct methods are computationally expensive for matrices with many zero entries. Iterative methods require only a fraction of the operations per step.
2. Polishing: When a good approximation to the solution is already known (e.g., from a previous, similar problem), iterative methods can refine this approximate solution to make it more accurate, which is common in real-time applications where data updates frequently.
Mechanism: Iterative methods generate a sequence of approximate solution vectors x(1), x(2), … for Ax = b using the recursive equation Qx(n+1) = (Q-A)x(n) + b, where Q is a selected non-singular matrix.
Convergence: Convergence is guaranteed if the spectral radius (ρ) of the iteration matrix B = I - Q⁻¹A satisfies ρ(B) < 1. A sufficient condition for convergence is that the coefficient matrix A is strictly diagonally dominant.

7. Jacobi Method

This is a specific iterative method that is a form of fixed-point iteration.

Mechanism: In Jacobi iteration, the matrix Q is taken to be the diagonal of matrix A. The iteration formula is x(n+1) = Bx(n) + h, where B = I - Q⁻¹A and h = Q⁻¹b. Each component of the solution vector at step (n+1) is calculated using the components of the solution vector from the previous iteration (n).
Convergence:
- Convergence is guaranteed if the coefficient matrix A is strictly diagonally dominant. A matrix is strictly diagonally dominant if, for each row, the magnitude of the main diagonal entry is greater than the sum of the magnitudes of the rest of the entries in its row.
- The Spectral Radius Theorem confirms convergence if ρ(B) < 1.
- The Jacobi method typically exhibits linear convergence. If the matrix is not strictly diagonally dominant, the method may not converge.

8. Gauss-Seidel Method

This is another iterative method closely related to the Jacobi method, often converging faster if the method is convergent.

Mechanism: The key difference from Jacobi is that Gauss-Seidel uses the most recently updated values of the unknowns within the current iteration. In this method, Q is taken as the lower triangular part of A, including the diagonal. The iteration formula is x(n+1) = Bx(n) + h, where B = I - Q⁻¹A and h = Q⁻¹b.
Convergence: Like Jacobi, the Gauss-Seidel method converges to the solution as long as the coefficient matrix A is strictly diagonally dominant. The Spectral Radius Theorem also applies to prove its convergence for all initial guesses.

9. Eigenvectors and Eigenvalues

This topic focuses on solving linear systems of the form Ax = λx, which helps understand how a linear transformation A “acts” on a vector x.

Definitions:
- An eigenvector x of an n x n matrix A is a non-zero vector such that when multiplied by A, it results in a scalar multiple of itself ( Ax = λx ).
- A scalar λ is an eigenvalue of A if there is a non-trivial solution x to Ax = λx. An eigenvalue can be zero, but an eigenvector must be non-zero.
Solving Process:
1. The equation Ax = λx is rewritten as Ax - λx = 0, which is equivalent to (A - λI)x = 0, where I is the identity matrix.
2. For non-trivial solutions (eigenvectors), the matrix (A - λI) must not be invertible, meaning its determinant must be zero: det(A - λI) = 0. This equation is called the characteristic equation of the linear system.
3. Solve the characteristic equation for the eigenvalues λ.
4. For each found eigenvalue λ, substitute it back into (A - λI)x = 0 and solve the homogeneous system to find the corresponding eigenvectors x.
Key Concepts:
- The set of all solutions to (A - λI)x = 0 forms the null space of the matrix (A - λI). This set is a subspace of Rⁿ and is called the eigenspace of A corresponding to λ. The eigenspace consists of the zero vector and all eigenvectors for that λ.
- An eigenspace is often represented by its basis, which is a linearly independent set of vectors that spans the subspace.jjkkgg

Wei Jie's Lectures Notes

Explorer