Math 660 - Fall 1999

Summary

29 August

§1.1
Matrices
Matrix and vector operations
Algorithms
HW
  1. Problem 1.1.1. Solution

2 September

§1.2 & 1.3
Partitioning/block multiplication of matrices A, B
MATLAB notation
Banded matrices
HW
  1. Problem 1.2.5 Solution
  2. Problem 1.2.10 Solution
  3. Problem 1.3.4
  4. MATLAB matrix exercise. Let A the matrix below. Find the submatrices A(2:4,2:5), A([3,2,4],1:5), and A([1,4],[6,3]).

    -2 1 5 1 2 3
    0 4 9 -1 -6 -4
    1 -3 7 0 1 -8
    5 1 -1 1 5 3

  5. Let A be the matrix above, and let B be a matrix with 6 rows, partitioned 1 2 | 3 4 5 | 6 . Give two block partitions for A compatible with B's partitioning scheme. How many partitions are there for A that are compatible with B?

7 September

§2.1
Vector spaces
Definition - Vector space. A vector space is a set V together with two operations, + and × . If u & v are in V, then u + v is in V; if r is a scalar, then r×v is in V. The operations satisfy the following rules.

Addition   Scalar multiplication
u + (v + w) = (u + v) + w   a×(b×u) = (ab)×u
Identitiy: u + 0 = 0 + u = u   (a + b)×u = a× u + b×u
Inverse: u + (-u) = (-u) + u = 0   a×(u + v) = a×u + a×v
u + v = v + u   1×u = u

Subspaces
Definition - Subspace. A subset S of V is a subspace if, under + & × from V, S is a vector space in its own right.
Theorem. S is a subspace of V if and only if these hold:
  1. 0 is in S.
  2. S is closed under + .
  3. S is closed under × .
Ways of getting subspaces.

Linear independence and linear dependence

9 September
§2.1 and class notes

Bases

Definition - basis. A subset B = {v1 ... vn} of V is a basis for v if B spans V and is linearly independent. Equivalently, B is a basis if it is maximally linearly independent; that is, B is not a proper subset of some other linearly independent set. Unless we specifically state otherwise, we will assume that B is ordered.

Theorem. Every basis for V has the same number of vectors as every other basis. This common number is defined to be the dimension of V, dim(V).

Coordinates and bases. The properties defining a basis are exactly the ones needed to define ``good'' coordinates for a vector space. Given an order basis B = {v1 ... vn} and a vector v, we can uniquely write the vector as v = x1v1 +...+ xnvn, and thus represent it by the column vector [v]B = [x1, ..., xn]T

Theorem A. Let V be a vector space, and let dim(V) = n.
  1. V is isomorphic to Rn (or Cn, for the complex case).
  2. Let C = {v1 ... vk}, with k < n. If C is linearly independent, we may add vectors to it to turn it into a basis.
  3. Let D = {v1 ... vj}, with j > n. If D spans V, we may remove vectors from it to turn it into a basis.
  4. If S is a subspace of V, then dim(S) is at most n. If dim(S) = n, then S = V.

Linear transformations

Definition - linear transformation. Let V and W be vector spaces, and let L: V -> W. L is called a linear transformation if for all scalars a, b and vectors u, v, we have L[au+bv]=aL[u]+bL[v].

Subspaces associated with L.
  1. The null space of L - null(L) = {v in V | L[v] = 0} - is a subspace of V. The dimension of the null space is called the nullity.
  2. The range of L - ran(L) = {w in W | w = L[v] for some v in V} - is a subspace of W. The dimension of the range is called the rank of L.

Matrix for L. Let B = {v1 ... vn} and C = {w1 ... wm} be bases for V and W, respectively. The m×n matrix A that represents L has its jth column given by A( : , j) = [ L[vj] ]C. To say that A represents L means that the equation w = L[v] is equivalent to the matrix equation y = Ax, where y = [w]C and x = [v]B.

Theorem B. Let K: U -> V and L: V -> W be linear transformations. Also, let A and B be the matrices representing L and K, respectively. (Appropriate bases are assumed given.)
  1. The composition L o K is a linear transformation, and AB represents L o K. (Matrix multiplication is thus compositon for functions).
  2. If both transformations take V to W, then the matrix representing L+K is A+B.
  3. If c is a scalar, then the matrix for cL is cA.
  4. If L: V -> V is one-to-one and onto, then the inverse mapping L-1 is linear. The matrix for L-1 is A-1, the matrix inverse of A.
HW
  1. Prove Theorem A, Part 3.
  2. Prove Theorem A, Part 4.
  3. Show that ran(L) is a subspace of W.
  4. Find the matrix A of L: P3 -> P3, where L[p]=p' and B = {1, x, x2, x3}.
    1. Find a basis for the column space of A from among its columns. (See Theorem A, Part 3.) Find the rank of A (and of L, too.) Use the correspondence between A and L to read off ran(L) in terms of polynomials.
    2. Find the null space of A, and read off the null space of L in terms of polynomials. Find the nullity of A (and, again of L). Verify that rank(A) + nullity(A) = 4.
  5. Show that if A is an m×n matrix, then rank(A) + nullity(A) = n, the number of columns.

14 September
§2.1, §2.2, and §2.3

Determinants
  1. Recursive definition and permuation definition
  2. If B is A with two rows interchanged, then det(B) = - det(A).
  3. If B is A with a multiple of one row added to another, det(B) = det(A).
  4. If A has two rows identical, then det(A) = 0.
  5. det(AT) = det(A)
  6. If A, B are both n×n matrices, then det(AB) = det(A)det(B)
  7. A is invertible if and only if det(A) is not 0.

Norms

Definition -- Norm. Let V be a vector space, and let f : V --> R. f is a norm for V if these hold.
  1. positivity - f(x) is nonnegative, with f(x) = 0 ==> x=0.
  2. triangle inequality - f(x+y) <= f(x) + f(y)
  3. positive homogeneity - f(a x) = |a|f(x)

p-norms -- ||x||p: = (|x1|p + ... + |xn|p)1/p, p >= 1; ||x||infinity := max{ |xj| },

Holder's Inequality. |xTy| <= || x ||p|| y ||q, 1/p + 1/q = 1

Minkowski's Inequality -- || x + y ||p <= || x ||p + || y ||p

Equivalence of norms. c1 || x ||a <= || x ||b <= c2|| x ||a

Convergence. x(k) converges to x if and only if limk - > infinity|| x(k) - x || = 0.
Matrix Norms. Frobenius norm and operator norms.
HW (Due 9/23)
  1. Prove Holder's inequality. Two hints:
    1. If 0< s < t, and if a,b are positive, a + b = 1, then ln(a s + b t) >= a ln(s) + b ln(t); consequently, sa tb <= a s + b t.
    2. There is no loss of generality in assuming that || x ||p = || y ||q = 1

16 September
§2.3

Matrix Norms
Definition - Operator Norm. Let V and W be vector spaces having norms || v ||V and || w ||W, respectively. If L:V -> W, then we define the operator norm of L by this:
|| L ||V,W := sup{ || L[v] ||W ||v||V-1 } (v is not 0.)
There are two other equivalent ways to formulate this.
  1. || L ||V,W := sup|| v || = 1{ || L[v] ||W }
  2. || L ||V,W is the smallest C for which || L[v] ||W <= C||v||V

Matrix Operator Norms We apply the definition above to the case in which we have
  • A is in Rm×n
  • V = Rn and || v ||V = || v ||r (p-norm, p=r)
  • W = Rm and || w ||W = || w ||p (p-norm, p=p)
and we set || A ||r,p = operator norm || A ||V,W. Also, if r=p, || A ||p := || A ||p,p

Properties
  • There is a vector u in Rn, with || u ||r = 1, such that || A ||r,p = || A u ||p
  • || A ||1 = || A ||1,1 = maxj=1..n || A( : , j ) ||1
  • || A ||2 is the square root of the largest eigenvalue of ATA
  • Submultiplicative property. || A B ||p <= || A ||p|| B ||p (Holds in other situations.)

Application - Perturbations
  • If F is in Rn×n, || F ||p < 1, then I - F is nonsingular, and its inverse is given by the convergent Neumann series (geometric series)
    (I - F)-1 = I + F + F2 + F3 + ...
    Moreover, || (I - F)-1 ||p <= (1 -|| F ||p )-1

HW (Due 9/23)
  1. 2.2.1
  2. 2.2.5
  3. 2.2.8
  4. 2.3.3
  5. 2.5.1 Solution

21 September
§2.4, §2.5.1

Floating Point Numbers

Orthogonal Sets
``Dot'' product: < x, y > = yT x (for the real case; < x, y > = yH x in the complex case)

Angle between x and y: cos-1(< x, y >/|| x || || y ||)

x and y are orthogonal if < x, y > = 0

An orthogonal set of vectors is a set for which each pair of vectors is orthogonal.

A set is orthonormal if it is orthogonal and if each vector in the set has norm 1.

The orthogonal complement of a set S is the space of all vectors orthogonal to every vector in S.

Orthogonal matrix. An n×n matrix U is orthogonal if UTU = I; That is, UT = U-1.

Proposition A.
  1. An orthogonal set of nonzero vectors is linearly independent.
  2. An orthonormal set is linealy independent.

Proposition B.
  1. An orthonormal set having n vectors is a basis for Rn.
  2. A set {u1, ..., un} is orthonormal in Rn if and only if the matrix U = [u1 ... un] is orthogonal.
  3. If the n×n matrix U is orthogonal, so its columns are a basis, then the formula for changing coordinates relative to the standard basis (SB) to the new, orthonormal basis, is
    [x]NB = UT[x]SB

HW (Due 9/30)
  1. Find all floating point numbers in the case that b = 3, t = 3, L=-1, and U = 2. Give an example illustrating ``catastrophic cancellation'' for this system.
  2. Prove formulas 2.5.1 and 2.5.2 in the text. Solution
  3. Show that the singular values of an m×n matirx A are square roots of the eigenvalues of ATA.
  4. 2.5.8
  5. 2.5.9
  6. Find the singular value decomposition for the matrices given below.
    1 -1 1
    3 -1 2

    1 3
    3 -1
    4 2


23 September
§§2.5.1-2.5.3

Gram-Schmidt process

Singular Value Decomposition

HW See the HW for 21 September.

28 September
§§2.5.4-2.5.5, 2.7.1

Thin SVD
Let A be an m×n matrix with SVD given by A = UT S V. If m >= n, then A = U(1:n,:)T S(1:n,1:n) V.

Numerical Rank

Definition.   rank(A,t) := min ||A - B|| <= t rank(B)

Distances to Lower Rank Matrices

Theorem.   min rank(B) = k ||A - B||2 = sk+1 ,   k < rank(A)

Corollary.   If rt = rank(A,t), then srt+1 <= t <= srt

Inverse of a square matrix via SVD expansion
A = s1 u1 v1T + ... + sn un vnT   ==>   A-1 = s1-1 v1 u1T + ... + sn-1 vn unT

HW (Due 10/7)
  1. This exercise requires you to use MATLAB or software capable of doing singular value decompositions. (You may, of course, write your own C or FORTRAN programs.) Let A be the matrix below. Plot rank(A,t) vs. log10 t for t = 10-7 to t = 10. Generate 3 random 6×6 matrices with coefficients < 0.05. Add each matrix to A, and repeat the exercise with it.

    1 1/2 1/3 1/4 1/5 1/6
    1/2 1/3 1/4 1/5 1/6 1/7
    1/3 1/4 1/5 1/6 1/7 1/8
    1/4 1/5 1/6 1/7 1/8 1/9
    1/5 1/6 1/7 1/8 1/9 1/10
    1/6 1/7 1/8 1/9 1/10 1/11

  2. For the four matrices generated in the previous exercise, find the condition number relative to the 2-norm.
  3. Let b be the column vector below. First, solve A x = b, with A as in question 1. Then Generate 3 random 6×1 column vectors with coefficients < 0.05. Add each vector to b, and resolve the system with the new value of b. Calculate the relative change in x divided by the relative change in b. (Use 2-norms here.)

    -1
    2
    3
    4
    -1
    3


30 September
§§2.7.2-2.7.4

Condition number
Definition. K(A) = ||A|| ||A-1||

In solving A x = b, the relative error in x behaves like K(A) times the sum of the relative error in A plus the relative error in b.

2-norm condition number. K2(A) = s1 (sn)-1

MATLAB demo - diary file

5 October
§§2.7.3 & 3.1-3.2.5.

Condition number - revisited
Reviewed definition.
Discussed why det(A) is a poor measure of the conditioning of A.

Upper and lower triangular matrices
Backward substitution. Solving systems of the form Ux=b, where U is upper triangular.

Forward substitution. Solving systems of the form Lx=b, where L is lower triangular.

Properties.
  1. Products of upper traingular matrices are upper triangular.
  2. Inverses of upper triangular are upper triangular.
  3. These two properties hold for unit upper triangular matrices.
  4. All the properties listed hold for lower triangular matrices, too.

Gauss elimination
Matrix form of row operations
  • Row(j) = c*Row(j)   <==>   I - (1-c)ej ejT
  • Row(j) = Row(j) - c*Row(k), k ~ = j,   <==>   I - c*ejekT
  • Zero out all entries in a column x below xk (assume xk ~ = 0)   <==>   Mk = I - tkekT, where
    tk(j) = 0 if j <= k and tk(j) = xj xk-1 if j > k
  • Mk-1 = I + tkekT
LU decomposition
A can be factored in the product of a unit lower traingular matrix L and an upper triangular matrix U, provided all pivots, u(1,1), u(2,2), ... , u(n-1,n-1), are all nonzero.

det(A) = u(1,1)u(2,2) · · · u(n,n)

HW (Due 10/19)
  1. Let A be a real, 2×2 matrix having the singular value decomposition A = USVT. Using the Frobenius norm, find the distance from A to the rank 1 matrices. (Hint: Use the invariance of the Frobenius norm under orthogonal transfromations.)

7 October
§§3.2.5-8, 3.2.12.

An example
Let A be the 3×3 matrix below.

 1 -2  3
 2 -5  4
-1  3  4

The Gauss transformations used to triangularize A all have the form Mk=I - tk ekT, where
t1= [0 2 -1]T and t2= [0 0 -1]T.
As we showed in class,

L = M1-1 M2-1 = I + t1 e1T + t2 e2T =
1 0
2 1 0
-1 -1 1

and the upper triangular matrix U = M2M1A is
U =
1   -2 3
0 -1 -2
0 0 5

When does a matrix have an LU factorization?
Proposition. The pivot ukk for an n×n matrix A is
ukk = det(A(1: k,1: k)) / det(A(1: (k-1),1: (k-1))) .

Theorem. An n×n matrix A has an LU factorization if and only if all submatrices A(1: k,1: k),
k=1: (n-1), are invertible.

Outer product algorithm

HW (continued from 5 October)
  1. Using the matrix A in the example, go through each step in the algorithm on p. 98 of the text.
  2. Determine which of these 3×3 matrices has an LU factorization. For each that does, find it.
    4   -1 1
    1 4 6
    -2 3 -7

    2 -2 1
    -5 5 -9
    -4 3 1

    2 -2 -1
    -1 -1 3
    -4 -6 2

  3. For each of the matrices above, use MATLAB's lu command to find an appropriate factorization. In each case, look at L*U given by the command. Is it the same as the A you started with?
  4. 3.2.5

12 and 14 October
Review and mid-term test.

19 October
§§3.4.1-3.4.4

Partial Pivoting
Permutation matrices and properties
  • P is a matrix with rows or columns of I permuted.
  • P is orthogonal.
  • P*A permutes the rows of A
  • A*P permutes the columns of A.
Interchange matrices
  • Ej <-> k interchanges rows j and k.
  • Ej <-> k Mk = M'kEj <-> k, where M and M' are Gauss transformations
Strategy - pivot off the largest entry on or below the diagonal in a column.

Complete pivoting selects the largest entry from the whole submatrix remaining, not just the part of the column on or below the diagonal.

21 October
§§4.1.1-4.2.1

LDMT Decompositon
Theorem.  If an n×n matrix A and all its principal submatrices are invertible, then there are unit lower triangular matrices L and M, and an invertible diagonal matrix D such that A = LDMT. Moreover, the decomposition is unique.

Proof: This is the LU decomposition in disguise. Here, D = diag(pivots) and MT = D-1U. We proved uniqueness in class. Also, there is a version in case pivoting is required.

Algorithm. Derived in class using partitioning.

LDLT Decompositon
Theorem.  If A is an invertible, symmetric n×n matrix, and all its principal submatrices are invertible, then A = LDLT, where D is an invertible diagonal matrix and L is a unit lower triangular matrix. Moreover, the decomposition is unique.

Proof: A = AT implies that LDMT = MDLT. Uniqueness of the LDMT factorization implies that M = L.

For symmetric A, the algorithm mentioned above takes roughly half the number of flops as the usual Gaussian elimination algorithm.

Positive Definiteness
Definition. A quadratic forms Q(x) = xT A x is positive definite if Q(x)>0, unless x = 0. (The matrix A is not unique. The same quadratic form can be generated by A + B, where B is antisymmetric.)

Definition. If A is a symmetric matrix, then A is said to be positive definite if the associated quadratic form Q(x) = xT A x is positive definite. Here are a few properties; throughout, we assume A is positive definite.
  • A is invertible.
  • If X is a k×n matrix with rank(X) = k, then XT A X is a positive definite k×k matrix.
  • Every principal submatrix of A is positive definite.

HW (Due 10/28.)
  1. Find the permutation matrix P corresponding to p = [2 3 5 4 1]. Express this matrix as a product of interchange matrices. Determine whether the permutation is even or odd.
  2. Problem 3.4.1.
  3. Problem 3.4.3. (Write an m-file or C or FORTRAN program to do this.)
  4. Write MATLAB m-files (or C or fORTRAN) programs to carry out the steps in Algorithm 3.4.1 on pg. 112, modified by the change mentioned on the top of pg. 114. Test this m-file (or program) on the matrix in Example 3.4.2.
  5. Show that the matrices A and A+B generate the same quadratic form if and only if B is antisymmetric.

26 October
§§4.2.3-4.2.5

Cholesky Factorization

Algorithms for finding G
Update columns - §4.2.4
  • G(j,j)*G(:,j) = A(:,j) - G(j,1)*G(:,1) - G(j,2)*G(:,2) - ... - G(j,j-1)*G(:,j-1)
  • G(j,j) = (A(j,j) - G(j,1)*G(j,1) - G(j,2)*G(j,2) - ... - G(j,j-1)*G(j,j-1))½
Outer product rank-1 update - §4.2.5
Note: Both algorithms provide a test for a symmetric matrix to be positive definite.

28 October
§§4.2.8-4.2.10, 4.3.1-4.3.2.

Symmetric positive semidefinite (spsd) matrices
Definition. A=AT and xT A x >= 0 for all x in Rn
  • |A(j,k)|<=(A(j,j)*A(k,k))½
  • A(j,j) = 0 ==> A(j,:) = 0 and A(:,j) = 0
  • Algorithms for getting the Cholesky factorization in the positive definite case work for spsd matrices, with minor modifications in them.
  • Small diagonal elements make pivoting useful.

Symmetric pivoting - PAPT

Square root and polar decomposition

Banded matrices and the LU factorization
Theorem. Let A = LU. If A has upper bandwidth q and lower bandwidth p, then the lower triangular matrix L has lower bandwidth p, and the upper triangular matrix U has upper bandwidth q.

Forward substitution to solve Lx = b requires ~2pn flops, rather than n2. Similarly, backward substitution in Ux=b requires only ~2qn flops.

HW (Due 11/9.)
  1. Consider the symmetric 4×4 matrix A below. Use the Cholesky Algorithm 4.2.1 (p. 144) to find G or to show that it is not positive definite.

    A =

    12
    4 3 2
    4 3 3 2
    3 2 2 3

  2. Write a MATLAB (or C or FORTRAN) program to implement Algorithm 4.2.2 (p. 145).
  3. Problem 4.2.1(a)
  4. Problem 4.2.5 (Assume A=AT.)
  5. Problem 4.2.11

2 November
§§4.3.1-4.3.3, 4.3.5, 5.2.7

Banded matrices - Cholesky factorization (finished).
Standard (``classical'') Gram-Schmidt algorithm for ``thin'' QR factorization

4 November
§§5.2.6-5.2.8

Modified Gram-Schmidt algorithm
QR Factorization - properties
Thin QR factorization is unique, and G = RT in the Cholesky factorization of ATA
ran(A(1:k)) = ran(Q(:,1:k))
null(AT) = ran(Q(:, (n+1):m))

Connection with least squares data fitting

HW (Due 11/17.)
  1. Problem 4.3.1
  2. Let f(x) be a smooth function that has period 1. That is, f(x+1) = f(x).
    1. Use a Taylor series argument to show that f ''(x) = (f(x+h) + f(x-h) -2f(x)) h-2 + O(h), for h small.
    2. Let h = 1/n. Suppose we are given y(k+1) = f(kh), k=0:(n-1). Show that

      f ''(jh) = -h-2 A(j+1,:)*y + O(h),  j = 0:(n-1),

      where A is the matrix given in Problem 4.3.11.

  3. Problem 4.3.11(a,b)
  4. Problem 5.2.6 (Note: stated formula is incorrect, unless v is a column vector, which requires k=n-1. You may make this assumption.)
  5. Problem 5.2.7

9 November
§§5.1.1-5.1.6

Householder Matrices

Working with Householder matrices

11 November
§§5.2.1, 5.3.1-5.3.5

Householder QR algorithm

Least squares problems

HW (Due 11/24.)
  1. P5.1.2
  2. P5.1.4
  3. P5.2.2
  4. P5.2.5
  5. P5.3.5

17 November
§§5.3.5 (finished), 5.3.7, 5.4.1-5.4.3

Corrrected Problem 5.2.6 and gave hints on 5.2.7.

Sensitivity of LS - worked Problem 5.3.2

Rank deficiency and qr with column pivoting - an example

Complete orthogonal decompositions

Bidiagonalization

19 November
§§5.5

Pseudoinverse

Rank deficient LS problems

23 November
§10.2.1

The method of steepest descent

HW (Due 12/2.)
  1. Write a MATLAB (or C or FORTRAN) program to upper bidiagonalize a matrix, and use it to upper bidiagonalize the 4×4 matrix B below.

    B =

    12
    4 3 2
    4 3 3 2
    3 2 2 3

  2. Consider the LS problem: minimize || A x - b ||, where b = [ 1 0 -1 2 0]T and A is given below. Show that A is rank deficient. Find all x that solve the LS problem. (You may use MATLAB etc.)

    A =

    3 -4  -3 
    -1  -4 5 1
    -1 0 1 1
    1 -2 1 -1
    1 3 -4 -1

  3. P5.5.2
  4. P10.2.1
  5. Let D = diag(1,1,10-2), b = [ 1 1 0 ]T, F(x) = || D x - b ||2, and x0 = [ 2 2 1 ]T. Minimize F using the method of steepest descent with an initial guess x0. Illustrate convergence by plotting iterates xk. Use MATLAB to do the graphics and computations. Solution

30 November
§10.2.2-10.2.4, 7.1.1, 8.1.1

Conjugate gradient method


2 December
§8.1.1

Schur decomposition for a symmetric matrix A.
Courant-Fischer Theorem

7 December
Review