Notes for Analysis for Applications I

Vector Spaces

Vector spaces developed gradually over a very long period of time, with roots in both pure mathematics and physics. Its axiomatic formulation is attributed to the mathematician Peano, in 1886. The word vector comes directly from Latin; it means "carrier." In astronomy in the 1600's, the line joining a planet to the sun was thought to be a a kind of string that "carried" the planet around in its orbit around the sun. Here is the familiar, modern definition:

Definition. A vector space is a set V together with two operations, + and × . If u, v are in V, then u + v is in V; if c is a scalar (a real or complex number for us), then c×v is in V. The operations satisfy the following rules.

Addition		Multiplication by a scalar
u + (v + w) = (u + v) + w		a(b×u) = (ab)×u
Identity: u + 0 = 0 + u = u		(a + b)×u = a× u + b×u
Inverse: u + (-u) = (-u) + u = 0		a×(u + v) = a×u + a×v
u + v = v + u		1×u = u

The mathematician Peter Lax calls the idea of a vector space "a bare-bones concept," and, when we look at the definition, it's easy to see why. All that's required is a set, scalars, two operations, and a few axioms involving them. It is striking that there are many objects that satisfy these axioms, and that the "bare-bones" vector-space notion can yield a great deal of information concerning these objects.

The vector spaces that we will deal with here include the familiar finite dimensional spaces $\mathbb R^n$, $\mathbb C^n$, the space of $k$ times continuously differentiable functions $C^k$, polynomials of degree $n$ or less $P_n$, various spaces of Lebesgue integrable functions $L^p$, sequence spaces $\ell^p$, and other spaces that we will introduce later. At this point, it's a good idea to review some standard definitions and theorems that come up in linear algebra.

Subspaces

Definition. A subset U of V is a subspace if, under + and × from V, U is a vector space in its own right.

Theorem. U is a subspace of V if and only if these hold:

0 is in U.
U is closed under + .
U is closed under × .

Span

Definition. Let S={v₁ ... v_n} be a subset of a vector space V. The span of S is the set of linear combinations of vectors in S. That is, $\text{span}(S) := \{c_1v_1 + \cdots c_n v_n \}$, where the $c$'s are arbitrary scalars and the $v$'s are vectors from $S$.

Proposition. The set span(S) is a subspace of V.

Bases and dimension

Definition. We say that a set of vectors
S = {v₁, v₂, ... , v_n}
is linearly independent if and only the equation
c₁v₁ + c₂v₂ + ... + c_nv_n = 0
has only c₁ = c₂ = ... = c_n = 0 as a solution. If it has solutions different from this one, then the set S is said to be linearly dependent.

Definition. A subset B = {v₁ ... v_n} of V is a basis for V if B spans V and is linearly independent. Equivalently, B is a basis if it is maximally linearly independent; that is, B is not a proper subset of some other linearly independent set. Unless we specifically state otherwise, we will assume that B is ordered.

Theorem. Every basis for V has the same number of vectors as every other basis. This common number is defined to be the dimension of V, dim(V).

Two remarks. If a vector space has arbitrarily large sets of linearly independent vectors, it is infinite dimensional. The subspace containing only $0$ has no linearly independent vectors and is assigned $0$ as its dimension.

Linear transformations and isomorphisms

Definition. A mapping $L$ from a vector space $V$ to a vector space $W$ that preserves addition and scalar multiplication is called a linear transformation. That is $L:V\to W$ is linear if and only if it satisfies $L(c_1v_1+\cdots +c_nv_n) =c_1L(v_1)+\cdots +c_nL(v_n)$.

Definition A linear transformation that is one-to-one and onto (bijective) is said to be an isomorphism.

Proposition. Under appropriate conditions, linear combinations, compositions, and inverses of linear transformations are linear.

The word "appropriate" means that, for example, the composition $K\circ L$ has to be defined. Thus the image of $L$ must be contained in the domain of $K$. Similar conditions need to be placed on the other operations mentioned in the proposition.

Coordinate vectors

Coordinates are used to associate a point in some space with a set of real or complex numbers — think of polar or cartesian coordinates in 2D. The association is at least locally one-to-one and onto. For finite dimensional vector spaces, we not only want the association to be one-to-one and onto, but we also want it to preserve vector addition and multiplication by scalars. In other words, we want an isomorphism between a vector space $V$ and $\mathbb R^n$ or $\mathbb C^n$. The properties defining a basis are exactly the ones needed to define "good" coordinates for a finite dimensional vector space. We start by recalling the following important theorem:

Theorem. Given an ordered basis B = {v₁ ... v_n} and a vector v, we can uniquely write the vector as v = x₁v₁ +...+ x_nv_n, and thus represent it by the column vector [v]_B = [x₁, ..., x_n]^T.

As a consequence of this theorem, we can define a bijective map $\Phi$ from an $n$ dimensional vector $V$ to $n\times 1$ columns of scalars. All we need is an ordered basis for $V$. Then, using the previous result, we define the map \[ \Phi[v] := [v]_B = \left(\begin{array}{c} x_1 \\ \vdots \\ x_n\end{array}\right). \] The inverse is given by $\Phi^{-1}[x_1 \ \cdots \ x_n]^T = x_1v_1+\cdots +x_nv_n$. The map $\Phi$ is easily shown to be linear, so $\Phi$ is an isomorphism between $V$ and $n\times 1$ columns of scalars. We will call $\Phi$ a coordinate map. and the column vector $[v]_B$ a coordinate vector.

Change of coordinates

Often we want to change coordinates. For instance, we do this when we diagonalize a matrix. To see how to do this, we start with two ordered bases B = {v₁ ... v_n} and D = {w₁ ... w_n} for an n-dimensional vector space V. In addition, let $\Phi$ and $\Psi$ be coordinate maps for B and D, respectively. Finally, suppose that the coordinates of $v\in V$ relative to B and D are \[ \Phi(v) = [v]_B = \mathbf x \quad \text{and}\quad \Psi(v) = [v]_D = \mathbf y. \] From this, we see that $\Phi^{-1}(\mathbf x)=v$. Since $\Psi(v)=\mathbf y$, $\Psi(\Phi^{-1}(\mathbf x)) = \Psi(v) =\mathbf y$. Thus the map $\Psi\circ \Phi^{-1}$ changes $\mathbf x$-coordinates to $\mathbf y$ coordinates.

Of course, this doesn't give us an explicit formula for changing from one system to the other. To do that, we first let $\mathbf e_j= (0 \cdots 1 \cdots 0)^T$ be the column vector with 1 in the j^th position and zeros elsewhere. Next, write the column vector as a linear combination of $\mathbf e_j$'s, $\mathbf x = \sum_j x_j \mathbf e_j$, and then apply $\Psi\circ \Phi^{-1}$ to get $\mathbf y = \sum_j x_j\Psi\circ \Phi^{-1}(\mathbf e_j)$. By the definition of $\Phi$, we see that $\Phi^{-1}(\mathbf e_j)=0\cdot v_1+\cdots +1\cdot v_j+\cdots 0\cdot v_n=v_j.\,$ Hence, $\mathbf y =\sum_j x_j\Psi(v_j)= \sum_j x_j[v_j]_D=S\mathbf x$. Here $S$ is called the transition matrix and its j^th column is $[v_j]_D$, the D-coordinate vector of $v_j$; explicitly, \[ S= \big[ [v_1]_D\ [v_2]_D \cdots [v_n]_D\big] = \big[\text{D-coordinates of the B-basis}\big]. \] What's left to do is to find the entries in $S$. With a little matrix manipulation, we have that $S\mathbf e_j=[v_j]_D=\sum_k S_{kj}\mathbf e_k$. Applying $\Psi^{-1}$ to this equation yields \[ \nu_j = \Psi^{-1}[v_j]_D = \Psi^{-1}\big(\sum_k S_{kj}\mathbf e_k\big)= \sum_k S_{kj}\underbrace{\Psi^{-1}(\mathbf e_k)}_{w_k} = \sum_k S_{kj}w_k. \] One final remark. If we take components in $\mathbf y = S\mathbf x$, we get $ y_j=\sum_k S_{jk}x_k$. In the formula for $v_j$ above, the row and column indices of $S$ are reversed. This means that the matrix relating the two bases is the transpose of the transition matrix, which relates the two sets of coordinates.

Matrix representations of linear transformations

Let $L:V\to W$ be linear, and suppose that the dimension of $V$ is $n$ and that of $W$ is $m$. Furthermore, we will take B = {v₁ ... v_n} and D = {w₁ ... w_m} to be bases for $V$ and $W$, respectively, and we will let $\Phi$ and $\Psi$ be the associated coordinate maps. (Keep in mind that here the coordinate maps are for different vector spaces.) It follows that the linear map $\Psi\circ L \circ \Phi^{-1}$ takes $n\times 1$ columns into $m\times 1$ columns. If $w=L(v)$, $\mathbf x=\Phi(v)$, $\mathbf y = \Psi(w)$, then $\mathbf y = \Psi\circ L \circ \Phi^{-1} (\mathbf x)$. Next, note that $\mathbf x= \sum_k x_k\mathbf e_k$, so \[ \mathbf y = \Psi\circ L \circ \Phi^{-1} (\mathbf x) = \sum_k x_k \Psi\circ L \circ \Phi^{-1} (\mathbf e_k) = \sum_k x_k \Psi(L(v_k))= \sum_k [L(v_k)]_D x_k = A_L \mathbf x, \] where $A_L =\big[[L(v_1)]_D \ [L(v_2)]_D \cdots [L(v_n)]_D\big]=\big[\text{ D-coordinates of } L(\text{B-basis})\big]$. To summarize, we have constructed a unique matrix $A_L$ such that $w=L(v)$ if and only if $\mathbf y=A_L\mathbf x$. Thus $A_L$ is the matrix representation of $L$ relative to the bases involved.

Dual space

Definition. A linear functional is a linear transformation $\varphi:V\to $ scalars.

Definition. The set $V^\ast$ of all linear functionals is called the (algebraic) dual of V.

Proposition. $V^\ast$ is a vector space under the operations of addition of functions and multiplication of a function by a scalar.

Next: Inner product spaces

Updated 9/1/14 (fjn).