SEP | 3 | 5 | 10 | 12 | 17 | 19 | 24 | 26 | ||
---|---|---|---|---|---|---|---|---|---|---|
OCT | 1 | 3 | 8 | 10 | 15 | 17 | 22 | 24 | 29 | 31 |
NOV | 5 | 7 | 12 | 14 | 19 | 21 | 26 | 28 | ||
DEC | 3 | 5 | 10 |
Addition | Scalar multiplication | |
---|---|---|
u + (v + w) = (u + v) + w | a·(b·u) = (ab)·u | |
Identity: u + 0 = 0 + u = u | (a + b)·u = a· u + b·u | |
Inverse: u + (-u) = (-u) + u = 0 | a·(u + v) = a·u + a·v | |
u + v = v + u | 1·u = u |
When V is finite dimensional, its LI sets cannot be arbitrarily large. Suppose that n is the maximum number of vectors an LI set in V can have. A linearly independent set S in V having n vectors in it will be called maximal.
Proof. Since B is a basis, it spans V. Consequently, we have
scalars c^{1}, ..., c^{n} such that the representation
above holds. We want to show that this representation is unique -
i,e,, no other set of scalars can be used to represent
v. Keeping this in mind, suppose that we also have the
representation
v =
d^{1}v_{1} +...+
d^{n}v_{n},
where the coefficients are allowed to be different from the
c^{k}'s. Subtracting the two representations
for v yields
0 =
(c^{1} - d^{1}) v_{1} +...+
(c^{n} - d^{n}) v_{n}.
Now, B is a basis for V, and is therefore LI; the last equation
implies that c^{1} - d^{1} = 0, c^{2} -
d^{2} = 0, ..., c^{n} - d^{n} = 0. That is,
the c's and d's are the same, and so the representation is unique.
This theorem gives us a way to assign coordinates to V, for the
correspondence
v=
c^{1}v_{1} +...+
c^{n}v_{n}
<-> (c^{1}, c^{2}, ... , c^{n})
it sets up is both 1:1 and onto. The condition of linear independence
gives us that it is 1:1, and the condition of spanning gives us that
it is onto. It is also easy to show that this correspondence preserves
addition and scalar multiplication, which are the properties needed in
defining "good" coordinates for a vector space.
The word isomorphism comes from two Greek words, "isos," which means "same", and "morphy," which means "form." As far as vector space operations go, two isomorphic vector spaces have the "same form" and behave the same way. Essentially, the spaces are the same thing, just with different labels. For example,a basis in one space corresponds to a basis in the other. Indeed, any property in one space that only involves vector addition and scalar multiplication will hold in the other. This makes the following theorem, which is a consequence of what we said above concerning coordinates, very important.
This isn't the only isomorphism between P_{2} and
R^{3}. Recall that a quadratic polynomial is determined
by its values at three distinct values of x; for instance, x=-1, 0,
and 1. Also, we are free to assign whatever values we please at these
points, and we can get a quadratic that passes through them. Thus, the
correspondence
p <-> [p(-1) p(0) p(1)]^{T}
between P_{2} and R^{3} is both 1:1 and
onto. It is easy to show that it also preserves addition and scalar
multiplication, so it is another isomorphism between
P_{2} and R^{3}. Let's use this to find
a new basis for P_{2}. (Remember, a basis in one
isomorphic space corresponds to a basis in the other.) Since { [ 1 0 0
]^{T}, [ 0 1 0 ]^{T}, [ 0 0 1 ]^{T} } is a
basis for R^{3}, the set of polynomials
C = { p_{1}(x) = -½x + ½x^{2},
p_{2}(x) = 1, p_{3}(x) = ½x +
½x^{2} },
which satisfy
p_{1}(-1) = 1, p_{1}(0) = 0, p_{1}(1) = 0,
p_{2}(-1) = 0, p_{2}(0) = 1, p_{2}(1) = 0,
p_{3}(-1) = 0, p_{3}(0) = 0, p_{1}(1) = 1,
is another basis for P_{2}. This raises the question of
how the coordinate vectors [p]_{C} and [p]_{B} are
related.
v_{j}=A^{1}_{j}w_{1} +
A^{2}_{j}w_{2} + ... +
A^{n}_{j}w_{n}
w_{k}=C^{1}_{k}v_{1} +
C^{2}_{k}v_{2} + ... +
C^{n}_{k}v_{n}
(Note that the sums are over the row index for each matrix A and C.)
For any vector v with representations
v = b^{1}v_{1} +...+
b^{n}v_{n}
v = d^{1}w_{1} +...+
d^{n}w_{n}
and corresponding coordinate vectors
[v]_{B} = [b^{1},..., b^{n}]^{T}
[v]_{D} = [d^{1},..., d^{n}]^{T}
we have the change-of-basis formulas
[v]_{D} = A[v]_{B} and
[v]_{B} = C[v]_{D}.
These imply that AC=CA=I_{n×n}, so C=A^{-1} and
A=C^{-1} .
For purposes of comparison, we want to write out the expressions for
the coordinate changes. Writing the d's in terms of the b's, we have
d^{k} = A^{k}_{1} b^{1} + ... +
A^{k}_{n} b^{n}.
Going the other way, we can write the b's in terms of the d's,
b^{k} = C^{k}_{1} d^{1} + ... +
C^{k}_{n} d^{n}.
We note that quantities transforming according to the formula for bases
are called covariant, and quantities transforming like the
coordinates are called contravariant.
1 2 -1 3and the matrix A = C^{-1} =
3/5 -2/5 1/5 1/5
9 2 -1 -6 1 1 1 0 0The matrix that takes coordinates relative to B into ones relative to D is A =
0 0 1 1/3 1/3 -1 -1/3 2/3 7
A simple physical example is the work W done by a force f applied at a point and producing a displacement s. Here, the work is given by W=L(s) = f·s. The point is that is if we fix the force, then the work is a linear function of the displacement. Note that forces and displacements have different units and are thus in different vector spaces, even though the spaces are isomorphic.
Another simple example that frequently comes up is multiplication of a column vector X by a row vector Y. The linear functional in this case is just L(X) = Y X. Our final example concerns C[0,1]. L[f] = _{0}S^{1 } f(x)dx is a linear functional.
We can use the theorem we just obtained to define n linear functionals
{v^{1} ... v^{n}} via
To make this clearer, let's look at what v^{1} does to
vectors. If we take a vector v = x^{1}
v_{1}+ ...+ x^{n} v_{n}, then
v^{1}(v) =
x^{1}v^{1}(v_{1}) +
x^{2}v^{1}(v_{2}) + ... +
x^{n}v^{1}(v_{n}) =
x^{1}·1 + x^{2}·0 +
... +x^{n}·0 = x^{1}.
A similar calculation shows that v^{2}(v) =
x^{2}, v^{3}(v) = x^{3}, ...,
v^{n}(v) = x^{n}. This means that we can
write L(v) = x^{1}y_{1} + ...+
x^{n}y_{n} as
L(v) = y_{1}v^{1}(v) + ... +
y_{n}v^{n}(v)
= (y_{1}v^{1} +
... +y_{n}v^{n})(v)
Now the two sides are equal for all values of the argument, so they
are the same function. That is, L =
y_{j}v^{1} +...+
y_{n}v^{n}. Hence, the set B^{*} =
{v^{1} ... v^{n}} spans
V^{*}. The set is also linearly independent. If 0 =
y_{j}v^{1} +...+
y_{n}v^{n}, then
0=0(v_{j}) = y_{j}. Hence, the only
y_{j}'s that give 0 are all 0. Summarizing, we have
obtained this result.
Schwarz's inequality shows that the quotient |<u,v>| ÷ ||u|| ||v|| is always between -1 and 1. Consequently, we may define an angle between vectors to be cos^{-1}(<u,v>(||u|| ||v||)^{-1}). The norm or length of a vector ||v|| satisfies three important properties.
Examples We verified in detail that the following are inner products on the spaces listed. In particular, we motivated the selecting the inner product on C[a,b] by working with the one for R^{n}, modifying it, and letting n tend to infinity.
One can always convert an orthogonal set into an orthonormal set. We simply divide each vector in the set by the norm (length) of its length. Here are results for the examples above.
Othogonal and orthonormal bases
Things are even simpler in an orthonormal basis. If we let B =
{u_{1}, u_{2}, u_{3},
..., u_{n}} be orthonormal, then ||
u_{j} || = 1, and
x^{j } =
<v,u_{j}>.
This is familiar from 3D vectors, with {i, j,
k} being the orthonormal basis.
We now want to address what happens if we change from B to a new
orthonormal basis, B' = {u'_{1}, u'_{2},
u'_{3}, ..., u'_{n}}.
u'_{j}=A^{1}_{j}u_{1} +
A^{2}_{j}u_{2} + ... +
A^{n}_{j}u_{n}.
If the matrix A has j,k entry A^{k}_{j}, then the
coordinate vectors transform according to the rule
[v]_{B} = A[v]_{B'}. By our previous
proposition, we thus have that
<v,w> = [w]_{B}^{T }
[v]_{B} = [w]_{B'}^{T } A^{T
} A[v]_{B'}
On the other hand, the proposition applies directly to the
basis B' itself. Hence,
<v,w> = [w]_{B'}^{T }
[v]_{B'}. Combining these two equations then gives us
[w]_{B'}^{T }
[v]_{B'} = [w]_{B'}^{T } A^{T
} A[v]_{B'},
which holds for any choice of vectors v and w.
We are interested in getting the components of A^{T}A. To do
this, choose w = u'_{j} and v =
u'_{k}. The coordinate vectors for these are
[w]_{B'} = [u'_{j}]_{B'} =
e_{j} and [v]_{B'} =
[u'_{k}]_{B'} = e_{k}. Inserting
these in the equation above gives us
e_{j}^{T } e_{k} =
e_{j}^{T } A^{T}A e_{k}
This implies that the (j,k) entry in A^{T}A is 1 if j=k and 0
if j is not equal to k. But these are exactly the entries in the
n×n identity matrix I. Thus, we have shown that A^{T}A =
I.
i' = cos(t) i + sin(t) j
j' = -sin(t) i + cos(t) j
k' = k
The matrix A for which [v]_{B} =
A[v]_{B'} is A = R_{z}(t) =
cos(t) -sin(t) 0 sin(t) cos(t) 0 0 0 1By simplying relabeling the axes, we can obtain formulae for rotations with the x or y axis fixed. The one for a counterclockwise rotation about the x-axis through an angle t is A = R_{x}(t) =
1 0 0 0 cos(t) -sin(t) 0 sin(t) cos(t)
A = R_{z}(precession)
R_{x}(nutation) R_{z}(pure rotation)
This discussion is based on that given in the book by H. Goldstein:
Classical Mechanics, Addison-Wesley, Reading, MA, 1965.
r_{11} r_{12 } r_{13 } ... r_{1n} 0_{ } r_{22 } r_{23 } ... r_{2n} 0_{ } 0_{ } r_{33 } ... r_{3n} ... 0_{ } 0_{ } 0_{ } ... r_{nn}we can write the equations that give v_{k}'s as linear combinations of u_{j}'s in matrix form, A = QR. This is the QR factorization. Whem m = n, the matrix Q is orthogonal. If m > n, then Q satisfies Q^{T}Q = I_{n×n}. However, QQ^{T} will not be the m×m identity I_{m×m}
1 1 -2 0 2 1We start with v_{1} = [1 -2 2]^{T} v_{2} = [1 0 1]^{T}. Following the Gram-Schmidt procedure, we get
3 1 0 1and the matrix Q =
1/3 2/3 -2/3 2/3 2/3 1/3
t | 0 | 1 | 2 | 3 | 4 |
---|---|---|---|---|---|
ln(C) | -0.1 | -0.4 | -0.8 | -1.1 | -1.5 |
We know that the law of decay tells us that ln(C) = -r*t + ln(C_{0}), so the the data should lie on a straight line. Of course, they don't; experimental errors offset the points. The question is, what are values for r and ln(C_{0}) that will fit a straight line to the data? A more general problem is this. At times t_{1}, t_{2}, ..., t_{n}, we have measurements y_{1}, y_{2}, ..., y_{n}. Find a line y = a*t + b that fits the data.
One good way to solve this problem is the method of least squares. Let
E^{2} = (|y_{1} - a*t_{1} -b|^{2} +
|y_{2} - a*t_{2} -b|^{2} + ... +
|y_{n} - a*t_{n} -b|^{2})/n
The quantity E is the root mean square of all of the errors
y_{j} - a*t_{j} -b at each time t_{j}. The
idea is to choose a and b so as to minimize E. That is, we will try to
find a and b that give the least value for the sum of the
squares of the errors. We can put this in the form of an
inner product. Define the following column vectors in
R^{n}.
v = [y_{1} y_{2} ...
y_{n}]^{T}
v_{1} = [t_{1} t_{2} ...
t_{n}]^{T}
v_{2} = [1 1 ... 1]^{T}
Notice that the j^{th} component of the vector
v - av_{1} - bv_{2} is just the
difference y_{j} - a*t_{j} -b. Form this it follows
that
E^{2} = ||v - av_{1} -
bv_{2}||^{2}/n
Now, n is simply the number of data points, so it is
fixed. Consequently, minimizing E then is equivalent to minimizing the
the distance from v to the space spanned by
v_{1} and v_{2}. Put a little
differently, minimizing E is equivalent to finding the
v* in span{v_{1},
v_{2}} that comes closest to v or best
approximates v.
Suppose that we are given a continuous function f(x) on the interval
[0,1] that has an upward trend or bias to it. One way to
measure this is to fit a straight line to the function f. The
difference here is that we know f at every x in [0,1]. The
discrete square error E^{2} goes over to an integral,
E^{2} = _{0}S^{ 1
} (f(x) - ax -b)^{2}dx .
If we use the inner product < f,g > = _{0}S^{ 1 } f(x)g(x)dx, then E^{2} = ||
f(x) - ax -b||^{2}, and the problem again goes over to finding
the best approximation to f from the span{1,x}, relative to
the norm from our inner product < f,g >.
One can carry this further. If f(x) has not only an upward trend, but is also concave up, then it makes sense to fit a quadratic to it. The problem described above would change to finding the quadratic polynomial f*(x) in span{1,x,x^{2}} that minimizes || f(x) - a_{0} - a_{1}x - a_{2}x^{2} ||^{2}.
Proof. Let's first show that if v* in U minimizes ||
v - u ||, then it satisfies the normal equations. The
way we do this is similar to the way we proved Schwarz's
inequality. Fix u in U and define
q(t) := || v - v* + t u) ||^{2} = ||
v - v* ||^{2} + 2t < v - v*,
u > + t^{2} || u ||^{2}
Because v* minimizes || v - u ||^{2} over
all u in U, the minimum of q(t) is at t = 0. This means that t
= 0 is a critical point for q(t), so q'(0) = 0. Calulating q'(0) then
gives us 2< v - v*, u > = 0 for all
u in U. Dividing by 2 yields the normal equations.
Conversely, if v* in U satisfies < v - v*,
u > = 0, then we will show not only that v* is
a minimizer, but also that it is the minimizer; that
is, v* is unique. To do this, let u be any vector in
U. Observe that we can write v - u = v -
v* + v* - u = v - v* + u',
where u' := v* - u is in U. Consequently, we also
have
|| v - u ||^{2} = ||v - v* +
u' ||^{2}
|| v - u ||^{2} = ||v - v*
||^{2} + 2< v - v*, u' > + ||
u' ||^{2}.
Since we are assuming that v - v* is orthogonal to every
vector in U, it is orthogonal to u'; hence, < v -
v*, u' > =0, and so have that
|| v - u ||^{2} = ||v - v*
||^{2} + || u' ||^{2}.
It follows that || v - u || >= ||v - v*
||, so that v* is a minimizer. Now, if equality holds,
that is, if || v - u || = ||v - v* ||,
then we also have || u' || = 0. Consequently, u' =
0. But then, we have to have u = v*. So, the
vector v* is unique.
Proof. If the normal equations are satisfied, they hold for
every vector in U, including the basis vectors. Thus, the equations
above have to hold, too. On the other hand, suppose the equations
above are satisfied. We can write any vector u in U as u
= c_{1} w_{1}+ ... +
c_{n}w_{n}. It then follows from the equations
above that
< v - v*, u > = c_{1} < v
- v*,w_{1} > + ... + c_{n} <
v - v*,w_{n} > = c_{1}·0 +
... + c_{n}·0 = 0,
and so the normal equations hold.
The orthonormal set of Legendre polynomials is formed by using the
Gram-Schmidt process on {1, x, x^{2}, x^{3}, ...}
relative to the inner product
We denote these polynomials by {p_{0}, p_{1},
p_{2}, p_{3}, ...}. Earlier, we had seen that
p_{0} = 2^{-½}, p_{1} =
(3/2)^{½ }x, p_{2} = (5/8)^{½
}(3x^{2} - 1).
We remark that there are similar formulas for all of the orthonormal
Legendre polynomials.
For each n, we look at the subspace U_{n} =
span{p_{0}, p_{1}, p_{2}, ...,
p_{n}}. Of course, U_{n} = P_{n}, the
polynomials of degree n or less. In our least squares minimization
problem, we identify f with v and the u_{k}'s with
the p_{k}'s. The minimizer for U_{n} is
f*_{n} = < f, p_{0} >
p_{0}+ ... + < f, p_{n} >
p_{n}.
The minimizers f*_{n} change with n in a very simple
way. Namely, to go from n to n+1, we only need to add a term to the
previous minimizer. If we formally let n tend to infinity, then we get
the infinite series
< f, p_{0} >
p_{0} + < f, p_{1} >
p_{1} + < f, p_{2} >
p_{2} + < f, p_{3} > p_{3} + ...
for which the minimizer f*_{n} is the nth partial
sum.
Let the minimum error over U_{n} be E_{n} = || f -
f*_{n} ||. Because U_{n} = P_{n}
is contained in U_{n+1} = P_{n+1}, we must have
E_{n+1} <= E_{n}. That is, E_{n} decreases
as n gets bigger. Does the E_{n} go to 0 as n -> infinity?
If it does, we say that f*_{n} converges in the
mean. We also say that the series converges in the mean to f, and
we also write
f = < f, p_{0} >
p_{0} + < f, p_{1} >
p_{1} + < f, p_{2} >
p_{2} + < f, p_{3} > p_{3} + ... .
To find A, we first find the output of L applied to each basis vector;
that is, L[1], L[x], and L[x^{2}]. Doing this, we obtain L[1]
= 3, L[x] = 1 + 5x, and L[x^{2}] = 2x + 9x^{2}. By the
construction in the theorem, the kth column of A is the coordinate
vector [ L[v_{k}] ]_{D}. Consequently, we have
[ L[1] ]_{D} = [3 0 0]^{T}
[ L[x] ]_{D} = [1 5 0]^{T}
[ L[x^{2}] ]_{D} = [0 2 9]^{T}
We have now found that the matrix A =
3 1 0 0 5 2 0 0 9To solve L[p] = 18x^{2} - x + 2, we go over to the matrix form of the equation,
1 -1 2 3 1 1 1 4We let L be the transformation L[v] = Mv, where v is in R^{4}. For this problem, V = R^{4}. The co-domain or set of outputs W is R^{2}. Here, we also have image(L) = R^{2}. The null space is the set of all v for which Mv = 0. Using row-reduction, one can show that
Inverses Let L : V -> V be linear. As a function, if L is both one-to-one and onto, then it has an inverse K V -> V. One can show that K is linear, and LK = KL = I, the identity transformation. We write K = L^{-1}.
Associated matrices Recall if B = {v_{1},
... , v_{n}} and D = {w_{1}, ... ,
w_{m}} be bases for V and W, respectively, then the matrix
associated with the linear transformation L : V -> W is
M_{L} = [ [ L[v_{1}] ]_{D}, ... , [
L[v_{n}] ]_{D} ]
Since each of the combinations listed above is still a linear
transformation, it will have a matrix associated with it. Here is how
the various matrices are related.
Polynomials in L : V -> V We define powers of L in the
usual way: L^{2} = LL, L^{3} = LLL, and so on. A
polynomial in L is then the transformation
p(L) = a_{0}I + a_{1}L + ... + a_{m}L^{m}
Later on we will encounter the Cayley-Hamilton theorem, which says
that if V has dimension n, then there is a degree n (or less)
polynomial p for which p(L) is the 0 transformation.
1 4 1 -2When this system is written out for this choice of A, it looks like
4 -1 1 1Of course, S_{B ->B'} = S^{-1}. Relative to the new basis, the matrix of L, A' = M'_{L} = S^{-1}AS =
2 0 0 -3Letting Z = [x]_{B'} = [z_{1} z_{2}]^{T}, we have the new system dZ/dt = A'Z, with Z(0) = [x(0)]_{B'}. In the new coordinates the system decouples and becomes
exp(2t) 0 0 exp(-3t)This explicitly solves the problem. However, we still have to explain how to find the eigenvalues and eigenvectors of A.
1 4 1 -2we see that p_{A}(µ) = det(A - µ I) = µ^{2} + µ - 6 = (µ+3)( µ-2). Thus the eigenvalues are µ = 2 and µ = -3. Now, we can solve for the eigenvectors. When µ = 2, the corresponding eigenvector X satisfies (A - 2I)X = 0. In augmented form, this becomes
-1 4 0 1 -4 0which has X = x_{2}·[4 1]^{T} as a solution. Repeating the argument for µ = -3 results in X = x_{2}·[-1 1]^{T}. To get the basis we want, we choose x_{2} = 1 in both cases. Other nonzero values will work equally well.
1 a 0 1This matrix is not diagonalizable. It's only eigenvalue is µ=1 and the only eigenvectors have the form c·[1 0]^{T}, where c is a scalar. There is no second linearly independent eigenvector, and so there is no basis of eigenvectors.
2 -1 -1 2A normal mode of this system is a solution of the form X(t) = X_{w}e^{iwt}, where X_{w} is independent of t and not equal to 0. Plugging this solution back into the matrix equation yields, after cancelling e^{iwt},
Recall that we have these relations among currents and voltages in the
circuit components.
V_{L} = L dI/dt
I_{C} = CdV/dt
V_{R1} = R_{1}(I + I_{C}) = R_{1}(I +
CdV/dt)
V_{R2} = R_{2}I
Kirchoff's laws for this circuit are as follows. For the loop
E-R1-C, we have
E = R_{1}(I + CdV/dt) + V
For the loop R2-L-C. we have
V = R_{2}I + L dI/dt
Rearranging these equations gives us
dI/dt = V/L - R_{2}I/L
dV/dt = (E - V)/(CR_{1}) - I/C
We want to put this in matrix form. Let X = [I V]^{T}, F(t) =
[0 E/(CR_{1})]^{T}, and finally let A =
- R_{2}/L 1/L |
- 1/C - 1/(CR_{1}) |
-1 1 -1 -1The characteristic polynomial for this matrix is p_{A}(µ) = (-1 -µ)^{2} + 1. The two eigenvalues are then
-i 1 0 -1 -i 0This is equivalent to the single equation -ix_{1} + x_{2} = 0. Up to nonzero multiples, the eigenvector for µ_{+} is [1 i]^{T}. Either by repeating these steps with µ_{-} or by taking complex conjugates, we have that the eigenvector for µ_{-} is [1 -i]^{T}. As in our previous examples, we form the change-of-basis matrix S = S_{B' -> B} =
1 1 i -iSetting Z = SX, the system becomes
e^{(-1+i)t} 0 |
0 e^{(-1-i)t} |
5 1 1 1 5 1 1 1 5We wish to find coordinates relative to which the "cross terms" are removed. A set of axes that has this property is called principal. Since A is selfadjoint, it can be diagonalized by an orthogonal transformation S. That is, there is a diagonal matrix D such that D = S^{T}AS. Because S is orthogonal, we also have A = SDS^{T}. Now, let Y = [u v w]^{T} = S^{T}X. We see that
We now will finish the probelm by finding the eigenvalues and eigenvectos of A. The characteristic polynomial of A is p_{A}(µ) = det(A - µI) =
5-µ 1 1 |
1 5-µ 1 |
1 1 5-µ |
0 1 1 |
1-(5-µ)^{2} 5-µ 1 |
1-(5-µ) 1 5-µ |
0 1 1 |
6+µ 5-µ 1 |
1 1 5-µ |
0 0 1 |
6+µ 4-µ 1 |
1 µ-4 5-µ |
0 0 1 |
6+µ 1 1 |
1 -1 5-µ |
1 1 1 0 |
1 1 1 0 |
1 1 1 0 |
-2 1 1 0 |
1 -2 1 0 |
1 1 -2 0 |
-2^{-½} -6^{-½} 3^{-½} |
2^{-½} -6^{-½} 3^{-½} |
0 2·6^{-½} 3^{-½} |
-3 1 -1 -1The characteristic polynomial for this matrix is p_{A}(µ) = µ^{2} + 4µ + 4 = (µ+2)^{2}. Thus, there is only one eigenvalue, µ = -2. The augmented matrix representing the system (A -µI)X = 0 is
-1 1 0 -1 1 0This is equivalent to the single equation -x_{1} + x_{2} = 0. Hence, up to nonzero multiples, the eigenvector for µ = -2 is [1 1]^{T}. There are no other linearly independent eigenvectors, and so A is not diagonalizable. What this means is that the system of equations doesn't completely decouple. It does partially decouple, though. Consider a new basis B' = {[1 1]^{T}, [0 1]^{T}}. As before, let S = S_{B'->B} =
1 0 1 1Setting Z = SX, the system becomes dZ/dt = S^{-1}ASZ. Doing a little matrix algebra, we see that S^{-1}AS =
-2 1 0 -2This is called the Jordan normal (canonical) form of A. The new system is
e^{-2t} te^{-2t} |
0 e^{-2t} |
T_{1,1} | 0 | 0 | ... | 0 |
0 | T_{2,2} | 0 | ... | 0 |
... | ... | ... | ... | ... |
0 | 0 | 0 | ... | T_{r,r} |
µ_{7} | * | * | * |
0 | µ_{7} | * | * |
0 | 0 | µ_{7} | * |
0 | 0 | 0 | µ_{7} |
3 | 1 | 0 | 0 | 0 | 0 |
0 | 3 | 1 | 0 | 0 | 0 |
0 | 0 | 3 | 1 | 0 | 0 |
0 | 0 | 0 | 0 | 3 | 1 |
0 | 0 | 0 | 0 | 0 | 3 |
Af_{1} = µf_{1}
Af_{2} = µf_{2} + f_{1}
...
Af_{m} = µf_{m} + f_{m-1}
2 1 -1 0 2 3 0 0 2We begin with the eigenvector, f_{1} = (1,0,0)^{T}. Solving (A - 2I)f_{2} = f_{1} gives f_{2} = (0,1,0)^{T}. Finally, solving (A - 2I)f_{3} = f_{2} gives f_{3} = (0,1/3,1/3)^{T}. Thus, S^{-1}AS = J_{3}(2), where S = [f_{1}, f_{2}, f_{3}]
Displacements | Dual space | |
---|---|---|
Basis | (J^{T})^{-1} | J |
Components | J | (J^{T})^{-1} |
Original Basis | Reciprocal Basis | ||
---|---|---|---|
Representation | v = v^{1}f_{1}+ v^{2}f_{2}+ v^{3}f_{3} | v = v_{1}f^{1}+ v_{2}f^{2}+ v_{3}f^{3} | |
Components | v^{j} = f^{j}·v = SUM_{k}g^{j,k}v_{k} | v_{j} = f_{j}·v = SUM_{k}g_{j,k}v^{k} | |
Transformation matrix | J | (J^{T})^{-1} | |
Transformation law | Contravariant | Covariant |
There is another way to view the reciprocal basis vectors, a way that is similar to viewing basis vectors as tangent vectors to the coordinate curves. Let F(x^{1}, x^{2}, x^{3}) = C be the level surfaces for for a function F. Recall that at a point P(x^{1}, x^{2}, x^{3}), the vector ∇F is normal to the plane tangent to F=C at P. Applying this to the coordinate surface q^{3}(x^{1}, x^{2}, x^{3}) = c^{3} = constant, we see that ∇q^{3} is perpendicular to the tangent vectors to the q^{1} and q^{2} coordinate curves. Since these tangent vectors are precisely the two basis vectors e_{1} and e_{2}, it follows that ∇q^{3} is parallel to e_{1}×e_{2} and hence to e^{3}. In fact, we will show that they are equal.
1 | 0 | 0 |
0 | r^{2} | 0 |
0 | 0 | 1 |
1+4(q^{1})^{2} | 1-2q^{1} |
1-2q^{1} | 2 |
2 | 2q^{1}-1 |
2q^{1}-1 | 1+4(q^{1})^{2} |
Input basis | Output basis | Matrix | Column k | (j,k)-entry | Tensor type |
---|---|---|---|---|---|
B | B | M | [T(e_{k})]_{B} | T^{j}_{ k} | Mixed |
B | B_{r} | N | [T(e_{k})]_{Br} | T_{j}_{ k} | Covariant |
B_{r} | B_{r} | P | [T(e^{k})]_{Br} | T_{j}^{ k} | Mixed |
B_{r} | B | Q | [T(e^{k})]_{B} | T^{j}^{ k} | Contravariant |
The names under the heading matrix are arbitrary labels. They
are used only here and nowhere else. Using the change of basis
formulas from the previous section, we can write all of the matrices in
terms of M, g, and g^{-1}.
N = gM and T_{j}_{ k} = ∑g_{j
m}T^{m}_{ k}
P = gMg^{-1} and T_{j}^{ k} = ∑g_{j
m} g^{k n} T^{m}_{ n}
Q = Mg^{-1} and T^{j}^{ k} =
∑g^{k m} T^{j}_{ m}
The point is that once one set of components is determined, so are all
of the rest.
Green's Theorem Let C be a piecewise smooth simple closed curve that is the boundary of its interior region R. If F(x,y) = A(x,y)i + B(x,y)j is a vector-valued function that is continuously differentiable on and in C, then
Vdt
f_{1}du^{1}
f_{2}du^{2}
The mass of the fluid crossing the base in time t to dt is then
density×volume, or
(µVdt)·f_{1}×f_{2}
du^{1}du^{2}
Thus the mass per unit time crossing the base is
F·N du^{1}du^{2}, where F
= µV, and N =
f_{1}×f_{2} is the standard
normal. Recall that the area of the surface element is dS =
|N|du^{1}du^{2}. Consequently the mass per unit
time crossing the base is F·n dS, where n
is the unit normal. Integrating over the whole surface then yields
This surface integral is called the the flux of the vector field F.
To state this theorem, we also need to define the curl
of a vector field
F(x)=A(x,y,z)i + B(x,y,z)j
+C(x,y,z)k.
We will assume that F has continuous partial derivatives. The
curl is then defined by
There is a useful physical interpretation for the curl. Suppose that a
fluid is rotating about a fixed axis with angular velocity
ω. Define ω to be the vector with magnitude ω
and with direction along the axis of rotation. The velocity of an
element of the fluid located at the position with radius vector
x is v(x) = ω×x. With a
little work, one can show that ω =
½∇×v. Thus one half of the curl of the
velocity vector v is the vector ω mentioned above.
Stokes' Theorem Let S be an orientable surface bounded by a simple closed positively oriented curve C. If F is a continuously differentiable vector-valued function defined in a region containing S, then
We will first compute the line integral over C. In the xy-plane, C is
parameterized via
x(t) = 3 cos(t) i + 3 sin(t) j, 0 ≤ t
≤ 2π,
and so we have:
dx = (- 3 sin(t) i + 3 cos(t) j)dtWe now turn to finding the surface integral. ∫∫_{S} ∇×F·n dσ. The normal compatible with the orientation of C is n = x/|x| = x/3. Thus, on the surface of the hemisphere S, we have
F(x(t)) = 2·3 sin(t)i + 3^{2} cos(t)j - 0^{2}k = 2·3 sin(t)i + 3^{2} cos(t)j
F·dx = (- 18 sin^{2}(t) + 27 cos^{2}(t))dt
∫_{C}F·dx = ∫_{0}^{2π}(- 18 sin^{2}(t) + 27 cos^{2}(t))dt = 9π
∫∫_{S} ∇×F·n dσ = ∫∫_{S} k·n dσSince both terms in Stokes's Theorem have the same value, we have verified the theorem in this case.
∫∫_{S} ∇×F·n dσ = ∫_{0}^{2π} ∫_{0}^{½π} cos(θ) 3^{2} sin(θ)dφdθ
∫∫_{S} ∇×F·n dσ = 9π
Divergence Theorem Let V be region in 3D bounded by a closed, piecewise smooth, orientable surface S; let the outward-drawn normal be n. Then,
To do this we must compute both integrals in the Divergence
Theorem. We will first do the volume integral. It is easy to check
that ∇·F = 3+1+2=6. Hence, we have that
∫∫∫_{V}∇·FdV =
∫∫∫_{V}6dV = 6π·4^{2}·5 =
480π
The surface integral must be broken into three parts: one for the top
cap, a second for the curved sides, and a third for the bottom cap.
∫∫_{S} F·ndσ =
∫∫_{top} F·ndσ +
∫∫_{sides} F·ndσ +
∫∫_{bottom} F·ndσ
The outward normals for the top and bottom caps are k and
−k, respectively. For the top (z = 5), we are integrating
F(x,y,5)·k = 2·5 = 10, and for the
bottom (z = 0), F(x,y,0)·(−k) =
−2·0 = 0. Hence, we have
∫∫_{top} F·ndσ =
∫∫_{top} 10dσ = 10π4^{2} = 160π
∫∫_{bottom} F·ndσ =
∫∫_{bottom} 0dσ = 0
The integral over the curved sides will require a little more
effort. The outward normal (see my notes, Surfaces,
pg. 5) and area element are, respectively,
n = cos(θ)i + sin(θ)j and dσ =
4dθdz.
In addition, on the curved sides
F(4cos(θ),4sin(θ),z) = 12cos(θ)i +
sin(θ)j + 2zk, so F·n
= 12cos^{2}(θ) + 4sin^{2}(θ).
The surface integral over the curved sides is then given by
∫∫_{sides} F·ndσ =
∫_{0}^{5}∫_{0}^{2π}
(12cos^{2}(θ) + 4sin^{2}(θ))4dθdz =
5·4(12π+4π)= 320π.
Combining these three integrals, we obtain
∫∫_{S} F·ndσ =
160π+320π + 0 = 480π,
which agrees with the result from the volume integral. Thus we have
verified the Divergence Theorem in this case.
Type | Eigenvalues of A | Example | Variables | |
---|---|---|---|---|
Parabolic | +++0 | Heat equation | 3 space, 1 time | |
Elliptic | +++ | Laplace's equation | 3 space | |
Hyperbolic | +++- | Wave equation | 3 space, 1 time |
We remark that if the general PDE is multiplied by a minus sign, the patterns in the table will have "+" replaced by"−". In general, the solutions to the various types of equations behave like the corresponding example. For instance, hyperbolic equations have solutions that propagate in time, like those for the wave equation, while parabolic equations have solutions that behave like temperature in a heat flow problem. For further discussion, see section V.8 in Z/T.
∇^{2}u = r^{-1}∂/∂r[r∂u/∂r] + r^{-2}∂^{2}u/∂θ^{2} = 0Because we are using polar coordinates, which are singular at r = 0 and have a discontinuity at θ = ±π, we have two additional "boundary conditions" -- namely that u(r,θ) is well behaved (bounded) as r approaches 0 and that u is 2π periodic in θ.
u(a,θ) = f(θ) = known temperature on boundary (Dirichlet boundary condition).
If we ignore the nonhomogeneous boundary condition, u(a,θ) = f(θ), then the set of solutions is a vector space. Our aim is to construct a basis for this space. Separation of variables is a method for finding a basis. Once we have accomplished this, we then find the linear combination that also satisfies the nonhomogeneous condition.
r^{-1}∂/∂r[r∂u/∂r] + r^{-2}∂^{2}u/∂θ^{2} = 0The solutions that we want have the form u(r,θ) = R(r)Θ(θ). Plugging into the equation gives us
u(r,θ) is bounded as r approaches 0
u(r,θ) is 2π periodic in θ.
r^{-1}[rR′]′Θ + r^{-2}RΘ″ = 0If we now multiply this equation by r^{2} and divide by RΘ, we arrive at this equation:
r[rR′]′/R + Θ″/Θ = 0Since r[rR′]′/R is a function of r only, and since Θ″/Θ is a function of θ only, it follows that both are constant. If we let μ = r[rR′]′/R, then Θ″/Θ = -μ. With a little algebra, we obtain the separation equations,
r[rR′]′ - μR = 0 and Θ″ + μΘ = 0.
Find all possible values of μ for which the problemWe can immediately eliminate μ < 0. The solutions to Θ″ − |μ|Θ = 0 are linear combinations of exp(±|μ|^{½}θ), which always will blow up as θ approaches either +∞ or −∞ or both. They therefore cannot be periodic. For μ = 0, we do have a single periodic solution, namely Θ = 1. The second solution is Θ(θ) = θ, which is not periodic.Θ″ + μΘ = 0 and Θ(θ) = Θ(θ+2π)has a nonzero solution Θ. These values of μ are called eigenvalues, while the corresponding solutions Θ are called eigenfunctions.
This leaves the case in which μ > 0. The differential equation Θ″ + μΘ = 0 has two solutions, sin(μ^{½}θ) and cos(μ^{½}θ). These solutions are periodic with fundamental period 2πμ^{−½}. They will also have 2π as a period if and only if some integer multiple of 2πμ^{−½} is 2π. Thus, we μ > 0 is an eigenvalue if and only if there is an integer n > 0 such that 2πμ^{−½}n = 2π. It follows that μ = n^{2} and that Θ(θ) is a linear combination of sin(nθ) and cos(nθ).
Eigenvalues μ | Eigenfunctions &Theta(θ) |
---|---|
0^{2} | 1 |
1^{2} | cos(θ), sin(θ) |
2^{2} | cos(2θ), sin(2θ) |
⋮ | ⋮ |
n^{2} | cos(nθ), sin(nθ) |
⋮ | ⋮ |
When μ = n^{2}, n ≥ 1, R satisfies the equation r[rR′]′ - n^{2}R = 0. Working out the derivatives, we see that this is the equation
r^{2}R″ + rR′ - n^{2}R = 0,which is a Cauchy-Euler equation. The technique for solving it is to use assume a solution of the form R = r^{α} and determine α. Carrying this out, we obtain
α(α−1)r^{2}r^{α−2} + αrr^{α−1} − n^{2}r^{α} = 0Dividing the last equation by r^{α}, we see that α^{2} − n^{2} = 0, and so α = ±n and the possible solutions are linear combinations of r^{n} and r^{−n}. Of these two, only r^{n} is bounded as r approaches 0. Thus, only R(r) = r^{n} can be used. The separation solutions that we have obtained are listed in the table below.
(α(α−1) + α − n^{2})r^{α} = 0
(α^{2} − n^{2})r^{α} = 0
n | R(r) | Θ(θ) | u = Rθ |
---|---|---|---|
0 | 1 | 1 | 1 |
1 | r | cos(θ), sin(θ) | r cos(θ), r sin(θ) |
2 | r^{2} | cos(2θ), sin(2θ) | r^{2}cos(2θ), r^{2}sin(2θ) |
⋮ | ⋮ | ⋮ | ⋮ |
n | r^{n} | cos(nθ), sin(nθ) | r^{n}cos(nθ), r^{n}sin(nθ) |
⋮ | ⋮ | ⋮ | ⋮ |
u(r,θ) = A_{0} + ∑_{n≥1}(A_{n}r^{n}cos(nθ) + B_{n}r^{n}sin(nθ)).To match the boundary condition u(a,θ) = f(θ), we need to find coefficients such that
f(θ) = A_{0} + ∑_{n≥1}(A_{n}a^{n}cos(nθ) + B_{n}a^{n}sin(nθ))holds. We have already seen that we can represent f this way via its Fourier series. Indeed, this type of problem was Fourier's motivation for introducing such series! All we need to do now is to identify the Fourier coefficients for f with the coefficients above: a_{n} = A_{n}a^{n} and b_{n} = B_{n}a^{n}. The final solution is then
u(r,θ) = a_{0} + ∑_{n≥1}(r/a)^{n}(a_{n}cos(nθ) + b_{n}sin(nθ)),where a_{n} and b_{n} are the Fourier coefficients for f.
f(θ) = ½π - (4/π)∑_{k≥1} (2k−1)^{−2}cos((2k−1)θ)By what we said above, the temperature u(r,θ) corresponding to this f is
u(r,θ) = ½π - (4/π)∑_{k≥1} (r/a)^{2k−1 }(2k−1)^{−2} cos((2k−1)θ)