关于证明正定矩阵的问题,这一步里Bx两边的双竖线怎么打什么意思啊?

Mobile Apps
Engineering, R&D
Web & Software
Finance, Statistics & Business Analysis
Find an Answer
Ask for Help
Guided Learning
Premium Support
Work with Us
Initiatives
OBSOLETE WOLFRAM LANGUAGE TUTORIAL
Additional functionality related to this tutorial has been introduced in subsequent versions of Mathematica. For the latest information, see .
Matrix Computations
This tutorial reviews the functions that Mathematica provides for carrying out matrix computations. Further information on these functions can be found in standard mathematical texts by such authors as
or . The operations described in this tutorial ar an exception is the computation of norms, which also extends to scalars and vectors.
Basic Operations
This section gives a review of some basic concepts and operations that will be used throughout the tutorial to discuss matrix operations.
The norm of a mathematical object is a measurement of the length, size, or extent of the object. In Mathematica norms are available for scalars, vectors, and matrices.
[num]norm of a number
[vec]2‐norm of a vector
[vec,p]p‐norm of a vector
[mat]2‐norm of a matrix
[mat,p]p‐norm of a matrix
[mat,&Frobenius&]Frobenius norm of a matrix
Computing norms in Mathematica.
For numbers, the norm is the absolute value.
Vector Norms
For vector spaces, norms allow a measure of distance. This allows the definition of many familiar concepts such as neighborhood, closeness of approximation, and goodness of fit. A vector norm is a function
that satisfies the following relations.
Typically this function uses the notation . The subscript is used to distinguish different norms, of which the p-norms are particularly important. For , the p-norm is defined as follows.
Some common p-norms are the 1-, 2-, and -norms.
In Mathematica, vector p-norms can be computed with the function . The 1-, 2-, and -norms are demonstrated in the following examples.
The 2-norm is particularly useful and this is the default.
Norms are implemented for vectors with exact numerical entries.
Norms are also implemented for vectors with symbolic entries.
In addition, if a symbolic p is used, the result is symbolic.
Matrix Norms
Matrix norms are used to give a measure of distance in matrix spaces. This is necessary if you want to quantify what it means for one matrix to be near to another, for example, to say that a matrix is nearly singular.
Matrix norms also use the double bar notation of vector norms. One of the most common matrix norms is the Frobenius norm (also called the Euclidean norm).
Other common norms are the p-norms. These are defined in terms of vector p-norms as follows.
Thus the matrix p-norms show the maximum expansion that a matrix can apply to any vector.
In Mathematica the Frobenius norm can be computed as follows.
The matrix 2-norm can be computed as follows.
It is also possible to give an argument to specify the 1, 2, or
matrix p-norms.
This computes the expansion that a collection of different vectors gets from an input matrix. You can see that the maximum value is still less than the 2-norm.
All the p-norms are supported for exact numerical matrices.
However, only the 1 and
matrix p-norms are supported for symbolic matrices.
One of the fundamental subspaces associated with each matrix is the null space. Vectors in the null space of a matrix are mapped to zero by the action of the matrix.
[m]a list of basis vectors whose linear combinations satisfy
For some matrices (for instance, nonsingular square matrices), the null space is empty.
This matrix has a null space with one vector.
This matrix has a null space with two basis vectors.
The rank of a matrix corresponds to the number of linearly independent rows or columns in the matrix.
[mat]give the rank of the matrix mat
If an mn matrix has no linear dependencies between its rows, its rank is equal to
and it is said to be full rank.
If a matrix has any dependencies in its rows then its rank is less than . In this case the matrix is said to be rank deficient.
Note that the rank of a matrix is equal to the rank of its transpose. This means that the number of linearly independent rows is equal to the number of linearly independent columns.
For an mn matrix A the following relations hold: [[A]]+ [A]n, and [[AT]]+ [AT]m. From this it follows that the null space is empty if and only if the rank is equal to n and that the null space of the transpose of A is empty if and only if the rank of A is equal to m.
If the rank is equal to the number of columns, it is said to have full column rank. If the rank is equal to the number of rows, it is said to have full row rank. One way to understand the rank of a matrix is to consider the row echelon form.
Reduced Row Echelon Form
A matrix can be reduced to a row echelon form by a combination of row operations that start with a pivot position at the top-left element and subtract multiples of the pivot row from following rows so that all entries in the column below the pivot are zero. The next pivot is chosen by going to the next row and column. If this pivot is zero and any nonzero entries are in the column beneath, the rows are exchanged and the process is repeated. The process finishes when the last row or column is reached.
A matrix is in row echelon form if any row that consists entirely of zeros is followed only by other zero rows and if the first nonzero entry in row
is in the column , then elements in columns from 1 to
in all rows below
are zero. The row echelon form is not unique but its form (in the sense of the positions of the pivots) is unique. It also gives a way to determine the rank of a matrix as the number of nonzero rows.
A matrix is in reduced row echelon form if it is in row echelon form, each pivot is one, and all the entries above (and below) each pivot are zero. It can be formed by a procedure similar to the procedure for the row echelon form, also taking steps to reduce the pivot to one (by division) and reduce entries in the column above each pivot to zero (by subtracting multiples of the current pivot row). The reduced row echelon form of a matrix is unique.
The reduced row echelon form (and row echelon form) give a way to determine the rank of a matrix as the number of nonzero rows. In Mathematica the reduced row echelon form of a matrix can be computed by the function .
[mat]give the reduced row echelon form of the matrix mat
The reduced row echelon form of this matrix only has one nonzero row. This means that the rank is 1.
Out[2]//MatrixForm=
This is a 32 random matrix whose columns are linearly independent.
Out[4]//MatrixForm=
The reduced row echelon form
thus, its rank should be 2.
Out[5]//MatrixForm=
also computes the rank as 2.
The reduced row echelon form of the transpose of the matrix also has two nonzero rows. This is consistent with the fact that the rank of a matrix is equal to the rank of its transpose, even when the matrix is rectangular.
Out[7]//MatrixForm=
The inverse of a square matrix A is defined by
is the identity matrix. The inverse can be computed in Mathematica with the function .
[mat]give the inverse of the matrix mat
Here is a sample 22 matrix.
Out[3]//MatrixForm=
This demonstrates that the matrix satisfies the definition.
Out[4]//MatrixForm=
Not all matr if a matrix does not have an inverse it is said to be singular. If a matrix is rank-deficient then it is singular.
The matrix inverse can in principle be used to solve the matrix equation
by multiplying it by the inverse of .
However, it is always better to solve the matrix equation directly. Such techniques are discussed in the section &&.
PseudoInverse
When the matrix is singular or is not square it is still possible to find an approximate inverse that minimizes . When the 2-norm is used this will find a least squares solution known as the pseudoinverse.
[mat]give the pseudoinverse of the matrix mat
This finds the pseudoinverse for the singular matrix defined earlier.
Out[3]//MatrixForm=
The result is not the identity matrix, but it is &close& to the identity matrix.
Out[4]//MatrixForm=
This computes the pseudoinverse of a rectangular matrix.
Out[7]//MatrixForm=
The result is quite close to the identity matrix.
Out[8]//MatrixForm=
The solution found by the pseudoinverse is a least squares solution. These are discussed in more detail under &&.
Determinant
The determinant of an nn matrix is defined as follows.
It can be computed in Mathematica with the function .
[mat]determinant of the matrix mat
[mat]matrix of minors of mat
[mat,k]k minors of mat
Here is a sample 22 matrix.
One useful property of the determinant is that
if and only if
is singular.
As already pointed out, if the matrix is singular, it does not have full rank.
A minor of a matrix is the determinant of any kk submatrix. This example uses the function
to compute all the 22 minors.
The size of the submatrices can be controlled by a second option. Here the 11 these are just the entries of the matrix.
Note that [m] is equivalent to [m,n-1]. In general, [m,k] computes determinants of all the possible kk submatrices that can be generated from the matrix m (this is done by picking different k rows and k columns). For the 22 minors of a 44 matrix you obtain a 66 matrix because there are only 6 different ways to pick 2 rows from 4 rows, and to pick 2 columns from 4 columns.
One of the properties of the rank of a matrix is that it is equal to the size of the largest nonsingular submatrix. This can be demonstrated with . In this example, the following matrix has a rank of 2.
The determinant of the matrix is zero.
When the 22 minors are computed you can see that not all are zero, confirming that the rank of the matrix is 2.
Solving Linear Systems
One of the most important uses of matrices is to represent and solve linear systems. This section discusses how to solve linear systems with Mathematica. It makes strong use of , the main function provided for this purpose.
Solving a linear system involves solving a matrix equation . Because
matrix this is a set of
linear equations in
the system is said to be square. If
there are more equations than unknowns and the system is overdetermined. If
there are fewer equations than unknowns and the system is underdetermined.
; solution may or may not exist
; solutions may or may not exist
; no solutions or infinitely many solutions exist
number of independent equations equal to the number of variables and a unique solution exists
at least one solution exists
no solutions exist
Classes of linear systems represented by rectangular matrices.
Note that even though you could solve the matrix equation by computing the inverse via , this is not a recommended method. You should use a function that is designed to solve linear systems directly and in Mathematica this is provided by .
[m,b]a vector x that solves the matrix equation
[m]a function that can be applied repeatedly to many b
Solving linear systems with .
to solve a matrix equation.
The solution can be tested to demonstrate that it solves the problem.
You can always generate the linear equations from the input matrices.
can then be used to find the solution.
also works when the right-hand side of the matrix equation is also a matrix.
If the system is overdetermined,
will find a solution if it exists.
If no solution can be found, the system is inconsistent. In this case it might still be useful to find a best-fit solution. This is often done with a least squares technique, which is explored under &&.
If the system is underdetermined, there is either no solution or infinitely many solutions and
will return one.
A system is consistent if and only if the rank of the matrix
equals the rank of the matrix formed by adding
as an extra column to
(known as the augmented matrix). The following demonstrates that the previous system is consistent.
can work with all the different types of matrices that Mathematica supports, including symbolic and exact, machine number, and arbitrary precision. It also works with both dense and sparse matrices. When a sparse matrix is used in , special techniques suitable for sparse matrices are used and the result is the solution vector.
the method to use for solving
0take the equation to be modulo n
(#0&)the function to be used by symbolic methods for testing for zero
Options for .
has three options that allow users more control. The
options are used by symbolic techniques and are discussed in the section &&. The
option allows you to choose methods that are known to be suitable for
the default setting of
means that
chooses a method based on the input.
Singular Matrices
A matrix is singular if its inverse does not exist. One way to test for singularity is to c this will be zero if the matrix is singular. For example, the following matrix is singular.
For many right-hand sides there is no solution.
However, for certain values there will be a solution.
For the first example, the rank of
does not equal that of the augmented matrix. This confirms that the system is not consistent and cannot be solved by .
For the second example, the rank of
does equal that of the augmented matrix. This confirms that the system is consistent and can be solved by .
In those cases where a solution cannot be found, it is still possible to find a solution that makes a best fit to the problem. One important class of best-fit solutions involves least squares solutions and is discussed in &&.
Homogeneous Equations
The homogeneous matrix equation involves a zero right-hand side.
This equation has a nonzero solution if the matrix is singular. To test if a matrix is singular, you can compute the determinant.
To find the solution of the homogeneous equation you can use the function . This returns a set of orthogonal vectors, each of which solves the homogeneous equation. In the next example, there is only one vector.
This demonstrates that the solution in fact solves the homogeneous equation.
The function
can be used to replace approximate numbers close to 0.
The solution to the homogeneous equation can be used to form an infinite number of solutions to the inhomogeneous equation. This solves an inhomogeneous equation.
The solution does in fact solve the equation.
If you add to
an arbitrary factor times the homogeneous solution, this new vector also solves the matrix equation.
Estimating and Calculating Accuracy
An important way to quantify the accuracy of the solution of a linear system is to compute the condition number . This is defined as follows for a suitable choice of norm.
It can be shown that for the matrix equation
the relative error in
times the relative error in
and . Thus, the condition number quantifies the accuracy of the solution. If the condition number of a matrix is large, the matrix is said to be ill-conditioned. You cannot expect a good solution from an ill-conditioned system. For some extremely ill-conditioned systems it is not possible to obtain any solution.
In Mathematica an approximation of the condition number can be computed with the function .
LinearAlgebra`MatrixConditionNumber[mat]
infinity‐norm approximate condition number of a matrix of approximate numbers
LinearAlgebra`MatrixConditionNumber[mat,-&p]
p‐norm approximate condition number of a matrix, where p may be 1, 2, or
This computes the condition number of a matrix.
This matrix is singular and the condition number is .
This matrix has a large condition number and is said to be ill-conditioned.
If a matrix is multiplied with itself, the condition number increases.
When you solve a matrix equation involving an ill-conditioned matrix, the result may not be as accurate.
The solution is less accurate for the matrix .
The solution to these problems is to avoid constructing matrices that may be ill-conditioned, for example, by avoiding multiplying matrices by themselves. Another method of solving problems more accurately is to use Mathematica's arbitrary-p this is discussed under &&.
Symbolic and Exact Matrices
works for all the different types of matrices that can be represented in Mathematica. These are described in more detail in &&.
Here is an example of working with purely symbolic matrices.
Out[2]//MatrixForm=
The original matrix can be multiplied into the solution.
The result of multiplication needs some postprocessing with
to reach the expected value. This demonstrates the need for intermediate processing in symbolic linear algebra computation. Without care these intermediate results can become extremely large and make the computation run slowly.
In order to simplify the intermediate expressions, the
option might be useful.
can also be used to solve exact problems.
Out[8]//MatrixForm=
There are a number of methods that are specific to symbolic and exact computation: , , and . These are discussed in the section &&.
Row Reduction
Row reduction involves adding multiples of rows together so as to produce zero elements when possible. The final matrix is in reduced row echelon form as described . If the input matrix is a nonsingular square matrix, the result will be an identity matrix.
This can be used as a way to solve a system of equations.
This can be compared with the following use of .
Saving the Factorization
Many applications of linear systems involve the same matrix
but different right-hand sides . Because a significant part of the effort involved in solving the system involves processing , it is common to save the factorization and use it to solve repeated problems. In Mathematica, this is done by using a one-argument form of ; this returns a functional that you can apply to different vectors to obtain each solution.
When you can apply the
to a particular right-hand side the solution is obtained.
This solves the matrix equation.
A different right-hand side yields a different solution.
This new solution solves this matrix equation.
The one-argument form of
works in a completely equivalent way to the two-argument form. It works with the same range of input matrices, for example, returning the expected results for symbolic, exact, or sparse matrix input. It also accepts the same options.
One issue with the one-argument form is that for certain input matrices the factorization cannot be saved. For example, if the system is overdetermined, the exact solution is not certain to exist. In this case a message is generated that warns you that the factorization will be repeated each time the functional is applied to a particular right-hand side.
For this right-hand side there is a solution, and this is returned.
However, for this right-hand side there is no solution, and an error is encountered.
provides a number of different techniques to solve matrix equations that are specialized to particular problems. You can select between these using the option . In this way a uniform interface is provided to all the functionality that Mathematica provides for solving matrix equations.
The default setting of the option
is . As is typical for Mathematica, this means that the system will make an automatic choice of method to use. For
if the input matrix is numerical and dense, then if it is numerical and sparse, a
is used. If the matrix is symbolic then
The details of the different methods are now described.
LAPACK is the default method for solving dense numerical matrices. When the matrix is square and nonsingular the routines dgesv, dlange, and dgecon are used for real matrices and zgesv, zlange, and zgecon for complex matrices. When the matrix is nonsquare or singular dgelss is used for real matrices and zgelss for complex matrices. More information on LAPACK is available in the .
If the input matrix uses arbitrary-precision numbers, then LAPACK algorithms extended for arbitrary-precision computation are used.
Multifrontal
The multifrontal method is a direct solver used by default if the in it can also be selected by specifying a method option.
This reads in a sample sparse matrix stored in the Matrix Market format. More information on the importing of matrices is under &&.
This solves the matrix equation using the sample matrix.
If the input matrix to the multifrontal method is dense, it is converted to a sparse matrix.
The implementation of the multifrontal method uses the && library.
The Krylov method is an iterative solver that is suitable for large sparse linear systems, such as those arising from numerical solving of PDEs. In Mathematica two Krylov methods are implemented: conjugate gradient (for symmetric positive definite matrices) and BiCGSTAB (for nonsymmetric systems). It is possible to set the method that is used and a number of other parameters by using appropriate suboptions.
the maximum number of iterations
BiCGSTABthe method to use for solving
Preconditionerthe preconditioner
ResidualNormFunctionthe norm to minimize
the tolerance to use to terminate the iteration
Suboptions of the Krylov method.
The default method for Krylov, BiCGSTAB, is more expensive but more generally applicable. The conjugate gradient method is suitable for symmetric positive definite systems, always converging to a solution (though the convergence may be slow). If the matrix is not symmetric positive definite, the conjugate gradient may not converge to a solution.
In this example, the matrix is not symmetric positive definite and the conjugate gradient method does not converge.
The default method, BiCGSTAB, converges and returns the solution.
This tests the solution that was found.
Here is an example that shows the benefit of the Krylov method and the use of a preconditioner. First, a function that builds a structured banded sparse matrix is defined.
This demonstrates the structure of the matrices generated by this function.
This builds a larger system, with a
This will use the default sparse solver, a
direct solver.
The Krylov method is faster.
The use of the ILU preconditioner is even faster. Currently, the preconditioner can only work if the matrix has nonzero diagonal elements.
At present, only the ILU preconditioner is built-in. You can still define your own preconditioner by defining a function that is applied to the input matrix. An example that involves solving a diagonal matrix is shown next.
First the problem is set up and solved without a preconditioner.
Now, a preconditioner function
there is a significant improvement to the speed of the computation.
Generally, a problem will not be structured so that it can benefit so much from such a simple preconditioner. However, this example is useful since it shows how to create and use your own preconditioner.
If the input matrix to the Krylov method is dense, the result is still found because the method is based on matrix/vector multiplication.
The Krylov method can be used to solve systems that are too large for a direct solver. However, it is not a general solver, being particularly suitable for those problems that have some form of diagonal dominance.
The Cholesky method is suitable for solving symmetric positive definite systems.
The Cholesky method is faster for this matrix.
The Cholesky method is also more stable. This can be demonstrated with the following matrix.
For this matrix the default LU decomposition method reports a potential error.
The Cholesky method succeeds.
For dense matrices the Cholesky method uses LAPACK functions such as dpotrf and dpotrs for real matrices and zpotrf and zpotrs for complex matrices. For sparse matrices the Cholesky method uses the && library.
Symbolic Methods
There are a number of methods that are specific to symbolic and exact computation: , , and . These can be demonstrated for the following symbolic matrix.
The computation is repeated for all three methods.
Least Squares Solutions
This section follows from the previous section. It is concerned with finding solutions to the matrix equation , where
matrix and
(i.e., when there are more equations than unknowns). In this case, there may well be no solution, and
will issue an error message.
In these cases it is possible to find a best-fit solution that minimizes . A particularly common choice of p is 2; this generates a least squares solution. These are often used because the function
is differentiable in
and because the 2-norm is preserved under orthogonal transformations.
If the rank of
(i.e., it has full column rank), it can be shown () that there is a unique solution to the least squares problem and it solves the linear system.
These are called the normal equations. Although they can be solved directly to get a least squares solution, this is not recommended because if the matrix
is ill-conditioned then the product
will be even more ill-conditioned. Ways of finding least squares solutions for full rank matrices are explored in .
It should be noted that best-fit solutions that minimize other norms, such as 1 and , are also possible. A number of ways to find solutions to overdetermined systems that are suitable for special types of matrices or to minimize other norms are given under .
A general way to find a least squares solution to an overdetermined system is to use a singular value decomposition to form a matrix that is known as the pseudoinverse of a matrix. In Mathematica this is computed with . This technique works even if the input matrix is rank deficient. The basis of the technique follows.
Here is an o you can see that the matrix
is rank deficient.
The least squares solution can still be found using .
This shows how close the solution actually is.
Data Fitting
Scientific measurements often result in collections of ordered pairs of data, . In order to make predictions at times other than those that were measured, it is common to try to fit a curve through the data points. If the curve is a straight line, then the aim is to find unknown coefficients
for all the data pairs. This can be written in a matrix form as shown here and is immediately recognized as being equivalent to solving an overdetermined system of linear equations.
Fitting a more general curve, for example,
is equivalent to the matrix equation
In this case the left-hand matrix is a Vandermonde matrix. In fact any function that is linear in the unknown coefficients can be used.
An example of how to find best-fit parameters will now be given. This will start with the measurements shown next.
These can be plotted graphically.
You could try to solve this with the
technique introduced in &&. First, split the data into
components.
Now form the left-hand side matrix.
Out[5]//MatrixForm=
The best-fit parameter can be found as follows.
Mathematica provides the function
that can find best-fit solutions to data. The solution agrees with that found by the .
Finally, a plot is made that shows the data points and the fitted curve.
The function
it can also fit functions to data that is not linear in the parameters.
Eigensystem Computations
The solution of the eigenvalue problem is one of the major areas for matrix computations. It has many applications in physics, chemistry, and engineering. For an
the eigenvalues are the
roots of its characteristic polynomial, . The set of roots, , are called the spectrum of the matrix. For each eigenvalue, , the vectors, , that satisfy
are described as eigenvectors. The matrix of the eigenvectors, , if it exists and is nonsingular, may be used as a similarity transformation to form a diagonal matrix with the eigenvalues on the diagonal elements. Many important applications of eigenvalues involve the diagonalization of matrices in this way.
Mathematica has various functions for computing eigenvalues and eigenvectors.
[m]a list of the eigenvalues of m
[m]a list of the eigenvectors of m
[m]a list of the form
[m,x]the characteristic polynomial of m
Eigenvalues and eigenvectors.
Here is a sample 22 matrix.
This computes the eigenvalues.
This computes the eigenvectors.
You can compute both the eigenvalues and eigenvectors with .
You can confirm that the first eigenvalue and its eigenvector satisfy the definition.
You should note that
return a list of eigenvectors. This means that if you want a matrix with columns that are the eigenvectors you must take a transpose. Here, the eigenvalues and their corresponding eigenvectors are shown to meet the definition of the eigenvectors.
You can also obtain the characteristic polynomial of the test matrix.
This finds the roots of the chara they are seen to match the eigenvalues of the matrix.
Eigenvalue computations work for sparse matrices as well as for dense matrices. In the following example, one of the eigenvalues is zero.
Out[10]//MatrixForm=
An eigenvalue being zero means that the null space of the matrix is not empty.
Certain applications of eigenvalues do not require all of the eigenvalues to be computed. Mathematica provides a mechanism for obtaining only some eigenvalues.
[m,k]the largest k eigenvalues of m
[m,k]the corresponding eigenvectors of m
[m,-k]the smallest k eigenvalues of m
[m,-k]the corresponding eigenvectors of m
This is particularly useful if the matrix is very large. Here is a
sparse matrix.
This is the largest eigenvalue.
whether to return radicals for 33 matrices
the method to use for solving
whether to return radicals for 44 matrices
(#0&)the function to be used by symbolic methods for testing for zero
Options for .
has four options that allow users more control. The , , and
options are used by symbolic techniques and are discussed under &&. The
option allows knowledgeable users to choose methods that are particularly appropriat the default setting of
means that
makes a choice of a method that is generally suitable.
Eigensystem Properties
This section describes certain properties of eigensystem computations. It should be remembered that because the eigenvalues of an
matrix can be associated with the roots of an
degree polynomial, there will always be
eigenvalues.
Even if a matrix only contains real numbers, its eigenvalues may not be real.
The eigenvalues of a ma eigenvalues that are not repeated are called simple. In this example, the repeated eigenvalue has as many eigenvectors as the number of
this is a semisimple eigenvalue.
The repeated eigenvalue may have fewer eigenvectors than the number of times it is repeated. In this case it is referred to as a defective eigenvalue and the matrix is referred to as a defective matrix.
A matrix may have a zero eigenvalue, but can still have a complete set of eigenvectors.
A matrix may have a zero eigenvalue, but not have a complete set of eigenvectors.
If a matrix is real and symmetric, all of the eigenvalues are real, and there is an orthonormal basis of eigenvectors.
The eigenvectors are an orthonormal basis.
A symmetric real matrix that has positive eigenvalues is known as positive definite.
A symmetric real matrix that has nonzero eigenvalues is known as positive semidefinite.
If a matrix is complex and Hermitian (i.e., equal to its conjugate transpose), all of the eigenvalues are real and there is an orthonormal basis of eigenvectors.
The eigenvectors are an orthonormal basis.
A Hermitian matrix that has positive eigenvalues is known as positive definite.
Diagonalizing a Matrix
One of the uses of eigensystem computations is to provide a similarity transformation that diagonalizes a matrix.
Out[3]//MatrixForm=
Similarity transformations preserve a number of useful properties of a matrix such as determinant, rank, and trace.
If the matrix of eigenvectors is singular then the matrix cannot be diagonalized. This happens if the matrix is defective (i.e., there is an eigenvalue that does not have a complete set of eigenvectors) as in this example.
Out[7]//MatrixForm=
The matrix of column eigenvectors is singular and so a similarity transformation that diagonalizes the original matrix does not exist.
Symbolic and Exact Matrices
As is typical for Mathematica, if a computation is done for a symbolic or exact matrix it will use symbolic computer algebra techniques and return a symbolic or exact result.
This is a sample 33 matrix of integers.
This comp the result uses
A machine-precision approximation to the result can be computed by using .
The result can be returned in terms of radicals by setting the
option to .
Generalized Eigenvalues
the generalized eigenvalues are the
roots of its characteristic polynomial, . For each generalized eigenvalue, , the vectors, , that satisfy
are described as generalized eigenvectors.
[{a,b}]the generalized eigenvalues of a and b
[{a,b}]the generalized eigenvectors of a and b
[{a,b}]the generalized eigensystem of a and b
Generalized eigenvalues and eigenvectors.
Here are two 22 matrices.
This computes the generalized eigenvalues.
If the two matrices share a common nonzero null space, there will be an indeterminate eigenvalue.
These two matrices share a common one-dimensional null space.
One of the generalized eigenvalues is .
Eigenvalue computations are solved with a number of different techniques that are specialized to particular problems. You can select between these using the option . In this way a uniform interface is provided to all the functionality that Mathematica provides.
The default setting of the option
is . As is typical for Mathematica, this means that the system will make an automatic choice of method to use. For eigenvalue computation when the input is an
matrix of machine numbers and the number of eigenvalues requested is less than 20% of
is used. Otherwise, if the input matrix is numerical then a
routines is used. If the matrix is symbolic then
The details of the different methods are now described.
LAPACK is the default method for computing the entire set of eigenvalues and eigenvectors. When the matrix is unsymmetric, dgeev is used for real matrices and zgeev is used for complex matrices. For symmetric matrices, dsyevr is used for real matrices and zheevr is used for complex matrices. For generalized eigenvalues the routine dggev is used for real matrices and zggev for complex matrices. More information on LAPACK is available in the .
It should be noted that the computation of eigenvalues for a symmetric matrix is faster because Mathematica detects this case and switches to the appropriate method.
If the input matrix uses arbitrary-precision numbers, LAPACK algorithms extended for arbitrary-precision computation are used.
The Arnoldi method is an iterative method used to compute a finite number of eigenvalues. The implementation of the Arnoldi method uses the && library. The most general problem that can be solved with this technique is to compute a few selected eigenvalues for
Because it is an iterative technique and only finds a few eigenvalues, it is often suitable for sparse matrices. One of the features of the technique is the ability to apply a shift
and solve for a given eigenvector
and eigenvalue
This is effective for finding eigenvalues near to .
The following suboptions can be given to the Arnoldi method.
BasisSizethe size of the Arnoldi basis
CriteriaMagnitudethe method to use for solving
the maximum number of iterations
Shiftan Arnoldi shift
StartingVectorthe initial vector to use
the tolerance to use to terminate the iteration
Suboptions of the Arnoldi method.
Here is a 33 sparse matrix.
This computes the largest eigenvalue, specifying that the Arnoldi method is used.
This computes the eigenvalue near to .
If a shift is given that matches an eigenvalue an error may result.
This tries to find eigenvalues of a
tridiagonal matrix with 2 on the diagonal and
off the diagonal. The algorithm does not converge.
When the technique does not converge it returns a result, but this may not be very accurate. One way to accelerate the convergence would be to use the results generated as a shift.
An alternative way to get convergence is to increase the number of iterations.
Alternatively, the convergence may be achieved by increasing the Arnoldi basis size in addition to the number of iterations. In this example, the increased basis size makes the algorithm converge in a smaller number of iterations. Even though each iteration is more expensive, the overall computation is faster.
The Arnoldi algorithm with a basis size of 30 is still often faster and less memory intensive than the dense eigenvalue computation. (It should be noted that specific timings on individual machines can vary.)
For many large sparse systems that occur in practical computations the Arnoldi algorithm is able to converge quite quickly.
Symbolic Methods
Symbolic eigenvalue computations work by interpolating the characteristic polynomial.
Matrix Decompositions
This section will discuss a number of standard techniques for working with matrices. These are often used as building blocks for solving matrix problems. The decompositions fall into the categories of factorizations, orthogonal transformations, and similarity transformations.
LU Decomposition
The LU decomposition of a matrix is frequently used as part of a Gaussian elimination process for solving a matrix equation. In Mathematica there is no need to use an LU decomposition to solve a matrix equation, because the function
does this for you, as discussed in the section &&. Note that if you want to save the factorization of a matrix, so that it can be used to solve the same left-hand side, you can use the one-argument form of , as demonstrated in the section &&.
The LU decomposition with partial pivoting of a square nonsingular matrix
involves the computation of the unique matrices , , and
is lower triangular with ones on the diagonal,
is upper triangular, and
is a permutation matrix. Once it is found, it can be used to solve the matrix equation by finding the solution to the following equivalent system.
This can be done by first solving for
in the following.
Then solve for
to get the desired solution.
Because both
are triangular matrices it is particularly easy to solve these systems.
Despite the fact that in Mathematica you do not need to use the LU decomposition for solving matrix equations, Mathematica provides the function
for computing the LU decomposition.
[m]the LU decomposition with partial pivoting
returns a list of three elements. The first element is a combination of upper and lower triangular matrices, the second element is a vector specifying rows used for pivoting (a permutation vector which is equivalent to the permutation matrix), and the third element is an estimate of the condition number.
In this example, the three results of the
computation are shown.
One of the features of the LU decomposition is that the lower and upper matrices are stored in the same matrix. The individual
factors can be obtained as follows. Note how they use a vectorized operation for element-by-element multiplication with a sparse matrix. This makes the pr these techniques are discussed further in &&.
This generates the
Out[7]//MatrixForm=
This generates the
Out[11]//MatrixForm=
The original matrix was nonsingular and so this is a factorization. Therefore multiplying
recreates the original matrix permuted with the row pivot vector.
The following subtracts the product of the
matrices from the permuted original matrix. The difference is very close to zero.
The diagonal elements in the
matrix are known as the pivots. If any zero pivots emerge during a computation of the LU factorization this means that the matrix is singular. In such a case
produces a warning.
Cholesky Decomposition
The Cholesky decomposition of a symmetric positive definite matrix
is a factorization into a unique upper triangular
This factorization has a number of uses, one of which is that, because it is a triangular factorization, it can be used to solve systems of equations involving symmetric positive definite matrices. (The way that a triangular factorization can be used to solve a matrix equation is shown in the section &&.) If a matrix is known to be of this form it is preferred over the LU factorization because the Cholesky factorization is faster to compute. If you want to solve a matrix equation using the Cholesky factorization you can do this directly from
using the C this is described in a
The Cholesky factorization can be computed in Mathematica with the function .
[m]the Cholesky decomposition
Out[3]//MatrixForm=
This reconstructs the original matrix.
If the matrix is not positive definite then the Cholesky decomposition does not exist.
There are a number of equivalent definitions of positive definite, such as the eigenvalues being all positive.
One way to test if a matrix is positive definite is to see if the Cholesky decomposition exists.
Note that if the matrix is complex, the factorization is defined by the conjugate transpose. The following computes the Cholesky decomposition of a complex matrix.
This demonstrates that the factorization is correct.
Out[11]//MatrixForm=
Cholesky and LU Factorizations
The Cholesky factorization is related to the LU factorization as
is the diagonal matrix of pivots.
This can be demonstrated as follows.
Out[4]//MatrixForm=
Now you can compute the
Out[9]//MatrixForm=
This step computes the matrix of pivots.
Out[12]//MatrixForm=
Finally, this computes ; its transpose is equal to the Cholesky decomposition.
Out[13]//MatrixForm=
Orthogonalization
Orthogonalization generates an orthonormal basis from an arbitrary basis. An orthonormal basis is a basis, , for which
In Mathematica a set of vectors can be orthogonalized with the function .
[{v1,v2,…}]generate an orthonormal set from the given list of real vectors
This creates a set of three vectors that form a basis for .
A plot vi they all tend to lie in the same direction.
This computes an orthonormal basis.
The orthonormal vectors are obviously much more spread out.
The vectors , , and
thus, the dot product of each vector with itself is 1.
In addition, the dot product of a vector with another vector is 0.
to compare all vectors with all other vectors.
By default a Gram–Schmidt orthogonalization is computed, but a number of other orthogonalizations can be computed.
GramSchmidt
ModifiedGramSchmidt
Reorthogonalization
Householder
Orthogonalization methods.
This uses a
QR Decomposition
The QR decomposition of a rectangular matrix
with linearly independent columns involves the computation of matrices
where the columns of
form an orthonormal basis for the columns of
is upper triangular. It can be computed in Mathematica with the function .
[m]the QR decomposition of a matrix
This computes the QR decomposition of a sample matrix. The result is a list of the two matrices.
In fact the first argument that is returned is a list of the orthonormal columns. To generate the
matrix it is necessary to compute the transpose.
This is a matrix factorization, so
is equal to .
In addition, the columns of
are orthonormal. Because the result of
returns the list of columns, they can be extracted with one part index. The following two examples demonstrate the orthonormal properties of the columns.
The QR decomposition is also defined for rectangular matrices. In this case .
For an input matrix where , the result of
is two matrices: an
matrix . When the matrix represents an overdetermined system of equations, that is, ,
returns an
matrix . (Note that to generate the matrix , it is necessary to transpose the result from .) This is demonstrated in the following example.
This is often referred to as the thin QR see for example . If you wish to compute the full QR decomposition, it is possible to pad out
with extra rows of zeros and
with extra orthonormal columns. This can be done with the following function.
computed by this function are
and , respectively.
Solving Systems of Equations
For a square nonsingular matrix
the QR decomposition can be used to solve the matrix equation , as is also the case for the LU decomposition. However, when the matrix is rectangular, the QR decomposition is also useful for solving the matrix equation.
One particular application of the QR decomposition is to find
to overdetermined systems, by solving the system of normal equations
Because , this can be simplified as
Thus the normal equations simplify to
and because
is nonsingular this simplifies to
is triangular this is a particularly
sample code to implement this technique is given in .
Singular Value Decomposition
The singular value decomposition of a rectangular matrix
involves the computation of orthogonal matrices
and a diagonal matrix
The diagonal elements of the matrix
are called the singular values of . In Mathematica, the function
computes the singular values and the function
computes the full singular value decomposition.
[m]a list of the nonzero singular values of m
[m]the singular value decomposition of m
This is a factorization so the original matrix can be restored.
Out[5]//MatrixForm=
Because the matrices
are orthogonal, they can be used as an orthogonal transformation of the original matrix to generate the diagonal matrix with the singular values on the diagonal.
Out[6]//MatrixForm=
If the matrix is singular then some of the singular values will be zero.
Out[9]//MatrixForm=
only returns the nonzero singular values.
Note that if the matrix is complex, the definition of the singular value decomposition uses the conjugate transpose.
There are many applications of the singular value decomposition. The singular values of a matrix give information on the linear transformation represented by . For example, the action of
on a unit sphere generates an ellipsoid with semi-axes given by the singular values. The singular values can be used to compute
the number of nonzero singular values is equal to the rank. The singular values can be used to compute the 2-norm of a matrix, and the columns of the
matrix that correspond to zero singular values are the null space of the matrix.
Certain applications of singular values do not require all of the singular values to be computed. Mathematica provides a mechanism for obtaining only some singular values.
[m,k]the largest k singular values of m
[m,k]the corresponding singular value decomposition of m
[m,-k]the smallest k singular values of m
[m,-k]the corresponding singular value decomposition of m
This returns the smallest singular value because it is zero, this demonstrates the matrix is not full rank.
Generalized Singular Values
the generalized singular values are given by the pair of factorizations
are orthogonal, and
is invertible.
[{a,b}]the generalized singular values of a and b
[{a,b}]the generalized singular value decomposition of a and b
These are some sample matrices.
This returns the generalized singular values of the matrices
This computes the full generalized singular value decomposition of the matrices
This demonstrates the identity for .
Out[5]//MatrixForm=
This demonstrates the identity for .
Out[6]//MatrixForm=
Finally, the singular values are computed by dividing the corresponding diagonal elements.
The functions
both take a
the tolerance for treating small singular values as zero
The option controls the size at which singular values are treated as zero. By default, values that are tol times smaller than the largest singular value are dropped, where tol is 100 for machine-number matrices. For arbitrary-precision matrices, it is , where
is the precision of the matrix.
The smallest singular value of this matrix is just larger than the default setting for the tolerance. It is not treated as being equivalent to a zero singular value.
Increasing the setting for the tolerance causes the small singular value to be treated as zero.
Schur Decomposition
The Schur decomposition of a square matrix
involves finding a unitary matrix
that can be used for a similarity transformation of
to form a block upper triangular matrix
with 11 and 22 blocks on the diagonal (the 22 blocks correspond to complex conjugate pairs of eigenvalues for a real matrix ). A block upper triangular matrix of this form can be called upper quasi-triangular.
The Schur decomposition always exists and so the similarity transformation of
to upper triangular always exists. This contrasts with the eigensystem similarity transformation, used to , which does not always exist.
[m]the Schur decomposition of a matrix
This computes the Schur decomposition of a sample matrix. The result is a list of the two matrices.
The matrix
can be used for a similarity transformation on the matrix to generate the upper triangular result .
Out[4]//MatrixForm=
For this particular matrix, a similarity transformation that generates an even simpler form can be found, because the matrix can be diagonalized.
Out[7]//MatrixForm=
This matrix cannot be diagonalized because the matrix of eigenvectors is singular.
However, the Schur decomposition can be found and the matrix transformed to an upper triangular form.
In this example, the matrix has complex eigenvalues.
Now the resulting matrix
has a 22 block on the diagonal.
The matrix
can be used for a similarity transformation on the matrix to generate the upper quasi-triangular result. Note that an upper triangular result (with 11 blocks on the diagonal) that may involve complex numbers can be obtained by using the option . When the result is upper triangular (i.e., has 11 blocks on the diagonal) the eigenvalues of the matrix are always found on the diagonal.
Out[17]//MatrixForm=
Note that if the matrix is complex the definition of the Schur decomposition uses the conjugate transpose and returns an upper triangular result. This is demonstrated for the following complex matrix.
This demonstrates that the result satisfies the definition of the Schur decomposition.
Out[21]//MatrixForm=
The diagonal of
contains the eigenvalues of .
Generalized Schur Decomposition
and , the generalized Schur decomposition is defined as
are unitary,
is upper triangular, and
is upper quasi-triangular.
[{a,b}]the generalized Schur decomposition of a and b
These are some sample matrices.
This returns the generalized Schur decomposition.
This demonstrates the results are consistent with the definition of the decomposition.
Out[4]//MatrixForm=
Out[5]//MatrixForm=
For real input, a result involving complex numbers and an upper triangular result can be obtained with the option .
takes two options.
Pivotingwhether to carry out pivoting and scaling
RealBlockFormwhether to return 22 real blocks for complex eigenvalues
Options for .
The option
can be used to allow pivoting and scaling to improve the quality of the result. When it is set to , pivoting and scaling may be used and a matrix
that represents the changes to
is returned. With this new matrix the definition of the Schur decomposition can be seen as follows.
The use of pivoting and scaling is now demonstrated for the following matrix. When
is set to , pivoting and scaling are used if necessary, and an extra matrix (which here only represents scaling) is returned.
This demonstrates the transformation to an upper triangular form.
Out[4]//MatrixForm=
The diagonal elements of
are the eigenvalues of .
When the Schur decomposition is computed without pivoting and scaling, the diagonal elements of
are not as close to the eigenvalues of . This demonstrates the utility of the
Out[7]//MatrixForm=
The option
controls the generation of the upper quasi-triangular result. If this is set to , a result that may have 22 blocks on the diagonal is generated. If it is set to , the result is an upper triangular with 11 blocks on the diagonal (but which may involve complex numbers). This is demonstrated for the following real matrix, which has complex eigenvalues and a Schur decomposition with 22 blocks on the diagonal.
generates a matrix
are complex.
This generates the upper triangular result .
Out[12]//MatrixForm=
Jordan Decomposition
The Jordan decomposition of a square matrix
involves finding the nonsingular matrix
that can be used for a similarity transformation of
to generate a matrix
(known as the Jordan form) that has a particularly simple triangular structure.
The Jordan decomposition always exists, but it is hard to compute with floating-point arithmetic. However, computation with exact arithmetic avoids these problems.
[m]the Jordan decomposition of a matrix
This demonstrates the Jordan form for a sample matrix.
Out[3]//MatrixForm=
The Jordan form has the eigenvalues of the matrix along its diagonal. Any defective eigenvalues are grouped into blocks by 1s just above the diagonal. The Jordan form of the above matrix is shown here.
Out[4]//MatrixForm=
The Jordan form shows that there are two eigenvalues:
and 2. The eigenvalue
is repeated twice and has a complete set of eigenvectors. The eigenvalue 2 is repeated four times. It appears once with its own eigenvector, and then three times with only one full eigenvector. This is demonstrated when the eigensystem for the matrix is computed.
Functions of Matrices
The computation of functions of matrices is a general problem with applications in many areas such as control theory. The function
of a square matrix
is not the same as the application of the function to each element in the matrix. Clearly element-wise application would not maintain properties consistent with the application of the function to a scale.
For example, each element of the following matrix is raised to the zero power.
However, a much better result would be the identity matrix. Mathematica has a function for raising a matrix to a power, and when a matrix is raised to the zero power the result is the identity matrix.
There are a number of ways to define f one useful way is to consider a series expansion. For the exponential function this works as follows.
One way to compute this series involves diagonalizing , so that
and . Therefore, the exponential of
can be computed as follows.
This technique can be generalized to functions
of the eigenvalues of . Note that while this is one way to define functions of matrices, it does not provide a good way to compute them.
Mathematica does not have a function for computing general functions of matrices, but it has some specific functions.
[m,n]n matrix power
[m]matrix exponential
matrix multiplication
[m]matrix inverse
Here is a sample matrix.
This raises the matrix to the fourth power.
The result is equivalent to squaring the square of the matrix.
This computes the exponential of the matrix.
It is equivalent to the computation that uses the eigensystem of the matrix. (It should be noted that this is not an efficient way to compute a
the example here is only for exposition.)
A technique for computing parametrized functions of matrices by solving differential equations is given in .
Please complete this field.
Name (optional)
Email address (optional)
Send Feedback
Customer Care
Public Resources
Enable JavaScript to interact with content and submit forms on Wolfram websites.

我要回帖

更多关于 竖线怎么打 的文章

 

随机推荐