Skip to content

On Companion Matrices

Let \(F\) be a field and \(F[x]\) be the polynomial ring in \(x\) over \(F\).

Let \(f(x)\in F[x]\) be a monic polynomial of degree \(\deg{f(x)}=n\), write

\[f(x)=x^n+a_{n-1}x^{n-1}+\cdots+a_0\quad (a_i\in F)\]

Define the companion matrix \(C\) of \(f(x)\) to be the following \(n\times n\) matrix

\[ C=\begin{bmatrix} 0 && 0 && \cdots && 0 && -a_0 \\ 1 && 0 && \cdots && 0 && -a_1 \\ 0 && 1 && \cdots && 0 && -a_2 \\ \vdots && \vdots && \ddots && \vdots && \vdots \\ 0 && \cdots && \cdots && 1 && -a_{n-1} \end{bmatrix} \]

and is denoted by \(C(f(x))\).

Let \(f(x)\in F[x]\) be a monic polynomial and \(C=C(f(x))\) be its companion matrix, its characteristic polynomial \(\chi_C(x)\) and minimal polynomial \(\mu_C(x)\) can be shown to be \(f(x)\).

To compute \(\chi_C(x)\), expand from the definition of \(\chi_C(x)=\det(xI - C)\) and use basic properties of determinant of matrices it can be deduced that \(\chi_C(x)=f(x)\). To prove that \(\mu_C(x) = f(x)\), suppose that \(\deg(\mu_C(x)) = d < n\), then a non-trivial linear combination of the standard basis in \(F^n\) can be constructed, which contradicts to the definition of a basis. Hence \(d=n\) and \(\mu_C(x) = \chi_C(x)\) as \(\chi_C(C) = O\) by direct verification where \(O = O_{n\times n}\) is the \(n\times n\) zero matrix$.

Speaking of the rank of a companion matrix \(C=C(f(x))\), it can be immediately summarized as follows:

  • If \(C\) is non-singular, then \(\text{rank}(C) = n\).
  • If \(C\) is singular (i.e., when \(f(0) = a_0 = 0\)), then \(\text{rank}(C) = n-1\).

Let \(C = C(f(x))\) be the companion matrix of a monic \(f(x)\in F[x]\), it can be argued that \(C\) is the matrix representation of linear operator \(T: V \to V\) on a finite-dimensional vector space \(V\) over a field \(F\) with respect to some ordered basis \(\beta\) in \(V\). The construction is: let \(V=F^n\) and \(T:V\to V\) defined by \(T(v)=Cv\), the left multiplication of \(v\) by \(C\). and let \(\beta=\{e_1, T(e_1), \cdots, T^{n-1}(e_1)\}\). By the definition of \(C\), \(beta\) is linearly independent over \(F\) and therefore is a basis for \(V\). By the definition of the matrix representation,

\[[T]_{\beta} = C(f(x))\]

Let \(V\) be a finite-dimensional vector space over \(F\) and \(T:V\to V\) be a linear operator. Let \(v\in V\). consider the \(T\)-annihilator \(\mu_v(x)\) of \(v\), the monic polynomial \(p(x)\in F[x]\) with minimal degree such that \(p(T)v=0\). Let \(\mu_v(x) = f(x)\) and \(\deg(f(x)) = n\). Consider the set \(\beta=\{v, T(v),\ldots,T^{n-1}(v)\}\). Define

\[C_v = \text{span}(\beta)\]

The subspace \(C_v\) of \(V\) will be called the cyclic subspace of \(v\) (in \(V\)). Note that \(C_v\) depends on the underlying \(T\) as well but is usually omitted as the linear operator \(T\) being discussed is fixed throught the context.

By the definition of \(C_v\), \(C_v\) is \(T\)-invariant subspace of \(V\). Consider \(T|_{C_v}\), the linear operator induced by \(T\) on \(C_v\). Elementary facts about \(T|_{C_v}\) are: the characteristic polynomial \(\chi_{T_{C_v}}(x)\) and minimal polynomial \(\mu_{T|_{C_v}}(x)\) of \(T|_{C_v}\) are the same and equal to \(\mu_v(x) = f(x)\). Also by the definition of \(\mu_v(x)\) and that of \(\mu_T(x)\) and \(\chi_T(x)\),

\[\mu_v(x)\mid \mu_T(x), \mu_v(x)\mid \chi_T(x)\]

The importance of companion matrices is that they appear as the building blocks in some matrix representation of a linear operator on a finite-dimensional vector space over a field with respect to a particular basis. Such matrix representations are the so-called Rational Canonical Forms, and the existence theorem of such bases is the Cyclic Decomposition Theorem.

In essence, to derive the Cyclic Decomposition Theorem, a special case where \(\mu_T(x)\) is a power of an irreducible polynomial is being considered first and such well-behaved basis can be obtained. Later one can reduce the general case (no restriction on \(\mu_T(x)\)) to that special case and assemble those bases together to one for the original vector space.

Let \(V\) be a finite-dimensional vector space over F and \(T:V \to V\) be a linear operator on \(V\). Suppose that \(\mu_T(x) = q(x)^m\) where \(q(x)\) is a monic irreducible polynomial in \(F[x]\). Then there are cyclic subspaces \(C_{v_1}, \cdots, C_{v_{\ell}}\) such that \(V=\bigoplus_{i=1}^{\ell}{C_{v_i}}\), and \(\chi_{T|_{C_{v_i}}}(x) = \mu_{T|_{C_{v_i}}}(x) = q(x)^{m_{i}}\) with \(m = m_1 \ge m_2 \ge \cdots \ge m_{\ell}\). Let \(\beta\) be the ordered set derived from the cyclic bases of \(V_{v_1}, \cdots, V_{v_{\ell}}\), then

\[ [T]_{\beta}=\begin{bmatrix} C(q(x)^{m_1}) && O && O && \cdots && O\\ O && C(q(x)^{m_2}) && O && \cdots && O\\ O && O && C(q(x)^{m_2}) && \cdots && O\\ \vdots && && && \ddots && \vdots \\ O && \cdots && \cdots && O && C(q(x)^{m_{\ell}}) \end{bmatrix} \]

To obtain the Cyclic Decomposition Theorem without assuming that \(\mu_T(x) = q(x)^m\), a modern proof is to use the Primary Decomposition Theorem to decompose \(V\) into a direct sum of \(T\)-invariant subspaces \(W_i\), and consider the linear operator \(T|_{W_i}\) induced by \(T\) on \(W_i\). This consideration recudes to the previous case as \(T|_{W_i}\) is now a power of an irreducible polynomial.