Understanding Observables in Quantum Mechanics

Preliminaries

Recall the \(X\), \(Y\) \(Z\) Pauli Matrices:

\[\begin{equation} X = \begin{pmatrix} 0 && 1 \\ 1 && 0 \end{pmatrix} \end{equation}\]

\[\begin{equation} Y = \begin{pmatrix} 0 && -i \\ i && 0 \end{pmatrix} \end{equation}\]

\[\begin{equation} Z = \begin{pmatrix} 1 && 0 \\ 0 && -1 \end{pmatrix} \end{equation}\]

The eigenvalues of all 3 matrices are \(\{+1, -1\}\). Calculate the eigenvectors for yourself.

Composite Systems

Consider what happens when the transformation \(X_1 Z_2\) is applied to a two qubit system. \(X\) and \(Z\) are Pauli matrices and the 1 subscript on \(X\) is used to denote that the matrix acts on only the first qubit. Similarly for \(Z_2\). Lets see how this transformation will affect the 4 computational basis vectors:

\[\begin{equation} 00 \xrightarrow{Z_2} 00 \xrightarrow{X_1} 10 = \begin{pmatrix} 0 && 0 && 1 && 0 \end{pmatrix}^T \\ 01 \xrightarrow{Z_2} -01 \xrightarrow{X_1} -11 = \begin{pmatrix} 0 && 0 && 0 && -1 \end{pmatrix}^T \\ 10 \xrightarrow{Z_2} 10 \xrightarrow{X_1} 00 = \begin{pmatrix} 1 && 0 && 0 && 0 \end{pmatrix}^T \\ 11 \xrightarrow{Z_2} -11 \xrightarrow{X_1} -01 = \begin{pmatrix} 0 && -1 && 0 && 0 \end{pmatrix}^T \\ \end{equation}\]

The matrix corresponding to the transformation \(X_1 Z_2\) is thus given by stacking the column vectors as follows:

\[\begin{equation} M = \begin{pmatrix} 0 && 0 && 1 && 0 \\ 0 && 0 && 0 && -1 \\ 1 && 0 && 0 && 0 \\ 0 && -1 && 0 && 0 \end{pmatrix} \end{equation}\]

Note that this matrix is equal to the tensor product of the \(X\) and \(Z\) matrices:

\[\begin{equation} M = \begin{pmatrix} 0 && 1 \\ 1 && 0 \end{pmatrix} \otimes \begin{pmatrix} 1 && 0 \\ 0 && -1 \end{pmatrix} \end{equation}\]

Suppose we have to find the matrix corresponding to \(X_2 Z_1\). Repeating the procedure above we will get

\[\begin{equation} 00 \xrightarrow{Z_1} 00 \xrightarrow{X_2} 01 = \begin{pmatrix} 0 && 1 && 0 && 0 \end{pmatrix}^T \\ 01 \xrightarrow{Z_1} 01 \xrightarrow{X_2} 00 = \begin{pmatrix} 1 && 0 && 0 && 0 \end{pmatrix}^T \\ 10 \xrightarrow{Z_1} -10 \xrightarrow{X_2} -11 = \begin{pmatrix} 0 && 0 && 0 && -1 \end{pmatrix}^T \\ 11 \xrightarrow{Z_1} -11 \xrightarrow{X_2} -10 = \begin{pmatrix} 0 && 0 && -1 && 0 \end{pmatrix}^T \\ \end{equation}\]

Thus

\begin{equation} M = \begin{pmatrix} 0 && 1 && 0 && 0 \\ 1 && 0 && 0 && 0 \\ 0 && 0 && 0 && -1 \\ 0 && 0 && -1 && 0 \end{pmatrix} \end{equation}

and we can check that this is same as the tensor product of \(Z\) and \(X\) matrices:

\[\begin{equation} M = \begin{pmatrix} 1 && 0 \\ 0 && -1 \end{pmatrix} \otimes \begin{pmatrix} 0 && 1 \\ 1 && 0 \end{pmatrix} \end{equation}\]

Observations

Lets go back to the \(X_1 Z_2\) matrix and now understand what happens when a measurement is made after applying this matrix. The matrix is referred to as an observable in this case. An observable is constrained to be a Hermitian operator:

Definition of Hermitian (aka self-adjoint) operator

\[\begin{equation} M = M ^ \dagger \end{equation}\]

where \(\dagger\) denotes the adjoint which is formed by taking element-wise complex conjugate followed by transpose operation which turns column vectors into row vectors. Note that measurement operator need not be unitary.

What does the measurement process do? It does two things:

The measurement process collapses the wavefunction to one of the orthonormal eigen vectors (also known as eigen states) of the matrix \(M\).
The result of measurement is an eigenvalue of \(M\). That is what the measurement device registers.

The collapse is given by:

Wavefunction collapse onto the i-th eigenvector \(\textbf{q}_i\)

\[\begin{equation} \Psi^{'} = e^{i\theta} \textbf{q}_i \end{equation}\]

where \(e^{i\theta}\) is given by projecting the wavefunction onto \(\textbf{q}_i\) and normalizing the length of the resulting complex number. i.e., let

Projection of the wavefunction onto the i-th eigenvector \(\textbf{q}_i\)

\[\begin{equation} z_i = \textbf{q}_i ^ \dagger \Psi \end{equation}\]

Then \(e^{i\theta}\) is simply \(\frac{z_i}{||z_i||}\) and the probability with which the wavefunction will collapse to \(\textbf{q}_i\) is given by

Probability of measuring i-th eigenvalue

\[\begin{equation} p_i = ||z_i||^2 \end{equation}\]

The above equation is also written as follows in the brakets notation and is the same thing:

\[\begin{equation} p_i = \langle\Psi|\textbf{q}_i\rangle\langle\textbf{q}_i|\Psi\rangle \end{equation}\]

By default, when we see circuit diagrams with a measurement gate, it is measuring with \(M=Z\) where \(Z\) is the Pauli \(Z\) matrix with eigen states \(|0\rangle\) and \(|1\rangle\). This is also known as measurement in the computational basis.

What if we have repeated eigenvalues? What eigenvector does the wavefunction collapse to? In this case the wavefunction will collapse to the Hilbert space spanned by the associated eigenvectors. Umesh Vazirani talks about this in this video.

An Example

Lets do the math to make it clear taking \(M = X_1 Z_2\) as example. First, we need to calculate the eigenvalues and eigenvectors of the matrix \(M\). Do that as exercise. The eigenvalues are given by:

\[\begin{equation} \lambda = \{ +1, -1, +1, -1 \} (\textrm{there are duplicate eigenvalues}) \end{equation}\]

and the orthonormal eigenvectors are given by columns of the following matrix:

\[Q = \begin{equation} \frac{1}{\sqrt 2} \begin{pmatrix} 1 && 1 && 0 && 0 \\ 0 && 0 && -1 && 1 \\ 1 && -1 && 0 && 0 \\ 0 && 0 && 1 && 1 \end{pmatrix} \end{equation}\]

Verify:

\[\begin{equation} \begin{pmatrix} 0 && 0 && 1 && 0 \\ 0 && 0 && 0 && -1 \\ 1 && 0 && 0 && 0 \\ 0 && -1 && 0 && 0 \end{pmatrix} \begin{pmatrix} 1 && 1 && 0 && 0 \\ 0 && 0 && -1 && 1 \\ 1 && -1 && 0 && 0 \\ 0 && 0 && 1 && 1 \end{pmatrix} \end{equation} = \begin{pmatrix} 1 && -1 && 0 && 0 \\ 0 && 0 && -1 && -1 \\ 1 && 1 && 0 && 0 \\ 0 && 0 && 1 && -1 \end{pmatrix}\]

Above is nothing but:

\[\begin{equation} M Q = Q \Lambda \end{equation}\]

As exercise we can take \(\Psi\) to be \(\frac{1}{\sqrt 2}(|00 \rangle + |11 \rangle)\).

For i = 0:

\[\begin{equation} \textbf{v}_0 \textbf{v}_0 ^ \dagger = \frac{1}{2} \begin{pmatrix} 1 \\ 0 \\ 1 \\ 0 \end{pmatrix} \begin{pmatrix} 1 && 0 && 1 && 0 \end{pmatrix} = \frac{1}{2} \begin{pmatrix} 1 && 0 && 1 && 0 \\ 0 && 0 && 0 && 0 \\ 1 && 0 && 1 && 0 \\ 0 && 0 && 0 && 0 \end{pmatrix} \end{equation}\]

and we can check that

\[\begin{equation} p_0 = \Psi ^ \dagger \textbf{v}_0 \textbf{v}_0 ^ \dagger \Psi = \frac{1}{4} \end{equation}\]

If you do the math you will find that \(p_1, p_2, p_3\) are all \(\frac{1}{4}\).

The average value of the observable is:

\[\begin{equation} \langle M \rangle = \sum_i \lambda_i p_i \end{equation}\]

and turns out to be same as:

\[\begin{equation} \langle M \rangle = \Psi ^ \dagger M \Psi \end{equation}\]

which in this case will be 0.