Nuclear Magnetic Resonance (nmr) is, as the name suggests, a resonant method that probes the intrinsic magnetic moments of atomic nuclei. It is not nuclear in the sense of fission/fusion, but rather in the sense that it involves matter on the scale of the nucleus. As a discipline it is very mature, having a long history reaching back to fundamental research in the 1930s. Since then, it has found use in a huge variety of fields, not least as a structural determination tool in protein biology and in the modern hospital Magnetic Resonance Imaging (mri) scanner.
This page is intended to give an introduction to the physical origin and theoretical treatment of nmr for those with a physics background. Both the 'semi-classical' and quantum approaches are presented. It is not intended to be exhaustive or completely rigorous (the level of detail presented for each topic varies somewhat arbitrarily), and the reader is encouraged to pursue the enormous variety of available books and lecture courses on nmr (and quantum mechanics more generally) for more detail.
As with any resonant method, nmr requires a strong response at a certain frequency that is dependent on some property of the system: classically because the system has natural dynamics strongly confined to that frequency, and quantum-mechanically because the system has energy levels with gaps corresponding to that frequency. We also expect energy-loss and relaxation effects to rear their heads. With this in mind, we can apply the general resonant recipe for investigating the nuclei:
For Step 1, we can use the coupling of the nuclear magnetic moment to an external magnetic field introduced by the experimenter. This gives the nucleus an energy dependent on the interaction of its magnetic moment with the external field, and is generally different for different isotopes. Typically the field used is very strong, but it is possible (and is done in some contexts) to use very weak magnetic fields, such as that of the Earth.
Step 3 can be achieved using radio pulses, since the typical energies involved correspond to photons with frequencies in the radio-range.
To observe the nuclei in Step 4, we can exploit the changing magnetic field created by the resonating moments, and measure this as a voltage across a pickup coil. Typically this is the same coil used to generate the radio-frequency pulses used in the previous step.
The processing in Step 5 involves some form of numerical fitting, or integral transforms such as the Fourier transform (for frequency-space) or the Laplace transform (for decay-space).
Finally, the methods used for Step 6 depend on what information we want to extract. In many applications this is how much of a certain nucleus there is in a sample, in which case we are concerned with the relative size of signals compared to some reference.
Classically, the potential energy $E$ of a nuclear magnetic moment $\boldsymbol{\mu}$ in a magnetic field with flux density $\mathbf{B}$ is $$E = -\boldsymbol{\mu}\cdot\mathbf{B}. \tag{1}$$ This means that it is energetically favourable for the moment to be aligned parallel to the field.
Consider now a non-ferromagnetic sample consisting of a large (here 'large' means thermodynamically large) number of nuclei with moments $\boldsymbol{\mu}_i$, such that it makes sense to consider the macroscopic magnetic moment $\mathbf{M} = \sum_i \boldsymbol{\mu}_i$. If the magnetic field is zero, then there is no preferred alignment, and we expect $\mathbf{M} = \mathbf{0}$. If the field is non-zero, then the energy argument above means that the distribution of the moment components parallel to the field should be skewed slightly in the positive sense, while the distribution of the components perpendicular to the field should be unchanged. The result of this is that we expect a small, non-zero net moment $$\mathbf{M} \parallel \mathbf{\mathbf{B}}\qquad\mathrm{with}\qquad \frac{\mu_0}{V}|\mathbf{M}| \ll |\mathbf{B}|,\tag{2}$$ where $V$ is some proxy for the sample volume and $\mu_0$ is the permeability of free space.
An equivalent statement to Equation (1) is that the torque $\boldsymbol{\tau}$ on a moment $\boldsymbol{\mu}$ in a magnetic field $\boldsymbol{B}$ is $$\boldsymbol{\tau} = \boldsymbol{\mu} \times \mathbf{B},$$ which effectively comes from the angular equivalent of forces being related to potential gradients. Now comes the 'semi-classical' step. We assert, for now without justification, that the nuclear angular momentum $\mathbf{J}$ is parallel to the nuclear magnetic moment, that is that $$\boldsymbol{\mu} = \gamma \mathbf{J},$$ where $\gamma$, called the gyromagnetic ratio, is a constant scalar for a particular isotope.^{1} This may be positive or negative. Applying the rotational form of Newton's Second Law, namely $\boldsymbol{\tau} = \mathrm{d}\mathbf{J}/\mathrm{d}t$, gives $$\frac{\mathrm{d}\boldsymbol{\mu}}{\mathrm{d}t} = \gamma\boldsymbol{\mu}\times\mathbf{B}.$$ Since this equation is linear in $\boldsymbol{\mu}$, we may extend it to say that the net magnetic moment evolves as $$\frac{\mathrm{d}\mathbf{M}}{\mathrm{d}t} = \gamma\mathbf{M}\times\mathbf{B}=-\gamma\mathbf{B}\times\mathbf{M}.\tag{3}$$ This is a description of precession (note the similarity to the perhaps more familiar gyroscopic precession equation $\mathbf{G}=\boldsymbol{\Omega}\times\mathbf{L}$): if $\mathbf{M}$ and $\mathbf{B}$ are parallel, as is the case in equilibrium, then $\mathbf{M}$ does not evolve, but if $\mathbf{M}$ is tilted away from $\mathbf{B}$, it rotates with unchanging magnitude around the magnetic field with a signed angular frequency $$\omega = -\gamma|\mathbf{B}|.$$ This is the Larmor angular frequency, and corresponds to a resonance at an absolute frequency $f = |\omega|/2\pi = |\gamma||\mathbf{B}|/2\pi$ in a directional sense given by the sign of $\omega$ (positive is anticlockwise about $\mathbf{B}$). The vectors and directions involved are illustrated in Figure 1 for the common case $\gamma > 0$.
Figure 1: Precession of the net magnetisation in the laboratory frame for $\gamma > 0$
If a coil is placed perpendicular to the magnetic field, then the magnetic flux it threads will change as the magnetisation precesses. By Faraday's Law this will induce a potential difference across the ends of the coil alternating with the Larmor frequency, so we can detect the tilted magnetisation.
The previous section shows that for interesting dynamics, we must tilt the net moment away from the background magnetic field. This can be done by introducing a secondary magnetic field which has components perpendicular to the main field, because this temporarily changes the direction of the net field, and hence also the axis about which precession takes place.
At this point, it is useful to introduce the rotating frame treatment. Recall that the time derivatives in some reference 'laboratory' frame and a frame rotating with angular velocity $\boldsymbol{\Omega}$ relative to that frame are related by the operator equality $$\frac{\mathrm{d}}{\mathrm{d}t} = \left.\frac{\mathrm{d}}{\mathrm{d}t}\right|_\mathrm{rot} \quad + \quad \boldsymbol{\Omega}\times\phantom{0}.$$ Suppose then that we transform Equation (3), which is valid for the laboratory frame in which our machine is stationary, into such a rotating frame. We get $$\left.\frac{\mathrm{d}\mathbf{M}}{\mathrm{d}t}\right|_\mathrm{rot} = -\gamma\mathbf{B}\times\mathbf{M}\ -\ \boldsymbol{\Omega}\times\mathbf{M} = -(\mathbf{\Omega}+\gamma\mathbf{B})\times\mathbf{M}.$$ Define now the offset angular velocity as $\Delta\boldsymbol{\omega} = -[\boldsymbol{\Omega}-(-\gamma\mathbf{B})]$. If we choose $\boldsymbol{\Omega}=-\gamma\mathbf{B}$, then the offset is zero, and $\mathbf{M}$ does not evolve in that frame. This is to be expected - we have transformed into a frame that is rotating with the Larmor frequency, so we are tracking any precession of the magnetisation, and it will appear stationary.
An alternative way of thinking about this is that this new equation of motion looks exactly like Equation (3) if we define the reduced field $\mathbf{B}_\mathrm{red} = -\Delta\boldsymbol{\omega}/\gamma$. It is as if moving into the rotating frame reduces the background field, and in fact removes it entirely in the Larmor frame.
Now suppose we choose our laboratory axes such that the coil is aligned along the $x$-direction in the laboratory frame (and also such that the background magnetic field is in the $z$-direction). By Faraday's Law again, if we somehow apply an alternating voltage with a signed angular frequency $\omega_1$ across the ends of the coil, an alternating, linearly-polarised magnetic field $\mathbf{B}_\mathrm{coil}$ with its only non-zero component in the $x$-direction (that is along the unit vector $\mathbf{e}_x$) will be generated. Taking the amplitude of this magnetic field as $2B_1 > 0$ for later convenience, and its phase at time zero as $\phi$, the field has the form $$\mathbf{B}_\mathrm{coil} = 2B_1\cos(\omega_1 t + \phi)\mathbf{e}_x.$$ This linearly-polarised field can be written as the sum of two circularly-polarised fields, each of amplitude $B_1$, counter-rotating about the $\mathbf{e}_z$ direction as $$\mathbf{B}_\mathrm{coil} = \underbrace{B_1[\cos(\omega_1 t + \phi)\mathbf{e}_x + \sin(\omega_1 t + \phi)\mathbf{e}_y]}_{\mathbf{B}_\curvearrowleft,\ \mathrm{anti-clockwise}} + \underbrace{B_1[\cos(\omega_1 t + \phi)\mathbf{e}_x - \sin(\omega_1 t + \phi)\mathbf{e}_y]}_{\mathbf{B}_\curvearrowright,\ \mathrm{clockwise}}.$$ Suppose we choose $\omega_1$ to match the Larmor frequency (without loss of generality, say that $\gamma < 0$ such that $\omega > 0$ for clarity). In a frame rotating with the Larmor frequency (that is the rotating frame from before in which the magnetisation appears stationary when only the background field is present), these two portions of the field will behave differently:
The average value of the projection of $\mathbf{B}_{\curvearrowright,\ \mathrm{rot}}$ along any direction in the $x'$-$y'$ plane is zero. If in addition the Larmor frequency is in some sense large, such that $2\pi/\omega$ is much shorter than any meaningful dynamical timescale of our system, then it is reasonable to replace this field portion by its zero average and consider only the effect of the stationary $\mathbf{B}_{\curvearrowleft,\ \mathrm{rot}}$. This is known as the rotating frame approximation.
In the rotating frame, the effect of applying this stationary field for a time $\tau$ is to rotate the magnetisation by an angle $\theta=-\gamma \tau B_1$ about the stationary field. Such an application is generally called a pulse and is usually written as $(\theta)_\phi$. Often $\phi$ may be replaced by the axis along which the stationary field is pointing, eg. $\theta_{x'}$. When unambiguous, it is common to drop the primes too.
Two very important pulses are the $90^\circ$ or $\frac{\pi}{2}$ pulse, with $|\theta|=\frac{\pi}{2}$, and analogously the $180^\circ$ or $\pi$ pulse. Other names for these are respectively the excitation pulse (because it can be used to flip the equilibrium magnetisation into the transverse plane, where it will give maximum observable signal) and the inversion pulse (because it reverses the direction of the magnetisation). The action of these two pulses is illustrated in Figure 2.
Figure 2: Action of $\frac{\pi}{2}$ (left) and $\pi$ (right) pulses on the equilibrium magnetisation in the rotating frame for $\gamma < 0$
If the frequency of the applied field does not quite match the Larmor frequency, then the analysis is no longer so simple, because the background field is still non-zero in the frame rotating with the the applied field. This changes the direction and magnitude of the overall stationary field $\mathbf{B}_\mathrm{eff}$ in this frame, and hence also the effective orientation and flip angle of the pulse, as shown in Figure 3.
Figure 3: Off-resonance effective field in the rotating frame for $\gamma < 0$
Note that the stationary field now has a slightly larger magnitude of $B_\mathrm{eff}=\sqrt{B_1^2 + \left|\frac{\Delta\boldsymbol{\omega}}{\gamma}\right|^2}$, so the precession during a pulse is slightly faster than at resonance. The precession axis is also tilted out of the transverse plane by an angle $\arctan\left(\frac{1}{B_1}\left|\frac{\Delta\boldsymbol{\omega}}{\gamma}\right|\right)$.
However, this effect can be neglected if the magnitude of the applied field is much greater than the remaining part of the background field. This leads to the concept of a hard pulse - if the pulse length is very short and the pulse amplitude sufficiently large, then it acts almost like an on-resonance pulse at the Larmor frequency. (A hard pulse is also useful since because narrow distributions in time are wide in frequency, they can excite spins with different Larmor frequencies, as is the case when spins of the same type are in slightly different chemical environments, or when we are using an inhomogeneous background field).
There are many, many other pulse types. The optimisation of pulses has been a significant target of contemporary nmr research. These include soft pulses or selective pulses that excite only specific frequencies, as well as more complicated pulses generated using optimal control theory for achieving a specific spin distribution.
Once the magnetisation is set in resonance, it will not remain in an excited state indefinitely, but will relax back towards its equilibrium value. To a first approximation, this intrinsic relaxation can be divided into two main types:
The simplest model for these relaxation processes is that they are exponential with time constant $T_1$ for longitudinal relaxation, and $T_2$ for transverse relaxation. Let $M_z(t)$ be the longitudinal magnetisation as a function of time and similarly let $M_{xy}(t)=\sqrt{M_x^2(t)+M_y^2(t)}$ be the transverse magnetisation. Also let $M_0$ be the magnitude of the magnetisation in equilibrium, when it is parallel with the background magnetic field along the $z$-axis. We are then saying that the magnetisation will evolve as $$ \begin{aligned} \frac{\mathrm{d}}{\mathrm{d}t}M_z(t) &= -\frac{1}{T_1}[M_z(t)-M_0],\\ \frac{\mathrm{d}}{\mathrm{d}t}M_{xy}(t) &= -\frac{1}{T_2}M_{xy}(t). \end{aligned} $$ These phenomenological relations (and more general counterparts involving multiple types of spins and separation of $M_x$ and $M_y$) are usually called the Bloch equations. In this simple case, they have solutions $$ \begin{aligned} M_z(t) &= M_0 + [M_z(0)-M_0]\exp\left(-\frac{t}{T_1}\right),\\ M_{xy}(t) &= M_{xy}(0)\exp\left(-\frac{t}{T_2}\right), \end{aligned} $$ which simplify further if we consider the evolution immediately following an excitation pulse (such that $M_z(0)=0$ and $M_{xy}(0)=M_0$) to $$ \begin{aligned} M_z(t) &= M_0\left[1-\exp\left(-\frac{t}{T_1}\right)\right],\\ M_{xy}(t) &= M_0\exp\left(-\frac{t}{T_2}\right). \end{aligned} $$ In general, $T_2$ tends to be shorter than $T_1$. This is heuristically because a process that causes longitudinal relaxation effectively also contributes to transverse relaxation - graphically, rotating a moment towards the $z$-axis also reduces its component in the transverse plane.^{2}
Both the relaxation times are strong functions of the correlation time $\tau_\mathrm{c}$ for the chemical and physical environment the spin is located in. The correlation time can be thought of as the time taken for a tumbling molecule to perform one full rotation, and is hence roughly a measure of 'boundedness'. Small molecules lightly coupled to other species rotate quickly and have short correlation times, while large molecules strongly coupled to a matrix or other species rotate slowly and have long correlation times. A more complete treatment of relaxation requires application of Redfield theory, but we can make some semi-quantitative statements without doing too much mathematics.
As discussed briefly above, longitudinal relaxation involves the spins exchanging energy with the surroundings. To do this ultimately involves transitions between energy levels that correspond to emission or absorption of photons with the Larmor frequency. So this relaxation is most efficiently driven if the spin sees, in its own rest frame, a locally-fluctuating field with the Larmor frequency. Hence $T_1$ is shortest at a correlation time satisfying $\frac{2\pi}{\tau_\mathrm{c}} = \omega$ and increases for correlation times away from this.
On the other hand, transverse relaxation depends more smoothly on the timescale of the fluctuations in the rest frame. If the molecule has a very short correlation time, then the projection of the locally-fluctuating field in any direction will average to zero, and relaxation will be slow; if instead the correlation time is long, then the set of directions of the local field will be much more restricted in the rest frame, and relaxation will be faster. There is a small flattening of this relaxation curve near the Larmor correlation time, where the $T_1$ curve turns over, since we noted above that longitudinal relaxation also tends to cause transverse relaxation. Finally, applying our general rule that $T_1$ is longer than $T_2$ fixes the relative position of the relaxation curves.
These dependencies are illustrated schematically in Figure 4.
Figure 4: A schematic graph of the dependence of the two intrinsic relaxation times on the correlation time $\tau_\mathrm{c}$
There are other factors that can have a significant effect on relaxation. For example, paramagnetic impurities significantly shorten relaxation times. This is because the large magnetic moments of their unpaired electrons produce strong local dipole fields, which couple to the nuclear motion and add an additional relaxation pathway.
In the analysis above, we have assumed that every spin contributing to the overall magnetisation has exactly the same background magnetic field. This is true only if the field is completely homogeneous: in reality, it will have a magnitude that varies slightly across the spatial extent of the sample.
This means that each individual spin (or each isochromat, that is the maximal group of spins that can be considered to be in a locally homogeneous field) will precess at a slightly different rate after excitation. As a result, the transverse magnetisation will appear to dephase faster, and therefore also decay faster, than would be expected due to only intrinsic effects. It is usual to characterise this extra artificial dephasing with an additional exponential decay time $T_\mathrm{hom}$, such that the transverse magnetisation after an excitation pulse now evolves instead as $$M_{xy}(t) = M_{xy}(0)\exp\left(-\frac{t}{T_2}\right)\exp\left(-\frac{t}{T_\mathrm{hom}}\right) = M_{xy}(0)\exp\left(-\frac{t}{T_2^*}\right),$$ where $T_2^* = (1/T_2 + 1/T_\mathrm{hom})^{-1}$ is the new effective relaxation time.^{3}
There is another relaxation effect closely related to inhomogeneity that we have not yet considered. In all the above treatment, we have assumed that an individual isochromat stays in the same physical location, such that it always experiences the same local field. This is not the case if spins can diffuse throughout the measurement volume, which equivalently implies some notional diffusion process associated with the background field strength.
The classic way to approach this mathematically, as was done by Carr and Purcell (the same as those of spin-echo fame - see below), is to consider diffusion on a small enough scale (or in a field with sufficiently high homogeneity) that the background field can be modelled as having some constant gradient. Then the diffusion of a single isochromat can be modelled by a random walk process and the statistical properties of the accumulated phase relative to a isochromat that does not diffuse can be calculated. In the limit of very small step size, this random walk approaches continuous diffusion, which allows us to link the discrete step size to the continuous diffusion coefficient $D$.
In the case of a cpmg spin-echo sequence (see below), this yields a further exponential decay time $1/T_\mathrm{diff} = A \tau^2$, where $A \propto D$ is effectively a proxy for the diffusivity of the spins, and $\tau$ is a time parameter associated with the spin-echo sequence (specifically, it is the echo spacing). This can then be combined with the previous relaxation times to give an effective relaxation time, as was done above for inhomogeneity. Without going too much into the detail of spin echoes at this stage, note that we can effectively suppress diffusion by choosing our sequence to have the smallest possible $\tau$.
Let us temporarily ignore inhomogeneity (and therefore also diffusive) effects. The measured signal, which is effectively the magnetisation in the transverse plane in the lab frame, additionally has an oscillatory factor at the Larmor frequency. This decaying oscillating exponential is called the free induction decay (fid). The fid is shown in Figure 5, which additionally shows the real and imaginary parts of the Fourier transform of the fid. This real part of this profile in frequency space is that which is shown on a 'normal' nmr spectrum: the peak is centred at the Larmor frequency for the spin, and its width is determined by the relaxation time. A long $T_2$ gives a well-defined peak, while things that relax very quickly are smeared out - the area under the peak is a constant, so good resolution means lower amplitude and vice-versa.
Figure 5: The fid in the time (top) and frequency (bottom) domains
Note that in most instrumentation, the measured frequency is not actually the Larmor frequency itself, but rather the smaller angular frequency $\omega_\mathrm{rel}$ relative to some 'spectrometer angular frequency' $\omega_\mathrm{spec}$. There is also a degeneracy in the sign of the frequency, in that if we measure only the projection of the magnetisation along one direction, which is what a single pickup coil does, then we cannot tell the difference between $\omega_\mathrm{rel}$ and $-\omega_\mathrm{rel}$. An analogy is looking at a bicycle wheel with a painted spot edge-on: it is not possible to infer the direction of rotation from the motion of the spot. However, if we look at the wheel face-on, such that we can see the projection in two directions, then it is easy to see the direction of rotation. The equivalent in a spectrometer would be adding a second pickup coil orthogonal to the first. What is actually done in practice is keeping a single pickup coil, and then multiplying the fid with a cosine and sine oscillating at $\omega_\mathrm{spec}$. These two are then combined as the real and imaginary parts of a notional complex signal. Since the two sinusoids are orthogonal (in the sense that their inner product is zero), this amounts to the same thing as having two coils, and we can infer the sign of $\omega_\mathrm{rel}$. This process is called quadrature detection more generally.
The following is not exactly what is done in practice, but it gives a good sense of the procedure.
Ignoring relaxation for the sake of clarity, the signal we see in the pickup coil is proportional to $\cos(\omega t + \phi)$, where here $\omega$ is the Larmor angular frequency and $\phi$ is some 'machine phase' (assumed constant over the course of the measurement) set by various characteristics of the spectrometer. Multiplying with reference sinusoids at the spectrometer angular frequency $\Omega$ (taken to have zero phase for convenience), we get, starting with a cosine, $$ \begin{aligned} \cos\Omega t\, \cos(\omega t + \phi) &= \operatorname{Re}\exp(\text{i}\Omega t)\, \operatorname{Re}\exp[\text{i}(\omega t + \phi)]\\ &= \frac{1}{2}\operatorname{Re}\left\{\exp(\text{i}\Omega t)\exp[\text{i}(\omega t + \phi)] + \exp(-\text{i}\Omega t)\exp[\text{i}(\omega t + \phi)]\right\}\\ &= \frac{1}{2}\operatorname{Re}\left\{\exp[\text{i}(\omega+\Omega)t+\text{i}\phi] + \exp[\text{i}(\omega-\Omega)t+\text{i}\phi]\right\}\\ &= \frac{1}{2}\cos\left[(\omega + \Omega)t + \phi\right] + \frac{1}{2}\cos\left[(\omega - \Omega)t + \phi\right]. \end{aligned} $$ Low-passing that in some sense (perhaps by using some form of a modified integrator), we get $$\cos\Omega t\, \cos(\omega t + \phi) \underset{\text{low pass}}{\to} \frac{1}{2}\cos\left[(\omega - \Omega)t + \phi\right],$$ and doing a similar trick with the sine, we have $$\sin\Omega t\, \cos(\omega t + \phi) \underset{\text{low pass}}{\to} -\frac{1}{2}\sin\left[(\omega - \Omega)t + \phi\right],$$ Now interpret these as the real and imaginary parts of a notional complex signal (at least up to a sign in the imaginary part), giving $$\frac{1}{2}\cos\left[(\omega - \Omega)t + \phi\right] + \frac{1}{2}\text{i}\sin\left[(\omega - \Omega)t + \phi\right] = \frac{1}{2}\exp\left[\text{i}(\omega-\Omega)t+\phi\right]$$ Notice how we now have information on $\operatorname{sgn}(\omega - \Omega)$ in the form of a phase offset linear in time - so in principle we can correct for this after pickup. What is usually done is to apply a linear phase correction to remove the frequency offset, and then a constant phase correction to remove the machine phase $\phi$, such that in the ideal case, all of the signal lies in the real part. This may seem unecessary - why not just take the absolute value of the signal complex signal $S$, that is $\sqrt{\operatorname{Re}^2 S + \operatorname{Im}^2 S}$? The point is that if the noise on the real and imaginary parts is zero-centred (which in the absence of systematic errors it should be, by appealing to the central-limit theorem), then it will remain zero-centred on phase correction, without bias. In contrast, the transformed noise on the absolute value will be biased upwards, since that operation is manifestly non-negative.
There is no way to undo the intrinsic dephasing characterised by $T_2$.
On the other hand, provided that we neglect diffusion effects, there will be no exchange of spins between each isochromat created by the inhomogeneous field, and each one can be assigned its own unchanging offset frequency relative to the nominal Larmor frequency. Relative to the nominal isochromat, some isochromats will have positive offsets and gain phase, while some will have negative offsets and lose phase. This gained or lost phase is linear in the frequency offset, since the precessed angle is just the offset frequency multiplied by the time. This artificial dephasing can be reversed, in the sense that if we could spool the system backwards, the isochromats would come back into phase again.
Of course we cannot just reverse time - but we can do something almost equivalent by considering what state the isochromats would be in if we could do this. If, after a dephasing time $\tau$, we could change the sign of the phase gradient - that is make leading isochromats lag, and vice-versa - then the system would look exactly as it would have done at a time $\tau$ before the isochromats were in phase. Then precession for an additional period $\tau$ would bring the isochromats back into phase again. Changing the sign of the phase gradient equivalently means changing the sign of the angle of each isochromat relative to the nominal one (in the plane containing the isochromats). But this is what would happen if every isochromat was rotated $180^\circ$ around the nominal one, and we already know how to do this: apply a hard $\pi$ pulse in the direction of the nominal isochromat!
This sharp increase in observable signal is called a spin-echo, or sometimes a Hahn echo (after Erwin Hahn, who first discovered the phenomenon in 1950). Although the above treatment was for a $\frac{\pi}{2}$ excitation pulse followed by a $\pi$ pulse, in fact two pulses of flip angle will create a spin echo, albeit with a reduced amplitude. The animation in Figure 6 illustrates the echo formation in the rotating frame for one variation of the usual $\frac{\pi}{2}$-$\pi$ sequence.
Figure 6: Formation of a spin-echo in the rotating frame for $\gamma > 0$
The magnetisation refocussed by the $\pi$ pulse will continue to dephase, just as the initial excited magnetisation did. This can be refocussed by a further $\pi$ pulse, and so on, leading to a whole train of refocussing pulses. The amplitudes of the spin-echoes in the train decay only with $T_2$: we have removed the effect of the inhomogeneous magnetic field.
There are several variations on such a pulse train. One of the most used is the Carr-Purcell-Meiboom-Gill (cpmg) sequence, which is shown schematically in Figure 7. This has the advantage that it self-corrects for slight errors in the flip angle of the nominal $\pi$ pulses by shifting the phase of these relative to the excitation pulse.
Figure 7: Schematic pulse-timing diagram for one variant of the cpmg-refocussing train, indicating also the echo locations
nmr is inherently a quantum phenomenon. This is true not only on the smallest scales, in that spin has no classical analogue, but also on the macroscopic scale of the ferromagnets used to generate the background field. By the Bohr–Van Leeuwen theorem, the classical thermodynamic average of a bulk magnetisation must vanish, and so in a purely classical world permanent magnets, as well as diamagnets and paramagnets, cannot exist.^{4}
Although the semi-classical approach taken above is perfectly capable of explaining the important parts of simple single-isotope spin-echo experiments by tracking the bulk magnetisation, it fails to explain many other aspects of nmr. This is especially true in cases involving the coupling of spins of different types (as in the dept experiment), in which quantum coherences develop, which cannot be interpreted classically.
From the quantum perspective, we must work with operators and expectation values rather than classical quantities.
In quantum mechanics, smooth transformations of states $|\psi\rangle \to |\psi'\rangle$ are represented by unitary operators $U$, that is $U^{-1}=U^\dagger$. This is a consequence of the fact that transforming an arbitrary state should keep it normalised:^{5} $$\langle\psi'|\psi'\rangle=\langle\psi|\psi\rangle \implies \langle\psi|U^\dagger U|\psi\rangle=\langle\psi|\psi\rangle.$$ The normalised angular momentum operator $\mathbf{L}/\hbar$ can be defined as the generator of rotations, in that the unitary operator $U(\boldsymbol{\alpha})$ rotating an eigenstate $\left|\mathbf{x}\right>$ of the position operator $\mathbf{X}$ around an axis through the origin $\boldsymbol{\alpha}$ by an angle $\alpha = |\boldsymbol{\alpha}|$ is $$U(\boldsymbol{\alpha}) = \exp\left(-\frac{\mathrm{i}\boldsymbol{\alpha}\cdot\mathbf{L}}{\hbar}\right),$$ or, for a very small rotation $\delta\boldsymbol{\alpha}$, $$U(\delta\boldsymbol{\alpha}) \approx 1-\frac{\mathrm{i}\delta\boldsymbol{\alpha}\cdot\mathbf{L}}{\hbar}.$$ The equivalent classical statement is that the rotation $\mathcal{R}(\boldsymbol{\alpha})$ rotates a position vector $\mathbf{x}$ under the usual Rodrigues' rotation formula $$\mathcal{R}(\boldsymbol{\alpha})\mathbf{x} = (\mathbf{x}\cdot\hat{\boldsymbol{\alpha}})\hat{\boldsymbol{\alpha}} + \cos\alpha[\mathbf{x}-(\mathbf{x}\cdot\hat{\boldsymbol{\alpha}})\hat{\boldsymbol{\alpha}}]+\sin\alpha (\hat{\boldsymbol{\alpha}}\times\mathbf{x}),$$ or, again for a very small rotation, $$\mathcal{R}(\delta\boldsymbol{\alpha})\mathbf{x} \approx \mathbf{x} + \delta\boldsymbol{\alpha}\times\mathbf{x}.$$ There should be a homomorphism between the quantum $U$ and the associated classical $\mathcal{R}$. Concretely, by considering how the expectation of an operator should evolve when the state is transformed, we require any vector operator $\mathbf{V}$ (that is, loosely, any operator whose components transform under rotations in the same way that position does) to obey $$U^{-1}(\boldsymbol{\alpha})\,\mathbf{V}\,U(\boldsymbol{\alpha}) = \mathcal{R}(\boldsymbol{\alpha})\mathbf{V},$$ which, for a small rotation, yields the commutators $$\left[\frac{\mathrm{i}\boldsymbol{\delta\alpha}\cdot\mathbf{L}}{\hbar},\mathbf{V}\right]=\delta\boldsymbol{\alpha}\times\mathbf{V}.$$ Since $\delta\boldsymbol{\alpha}$ is arbitrary, this gives the commutators (in suffix notation) between the Cartesian components of the angular momentum and and the vector operator as $$[L_i, V_j] = \mathrm{i}\hbar\varepsilon_{ijk}V_k.$$ The angular momentum is itself a vector operator, so the specific case of $\mathbf{V}=\mathbf{L}$ recovers the commutators for the angular momentum components with each other as $$[L_i, L_j] = \mathrm{i}\hbar\varepsilon_{ijk}L_k,\tag{4}$$ which are explicitly $$[L_x, L_y] = \mathrm{i}\hbar L_z,\qquad[L_y, L_z] = \mathrm{i}\hbar L_x,\qquad[L_z, L_x] = \mathrm{i}\hbar L_y.$$ The total angular momentum operator is $L^2 = L_x^2 + L_y^2 + L_z^2$. From the component commutators, we find that any component of the angular momentum commutes with $L^2$, that is $[L_i, L^2]=0$. This is important as it means that there exists a simultaneous eigenbasis of $L^2$ and any component of $\mathbf{L}$.^{6} Since $L^2$ and $\mathbf{L}$ are Hermitian, this eigenbasis must be orthogonal.
Let us (arbitrarily) choose to find an eigenbasis of $L_z$ and $L^2$. Denote this eigenbasis as $|\beta, \alpha_m\rangle$ where $m$ indexes the $L_z/\hbar$ eigenvalue $\alpha_m$ and $\beta$ is the $L^2/\hbar^2$ eigenvalue, that is $$L_z |\beta, \alpha_m\rangle = \hbar\alpha_m |\beta, \alpha_m\rangle, \qquad L^2|\beta, \alpha_m\rangle = \hbar^2\beta |\beta, \alpha_m\rangle. \tag{5}$$ Now that we have the commutators, we can develop the algebra of this simultaneous eigenbasis. Introduce the raising and lowering operators $L_\pm = L_x \pm \mathrm{i} L_y$. Their commutators with the operators themselves are $$[L_\pm, L^2] = 0, \qquad [L_z, L_\pm] = \pm \hbar L_\pm,$$ from which it follows that they affect the eigenstates like $$L_zL_\pm |\beta,\alpha_m\rangle = \hbar (\alpha_m \pm 1)L_\pm|\beta, \alpha_m\rangle, \qquad L^2L_\pm|\beta, \alpha_m\rangle = \hbar^2 \beta L_\pm|\beta, \alpha_m\rangle,$$ that is leaving the total angular momentum unchanged, but adding/subtracting multiples of $\hbar$ to the expectation value in the $z$-direction. The angular momentum is quantised, in that its expectation cannot vary continuously, but is only allowed to take certain discrete values. In terms of our indexing, this implies $L_\pm |\beta, \alpha_m\rangle = C_\pm |\beta, \alpha_{m \pm 1}\rangle$ for some constants $C_\pm$ with $\alpha_{m\pm 1} = \alpha_m \pm 1$.
We cannot keep adding/subtracting arbitrary amounts of angular momentum, since we know that the total angular momentum is bounded from the relevant eigenvalue of $L^2$, and so there must be some maximum/minimum values of $m$, that is $m_\mathrm{min} \le m \le m_\mathrm{max}$, at which the eigenstates are annihilated: $$L_-|\beta, \alpha_{m_\mathrm{min}}\rangle = 0, \qquad L_+|\beta, \alpha_{m_\mathrm{max}}\rangle = 0.$$ Now consider the norm of these extremal states. For the upper case, after some algebra^{7} we have $$0 = \left|L_+|\beta, \alpha_{m_\mathrm{max}}\rangle\right|^2 = \hbar^2[\beta - \alpha_{m_\mathrm{max}}(\alpha_{m_\mathrm{max}}+1)],$$ and hence $\beta = \alpha_{m_\mathrm{max}}(\alpha_{m_\mathrm{max}}+1)$. For the lower case, a similar treatment gives $\beta = \alpha_{m_\mathrm{min}}(\alpha_{m_\mathrm{min}}-1)$. These two values of $\beta$ must agree, and additionally we need $\alpha_{m_\mathrm{min}} < \alpha_{m_\mathrm{max}}$, which forces $\alpha_{m_\mathrm{min}} = -\alpha_{m_\mathrm{max}}$. These conditions are consistent if we take $m_\mathrm{max} = \ell$, $\beta = \ell(\ell+1)$ and $\alpha_m = m$, that is $$ m=-\ell,\ -(\ell-1),\ -(\ell-2),\ ...,\ \ell-2,\ \ell-1,\ \ell.$$ Now we can relabel the eigenbasis as $|\beta, \alpha_m\rangle \to |\ell, m\rangle$, in which case Equation (5) becomes $$ L_z|\ell, m\rangle = m \hbar |\ell, m\rangle, \qquad L^2 |\ell, m\rangle = \ell(\ell+1) \hbar^2 |\ell, m\rangle.$$ Note that $2\ell$ must be an integer, since there are $2\ell$ steps from $|\ell,m\rangle$ down to $|\ell, -m\rangle$, and so $\ell$ must be half-integer or integer. How do we know that there are no other eigenstates for a given $\ell$? If there were, then we could lower or raise them to a state with $|m| > \ell$, which from the consideration of the norm above would imply a larger $\ell$ than that of the state we started with. But this is a contradiction, because we know that the raising/lowering operators do not change the value of $\ell$, and hence there can be no such states.
Finally, note also that by considering the norm of an arbitrary non-extremal eigenstate and requiring $\langle\ell,m|\ell,m\rangle=1$, we can obtain the constants $C_\pm$, giving the normalisation $$L_\pm|\ell, m\rangle = \hbar\sqrt{\ell(\ell+1)-m(m\pm 1)}|\ell, m \pm 1\rangle.\tag{6}$$
Above, we developed the algebra of the angular momentum operator $\mathbf{L}$ by considering it to be the generator of rotations and then examining the consequences for small rotations. However, we somewhat glossed over one aspect.
Suppose we rotate the eigenstate $|\ell, m\rangle$ by $2\pi$ around the $z$-axis. This gives, using the usual power-series interpretation of a function of an operator, the result $$\exp\left(-\frac{\mathrm{i}2\pi L_z}{\hbar}\right)|\ell, m\rangle = \exp(-\mathrm{i}2\pi m)|\ell,m\rangle.$$ A complete rotation should just be the identity, so we must have $\exp(-\mathrm{i}2\pi m)=1$, which forces $m$ to be integer - and crucially not half-integer too. This in turn means that $\ell$ must be only integer too, and not also half-integer as we said before.
If a body has internal structure, such that it makes sense to talk about it having an orientation relative to its own axis, then rotation about a point can be decomposed into two parts:
This suggests that we can decompose the angular momentum operator into a sum of two operators corresponding to these.
This would not appear to change the previous argument: although there may be separate generators for each of these types of rotations, which will have the same algebraic structure by virtue of the previous treatment, each must individually have integer $\ell$ only. And additionally, if the body is point-like, then the second type of angular momentum should vanish.
However, this turns out to not be the case. Investigations such as the Stern-Gerlach experiment, where only the second kind of angular momentum is considered, demonstrate experimentally that half-integer spin quantum numbers do exist! This leads to the notion of spin: some bodies, even those that do not seem to have internal structure, have intrinsic angular momentum that has no relation at all to spatial rotations, with associated quantum numbers that can take any of the values predicted by the local treatment. Mathematically, the half-integer/integer clash corresponds to our states transforming like the Lie groups $\text{SU}(2)$ and $\text{SO}(3)$ respectively, which are not equivalent. (Note however that the associated Lie algebras $\mathfrak{su}(2)$ and $\mathfrak{su}(3)$ are equivalent, in the sense that they are isomorphic.)
There are some unfortunate notational clashes: physicists tend to write $\mathbf{L}$ for the orbital angular momentum, $\mathbf{S}$ for the spin, and $\mathbf{J} = \mathbf{L} + \mathbf{S}$ for the overall angular momentum. In nmr, where the orbital angular momentum is usually irrelevant, it is customary to write $\mathbf{L}$ for the spin instead, with the eigenstates being $|\ell, m\rangle$. That is the convention used here from now on (and also why we started by using $\mathbf{L}$ in the algebra above, even though there we were in effect referring to the physicist's $\mathbf{J}$!).
We have now established that nuclei can have intrinsic angular momentum in the form of spin, even though they are not rotating or 'spinning' in any sense. But what determines the spin of a given nucleus - that is what $\ell$ are we working with - and can this be predicted given no further information? As well as being interesting in its own right, this is a vital question for practical nmr, since nuclei with no spin ($\ell=0$) have no magnetic moment, and hence are 'nmr-invisible'. Additionally nuclei with $\ell > \frac{1}{2}$ are quadrupolar, which complicates solid-state experiments by coupling the nucleus to the local electric field gradient.
Consulting a datasheet reveals an enormous variation in nuclear spins, even amongst common isotopes. Some examples are hydrogen (${}^1\mathrm{H}$) with spin $\frac{1}{2}$, nitrogen-14 (${}^{14}\mathrm{N}$) with spin 1, and sodium-23 (${}^{23}\mathrm{Na}$) with spin $\frac{3}{2}$. By far the most common natural isotope of uranium, namely uranium-238 (${}^{238}\mathrm{U}$), has spin 0. Another unfortunate spin-0 nucleus is that of carbon-12 (${}^{12}\mathrm{C}$), which of course underpins the organic chemistry that is so important in biological systems. Here we are saved somewhat by the much less abundant carbon-13 being spin active - note that isotopes of the same element need not have the same spin.
So where do we start? The key is that the nucleus consists of charged protons and neutral neutrons (collectively termed nucleons). These are all attracted to each other by the strong nuclear force, while the charged protons repel each other. These forces combine to give some complicated energetic potential that is difficult to describe analytically, but which can be usefully modelled as a series of energy levels called nuclear shells (and then nuclear subshells etc.), which are loosely analogous to the electron shells used to describe the electronic structure of entire atoms.
Now consider that both protons and neutrons are both spin-$\frac{1}{2}$ particles. That makes them fermions, and so they must separately obey the Pauli exclusion principle, which states that it is impossible for two identical fermions to have the same set of quantum numbers. The relevance for us is that if we imagine building up a nucleus by inserting nucleons into the lowest energy shell that is not yet filled, we must sometimes alternate the spin eigenstate ($m=\pm\frac{1}{2}$, that is 'spin-up' or 'spin-down') to maintain exclusion. This can be applied separately for the protons and neutrons (a proton is certainly not identical to a neutron), and the overall ground-state spin is then determined by the number of unpaired spins remaining. If all the spins are paired, then we have a spin-0 nucleus.
Now we can proceed to quantitatively relate the nuclear spin to the nuclear magnetic moment. Formally, this requires the use of some representation theory, specifically the Wigner-Eckart theorem, which in one form states that $$\langle\ell, m|T^{(k)}_q|\ell', m'\rangle = \langle \ell',m';k,q|\ell,m\rangle\langle \ell||\mathbf{T}^{(k)}||\ell'\rangle$$ where $T^{(k)}_q$ are the $q$-indexed components of a spherical tensor operator $\mathbf{T}^{(k)}$ of rank $k$, $\langle \ell||\mathbf{T}^{(k)}||\ell'\rangle$ is some 'intrinsic' numerical value associated with the operator but independent of the 'directions' $m$, $m'$, and $q$, and $\langle \ell',m';k,q|\ell,m\rangle$ is a Clebsch-Gordan coefficient for the coupling of spin-$\ell'$ to spin-$k$ to give spin-$\ell$.
We'll now outline a few points to give a better understanding of this theorem and its origin. A spherical tensor operator can be loosely thought of as an operator that transforms in the same way that the angular momentum eigenstates do under rotation. Namely, the action of a rotation operator $U(\mathcal{R})$ on a spherical tensor operator is, by definition, $$U^{-1}(\mathcal{R})\, T^{(\ell)}_m\, U(\mathcal{R}) = \sum_{m'} \langle \ell, m'|U^{-1}(\mathcal{R})|\ell,m\rangle\, T^{(\ell)}_{m'} = \sum_{m'} {\mathcal{D}^{(\ell)*}_{mm'}}(\mathcal{R})\, T^{(\ell)}_{m'} = \sum_{m'}\mathcal{D}^{(\ell)}_{m'm}(-\mathcal{R})\, T^{(\ell)}_{m'}\,$$ which can be compared with the similar action on the eigenstates $$U(\mathcal{R})|\ell,m\rangle = \sum_{m'} \langle\ell, m'|U(\mathcal{R})|\ell, m\rangle\, |\ell, m'\rangle = \sum_{m'} {\mathcal{D}^{(\ell)}_{m'm}}(\mathcal{R})\, |\ell, m'\rangle,$$ where $\mathcal{D}^{(\ell)}_{ij}(\mathcal{R}) = \langle\ell, i|U(\mathcal{R})|\ell, j\rangle$ (at least up to a transpose - there are several conventions) are the components of the Wigner $\mathcal{D}$-matrix associated with the Cartesian rotation $\mathcal{R}$. The appearance of $-\mathcal{R}$ in the tensor case reflects the usual concept of operators rotating in the opposite sense to states, which is itself related to the more famililar difference between passive and active rotations.
By a similar process to the earlier algebra (expand for a small rotation etc.), this eventually leads to the following equivalent definition of spherical tensor operators in terms of their commutators with the raising/lowering operators: $$\left[L_\pm, T^{(\ell)}_m\right] = \hbar\sqrt{\ell(\ell+1)-m(m\pm 1)}\, T^{(\ell)}_{m\pm 1}.$$ Note the similarity with Equation (6). If we contract this with $|\ell, m\rangle$ on the left and $|\ell, m'\rangle$ on the right, we eventually get a recursion relation for the elements $\langle\ell, m|T^{(k)}_q|\ell', m'\rangle$ of the tensor operator. The connection with the addition of angular momentum is that if we apply the raising/lowering operators to the definition of the Clebsch-Gordan coefficients for the coupling of spin-$\ell'$ to spin-$k$ to give spin-$\ell$, which is $$|\ell, m\rangle = \sum_{\ell',m',k,q}\langle \ell',m';k,q|\ell,m\rangle\, |\ell', m'\rangle \otimes |k, q\rangle,$$ then we get recursion relations for the coefficients $\langle \ell',m';k,q|\ell,m\rangle$ with the same structure (after some fiddling with index labels) as the ones for the spherical tensor operators. This fixes the tensor elements and the coefficients up to a constant relative to each other, giving the theorem.
This theorem itself looks intimidating, but can be more 'physically' interpreted by noting the following points:
Suppose now we choose two arbitrary spherical tensor operators of the same rank $k$, say $\mathbf{T}^{(k)}$ and $\mathbf{A}^{(k)}$. The above means that for a particular component $q$, we have $$ \langle\psi|T^{(k)}_q|\psi\rangle = \frac{\langle \ell||\mathbf{T}^{(k)}||\ell\rangle}{\langle \ell||\mathbf{A}^{(k)}||\ell\rangle}\cdot \langle\psi|A^{(k)}_q|\psi\rangle,$$ or, since this holds for all the components $$ \langle\psi|\mathbf{T}^{(k)}|\psi\rangle = \frac{\langle \ell||\mathbf{T}^{(k)}||\ell\rangle}{\langle \ell||\mathbf{A}^{(k)}||\ell\rangle}\cdot \langle\psi|\mathbf{A}^{(k)}|\psi\rangle.$$ Specialise now to the case $k=1$, and let $\mathbf{U}$, $\mathbf{V}$ be the vector operators constructed from the tensors $\mathbf{T}^{(1)}$, $\mathbf{A}^{(1)}$. From the linearity between the components of these established above, we get $$ \langle\psi|\mathbf{U}|\psi\rangle = \frac{\langle \ell||\mathbf{T}^{(1)}||\ell\rangle}{\langle \ell||\mathbf{A}^{(1)}||\ell\rangle}\cdot \langle\psi|\mathbf{V}|\psi\rangle.$$ This is a very strong statement. It tells us that within a subspace of constant $\ell$, the expectation values of any two vector operators are proportional to one another. In quantum mechanics, this is about as far as it makes sense to go in considering two quantities being parallel to each other, in that this appears to be true (on average) for any measurement that we can make.^{8}
Of course the nuclear spin $\mathbf{L}$ is also a vector operator, as is the nuclear magnetic moment $\boldsymbol{\mu}$. So we are free to write $\boldsymbol{\mu}= \gamma \mathbf{L} \propto \mathbf{L}$, where we now see that the gyromagnetic ratio $\gamma$ arises as an intrinsic property of the subspace (what ground-state spin does the nucleus have, amongst other things) and the operators.
Another commutation relationship obeyed by spherical tesnsor operators is $$\left[L_z, T_m^{(\ell)}\right] = m\, \hbar\, T_m^{(\ell)},$$ that is say that they are eigenoperators of commutation with $L_z$.
More generally, this means that the operator $T_m^{(\ell)}$ represents a quantum conherence of order $m$. Intuitively, this is a statement about the phase acquired by the operator when it is rotated about the $z$-axis. Coherences of order $\pm 1$ are of special interest since they represent observable magnetisation, and one approach to the design of pulse sequences is to follow the transfer of coherence between spins, adjusting the phase of pulses accordingly.
Higher-order spherical tensors can be constructed as products of lower-order ones (or, more properly, as Kronecker products in the usual case of operators on different Hilbert spaces, that is on different spins), and more specifically as products of the $k=1$ operators $V_z$, $V_\pm$ referred to in the previous section. Even more specifically, we can use combinations of $L_z$ and $L_\pm$ on different spins. Writing spin terms in this way makes it easier to identify the coherence order: it is the difference between the number of raising and lowering operators.
This can be seen by first considering the following recursive commutation identity on a single spin $$\left[L_z, A L_\pm B\right] = A L_\pm \left[L_z, B\right]\ +\ \left[L_z, A\right]L_\pm B\ \pm\ \hbar\, A L_\pm B $$ where $A$, $B$ are arbitrary combinations of operators, and we also used the previously-derived $[L_z, L_\pm] = \pm \hbar L_\pm$. The last term shows that the $L_\pm$ contributes a coherence order of $\pm 1$, and reapplying the identity itself to the commutators in the first two terms until no raising or lowering operators remain gives the result. (In fact, often this is more general a result than we need, since most spin interactions are 'linear' in the spin operators and so powers of e.g. $L_z$ do not appear with regularity when evolving the system. This exludes 'quadratic' interactions such as quadrupolar coupling.)
This then extends to multiple spins by noting that for two spins $L$, $S$, the commutator between the total system $z$-spin operator and some arbitrary operator $A \otimes B$ is $$\left[(L_z\otimes\mathbf{1}_S) + (\mathbf{1}_L \otimes S_z), A \otimes B\right] = [L_z, A] \otimes B + A \otimes [S_z, B],$$ and so we can just use the above result on each of the spins.
The evolution of a state $|\psi\rangle$ under a Hamiltonian $H$ is given by the Schrödinger equation $$\mathrm{i}\hbar\frac{\partial}{\partial t}|\psi\rangle = H|\psi\rangle,$$ which, for a time-independent Hamiltonian and an initial state $|\psi_0\rangle$ at $t=0$, has the solution $$|\psi\rangle = U(t)|\psi\rangle = \exp\left(-\frac{\mathrm{i} H t}{\hbar}\right)|\psi_0\rangle,$$ where $U(t)$ is the unitary time-propagator.
Let us explicitly recover the Larmor precession we saw in the semi-classical approach for a spin-$\frac{1}{2}$ particle. Suppose we have just applied an excitation pulse such that the expectation of the spin is maximally aligned along the $x$-direction, that is we have an initial state $\left|\uparrow_x\right\rangle = \frac{1}{\sqrt{2}}(\left|\uparrow\right\rangle + \left|\downarrow\right\rangle)$ with $\left\langle\uparrow_x\middle|L_x\middle|\uparrow_x\right\rangle = \frac{1}{2}\hbar$. The spin then evolves under the Hamiltonian giving the coupling to the background magnetic field, namely $H = -\gamma B L_z=\omega L_z$.
It is convenient to continue using Heisenberg's matrix approach. The matrix elements of the spin operators with respect to the $L_z$ eigenstates give matrix mappings $$\frac{L_x}{\hbar} \to \begin{bmatrix}0 & \frac{1}{2} \\ \frac{1}{2} & 0\end{bmatrix},\qquad \frac{L_y}{\hbar} \to \begin{bmatrix}0 & -\frac{1}{2}\mathrm{i} \\ \frac{1}{2}\mathrm{i} & 0\end{bmatrix}, \qquad \frac{L_z}{\hbar} \to \begin{bmatrix}\frac{1}{2} & 0 \\ 0 & -\frac{1}{2}\end{bmatrix},$$ and from our initial state we can read off $\left\langle\uparrow \middle| \uparrow_x \right\rangle = \frac{1}{\sqrt{2}}$ and $\left\langle\downarrow \middle| \uparrow_x \right\rangle = \frac{1}{\sqrt{2}}$. So the state vector corresponding to the state $|\psi \rangle$ is $$\frac{1}{\sqrt{2}}\begin{bmatrix}\exp\left(-\frac{\mathrm{i}\omega t}{2}\right) & 0 \\ 0 & \exp\left(\frac{\mathrm{i}\omega t}{2}\right)\end{bmatrix}\begin{bmatrix}1 \\ 1\end{bmatrix}=\frac{1}{\sqrt{2}}\begin{bmatrix}\exp\left(-\frac{\mathrm{i}\omega t}{2}\right) \\ \exp\left(\frac{\mathrm{i}\omega t}{2}\right)\end{bmatrix}.$$ Then the expectation of $L_x$ on the state can be evaluated as $$\langle \psi | L_x | \psi \rangle =\frac{1}{\sqrt{2}}\begin{bmatrix}\exp\left(\frac{\mathrm{i}\omega t}{2}\right) & \exp\left(-\frac{\mathrm{i}\omega t}{2}\right)\end{bmatrix}\, \hbar \begin{bmatrix}0 & \frac{1}{2} \\ \frac{1}{2} & 0\end{bmatrix}\, \frac{1}{\sqrt{2}}\begin{bmatrix}\exp\left(-\frac{\mathrm{i}\omega t}{2}\right) \\ \exp\left(\frac{\mathrm{i}\omega t}{2}\right)\end{bmatrix} = \frac{1}{2}\hbar \cos\omega t,$$ while a similar calculation would give $\langle \psi | L_y | \psi \rangle = \frac{1}{2}\hbar \sin\omega t$. So the expectations of the spin components in the transverse plane oscillate with the Larmor frequency, and consequently so do those of the magnetic moment, just as we saw before.
For arbitrary spin, in fact we can derive the same result by noticing that the time propagator has the form of a rotation propagator, and being careful to remember that operators rotate in the opposite sense to states: $$\left\langle \psi \middle| L_x \middle| \psi \right\rangle = \langle \psi_0 | \exp\left(\frac{\mathrm{i}\omega L_z t}{\hbar}\right)L_x\exp\left(-\frac{\mathrm{i}\omega L_z t}{\hbar}\right)|\psi_0\rangle = \langle\psi_0|L_x|\psi_0\rangle \cos\omega t - \left\langle\psi_0\middle|L_y\middle|\psi_0\right\rangle \sin\omega t,$$ and similarly for the other transverse component. So the bit in the transverse plane at the start precesses as before.
The density operator associated with a pure state $|\psi\rangle$, namely $\rho = |\psi\rangle\langle\psi|$ (note that there is an implicit Kronecker product here - we will continue to drop this when it is unambiguous), can be substituted into the Schrödinger equation to show that it evolves under $$\frac{\partial\rho}{\partial t}=-\frac{\mathrm{i}}{\hbar}[H,\rho],$$ which is the Liouville-von Neumann equation. Note that since this equation is linear, a general density operator for a mixed state will also evolve like this.
For a time-independent Hamiltonian, the density operator inherits its solutions from that of the state as $$\rho = U(t)|\psi_0\rangle\langle\psi_0|U^{-1}(t) = \exp\left(-\frac{\mathrm{i} \hat{H} t}{\hbar}\right)\rho_0,$$ where $\hat{H}$ is a commutation superoperator acting as $\hat{H}\bullet = [H,\bullet]$.^{9}
This gives some nice properties in relation to nmr, particularly if we are working with a spin-$\frac{1}{2}$ like hydrogen (which, given its sensitivity, is often the target isotope in more complicated pulse sequences involving eg. polarisation transfer). If $A$, $B$, $C$ are operators with cyclic commutation relations, such as the spin operators, then starting with a density operator $A$ and evolving it under a Hamiltonian $\alpha B$ for a time $t$ gives (after some algebra - see for example the exposition at Product Operator Formalism) $$\rho(t) = \cos \alpha t\, A - \sin \alpha t\, C,\tag{7}$$ which can be viewed as a rotation in the abstract space with the operators along the Cartesian axes. More formally, expanding a little on some earlier comments on group structure, we are dealing with a representation of the Lie group $\text{SU}(2)$ whose Lie algebra $\mathfrak{su}(2)$ has a representation with our operators as elements. The elements are the generators of rotations in the abstract space.^{10}
The next trick relies on the fact that the expectation of general operator $V$ on the (in general mixed) state with associated density operator $\rho$ is $\operatorname{tr}(V \rho)$. (This statement has to be modified slightly if $V$ is not Hermitian, which we'll ignore for now.) For a spin-$\frac{1}{2}$ the density operator for eg. the upper state can be written in terms of the identity and the spin operators as $\left|\uparrow\middle\rangle\middle\langle\uparrow\right| = \frac{1}{2}\mathbf{1}+\frac{1}{\hbar}L_z$. Since the spin operators are traceless, the bit proportional to the identity has no effect on any observable expectation values, and so we can associate states with operators: $\left|\uparrow\right\rangle$ with $L_z$, $\left|\downarrow\right\rangle$ with $-L_z$, $\left|\uparrow_x\right\rangle = \frac{1}{\sqrt{2}}(\left|\uparrow\right\rangle + \left|\downarrow\right\rangle)$ with $L_x$, and so on. Even more powerfully, the treatment extends for two or more spins, where we can associate Kronecker products of operators on each spin with different coherences that cannot be captured by the classical treatment.
This treatment constitutes a powerful tool for quickly determining the effect of pulse sequences by tracking the density operator while abstracting away the underlying quantum mechanics as rotations. The cyclic subspaces of operators, both those associated with observable magnetisation and those associated with non-observable coherences, can be tabulated in advance. Then 'all' we need to do is repeatedly apply Equation (7) starting at the initial state, changing the operator $B$ as the acting Hamiltonian changes (eg. when a pulse is applied), and then look at the coefficients of the bits corresponding to observable magnetisation.
As a quick illustrative example (albeit one where the formalism is not really necessary), consider following a spin with offset frequency $\omega$ during a $\left(\frac{\pi}{2}\right)_x$ — $\tau$ — $(\pi)_x$ — $\tau$ refocussing sequence. We start with the equilibrium magnetisation being classically aligned with the $z$-axis, so the operator associated with the initial state is some multiple of $L_z$. (See the next section for a more quantitative statement.) Writing the currently-active pulse or Hamiltonian (multiplied by the acting time, that is like $\alpha t$ in Equation (7)) above the arrows between operators, we can follow the evolution as $$ \begin{aligned} L_z \overset{\left(\frac{\pi}{2}\right)_x}{\longrightarrow} & \quad {-L_y}\\ \overset{\omega L_z \tau}{\longrightarrow} & \quad {-L_y\cos\omega \tau} + L_x\sin\omega \tau\\ \overset{(\pi)_x}{\longrightarrow} & \quad L_y\cos\omega \tau + L_x\sin\omega \tau\\ \overset{\omega L_z \tau}{\longrightarrow} & \quad (L_y\cos\omega \tau - L_x\sin\omega \tau)\cos\omega \tau + (L_x \cos\omega \tau + L_y \sin\omega \tau)\sin\omega \tau\\ & = L_y(\cos^2\omega \tau + \sin^2\omega \tau) + L_x (-\sin\omega \tau \cos\omega \tau + \cos\omega \tau \sin\omega \tau)\\ & = L_y. \end{aligned} $$ So indeed the magnetisation gets completely refocussed regardless of the offset, as we saw with the semi-classical treatment too.
Spin-echo sequences can also refocus heteronculear $J$-coupling in addition to chemical shifts. Inutitively this makes sense - the coupling to the 'passive' spin shifts the resonant frequencies of the 'active' spins slightly, which effectively manifests as a chemical shift. The passive spins are not affected by any pulses since they are so far off the resonance of the active spin, and hence have no other effect on the dynamics.
Using the product operator formalism, we can show this more explicitly in the case of an active spin-$\frac{1}{2}$ $L$ coupled to a passive spin-$\frac{1}{2}$ $S$ (for example hydrogen-1 coupled to carbon-13 via a $\text{C-H}$ bond). Keeping only the weak $J$-coupling (what exactly constitutes 'weak' can be made more formal), this gives a Hamiltonian $H=\pi J \ 2L_z S_z$. Consider the sequence $\left(\frac{\pi}{2}\right)_y$ — $\tau$ — $\left(\pi\right)_x$ — $\tau$, which gives $$ \begin{aligned} L_z \overset{\left(\frac{\pi}{2}\right)_y}{\longrightarrow} & \quad L_x\\ \overset{\tau}{\longrightarrow} & \quad L_x\cos\theta + 2L_yS_z\sin\theta\\ \overset{\left(\pi\right)_x}{\longrightarrow} & \quad L_x\cos\theta - 2L_yS_z\sin\theta\\ \overset{\tau}{\longrightarrow} & \quad \cos\theta\,(L_x\cos\theta + 2L_yS_z\sin\theta) + \sin\theta\,\left(-2L_yS_z\cos\theta + L_x\sin\theta\right)\\ & = L_x\,(\underbrace{\cos^2\theta + \sin^2\theta}_1) + 2L_yS_z\,(\underbrace{\cos\theta\sin\theta - \sin\theta\cos\theta}_0)\\ & = L_x \end{aligned} $$ where $\theta = \pi J \tau$. Note that, playing a bit loose with units, we used the commutator $$ \begin{aligned} \left[L_x, 2 L_z S_z\right] & = 2[L_x, L_z]S_z\\ &= -\text{i}\, 2 L_y S_z \end{aligned} $$ and its next cyclic rotation $$ \begin{aligned} \left[2L_zS_z, -2L_yS_z\right] &= -4[L_z, L_y]S_z^2\\ &= \text{i}\, L_x \end{aligned} $$ where we used $S_z^2 = \frac{1}{4}\mathbf{1}_S$, which follows from the Cayley-Hamilton theorem and the eigenvalues of $S_z$ being $\pm\frac{1}{2}$ (in units with $\hbar=1$).
Armed with the quantisation of the magnetic moment given by the quantum treatment, let us make the statement in Equation (2) more quantitative by applying some classical thermodynamics to the density operator treatment. Writing the background magnetic field as $\mathbf{B}=B\mathbf{e}_z$, consider the behaviour of a single spin-$\ell$ nucleus in thermal equilibrium with its surroundings at temperature $T$. When in the pure state $|\ell,m\rangle$, the quantum expectation of the nucleus' magnetic moment in the $z$-direction is $\mu_z = \gamma m \hbar$, and so the quantum expectation of its potential energy is $E=-\gamma m \hbar B$. Considering the spin as part of a much larger thermodynamic ensemble, the probability $p_{\ell,m}$ of it being in a particular pure state is classically determined by the canonical ensemble, and the mixed state we are working with has density operator $$\rho = \sum_{m=-\ell}^\ell p_{\ell,m}|\ell,m\rangle\langle\ell,m|.$$ Hence the mean magnetisation of the spin in equilibrium is given by $$\bar{\mu}_z = \sum_{m=-\ell}^\ell p_{m,\ell}\cdot \mu_z(m) = \frac{\sum_{m=-\ell}^\ell \exp\left(\frac{\gamma m \hbar B}{k T}\right)\gamma m \hbar}{\sum_{m=-\ell}^\ell \exp\left(\frac{\gamma m \hbar B}{k T}\right)}$$ In nmr, typically the nuclei's potential energies are much smaller than their thermal energies: $|E| \ll kT$. This allows us to Taylor expand the exponentials in the sum. To first non-trivial order, some algebraic manipulation^{11} gives a result $$\bar{\mu}_z \approx \frac{\gamma^2 \hbar^2 B}{3 k T}\, \ell(\ell+1) \propto \gamma^2 B.$$ The key takeaway here is the dependence of the equilibrium magnetisation on the two things we have some control over: the gyromagnetic ratio (changes depending on the target isotope) and the magnetic field.
Note that we could also have computed the equilibrium operator itself by a similar process, which would have given to lowest non-trivial order^{12} $$\rho \approx \frac{1}{2 \ell +1}\mathbf{1} + \frac{\gamma B}{kT(2 \ell +1)}L_z.$$ This is consistent with the above if we then compute (noting that $L_z$ is traceless) $$\bar{\mu}_z = \operatorname{tr}(\gamma L_z \rho)=\frac{\gamma^2 B}{k T (2\ell+1)}\operatorname{tr}L_z^2.$$
How large is the signal produced by such an equilibrium magnetisation in the pickup coil if we rotate it fully into the transverse plane? Faraday's law means that we pick up an extra factor of the Larmor frequency, and so the signal $S\propto |\gamma^3| B^2$. As for the noise, the dominant contribution is from Johnson noise in the resonant circuit at the pickup frequency. The root-mean-square noise amplitude goes as the square root of the frequency bandwidth, which is itself, to a good approximation, proportional to the pickup frequency. Hence we expect a noise $N \propto \sqrt{|\gamma| B}$. Therefore, the signal noise-ratio should scale as $$\frac{S}{N}\propto\frac{|\gamma^3|B^2}{\sqrt{|\gamma|B}} = |\gamma|^\frac{5}{2}B^\frac{3}{2}.$$ This shows why 'traditional' nmr strives to maximise the field strength. In our case, where the field is effectively fixed by design considerations, the important observation is that $S/N$ scales quite strongly with the gyromagnetic ratio. This is what is meant when we talk about how 'sensitive' an isotope is: the sensitivity grows with absolute gyromagnetic ratio.
Note that the above has said nothing about how much of the isotope there is intrinsically present in the sample being measured, but compares only the overall sensitivity of different spins. That is to say that a larger sensitivity does not necessarily mean a greater absolute signal. In particular, a very sensitive isotope may have a very low natural abundance, in which case the obtained signal may typically be much smaller than that for a more abundant but less sensitive one.
The above treatment is applicable only for completely homogeneous magnetic fields, in that it only considers one value of the magnetic field strength. But it is intuitive that the signal quality should also depend on the homogeneity of the magnet. In the case of spin-echo experiments, each data point is effectively some weighted average across an echo, and the echo width is a function of the inhomogeneity. The more inhomogeneous the magnet, the narrower the echo.
As an illustrative example, suppose that we can approximate the echo shape in the time domain as a Gaussian of width $\sigma_\text{inhom}$. Each of the points $x_i$ making up this echo have some normally-distributed radio noise $\sigma_\text{rad}$, where, to a good first approximation, this is a constant set by the Johnson noise (see previous section) at the nominal pickup frequency. To extract an amplitude from the echo, suppose we multiply by some Gaussian weights $w_i$, corresponding to a window with width $\sigma_\text{w}$. The window is truncated such that it extends to $\pm \sigma_w$ and is normalised to give total weight unity. Then the amplitude $X$ we calculate can be expressed as $$X = \sum_i w_i\, x_i.$$ The noise on this amplitude is relatively easy to calculate. Assuming that neighbouring points in the echo are uncorrelated (in the sense of the noise on them, not their nominal values), then we have $$\operatorname{var}X = \sigma_\text{rad}^2\sum_i w_i^2 \approx \int_{-\sigma_\text{w}}^{+\sigma_\text{w}} w^2(x)\, \text{d}x \propto \frac{1}{\sigma_\text{w}},$$ where we approximate the discrete sum with an integral over continuous weights $w(x) \propto \frac{1}{\sigma_\text{w}}\exp\left(-\frac{x^2}{2 \sigma_\text{w}^2}\right)$. Hence we approximately have an echo-amplitude noise $S \propto \frac{1}{\sqrt{\sigma_\text{w}}}$. Note that this intuitively makes sense: increasing the window width effectively increases the number of points $n$ used to calculate the amplitude in proportion to the width, and a simple average over $n$ points reduces the error on the average by a factor of $\frac{1}{\sqrt{n}}$.
Now consider the signal amplitude, which is slightly more fiddly to calculate. We are now interested in $$\bar{X} = \sum_i w_i \bar{x_i} \approx \int_{-\sigma_\text{w}}^{+\sigma_\text{w}}w(y)\bar{x}(y)\, \text{d}y,$$ where again we have passed to a continuous approximation, and now $\bar{x}(y) \propto \exp\left(-\frac{y^2}{2 \sigma_\text{inhom}^2}\right)$. Note that there is no reciprocal factor of $\sigma_\text{inhom}$ in the expression for $\bar{x}(y)$ - this is because the 'height' of the echo should remain the same regardless of its width.
To evaluate this integral, consider what happens when we multiply two zero-centred Gaussian profiles with widths $\sigma_1$ and $\sigma_2$. The functional parts combine like $$\exp\left(-\frac{x^2}{2 \sigma_1^2}\right)\exp\left(-\frac{x^2}{2 \sigma_2^2}\right) = \exp\left[-\frac{x^2}{2}\left(\frac{1}{\sigma_1^2} + \frac{1}{\sigma_2^2}\right)\right],$$ that is also the shape of a Gaussian, but with a new width $\sigma'$ given by $\sigma' = \left(\frac{1}{\sigma_1^2} + \frac{1}{\sigma_2^2}\right)^{-\frac{1}{2}}$. Using this fact, and making sure to normalise properly, we find the signal amplitude becomes approximately $$S \propto \bar{X} \propto \frac{\sigma_\text{inhom}}{\sqrt{\sigma_\text{w}^2 + \sigma_\text{inhom}^2}}\operatorname{erf}\left[\frac{1}{\sqrt{2}}\sqrt{1+\frac{\sigma_\text{w}^2}{\sigma_\text{inhom}^2}}\right],$$ and so the signal-noise ratio scales approximately as $$\frac{S}{N} \propto \frac{\sigma_\text{inhom}\sqrt{\sigma_\text{w}}}{\sqrt{\sigma_\text{w}^2 + \sigma_\text{inhom}^2}}\operatorname{erf}\left[\frac{1}{\sqrt{2}}\sqrt{1+\frac{\sigma_\text{w}^2}{\sigma_\text{inhom}^2}}\right].$$ In the limit of good homogeneity (i.e. large $\sigma_\text{inhom}$), the signal contribution varies little with respect to $\sigma_\text{w}$ (the echo is very flat), but then drops off 'sharply' above some crtical $\sigma_\text{w}$. This critical value gets larger as $\sigma_\text{inhom}$ increases. From above, we also know that the noise contribution is a decreasing function of $\sigma_\text{w}$. Putting this together, we see that as the homogeneity improves, the signal-noise ratio can also be improved by increasing the window width to follow the critical value.
^{1}Classically, this can somewhat loosely be thought of as being related to the magnetic moment generated by a spinning spherical charge distribution being parallel to its angular velocity. This should not be taken literally: quantum spin, which as discussed later is the actual origin of the nuclear magnetic moment, has nothing to do with physical rotations of the nucleus! ←
^{2}In certain rare cases, such as very antisymmetric molecules that have structures such that the locally fluctuating field has a significant component perpendicular to the main magnetic field, $T_2$ may be longer than $T_1$. However, there is an upper bound. We cannot create more magnetisation than there was at equilibrium, that is that we must always have $\sqrt{M_z^2 + M_{xy}^2} < M_0$. The extremal case where this inequality becomes an equality is at $T_2 = 2 T_1$. To see compactly that this is the critical value, consider the 'worst case', which is intuitively when we start with just having rotated all the equilibrium magnetisation into the $x$-$y$ plane. The later evolution (in units of the equilibrium magnetiation) is, appealing to our solutions of the Bloch equations, $M_{xy} = \exp(-t/T_2)$ and $M_z = 1-\exp(-t/T_1)$. Put $\alpha = T_2/T_1$ with $\alpha > 0$ and $x = \exp(-t/T_1)$ which has $0 < x < 1$. Then the square magnitude of the magnetisation evolves like $$ M^2(t, T_1, T_2) = f(x, \alpha) = \left[\exp\left(-\frac{t}{T_1}\right)\right]^{\frac{2T_1}{T_2}} + \left[1-\exp\left(-\frac{t}{T_1}\right)\right]^2 = x^\frac{2}{\alpha} + (1-x)^2, $$ and the condition for non-physicality is $f > 1$, which is equivalently $x^{\frac{2}{\alpha}-1} > 2-x$. When the exponent on the lhs is positive, corresonding to $\alpha \le 2$, then that side never gets larger than $1$, and the inequality is never satisfied. For negative exponent, that is $\alpha > 2$, then that side goes to infinity as $x\to 0$, and the inequality is certainly satisfied. So $\alpha=2$ must be the critical value.
As a final aside, note that all of this applies only for 'isolated' spin pools that do not exhange polarisation with other spin pools via processes such as chemical exchange. Then the system we have considered is no longer closed, and all bets are off regarding restrictions on the relaxtation times for a single pool. ←
^{3}This is only really true with some caveats. By imposing that the relaxation is an exponential decay in the first place, we are saying that the relaxation profile in frequency-space is Lorentzian, because those two functions are a one-sided Fourier transform pair (at least with respect to their real parts). To deal with the extra artificial relaxation due to inhomogeneity, the magnet profile should be convolved with the relaxation profile in frequency-space. The inverse Fourier transform of a convolution of two functions is almost the product of the inverse Fourier transforms of the individual functions. So in adding the inhomogeneity by multiplying with an extra exponential decay in the time domain, we are also restricting the inhomogeneity profile to be Lorentzian. ←
^{4}The main argument runs as follows for a non-rotating system. The magnetic part of the Lorentz force is perpendicular to the velocity, so it does no work. Hence the system's energy cannot depend on the applied magnetic field and, appealing to the canonical distribution, neither can the motion of the particles in the system. In zero applied field, there can be no net moment, because the system has no net rotation. But the motions do not depend on the applied field, so there can be no net moment even in a non-zero field. Hence the system cannot exhibit magnetic properties if it is purely classical. ←
^{5}There is a bit more work to be done from here before concluding that $U^\dagger U$ is the identity. Loosely we need to show that that operator has the same expectation value, including projection onto a different state, for any two states: $$\langle\psi|U^\dagger U|\chi\rangle = \langle\psi|\chi\rangle$$ This is closely related to saying that if operators $A$ and $B$ are representable by finite matrices, then $A=B$ if and only if they have the same matrix elements. ←
^{6}This is essentially the same idea as commuting matrices sharing the same set of eigenvectors. Indeed, taking the trace of the commutation relations in Equation (4) gives $\operatorname{tr}L_i = 0$, and so we can represent the angular momentum operators by finite-dimensional traceless matrices. The Pauli matrices $\sigma_x$, $\sigma_y$, $\sigma_z$ for the $\ell=\frac{1}{2}$ case are probably the most familiar of these. ←
^{7}Making use of the operator equalities $L_\pm^\dagger = L_\mp$ and $L_-L_+ = L^2 - L_z(L_z + \hbar)$. ←
^{8}At first glance, this might seem strange: what is the point of measuring anything if all the information about the system is tied up in the angular momentum anyway? And classical systems can certainly have non-parallel vector quantities, which means that we should expect quantum systems to have something corresponding to this. The key point is that the result only follows because we are working in a subspace with definite $\ell$, that is a definite total spin. Extended systems that we are used to dealing with - say a basketball - tend to have very well defined orientations, so the generalised uncertainty principle says that there must be a large uncertainty in their angular momenta, which kicks us out of the subspace. ←
^{9}The concept of a superoperator might be unfamiliar. An operator maps states in a Hilbert space to other states in the same space. A superoperator is the 'next level up' - it maps operators on that Hilbert space to other operators on the same space. ←
^{10} The operators $A$, $B$, $C$ are orthogonal in the sense that $\operatorname{tr}(AB) = \operatorname{tr}(BC) = \operatorname{tr}(CA) = 0$, and so they span the three-dimensional abstract subspace. To see this, first consider the case in which the operators are just the spin operators. Consider now the unitary operator $U = \exp\left[+\frac{2\pi\text{i}}{3\sqrt{3}\hbar}\left(L_x+L_y+L_z\right)\right]$, which can be recognised from the previous exposition as that rotating a state by a $\frac{1}{3}$-turn about the direction $(1,1,1)$ in the negative sense. Remembering that operators rotate in the opposite sense to states, this gives a unitary transformation that cyclically permutes the spin operators in the sense that $U^\dagger\, L_x\, U = L_y$ and so on. Traces are invariant under unitary transformations, so this establishes that $\operatorname{tr}(L_x\,L_y) = \operatorname{tr}(L_y\,L_z) = \operatorname{tr}(L_z\,L_x)$. Now consider a representation in which $L_z$ is diagonal. It has purely real entries since those are the eigenvalues, which are real since it is Hermitian. Since $L_x = \frac{1}{2}(L_+ + L_-)$ and $L_y = \frac{1}{2\text{i}}(L_+ - L_-)$, contracting with $\left|\ell, m\right>$ gives that in this representation $L_x$ has purely real entries, and $L_y$ has purely imaginary entries. So $L_z\, L_x$ has purely real entries (and therefore purely real trace), while $L_y\,L_z$ has purely imaginary entries (and therefore purely imaginary trace). But the only number that is both purely real and purely imaginary simultaneously is zero, so all the traces vanish. This argument extends for more general operators on multiple spins since Kronecker products preserve Hermitianness: if $A^\dagger = A$ and $B^\dagger = B$, then $(A \otimes B)^\dagger = A^\dagger \otimes B^\dagger = A \otimes B$. ←
^{11}This numerator is probably most easily done by noticing that $$\sum_{m=-\ell}^\ell \exp\left(\frac{\gamma m \hbar B}{k T}\right)\gamma m \hbar = k T \frac{\partial}{\partial B}\sum_{m=-\ell}^\ell \exp\left(\frac{\gamma m \hbar B}{k T}\right).$$ After Taylor expansion this ends up being a sum that is proportional to $\sum_{m=1}^{\ell}m^2$ in the case of integer $\ell$ and to $\sum_{m=\frac{1}{2}}^{\ell}m^2$ (ie sum over $m = \frac{1}{2},\frac{3}{2},...,\ell$) in the case of half-integer $\ell$, both of which evaluate to $\frac{1}{6}\ell(\ell+1)(2\ell+1)$. ←
^{12}This is the more quantitative statement promised in the section on the product operator formalism: the factor in front of $L_z$ corresponding to the initial equilibrium magnetisation should be $\beta = \gamma B/kT(2\ell+1)$. In practice for single spins, we just write $L_z$ at the beginning and then multiply the observable coefficient by $\bar{\mu}_z$ at the end. If we have two spins with $\gamma_1 \sim \gamma_2$, so both have equilibrium polarisation of the same order, then the starting operator should instead be $L_z + \frac{\beta_2}{\beta_1} S_z$. ←