NMR Analysis, Processing and Prediction: November 2009

Monday 30 November 2009

Basis on qNMR: Integration Rudiments (Part I)

First a quick recap. In my last post I put forward the idea that integration of NMR peaks is the basis of quantitative analysis. Before going any further, I would like to mention that, alternatively, peak heights can also be used for quantitation, but unless some special pre-processing is employed (see for example P. A. Haysa, R. A. Thompson, Magn. Reson. Chem., 2009, 47, 819 – 824, doi) measurement of peak areas is generally the recommended method for qNMR assays.

In this post I will cover some very basic rudiments of NMR peak areas measurements, without going into depth into complicated math , as my objective is just to set the basis for oncoming, more advanced posts.

NMR Integration basic Rudiments

Peak areas may be determined in various ways. While I was still at school I learnt a very simple peak area calculation method which just required a good analytical balance and scissors. This was the so-called ‘cut & weigh method’ and is illustrated in the figure below.

By simply cutting out a rectangle of known value, for example, known ppm or Hz on x-axis and known intensities on the y-axis, a calibration standard is obtained (in this case, 8 units of area). After cutting and weighing this standard, the area of any peak can be determined by cutting and weighing the peak(s) from the chart, weighing the paper and using this equation:

Area of Peak(s) = Area of standard * Weight of peak / Weight of standard

Yet, despite its primitiveness, this technique was remarkably precise for the purpose for which it was intended (obviously, not for accurate NMR peak areas measurement :-) ) but, of course, it assumed that the density of the paper was homogenous.

There are other classical methods such as counting squares, planimeters or mechanical integrators but in general they were subject to large errors. In the analogic era, it was more convenient to measure the integral as a function of time, using an electronic integrator to sum the output voltage of the detector over the time of passage through the signals. In those old days, as described in [2], before the FT NMR epoch, the plotter was set to integral mode and the pen was swept through the peak or group of peaks as the pen level rose with the integrated intensity.

Enough about archaic methods, we are in the 21st century now and all NMR spectra are digitalized, processed and analyzed by computers. As Richard Ernst wrote once [1], Without Computers – no modern NMR. How are NMR integrals measured? From a user point of view, it’s very straightforward: the user selects the left and right limits of the peaks to be integrated and the software reports the area (most NMR software packages have automated routines to automatically select the spectral segments to be integrated). For example, the figure below shows how this is done with our NMR software, Mnova.

Integration: What’s under the hood

But the question is: how is the computer actually calculating NMR peak areas? In order to answer this, let’s revisit some very simple integration concepts.

From basic calculus we all learnt in school, we know that in order to compute the area of a function (e.g. f(x)) we simply need to calculate the integral of that function over a given interval (e.g. [a,b]).

If the function to be integrated (integrand) f(x) is known, we can analytically calculate the value of the area. For example, if the function has the simple quadratic expression

and we want to calculate the area under the curve over the interval [1,3], we just need to apply the well known Fundamental Theorem of Calculus so that the resulting area will be:

Unfortunately, real life is always more complex. Where NMR is concerned, function f(x) is, in general, not known so it cannot be integrated as done before using the Calculus fundamental theorem. I wrote ‘in general’ because theory tells us the analytical expression for an NMR signal (i.e. we know that, at a good approximation, NMR signals can be modeled as Lorentzian functions) but, for the moment, let’s consider the more general case in which the NMR signal has an unknown lineshape.

Furthermore, up until now we have assumed that f(x) is a continuous function. Obviously, this is not the case for computer generated NMR signals as they are discrete points as a result of the analog to digital conversion. Basically, the digitizer in the spectrometer samples the FID voltage, usually at regular time intervals and assigns a number to the intensity. As a result, a tabulated list of numbers is stored in the computer. This is the so-called FID which, after a discrete Fourier Transform yields the frequency domain spectrum. So how can a tabular set of data points (the discrete spectrum) can be integrated?

A very naive method (yet as we will see shortly, very efficient) is to use very simple approximations for the area: Basically the integral is approximated by dividing the area into thin vertical blocks, as shown in the image below.

This method is called the Riemann Integral after its inventor, Bernhard Riemann.

Intuitively we can observe that the approximation gets better if we increase the number of rectangles (more on this in a moment). In practice, the number of rectangles is defined by the number of discrete points (digital resolution) in such a way that every point in the region of the spectrum to be integrated defines a rectangle.

For example, let’s consider the NMR peak shown in the figure below which I simulated using the spin simulation module of Mnova. It consists of a single Lorentzian peak with a line width at half height of 0.8531 points and a height of 100. With all this information we can know in advance the expected exact area calculated as follows:

In the spectrum shown in the image below we can see the individual digital points as crosses and the continuous trace which have been constructed by connecting the crosses by straight lines (usually only these lines are shown in most NMR software packages. The capability of showing both the discrete points and the continuous curve is a special feature of Mnova.

If the simple Riemann method is applied, we obtain an area = 146, which represents an error of ca 21% with respect to the true area value (184.12). It’s worth mentioning that the true value is calculated by integrating the function from minus infinite to plus infinite whilst in the example above the integration interval is very narrow.

As mentioned above, the approximate area should get better if we increase the number of rectangles. This is very easy to achieve if we use some kind of interpolation to, for example, double the number of discrete points. We could use some basic linear interpolation directly in the frequency domain, although in NMR we know that a better approach is to extend the FID with zeroes via the so-called zero filling operation.

So if we double the number of digital points and thus the number of rectangles used for the area calculation we obtain a value of 258 (see image below). In this case, as the digital resolution is higher, the line width at half height is also higher, 1.7146 (in other words, we have more digital points per peak) so the true integral value will be 269.32:

Now the error we are committing is just as little as 4%. As a general rule it can be said that the better the digital resolution, the better the integration accuracy.

Mathematically, Riemann method can be formulated as:

Considering that in almost all NMR experiments, we are interested in relative areas, the spacing between data points, Δx , is a common factor and can be dropped from the formulas with no loss of generality.

This is exactly the method of choice of most NMR software packages for peak area calculations: NMR integrals are calculated by determining the running sum of all points in the integration segment.

Other numeric integration methods

One important conclusion from the previous section is that in order to get more accurate areas we should increase the number of integration rectangles, something which is equivalent to increasing the number of digital points (e.g. by acquiring more points or using zero filling).

Instead of using the running sum of the simple individual rectangles, we can use some kind of polynomial interpolation between the limits defining each rectangle. The simplest method uses linear interpolation so that instead of rectangles we use trapezoids. This is the well known trapezoid rule which is formulated as:

If instead of linear interpolation we use parabolic interpolation, the method receives the name of Simpson as it’s formulated as [3]:

It is limited to situations where there are an even number of segments and thus, odd number of points. These 3 methods are summarized graphically in the figure below.

Other more sophisticated methods such as Romberg, Gaussian quadrature, etc, are beyond the scope of this post and can be found elsewhere.

Which integration method is more suitable for NMR?

This question will remain unanswered for now, open for discussion. Of the 3 integration methods discussed in this post, at first glance Simpson should be the most accurate. However, as explained in [3], this method is more sensitive to the integral limits (e.g. left and right boundaries) in such a way that if the limits are shifted one point to the left or to the right, the integral value will change significantly, while the other two approaches are more robust and the values are less affected.

In my experience, the difference between the simple sum and trapezoid method is small compared to other sources of errors (e.g. systematic and random errors, to be discussed in my next post) so using one approach or the other should not make any relevant difference.

Naturally, if very precise integral values are required, then more advanced methods based on deconvolution should be used. Of course, if you have any input, you’re more than welcome to leave your comments here.

Conclusions

There's a great deal more to NMR Integrals than reviewed here: I have simply scratched the surface. In my next post, I will follow up with the limits and drawbacks of standard NMR integration, introducing better approaches such as Line Fitting or Deconvolution.

References

[1] Ernst Richard R., Without computers - no modern NMR, in Computational Aspects of the Study of Biological Macromolecules by Nuclear Magnetic Resonance Spectroscopy, Edited by J.C.Hoch et al. Plenum Press 1991, pages 1-25

[2] Neil E. Jacobsen, NMR Spectroscopy Explained: Simplified Theory, Applications and Examples for Organic Chemistry and Structural Biology, N.J. : Wiley-Interscience, 2007

[3] Jeffrey C. Hoch and Alan S. Stern, NMR Data Processing, Wiley-Liss, New York (1996)

Sunday 22 November 2009

Basis on qNMR: Intramolecular vs Mixtures qNMR

A bit of historical background

NMR has won its reputation as a powerful tool for structure determination of organic molecules. In addition to the information provided by chemical shifts and coupling constants, the quantitative relationships existing between the peaks (or groups of peaks - multiplets) arising from the various nuclides in the sample has proven pivotal for the assignment and interpretation of NMR spectra.

Despite the fact that the concept of quantitative NMR (qNMR) has been coupled to NMR since the early 1950, shortly after the technique's inception, it seems as NMR, as an analytical tool for quantitative analysis was firstly mentioned in 1963 by Jungnickel and Forbes [Anal. Chem., 1963, 35 (8), pp 938–942] who determined the intramolecular proton ratios in 26 pure organic substances and Hollis [Anal. Chem., 1963, 35 (11), pp 1682–1684] who analyzed the amount fractions of aspirin, phenacetine and caffeine in respective mixtures.

From those pioneer works, many and varied studies on qNMR arose. As pointed out in J. Agric. Food Chem. 2002, 50, 3366-3374, qNMR is particularly suitable for the simultaneous determination of the percentage of active compounds and impurities in organic chemicals such as pharmaceuticals, agrochemicals and natural products, as well as vegetable oils, fuels and solvents, process monitoring, determination of enantiomeric excess, etc.

In what follows, I will use the term qNMR to refer to any quantitative measurement of NMR signals, regardless of whether the technique is employed as an analytical method (e.g. determination of the relative amounts of the components in a mixture) or as tool for structure determination or conformational analysis.

What’s the deal with qNMR?

The basic principle of qNMR assays is that, ideally, the integral of the set of all peaks which can be assigned to a particular nucleus is proportional to the molar concentration of that nucleus in the sample. Theoretically, this holds quite well, though there are deviations from the rule in strongly coupled systems. An important point to keep in mind is the word “ideally”; this includes, for example, perfectly relaxed samples.
Even so there remain a number of problems which can be first of all divided into two categories:

Sources of statistical assessment errors (scatter)
Sources of systematic assessment deviations (bias)

I will cover these points in detail in separate posts.

Intramolecular vs Intermolecular (mixtures) qNMR

The most important fundamental concept of qNMR is based on the fact that, the absorption coefficient for the absorption of electromagnetic energy is the same for all nuclides of the same species, regardless whether they belong to one or several molecules (e.g mixture). As a result, the NMR signal response (more precisely the integrated signal area) is directly proportional to the number of nuclides contributing to the signal.

For example, all organic chemists are very familiar with integrating the multiples of a 1H spectrum to elucidate or confirm a particular molecular structure (see figure below)

This application can be classified as Intramolecular qNMR. NOE spectra, where the intensity is related to the distance between spins and represents the main basis for NMR as a tool in structural molecular biology, is another application of Intramolecular qNMR (Note: In this context I’m not including Transfer-NOE used e.g. to study the structure of a ligand in a complex under conditions of fast exchange)

Let’s consider now another example, Intermolecular qNMR:
Purity determination of a compound using an internal standard (is) with known purity and assuming instrumental parameters properly set is given by the equation below (see for example, 10.1002/mrc.2464):

% purity by weight = W(is)/W(s) * A(s)/A(is)*MW(s)/MW(is)*H(is)/H(s)

where W(s) and W(is) are the weights of the sample and ISTD, A(s) and A(is) are the integrals (areas) of the sample and ISTD peaks, MW(s) and MW(is) are the molecular weights of the sample and ISTD, and H(s) and H(is) are the number of hydrogens represented by the integral for the sample and ISTD, respectively.

As a simple application, see Q-NMR for purity determination of macrolide antibiotic reference standards: Comparison with the mass balance method

Common to all qNMR studies is the calculation of NMR integrals. In my next post, I will cover the basic principles on NMR integration.

Saturday 21 November 2009

Basis on qNMR: Rudiments

When I started playing drums, so many years ago, I kept hearing about so-called "Drum Rudiments". By that time, I was too young to realize how important they were and to me, they appear just as boring and repetitive exercises. However, rudiments (basic building blocks or "vocabulary" of drumming) are absolutely essential to master drums (something I have to admit I never achieved :-) )

In the last few years I’ve had the opportunity to meet and interact with many chemists who are using our NMR software. Some of them are NMR specialists with an outstanding knowledge from whom I have learnt a lot. On the other hand, other chemists use NMR on daily basis simply to confirm the structure(s) they have just synthesized but do not have a deep grasp of the inner details of NMR theory and signal data processing. Whilst I understand that in general this is fine, I have noticed recently that many of these less-experienced NMR scientists are now getting involved in more advanced NMR studies and, in my humble opinion, the lack of some important rudiments can lead to an improper interpretation of the NMR data.

One interesting example is quantitative NMR (qNMR), a field which is being used increasingly in the pharmaceutical industry, for instance, to quantify impurity levels, but it’s also very important in the field of natural products (see for example J. Nat. Prod. 2007, 70, 589-595) and for the calibration of other quantitative techniques such as HPLC. Typically, qNMR is based on obtaining quantitative information through integral-based calculations so in principle, it might seem as this is something trivial which does not require any additional effort. Whilst this is generally true, there are some very important rudiments which I think are worth pointing out.
The rudiments I will present in this series of articles will range from basic concepts on NMR Integration to more advanced deconvolution techniques, including our newly developed Global Spectral Deconvolution algorithm, GSD.
So if you have any interest in qNMR, watch this space. I promise to post these qNMR rudiments on a regular basis.

Thursday 19 November 2009

Micropost [OT]: NMR meets Football

Relaxation plays a major role in NMR spectroscopy – What’s better than playing sports to chill out and forget about everyday problems?

I reckon this is not the best football team you might find but at least I guarantee they are fun people (sponsored by a great company :-) ) with whom you can have a good time (and get a free t-shirt!) :-)

Mestrelab World of Sports - Free Mnova t-shirt quiz

Tuesday 3 November 2009

Windows 7

Windows 7 was released last week marking, in the opinion of many analysts, the beginning of the end of Windows Vista. Microsoft expects that Windows 7 will woo users who have resisted Vista by offering higher performance and compatibility as well as extra features. In fact, Windows 7 has been the biggest pre-order item in the history of Amazon UK.
If you are interested in making the switch, our preliminary tests indicate that Mnova 6.0.2 runs smoothly under Windows 7. Either way, we cannot exclude any incompatibility as our tests on Windows 7 have not been as comprehensive as we would have liked (still working on it though).

So if you are running Windows 7 and find any problem with Mnova, we would really appreciate it if you could let us know

NOTE: Some users have reported problems with version 5.2.5 Lite on Windows 7, although we have not been able to reproduce them in our computers. Rest assured that we are currently investigating this further