Saturday 16 August 2008

SPECTRa

During the flight to Stockholm (where I’m having a great time btw), I read with great interest this article:

SPECTRa: The Deposition and Validation of Primary Chemistry Research Data in Digital Repositories
DOI: 10.1021/ci7004737

In my opinion, the project presented in this article is a great initiative which I can only applaud and support. There is, however, a point which I would like to comment on because it’s not fully clear to me which regards to the way in which NMR spectra are stored in the repository.

The authors have decided to use JCAMP as the format for file input to their repositories. They do not specify which actual data is being saved in these JCAMP files, the processed spectrum or the FID (or both). I hope that they are saving the original FID and not only the processed spectrum, otherwise data preservation will be broken. I think this is a very important point which deserves some further clarification. This is how I see it:

The most important piece of information in an NMR experiment is, for sure, the acquired FID, not the processed spectrum. A chemist could have processed an FID to produce a spectrum in such a way that some spectral features are lost. For example, he/she could have applied a very large line broadening function which will make the analysis of the finer structure of some multiplets impossible. If the original FID has not been kept, the option to re-process the spectrum to calculate those lost couplings will not be viable (in some cases, Inverse Fourier Transform and/or some resolution enhancements procedures could help, but only in a very limited extent). In fact, there are many processing operations which could alter, irreversibly, either the qualitative or quantitative information present in the NMR experiment.

In short, I’m strongly convinced that any system aimed at preserving all information contained in NMR experiments should keep the original acquired data points, the FID. This is something I have learnt during the many years of development of both MestReC and Mnova: Mnova keeps, in addition to the processed spectrum, all the original files as they were acquired in the spectrometer. And I know that iNMR does the same thing, though in a different way (Mnova packs all the files within a single binary file, whereas iNMR keeps all the files separately and a processing log file so that the processed spectrum does not need to be saved. From my point of view, both approaches are equivalent and perfectly valid).