Download as pdf or txt
Download as pdf or txt
You are on page 1of 7

Chemometrics and Intelligent Laboratory Systems 109 (2011) 171177

Contents lists available at SciVerse ScienceDirect

Chemometrics and Intelligent Laboratory Systems


journal homepage: www.elsevier.com/locate/chemolab

Structural characterization of carbonyl compounds by IR spectroscopy and chemometrics data analysis


Nabiollah Mobaraki a, b, Bahram Hemmateenejad a, b,
a b

Chemistry Department, Shiraz University, Shiraz, Iran Medicinal & Natural Products Chemistry Research Center, Shiraz University of Medical Sciences, Shiraz, Iran

a r t i c l e

i n f o

a b s t r a c t
Although there are some distinctive peaks in mid-IR region of the electromagnetic spectrum for carbonyl compounds, it is very difcult to assign a FT-IR spectrum to a specic carbonyl functional group due to the presence of other functional groups, which change the position of the distinctive peaks. Here, we analyzed the FT-IR spectra of a large set of carbonyl compounds by chemometrics methods to differentiate between the different carbonyl functional groups. FT-IR spectra of 370 carbonyl compounds (149 carboxylic acids, 47 aldehydes, 110 esters and 64 ketones) were collected from the Spectral Database for Organic Compounds and then were converted to digital data using a home-made program. The extended canonical variate analysis combined with partial least squares discriminate analysis method (ECVA-PLS-DA) was employed as a supervised classication method. Classication analysis by ECVA-PLS-DA resulted in a suitable classication model such that one can discriminate between the different carbonyl compounds using FT-IR spectra with a small error. The classication errors (reported as percentage of misclassied compounds) were 1.8% and 7.8% for training and test sets, respectively. Considering high structural diversity of the studied compounds and the employment of different methods for acquiring the spectra (i.e., KBr disk, CCl4 solution, liquid lm and Nujol moll) there are acceptable errors. Thus, it is concluded that with the help of chemometrics methods, one is able to differentiate the carbonyl compounds using their IR spectra without need to extra spectroscopic information. This can be considered as a signicant improvement in structural characterization of organic compounds using only IR spectroscopy. 2011 Elsevier B.V. All rights reserved.

Article history: Received 18 March 2011 Received in revised form 25 August 2011 Accepted 29 August 2011 Available online 7 September 2011 Keywords: Carbonyl compound IR spectroscopy Chemometrics Classication ECVA

1. Introduction Infrared (IR) spectroscopy with frequencies between 4000 and 400 cm 1 (wavenumbers) is an invaluable tool in organic structural determination and verication. Since radiations in this region are absorbed by interatomic bonds, and on the other hand chemical bonds in different environments absorb varying intensities at differing frequencies in organic compounds, IR absorption bands can be utilized in organic structure determinations. Thus, IR spectroscopy is certainly one of the most important analytical techniques available to today's scientists. An interesting feature of IR spectroscopy is studying of samples in any state such as liquids, solutions, pastes, powders, lms, bers, gasses and surfaces [1, 2]. The IR peak of C_O band stretching of carbonyl compounds is appeared at about 1700 cm 1. However, the exact position of this band is dependent on the type of carbonyl functional group (e.g., carboxylic acid, ester, aldehyde and ketone), and the chemical environment of the C_O functional group (e.g., presence or absence

Corresponding author. Tel.: + 98 711 613 7360; fax: +98 711 228 6008. E-mail address: hemmatb@sums.ac.ir (B. Hemmateenejad). 0169-7439/$ see front matter 2011 Elsevier B.V. All rights reserved. doi:10.1016/j.chemolab.2011.08.011

of other functional groups) [3]. While there are other absorbance bands (e.g., CH stretching of aldehydes at about 3000 cm 1, and for carboxylic acids the broad band centered in the range of 2700 3300 cm 1 caused by the presence of the OH and a band near 1400 cm 1 comes from the CO single bond), assigning of an IR spectrum to a specied carbonyl functional group is not a simple task yet. Thus, IR spectroscopy is usually combined with other spectroscopic methods such as nuclear magnetic resonance (NMR) spectroscopy or with a chemical testing method to distinguish between different carbonyl functional groups. Band overlapping and sensitivity of the absorption bands to chemical environments of the functional groups are the major limitations for extensive applications of IR spectroscopic techniques in studying complex samples. However, since the introduction of chemometrics in chemistry and thanks to the resolving power of different chemometrics techniques, the problem of band overlapping has been diminished [4]. Thus, in recent years, combination of IR spectroscopy with chemometrics methods makes it as a powerful analytical technique with signicant quantitative and qualitative applications in various chemical elds including pharmaceutical [5], food [6, 7] and textile industries [8]. In addition, different techniques of IR spectroscopy have found applications in disease detections [911]. Moreover,

172

N. Mobaraki, B. Hemmateenejad / Chemometrics and Intelligent Laboratory Systems 109 (2011) 171177

characterization of crude oil and renery products [12], monitoring of chemical and biotechnological processes [13, 14], and bacterial identication [15] by IR spectroscopy combined with chemometrics have been reported, recently. Multivariate data analysis methods have also been employed in structural characterization of organic compounds by spectroscopic methods. Stallard developed classication model to make discrimination between alkanes, alcohols and ketone/aldehydes by chemometrics analysis of the mid-IR region [16]. However, the number of compounds used was too small. Mass spectral classication methods have also been reported for determination of the presence or absence of functional groups through structural characterization of organic compounds [17, 18]. In our opinion, the recent chemometrics methods could be able to discriminate compounds of higher similarity in their structures. However, according to the best of our knowledge, there is no report on the classication of structurally related compounds (e.g., carbonyl compounds) by IR spectroscopy. In this work, we examined the application of IR spectroscopy combined with chemometrics methods for discrimination of the carbonyl functional groups. It was found that with the help of chemometrics methods, one is able to differentiate the carbonyl compounds using their IR spectra without need to extra spectroscopic information. 2. Experimental 2.1. Data set In this study, a list of 370 carbonyl compounds composed of 149 carboxylic acids, 47 aldehydes, 110 esters and 64 ketones was used. The data of each group was randomly partitioned into training and test sets. The numbers of molecules in each subset are reported in Table 1. These compounds are represented in Table S1 of the Supplementary materials. One can observe from Table S1 that the selected compounds have a large structural diversity. 2.2. IR spectra The transmittance FT-IR spectra were collected from the web site of Spectral Database for Organic Compounds (SDBS) provided by the Japanese National Institute of Advanced Industrial Science and Technology (AIST), which can be accessed via (http://riodb01.ibase. aist.go.jp/sdbs/cgi-bin/direct_frame_top.cgi). This website provides the IR spectra of a large number of organic compounds recorded by different techniques such as liquid lm, CCl4 solution, KBr disk and Nujol mull. This data base represents the picture of IR spectrum in GIF format and hence they are not suitable for multivariate data analysis. To make the IR pictures applicable for this study, the pictures (in JPEG, BMP or GIF format) should be converted to digital data. To do so, a subroutine was written in MATLAB environment (MathSoft Inc.)
Table 1 Summary of the classication errors obtained by ECVA-DA and PLS-DA. Subset Group Na ECVA-DA NM b Training Acid Aldehyde Ester Ketone Total Acid Aldehyde Ester Ketone Total 112 38 83 50 283 37 9 27 14 87 2 1 2 0 5 2 2 2 1 7 ME (%)c 1.8 2.6 2.4 0 1.8 5.4 22.2 7.4 7.1 8.0 PLS-DA NM b 2 3 4 2 11 2 4 4 4 14 ME (%)c 1.8 7.9 4.8 4.0 3.9 5.4 44.4 14.8 28.6 16.1

The subroutine can digitize the IR spectrum in a wavenumber intervals dened by the user and thus it produces a vector of transmittance (or absorbance) data for the spectrum of each compound. It saves the data in text format. Since the spectra are registered in the SDBS website in transmittance unit, and logarithmic transformation affects on noise distribution (e.g., the homoscedastic transmittance noise becomes strongly dependent on the concentration after undergoing logarithmic transformation to absorbance [19, 20], or multiplicative errors become additive after logarithmic transformation [21, 22]) we preferred to use the transmittance data rstly. However, as it will be explained in next sections, it was found that both transmittance and absorbance spectra resulted in the similar results. 2.3. Data analysis The digitized transmittance (or absorbance) of absorbance data of the IR spectra of all studied molecules were collected in a data matrix D of dimension of (370 nw), where nw is the number of transmittance readings per spectrum. Thus, each row of D (di) is the transmittance spectrum of a specied molecule. Three types of preprocessing methods (i.e., mean-centering, scaling to unit variance and autoscaling to zero mean and unit variance) were considered for both transmittance and absorbance data. Dimension reduction and data visualization was preformed by principal component analysis (PCA). Classication was achieved utilizing canonical variate analysis followed by partial least squareslinear discriminate analysis (ECVA-PLS-DA). The Eigen-vectors and Eeigen-values of the covariance data matrix in PCA were calculated by singular value decomposition (SVD) function of MATLAB. Calculations of the extended canonical variates (ECVs) in ECVA and data analysis by PLS-DA were performed using ECVA Toolbox version 2.02 (available at www.models.life.ku.dk). 3. Results and discussion We investigated here the ability of IR spectroscopy in combination with chemometrics data analysis for discrimination between structurally related carbonyl compounds. As it is shown in Table S1 of the Supplementary materials, a total of 370 carbonyl compounds belonging to carboxylic acid, ester, ketone and aldehyde functional groups were selected and their IR spectra were collected from the website of SDBS. To maintain the structural diversity and to encounter the effects of other functional groups, we also tried to select carbonyl compounds possessing other different functional groups (e.g. amines, halogens, nitro, aromatic, aliphatic, cyclic, and acyclic) in their structure. In addition, there is a large variation in the number of carbon atoms of the selected compounds. 3.1. Converting imaged spectra to digital spectra The SDBS website gives picture of IR spectra in GIF format. To convert the pictures to digital spectra, a subroutine as a graphical user interface (GUI) was written in MATLAB. Briey, this program reads the pictured spectra in different formats (e.g., JPEG, BMP or GIF format). It counts the pixels in the x and y directions, where the spectrum lines are appeared and then by dening the axes limits, convert the pixels in the y and x directions of the picture plane to transmittance and wavenumber, respectively. Since the wavenumber axis of IR spectra is usually represented by two different scales (i.e., the scale in the range of 40002000 cm 1 is shown smaller than that of 2000400 cm 1), an option was added to the program to manage such representation of the scale of wavenumber axis. The subroutine gives an option to the user to smooth the generated spectra using SavitskyGolay algorithm [16]. Finally, it plots the digitized spectra and compares them with the original pictured spectra. As an example, the pictured IR spectra of four compounds are compared

Prediction

a b c

Number of molecules in each group/subset. Number of misclassied molecules. Misclassication error (percent of misclassied molecules).

N. Mobaraki, B. Hemmateenejad / Chemometrics and Intelligent Laboratory Systems 109 (2011) 171177

173

with the digitized spectra in Supplementary material section. Clearly, there is a high degree of similarity between the original and the digitized spectra. The wavenumber interval for sampling of the digitized transmittance (or absorbance) data was taken as 0.5 cm 1 and hence for each spectrum 7200 absorbance readings were provided. Considering that 370 compounds were studied in this work, the dimension of the data matrix D was (370 7200). This data matrix was used as input of different data analysis methods for classication of carbonyl compounds. The digitized spectral data of the compounds of this study are available upon requesting. To have a quick snapshot on the IR spectra of the compounds of different carbonyl functional groups, the spectra of the compounds of each functional group were averaged. The results are shown in Fig. 1A. There are observed some spectral regions, where are not similar for different functional groups. However, one can see from Fig. 1B that the standard deviations in the peak intensities of the compounds in each group are very large at the distinctive peaks. This uncertainty in peak shapes and intensities, which arises from structural diversity of the organic compounds and the presence of other functional groups beside the carbonyl compounds, creates difculties in visual assigning a spectrum to a specied carbonyl functional group. Thus, the spectra were processed by chemometrics methods to extract the signature and nger print of the carbonyl functional groups. In addition to the analysis of the original transmittance data, the effect of data preprocessing methods including (i) mean centering,

(ii) scaling and (iii) autoscaling was also investigated. Interestingly, it was found that employed preprocessing methods did not affect on the nal classication results and in some instances better results were obtained by the original data without preprocessing. Therefore, we rstly represent the results of the analysis of the original data and then explain the effect of data preprocessing.

3.2. Dimension reduction and visualization by PCA Application of PCA on spectral data matrix of whole set of molecules revealed that the rst three principal components could explain 70.37% (54.17%, 9.66% and 6.54% for the rst, two and second PCs, respectively) of variances in the spectral data. This means that by projecting 7200-dimensional spectra of the selected components into 3dimentional factor space, about 70% of information is retained. This 3dimentional factor space can be used to visualize the relative position of the studied carbonyl compounds with respect to each other based on the similarity between their IR spectra. Fig. 2 shows the relative position of the different types of carbonyl compounds in the 3dimentional space of the rst three PCs. Acids and esters are observed at the two ends of the distribution pattern of the molecules and they are completely separated from each others. Aldehydes and ketones are positioned between acids and esters whereas ketones are separated relatively from acids and esters.

A
Ketone

Ester

Transmitance

Aldehy

Acid

4000

3400

2800

2200

1600
-1

1000

400

Wavenumber (cm )

B
Relative standard deiation

Ketone

Ester

Aldehyde

Acid

4000

3400

2800

2200

1600

1000

0 400

Wavenumber (cm-1)
Fig. 1. (A) The averaged transmittance IR spectra of the compounds of different carbonyl functional groups and (B) the standard deviation between the transmittance of the compounds in each group. For better visualization, constant baselines were added to the averaged and standard deviation spectra.

174

N. Mobaraki, B. Hemmateenejad / Chemometrics and Intelligent Laboratory Systems 109 (2011) 171177

Fig. 2. Distribution pattern of the studied carbonyl compounds in the 3-dimensional PCA-based factor space: (

) acids, (+) aldehydes, (o) ketones and (

) esters.

3.3. Classication by ECVA-PLS-DA For classication purposes, instead of using the original data original data as input of a discriminate analysis method, such as linear discriminate analysis (LDA), they are rstly transferred into a lower dimensional space by projection methods such as PCA. However, in PCA, the PCs are calculated solely from the absorbance/transmittance data and thus they might not necessarily be the components relevant for discrimination. On the other hand, methods, such as partial least square-discriminate analysis (PLS-DA), produce components that are more correlated with the data classes [17]. However, PLS-DA suffers from poor performance in situations not unlikely to occur in real data [18]. Canonical variate analysis (CVA) [23] is another supervised classication method, which maximizes the differences between the groups in the data according to a well-dened optimization criterion. However, CVA cannot deal with highly collinear data such as spectroscopic data (here IR spectra) where the number of variables is significantly larger than the number of samples. Norgaard et al. suggested an alternative method to solve the problem of singular matrices that results when analyzing collinear data with CVA [18]. The extended CVA (ECVA) method is based on the standard CVA. However, by a transformation of an eigenvector problem to a regression problem, it is possible to use PLS in the inner part of CVA thereby allowing for the analysis of collinear data. Among the different supervised classication method, ECVA-PLSDA as a new and efcient classication method [2426] was employed. The data set was partitioned into training and test sets so that about 25% of each group were selected as test set and the reminders were used as training to build the classication model. The model renement procedure used 10-segment contiguous block cross-validation (CB-CV) to select the optimum number of latent variables of the PLS model used in ECVA. The plot of the number of misclassied compounds versus the number of PLS latent variables is given in the Supplementary material section. One can observe that there is a minimum in the plot of cross-validation at number of latent variables of 13. Also, there is a small difference between that of calibration and cross-validation at this number of latent variables. After all, no signicant improvement is achieved in the cross-validation results. Thus, 13 latent variables were chosen as the optimum value for the prediction step. The number of canonical directions is always one less than the number of groups in the data set and for 4-group case this means that the solution is three-dimensional. In Fig. 3 and 4, the canonical variates for the ECVA method based on 13 components PLS model in the inner relation are shown. In Fig. 5, the corresponding canonical

weight vectors are represented. The shown extended canonical variates (ECVs) in Fig. 3 illustrate how they could be used for discrimination of carbonyl compounds. The ECVs in the rst direction clearly shows the discrimination of ketones from other functional groups whereas those in the second direction show the discrimination of aldehyde and esters from acids and ketones. For example, those having positive variates in the rst direction and small negative variates in the second direction are ketones and those having negative variates in both directions are most probably acid. On the other hand, those molecules that their respective canonical variates in the rst and second directions are negative and positive can be most

Fig. 3. The extended canonical variates for acids (red), aldehydes (blue), ester (black) and ketones (green) obtained by 13-component inner PLS model.

N. Mobaraki, B. Hemmateenejad / Chemometrics and Intelligent Laboratory Systems 109 (2011) 171177

175

Fig. 4. Distribution pattern of the studied carbonyl compounds in the 3-dimensional canonical variate space: (

) acids, (+) aldehydes, (o) ketones and (

) esters.

Fig. 5. The extended canonical weights obtained by 13-component inner PLS model.

probably considered as aldehyde or ester functional groups. Finally, by using of the ECVs in the third direction one is able to discriminate aldehydes from and esters. Fig. 4 illustrates the separation between the groups in the lower dimensional ECV space. Obviously, the group separation in the ECV space is more pronounced than that found in PC space (Fig. 2).

The canonical weight vectors shown in Fig. 5 indicate the wavenumbers, which play the more signicant role on the discrimination of the carbonyl functional groups. It is observed that the weight vectors possess the largest values at the spectral regions of 1285, 1353, 1399, 1732, 1735, 2682, 2785 and 2987 cm 1. These identied spectral regions, are close to the regions where distinctive peaks of carbonyl compounds are appeared. Based on the 13-component PLS model, LDA was introduced as the classier with the canonical variates as the input and the number of misclassication as the validation error. The class-validation predicted classes of different carbonyl functional groups in the training set and those of the prediction set are shown in Table 1. It is observed that 5 molecules (2 acids, 2 esters and 1 aldehyde) out of 281 molecules in the training phase have not been predicted correctly. This is a very small prediction error. The obtained model also represented good prediction ability for the test compounds, which did not have contribution in the model development steps. Among the 89 molecules used in the prediction set, 82 molecules were correctly assigned to their respective classes and the class of 7 molecules was not predicted correctly. The details of class prediction by ECVA-DA are given in the Supplementary information Table S1. The name and chemical structure of the misclassied molecules are represented in Table 2. In this Table, there are also reported the probabilities of class-membership (i.e., the probabilities that each molecule belong to a specic functional group). It is very difcult to discuss why these compounds have not been assigned to their classes correctly.

Table 2 The list of misclassied compounds by ECVA-DA. Subset Group Molecule Classied as Ketone Ketone Ketone Acid Acid Ester Ester Ketone Acid Ketone Ketone Ester Class membership probability Acid 0.19 0.09 0.05 0.56 0.43 0.12 0.17 0.14 0.55 0.01 0.00 0.00 Aldehyde 0.04 0.01 0.00 0.00 0.00 0.00 0.16 0.07 0.25 0.00 0.00 0.33 Ester 0.00 0.00 0.00 0.44 0.41 0.52 0.67 0.00 0.00 0.22 0.02 0.49 Ketone 0.77 0.90 0.95 0.00 0.16 0.36 0.00 0.79 0.20 0.77 0.98 0.18

Training

Test

Acids Acid Aldehyde Ester Ester Acid Acid Aldehyde Aldehyde Ester Ester Ketone

2-Amino-5-chlorobenzoic acid o-Toluic acid Pentamethyl benzaldehyde Ethyl palmitate Ethyl hexanoate 3,5-Dinitrosalicylic acid p-Tert-butyl benzoic acid Trans-2-butenal p-Isobutylbenzaldehyde Methyl hexanoate Hexyl hexanoate 3-Octanone

176

N. Mobaraki, B. Hemmateenejad / Chemometrics and Intelligent Laboratory Systems 109 (2011) 171177

Fig. 6. Results of interval ECVA analysis of the FT-IR spectral data. The bars are the number of misclassied compounds for each interval, and the horizontal line above the x-axes line represents the number of misclassication for global model (5 for 13 LVs). Italic numbers are optimal number of latent variables used for each interval models.

transformation of the transmittance data affects the noise distribution (e.g., changing homoscedastic noise in transmittance into heteroscedastic noise [19-22]) and consequently affects the prediction results. Interval ECVA (i-ECVA) was also used to see whether selecting of some spectral regions from the whole spectral region will increase the prediction quality of the ECVA or not. In i-ECVA, the whole spectral region is partitioned into some intervals and ECVA is applied to each interval separately. The results for 20-interval i-ECVA analysis of the IR spectra of the studied carbonyl compounds are represented in Fig. 6. There is observed no spectral interval with prediction errors lower than those of the global model (i.e., that using whole spectral region). This suggests that the obtained model from all spectral data possessed much better performances. A comparison between the numbers of misclassied compounds obtained from different intervals reveals that the intervals 7 (27402920 cm 1), 13 (1360 1840 cm 1), and 15 and 16 (11201480 cm 1) resulted in the least prediction errors. These intervals are the spectral regions, where the distinctive peaks of carbonyl compounds are observed (i.e., C\H and O\H vibrations, C_O vibrations and C\O vibrations, respectively). This suggests that although ECVA used all spectral regions for classication of carbonyl compounds, the regions of distinctive peaks played the most signicant role in the classication model.

One reason for misclassication could be attributed to the use of 4 different methods of recording the IR spectra. However, the outliers do not belong to specic type of spectral recording methods. It is interesting to note that most of the misclassied compounds have been assigned to the ketone group. Fig. 4 indicates that the ketone samples are located in the middle of the samples of the other functional groups. Among the studied functional groups, ketones posses only C_O group whereas the remining groups possess an extra discriminating group too (e.g., O_C\H in aldehydes, O_C\OH in carboxylic acids and O_COR in esters). In other words, the carbonyl functional group of the ketones is common between all carbonyl compounds. To improve the prediction ability of the ECVA-LDA mode, some different criteria were employed. In the rst attempt, the transmittance data were converted to absorbance and then they were used as input of ECVA. However, no improvement was obtained for all types of the employed preprocessing methods. Meanwhile, in some cases the number of misclassied compounds was larger than the number found by using transmittance data. One may expect to obtain better results using absorbance spectra as the transmittance spectra are known to non-linear compared to absorbance spectra. However, as it was noted previously, digital transformation of the pictured spectra added some noises to the transmittance data. Logarithmic

3.4. Comparison between ECVA-DA and PLS-DA It has been stated that in many situations ECVA-DA performs better than PLS-DA. To investigate the validity of this statement for our data set, they were also analyzed by PLS-DA. The composition of the training/test sets was the same as ECVA-DA. The changes in the classication error (utilizing 10-segment contiguous block CV) as function of the number of PLS latent variables are depicted in Supplementary material (Fig. S3). The lowest minimum misclassication error of cross-validation was obtained at 6 numbers of PLS latent variables and at this latent variable, the errors of calibration and cross-validation are the same. Thus, 6-component PLS model was used for class prediction of the test set compounds. The distribution of the samples in the space of the PLS components is shown in Fig. 7. As it is observed, the class separation is higher than that of PCs space but it is lower than the space of ECVs. The classication results obtained by 6-component PLS-DA model are reported in Table 1. Obviously, the classication errors of PLS-DA are higher than ECVADA. The percents of misclassications obtained by PLS-DA are almost 2 times of those of ECVA-DA for both training and test.

Fig. 7. Distribution pattern of the studied carbonyl compounds in the 3-dimensional PLS scores space: (

) acids, (+) aldehydes, (o) ketones and (

) esters.

N. Mobaraki, B. Hemmateenejad / Chemometrics and Intelligent Laboratory Systems 109 (2011) 171177

177

3.5. Effect of scaling and mean-centering As it was noted previously, data analyses were also performed on the pre-processed spectral data. The effects of scaling, meancentering and autoscaling on the outputs of PCA, ECVA and PLS methods are in supplementary section (Figs. S4S6). It is observed that preprocessing method did not exert signicant effect on the results of the employed methods. While, there is observed moderate changes in the relative positions of the samples in the shown spaces for different preprocessing methods, the class separation has not been affected signicantly. Moreover, when ECVs and PLS scores were used as input of a linear discriminate analysis method to make classication models, the percentage of misclassications were almost constant for the three employed preprocessing methods and were similar to those obtained from original data. 4. Concluding remarks The IR structural elucidation of the carbonyl compounds (acids, esters, aldehydes and ketones) by means of chemometrics methods was investigated. PCA of the IR spectral data resulted in partial discrimination of the carbonyl compounds in the 3-dimensional factor space. However, analysis of the spectral data by ECVA resulted in high discrimination ability so that among the 370 carbonyl compounds used in this study 276 out of 281 for training set and 82 out of 89 for test set were correctly assigned to their own functional group. This study will open a new insight into the structural elucidation of organic compounds by IR spectroscopy. Research on the extending of the model to other functional groups and then nding IR signature for each functional group is under study in our research group. Appendix A. Supplementary data Supplementary data to this article can be found online at doi:10. 1016/j.chemolab.2011.08.011. References
[1] M.W. Crowther, NMR and IR spectroscopy for the structural characterization of edible fats and oils, J. Chem. Educ. 85 (2008) 15501554. [2] B.H. Stuart, B. George, P. McIntyre, Modern Infrared Spectroscopy, John Wiley & Sons Ltd, Chichester, U.K, 1996. [3] D.L. Pavia, G.M. Lampman, G.S. Kriz, J.R. Vyvyan, In Introduction to Spectroscopy, Brooks/Cole Cengage Learning Ltd, U.S.A, 2009, p. 29. [4] E.V. Anslyn, Supramolecular analytical chemistry, J. Org. Chem. 72 (2007) 687699. [5] A. Peinado, J. Hammond, A. Scott, Development, validation and transfer of a Near Infrared method to determine in-line the end point of a uidised drying process [6] [7]

[8]

[9]

[10] [11]

[12]

[13]

[14]

[15]

[16] [17] [18]

[19] [20]

[21] [22] [23]

[24]

[25]

[26]

for commercial production batches of an approved oral solid dose pharmaceutical product, J. Pharm. Biomed. Anal. 54 (2011) 1320. H. Cen, Y. He, Theory and application of near infrared reectance spectroscopy in determination of food quality, Trends Food Sci. Technol. 18 (2007) 7283. N. Zandi-Atashbar, B. Hemmateenejad, M. Akhond, Determination of amylose in the Iranian rice by multivariate calibration of the surface plasmon resonance spectra of silver nanoparticle, The Analyst 136 (2011) 17601766. M. Blanco, J. Coello, J.M.G. Fraga, H. Iturriaga, S. Maspoch, Development and validation of methods for the determination of miokamycin in various pharmaceutical preparations by use of near infrared reectance spectroscopy, The Analyst 124 (1999) 10891092. J. Backhaus, R. Mueller, N. Formanski, N. Szlama, H. Meerpohl, M. Eidt, P. Bugert, Diagnosis of breast cancer with infrared spectroscopy from serum samples, Vib. Spectrosc. 46 (2010) 173177. M. Meurens, J. Wallon, J. Tong, H. Nol, J. Haot, Breast cancer detection by Fourier transform infrared spectrometry, Vib. Spectrosc. 10 (1996) 341346. V.R. Kondepati, M. Keese, R. Mueller, B.C. Manegold, J. Backhaus, Application of near-infrared spectroscopy for the diagnosis of colorectal cancer in resected human tissue specimens, Vib. Spectrosc. 44 (2007) 236242. X.L. Chu, Y.P. Xu, S.B. Tian, J. Wang, W.Z. Lu, Rapid identication and assay of crude oils based on moving-window correlation coefcient and near infrared spectral library, Chemometr. Intell. Lab. Syst. 107 (2011) 4449. M. Brink, C.F. Mandenius, A. Skoglund, On-line predictions of the aspen bre and birch bark content in unbleached hardwood pulp, using NIR spectroscopy and multivariate data analysis, Chemometr. Intell. Lab. Syst. 103 (2010) 5358. C.G.E. da Silva, A.S.C. Machado, J.S. Oliveira, Monitoring of molecular transformations in acid-base reactions by evolving factor analysis of Fourier transform infrared spectra data, Talanta 43 (1996) 14431456. C. Marcott, A.E. Dowrey, J.V. Poppel, I. Noda, Infrared spectroscopic analysis of a series of blends of poly(lactic acid) and poly(3-hydroxybutyrate-co-3-hydroxyhexanoate), a bacterial copolyester, Vib. Spectrosc. 36 (2004) 221225. R.G. Brereton, Chemometrics: Data Analysis for the Laboratory and Chemical Plant, John Wiley & Sons Ltd, England, 2003, p. 132. M. Barker, W. Rayens, Partial least squares for discrimination, J. Chemometrics 17 (2003) 166173. L. Nrgaard, R. Bro, F. Westad, S.B. Engelsen, A modication of canonical variates analysis to handle highly collinear multivariate data, J. Chemometrics 20 (2006) 425435. D.A. Cirovic, R.M. Jacobsen, R.G. Berereton, Instrumental noise distribution in electronic absorption spectroscopy, Anal. Comm. 33 (1996) 231234. J. Toft, O.M. Kvalheim, Eigenstructure tracking analysis for revealing noise pattern and local rank instrumental prole: application to transmittance and absorbance IR spectroscopy, Chemom. Intell. Lab. Syst. 19 (1993) 6573. H. Mark, J. Workman, Analysis of noise, Spectroscopy 17 (2002) 2425. E. Limpert, W.A. Stahel, M. Abbt, Log-normal distributions across the science: keys and clues, Biosciebces 51 (2001) 341352. E.I. Russell, L.H. Chiang, R.D. Braatz, Fault detection in industrial processes using canonical variate analysis and dynamic principal component analysis, Chemometr. Intell. Lab. Syst. 51 (2000) 8193. H. Winning, N. Viereck, T. Salomonsen, J. Larsen, S.B. Engelsen, Quantication of blockiness in pectinsa comparative study using vibrational spectroscopy and chemometrics, Carbohydr. Res. 344 (2009) 18331841. L. Norgaard, G. Soletormos, N. Harrit, M. Albrechtsen, O. Olsen, D. Nielsen, K. Kampmann, R. Bro, Fluorescence spectroscopy and chemometrics for classication of breast cancer samples a feasibility study using extended canonical variates analysis, J. Chemometrics 21 (2007) 451458. C.L. Hansen, F. van den Berg, M.A. Rasmussen, S.B. Engelsen, S. Holroyd, Detecting variation in ultraltrated milk permeates infrared spectroscopy signatures and external factor orthogonalization, Chemometr. Intell. Lab. Syst. 104 (2010) 243248.

You might also like