|
1. |
Back matter |
|
Analyst,
Volume 112,
Issue 12,
1987,
Page 043-046
Preview
|
PDF (926KB)
|
|
ISSN:0003-2654
DOI:10.1039/AN98712BP043
出版商:RSC
年代:1987
数据来源: RSC
|
2. |
Front cover |
|
Analyst,
Volume 112,
Issue 12,
1987,
Page 045-046
Preview
|
PDF (532KB)
|
|
ISSN:0003-2654
DOI:10.1039/AN98712FX045
出版商:RSC
年代:1987
数据来源: RSC
|
3. |
Contents pages |
|
Analyst,
Volume 112,
Issue 12,
1987,
Page 047-048
Preview
|
PDF (347KB)
|
|
ISSN:0003-2654
DOI:10.1039/AN98712BX047
出版商:RSC
年代:1987
数据来源: RSC
|
4. |
Chemometrics in analytical chemistry. A review |
|
Analyst,
Volume 112,
Issue 12,
1987,
Page 1635-1657
Richard G. Brereton,
Preview
|
PDF (3127KB)
|
|
摘要:
ANALYST, DECEMBER 1987, VOL. 112 1635 Chemometrics in Analytical Chemistry A Review Richard G. Brereton School of Chemistry, University of Bristol, Cantock‘s Close, Bristol BS8 ITS, UK S u m ma r y of Co n t e n t s 1. Introduction 2. Sampling strategies 2.1 Sampling theory 2.2 Simplex methods 2.3 Factorial methods 3.1 Choosing between different measurement techniques 3.2 Replicates 3.3 Analytical method development 3.4 Tuning instruments 3.5 Fourier methods 4.1 Information theory 4.2 Filters 4.3 Direct enhancement of analytical data 4.4 Maximum entropy methods 4.5 Fuzzy methods 4.6 Mixture analysis 4.7 Data reduction techniques 5. Processing sets of instrumental data 5.1 Classification and clustering 5.2 Relating different measurements 5.3 Pattern recognition 5.4 Time series analysis 6.1 Importance of understanding all steps in the acquisition and processing of data 6.2 Abuse of chemometrics 7.1 The automated laboratory 7.2 Mathematical chemistry 7.3 Education 3.Choosing and optimising analytical conditions 4. Getting the most out of instrumental signals 6. Use and abuse of chemometrics 7. The future 8. Conclusion 9. References Keywords: Chemometrics; analytical chemistry; methodology; review 1. Introduction Over the last few years the term chemometrics has been increasingly used. This term was supposedly coined by the Swedish physical organic chemist S. Wold in 1972 when submitting a grant proposal for the application of statistical methods to chemical data. Together with the American analytical chemist B. R. Kowalski, Wold formed the Interna- tional Chemometrics Society.Subsequently there have been several major reviews,1-4 ACS symposia,5*6 a National Bureau of Standards conference,’ a NATO schoo1,s a textbook9 and a series of monographs10 devoted to “chemometrics.” Recently two international journals devoted to chemometrics have been established. 11-12 There is, however, considerable debate as to the scope of the subject. Similar methods to those employed by analytical chemists were used for many years by biologists,l3J4 particularly numerical taxonomists. Arising from the biological literature, a large variety of clustering methods15 have been developed, originally as an aid to pharmaceutical and clinical researchers. Geologists also have seen the need for the development of methods for the interpretation of large multivariate datasets.16 Common to all these disciplines is the ability of modern analytical instrumen- tation to acquire large amounts of data rapidly. In a modern laboratory, measurements are cheap compared with samples: many parameters (e.g. , chromatographic and spectroscopic peak intensities) can be obtained from each sample. Hence new approaches are needed to interpret what may be described as large multivariate data matrices, involving specialised statistical and computational methods. Thus che- mometrics as a discipline can be regarded as originating as much in the biological and geological sciences as in chemical science. Further debate has centred on the scope of the subject. Clearly chemometrics involves the application of statistical methods to analytical data.However, there is an increasing tendency to link statistical packages to expert systems, library searching, graphical and databasing routines. Hence many people argue that the subject should encompass a broad battery of methods. The central theme of chemometrics must, however, be the laboratory instrument, and how computational methods can increase the productivity of experiments. In this review the discussion will be largely restricted to statistical methods. There will be no attempt to derive results mathematically, but the reader is referred to citations where appropriate. This review will concentrate, principally, on1636 ANALYST, DECEMBER 1987, VOL. 112 gaining an understanding as to how chemometrics can be useful in the modern analytical laboratory.2. Sampling Strategies A major task of the chemometrician is to help the experi- menter use his time and the available instrumentation more efficiently. This problem applies equally to method develop- ment, e.g. , optimising chromatographic conditions, and to increasing the useful throughput of data in a laboratory. The analyst samples a system. These samples may be physical entities (e.g., a lump of rock or a bottle of river water) or analytical parameters ( e . g . , the resolution of two GC peaks under a given set of chromatographic conditions), or even the portion of a trace (e.g., spectral peaks). Several statistical texts have been written over the last three decades aiming to help the experimenter optimise sampling methods.17-19 Biol- ogists have needed efficient methods of sampling for many years, as typically they might need to deduce information about large naturally occurring populations in 20 or so experiments: so-called response surface methodology has been developed by biometricians.20 These and allied approaches can equally well be used in the analytical chemical laboratory.2.1. Sampling Theory Sampling theory in analytical chemistry has been described in detail by Kateman and Pijpers.21 The chemist frequently encounters continuous processes often described as time series. Examples are: Fourier transform spectroscopy (see below), where data about a spectrum are obtained by sampling magnetisation or another measure in time-the rate and method of sampling is dependent on the information required from the spectrum22; continuous industrial processes where deviations from pre-set limits can result in poor quality of a product or sometimes industrial accidents23; naturally occurring time series, such as might be found in geochemistry when measuring compounds down a core ,z4J5 in environmen- tal chemistry when monitoring seasonal or diurnal changes in composition26 and in clinical chemistry when monitoring biorhythms; and finally when following reaction kinetics by methods such as stopped-flow.It is clearly important to plan how frequently samples must be taken. This depends, of course, on the process under study and desired information. There are two principal reasons for sampling time series in chemistry. The first is in order to provide a description of a system.Such a case may arise in geochemistry. It is often possible to come back later and take more samples. The second is in order to monitor or control a continuous process. Under such a circumstance it is not possible to return and take more samples. The first question the analyst should ask is “why is the system being sampled?” In some instances an exact future trend (“deterministic” component) is anticipated: this is normally a cyclical trend. Cyclical time series are well known in electrical circuits and in spectroscopy and a variety of methods called spectral analysis27-29 have been developed; similar approaches are used in economic forecasting30 and geological palaeoclimatic studies ,31 where the effect of the changing orientation of the Earth’s orbit around the sun is to change the sea surface temperature with time and so influence the proportions of various chemicals in geochemical samples of different ages.The most crucial points to note are, firstly, that data must be spaced regularly in time: it is common to compute an autocorrelogram*7-29 or discrete Fourier trans- form22i32-34 and computational procedures require equally spaced data. It is possible to use Fourier transform methods to analyse experimental data that have been sampled irregularly, but under such circumstances it is necessary to interpolate the raw data. Naturally the method of interpolation (linear, spline, rectangular, etc.) has to be carefully considered beforehand, as has the distortion on the data of the interpola- tion method.Secondly, the sampling frequency determines the range of distinguishable frequencies in the transform.22 A sampling rate of once per x min leads to a maximum distinguishable spectral frequency of 1/(2x) min -1. This frequency is called the Nyquist frequency, and it can be shown that components oscillating at 1/(2x) + 6 min-1 will be indistinguishable from components oscillating at 1/(2x) - 6 min-1 in the resultant transform. A third consequence is in the resolution of components: the longer a system is sampled, the greater the information and so the easier it is to resolve close components. If the objective of a particular experiment is to distinguish close components (and also if quantitation is necessary) then it is crucial to continue sampling for a long time: if a system is sampled for y min, then there should be sufficient information to be able to distinguish frequencies 2/y min-1 apart.Deterministic time series occur principally in spectroscopy or when studying natural cyclical events. In many instances the analyst has very little preconception as to how a system is likely to behave and wants to answer the question “is there any long-term trend?” One approach is to use autocorrelation functions.35 In such instances a high frequency of sampling is usually less crucial. Other methods include curve fitting. A long-term linear or exponential trend can be adequately studied by sparse sampling36; regular points are not even necessary in such a case, which involves least-squares curve fitting or deconvolution.However, there is a further problem associated with many processes. Samples are often not discrete. In order to obtain enough material to analyse a process by instrumental methods, it is always necessary to sample over a defined and finite period of time. This period of time can be called a “lot,” and the experimenter obtains a time average of the system over the sampling period. Clearly if a system changes non-linearly , the approximation that this average is equivalent to a discrete sample may break down. On the other hand, the smaller the sample the worse the signal to noise ratio, and so eventually instrumental errors counter- act the advantage of small samples. Chemometric methods can be used to guide the experimenter as to the rate and size of samples.Many relevant methods were first developed by geologists37J8 using a process called Kriging. If a geochemical map of an area is desired, samples have to be taken sufficiently frequently to enable a continuous picture to be built up; an efficient strategy is to first sample the system coarsely, estimate the errors in interpolated points and then increase the sampling frequency or size until a desired level of certainty is obtained. Obviously the optimum strategy must depend on the questions being asked about the system. The problems faced when monitoring continuous processes are considerably different. Under such circumstances the experimenter wishes to know the deviation of an object from given limits: he needs to sample more frequently if the object appears to deviate.Quality control is a familiar process in engineering.21 Statistically the sampling frequency can be predicted from the past history of the process. The controller needs to sample more frequently when the object is deviating from a pre-set mean, and will also wish to be alerted if possible problems are likely to occur. 2.2. Simplex Methods Methods for sampling time series, where the parameters of a single variable are normally measured as a function of time, have been discussed under Sampling Theory. However, the analyst is often interested in more complex processes. These may be, for example, the yield of a synthetic reaction as a function of temperature, solvents, catalyst, etc. ,39 the resolu- tion of HPLC peaks as a function of flow-rate, solvents, etc.,40-42 or the performance of a chemical instrument as aANALYST, DECEMBER 1987, VOL.112 1637 function of magnetic field inhomogeneity.43 These problems require optimisation techniques. The traditional approach is to perform an experiment under certain conditions, measure the performance of one variable of interest (e.g., the resolution of two chromatographic peaks), change one experimental condition (e.g. , flow-rate in chromatography) until the variable is optimised ( e . g . , the flow-rate that best resolves the peaks of interest) and then modify another variable (e.g., pH) until the response is further optimised, and so on. However, it is possible to model this process by considering the experiments as samples in multi-dimensional space, the dimensions being the number of independent parameters (e.g., flow-rate, pH, temperature). It is possible to understand the process of optimisation in more detail by envisaging the response (e.g., the resolution of two peaks) as a function of the variables. A response surface is a graph of this response against the experimental variables. Typically a three-dimensonal surface is displayed. This gives the value of the response as a function of the change in two experimental conditions. Most response surfaces are not actually observed but are computed from the data. For example, chromato- graphic resolution may depend on four or five variables, so the observed response surface is multi-dimensional and cannot readily be visualised. Using regression analysis or similar techniques it is possible to predict how each variable affects the observed response.A computed surface for any pair of variables can then be displayed. Graphical methods are often used to examine the shape of the surface. Methods for studying these surfaces are well known in biometry.20 The chemometrician’s role is to help the experimenter find the optimum response using the minimum number of experi- ments. The response surface is a multivariate surface, in other words it depends on several variables which are the experimental conditions. The traditional method of finding this optimum, described above, may, however, completely miss the optimum: in many natural situations the response is not an independent function of each variable, hence it is not possible to separate the effects of each condition completely.An obvious example is a three-solvent HPLC system: clearly solvent molecules will interact with each other, so, for example, the resolution of two peaks in a combination of 0.1 M acetone, 0.1 M methylene chloride and a third solvent is not necessarily the sum of the resolution of the peaks in 0.1 M acetone, 0 M methylene chloride and tile third solvent and the resolution of the peaks in 0 M acetone, 0.1 M methylene chloride and the third solvent (assuming the effects of the third solvent are negligible in this instance). Thus the “one factor at a time” approach in analytical chemistry is most unlikely to lead to an optimum, and the answer arrived at will depend, in part, on the order in which the factors were optimised.An alternative approach is to measure the response at a large number of possible combinations of variables. For example, ten possible sampling points could be used for each of three factors, resulting in 1000 experiments. Clearly this approach is laborious and the experimenter is unlikely to have the time to perform all the necessary experiments. Hence there have been developed a number of methods to aid the analyst to reach the desired optimum more efficiently. Simplex optimisation is a method for searching these surfaces,44-46 and is based on the observation that most experimental surfaces are likely to be relatively smooth and not contain sudden discontinu- ities. The first step in the analysis is to define an allowed region within which to search for the response.For example, if the response is dependent on three concentrations (e.g. , a three- solvent HPLC system), clearly the three concentrations are not entirely independent of each other as they must add up to 100%. Other physical constraints of the system may also be taken into account. The second stage is to choose a step size, which is a reasonable amount by which to change each variable, when searching for an optimum. The step size can be geometrically defined as a line in multi-dimensional space, the dimensions corresponding to each variable: scaling variables (e.g., taking logarithms where appropriate) should be per- formed prior to defining this space. The third stage is to establish initial conditions. These are three combinations of factors at which the the response is measured.These three factors should be equidistant in the “response space.” For a three-solvent system, the allowed response space is a triangle in three-dimensional space, with each side representing &loo% concentrations of each of the three solvents. Each of the three sampling points should be separated by the step size. The response is measured at each of these three points (e.g. , the resolution of two closely eluting peaks in a HPLC system under three different solvent regimes). The worst response is discarded and a new set of conditions are established equidistant from the best and the second best response. Of these three new conditions the worst is then discarded again and the procedure is repeated. If the new set of conditions yields a response that is worse than the two previous responses, the second worst response is discarded and so on.The simplex terminates (i. e . , has found an optimum response) when all possibilities of making extra measurements yield a worst response. Provided that the response surface is well behaved and the initial conditions and step size have been well chosen, simplex methods should yield the best conditions using a small number of experiments. The original applications of this method used a fixed step size. However, a more sophisticated technique is the modified simplex procedure.47.48 In this approach the step size is varied. Consider the case of two responses. If the third (and latest) point is worse than the previous two points a longer step size (in the opposite direction) is suggested; if the third point is better, clearly the simplex is homing in on an improved solution and a shorter step size is used.In the case of the third response being intermediate between the two previous responses, the step size is retained. A further sophistication is the super-modified simplex method49 in which the new response position is not restricted to the normal equilateral triangle. There has been a great deal of literature comparing and contrasting various simplex methods.50-52 However, possibly the most important considerations are the choice of a good response space and initial conditions. Probably the major use of simplex approaches is when it is most convenient to perform a single experiment at a time. For example, when tuning a scientific instrument it is only possible to record one spectrum at a time: it is not possible to perform more than one test on the system simultaneously. When several experiments can be run in parallel (e.g., control of batch processes) factorial methods as discussed below are often a more efficient use of the analyst’s time.2.3. Factorial Methods Factorial approaches to experimental design contrast with simplex approaches in that several experiments can be performed simultaneously. Factorial methods are extremely common in biological research.53-55 There are many types of design, but the over-all philosophy is to sample a number of points on the response surface: regression analysis can then be used to fit this surface, and find the maximum if there is one.There are many possible designs, but they are all dependent on an intuitive estimate of the shape of the response surface in advance. The region of most variability is that in which most experiments should occur. Several questions need to be asked. Firstly, how many factors (independent variables) are relevant to the experiment? A four-factorial experiment is one in which four factors (which may be temperature, concentration, pH and pressure, for example) are to be varied. Secondly, what is the main region of interest? In most experimental designs variables are coded. Typically this coding is a logarithmic transformation: when only positive values of the directly measured parameters are physically meaningful ( e . g . ,1638 ANALYST, DECEMBER 1987, VOL.112 measurements of concentration, flow-rate, length, etc.) this transformation allows for meaningful negative values in the analysis as the negative values of logarithms correspond to positive numbers. However, it is also important to estimate the “centre” of the experiment. This is the average value of each variable over the experiments and should ideally be the value at which the most variability in the response is expected. The process of coding can be thought of as a method for scaling. Thirdly, a model for the response surface should be set up. A typical model is a quadratic response model, but if other physical information is known about the dependence of the response on any of the variables then another model may be preferable. Typically a design is presented to the analyst as a table of experimental conditions.Each experiment consists of a series of levels of each variable. Normally standard designs are expressed in terms of coded variables, which are related to actual concentrations according to the assumptions of the analyst. Each experiment is performed and the values of the response (which may be a yield from a reaction or the resolution of two chromatographic peaks) is recorded. Some- times these responses are scaled further (e.g. , logarithmically) according to the expected model. Then regression analysis is used to fit the response variables. Normally confidence intervals, goodness-of-fit and analysis of variance statistics guide the analyst as to how reliably the model has been obeyed. The quantitative parameters can be used to yield two types of information.The first is the univariate response, i.e., how each single variable affects the response. This can often be important. In one recent example barley was grown in the presence of various toxic metals. The individual toxicity curves, which are the change in root and shoot length when the concentration of each metal of interest is varied alone, can be efficiently deduced from the response data.56 A second type of information is the interaction between parameters. Obviously if the effects of each variable on the observed response were independent there would be no interaction and no real need for elaborate optimisation methods. In practical terms a response (e.g., chromatographic resolution of peaks) is a complicated function of several parameters.Sometimes these terms can be interpreted physically such as in the competition of two metals for the same sites in a cell.56 Thirdly, the theoretical form of the response surface allows an accurate prediction of the minimum or maximum where relevant and so can be used for optimisation in the same way as simplex methods. There are a large number of designs and very careful thought must be given as to the optimum method, which often depends on the model, on the practicable number of experi- ments, on the coding of the concentrations and the number of variables. Factorial methods have been used recently in HPLC41 and GC57 work. Standard methods involving spread- sheets can be employed to plan a small number of experiments to optimise a multi-solvent HPLC system, for example.Typically seven experiments may provide sufficient informa- tion to optimise three-solvent HPLC conditions. Often the response surface can be visualised either by three-dimensional graphics (for a two factor example), or by contours (e.g., a triangular contour diagram for a three-solvent = two-factor chromatographic system). There have been several papers describing the geometry of such surfaces. Response surfaces can be very sensitive indicators of small differences in response between two different factors. Replicates can be used to compute upper and lower bounds to response surfaces. 58 Factorial methods have the advantage over simplex approaches in that they enable the entire response surface to be constructed.They are not only used to find optima, but can be used to examine how variables interact, which conditions are the most crucial, and so on, and tend to give a much better idea of the over-all effect of experimental conditions on an analytical process. However, the experimenter must perform a set number of experiments and the method can fail if the model or the coding of variables is inadequate. For simplex methods a prior model is unnecessary. 3. Choosing and Optimising Analytical Conditions After deciding how to sample a system, the analyst wishes to choose the best method for measuring the system; he wants to compare instrumental methods, minimise the number of tests necessary and consider whether replicate analyses are required. He must then optimise the instrumental method.In all such instances there are established chemometric tech- niques. 3.1. Choosing Between Different Measurement Techniques Procrustes is an established technique59.a recently used for the comparison of instrumental methods. The analyst is interested in what may be called a consensus configuration. For example, if the objective of a series of measurements is to detect outliers (e.g., in the quality control of food) then the question to be asked is “which instrumental method detects outliers most efficiently?” If several techniques are used, can the number of measurements be reduced as several provide similar information? Procrustes is one approach to comparing the information in two or more sets of measurements. For most applications in chemometrics the principal components of sets of measurements for each technique are first computed prior to procrustes analysis.Principal component analysis (PCA) will be described in more detail later, but is essentially a method for reducing the dimensionality of data. For example, a naturally occurring mixture of compounds may be characterised by 20 GC peaks. However, if a set of samples ( e . g . , the chemical extracts from 40 bacteria) is analysed by GC, there are likely to be only certain principal trends that may be dependent on species, growth environment, etc., that influence the appearance of the chromatographic trace. So instead of examining a table of 20 peaks, the main causes of variability could be described by two or three principal components. The analyst might be interested in whether GC or near-infrared analysis is the best method for classifying these bacteria.He can do this by computing the first few principal components for each of these methods. The next stage of the process can be graphically represented by rotation. Principal components can be visualised as lines in multi-dimensional space. Rotation merely involves rotating these lines. In procrustes analysis principal components from more than one dataset are rotated until they reach a maximum overlap. If, for example, two techniques measure identical properties, it is possible to superimpose graphically the rotated pictures, and the analyst knows he only needs to use one of the analytical techniques, the other being redundant. If techniques measure very different properties there will be very little consensus between methods.Quantitative estima- tions of consensus can be obtained by this approach. Thus it is possible to work out which analytical techniques measure which properties, and so increase efficiency. Procrustes need not be restricted to quantitative instrumen- tal data. A good example is in bacteriology, where the investigator may apply binary tests for the classification of different strains and species, providing categorical informa- tion. An alternative approach is to use continuous data such as the concentration and occurrence of fatty acids.61 Which approach is most useful? A case study involving 56 binary tests and 14 fatty acids has been reported.62 Which tests are most useful? Which tests repeat essentially similar information? Clearly a great deal of the information is redundant, and procrustes enables the investigator to decide which tests are most informative.This can be extended to commercial applications, for example, tests for quality of jet fuels: how many are actually useful? Chemometrics helps us decide on the most efficient way forward.ANALYST, DECEMBER 1987, VOL. 112 1639 At this point it is important to note that there are often many alternative approaches available to the chemometrician. For example, procrustes has been much employed in sensory research,63 to link information such as chromatographic intensities to the judgement of a taste panel. An alternative approach is partial least squares (PLS). 64 Canonical correla- tion65 could also be employed.Details of these methods and PCA are described under Processing Sets of Instrumental analyses. It is very easy to become confused about the ready availability of software and the desirability of using a particular algorithm. For example, there are an enormous number of implementations of PLS as some of the first work in the 1970s on chemical pattern recognition involved the exploitation of this method. However, generalised procrustes, being a relatively recent addition to the toolbox of the chemometrician, is not widely available in user-friendly packages: the main implementation is in the GENSTAT pro- gramming language,62 which is certainly not generally access- ible. It is important, however, not to rely too heavily on readily available methods should they be inappropriate to the example in question.A radically different approach to comparison of techniques comes from information theory.66 Consider the problem of using analytical tests to distinguish between a series of closely related compounds; for example, there are several known isomers of gibberellins (plant hormones). Which test is most informative? The test may be chromatographic ( e . g . , GC or TLC) or spectroscopic or a simpler means of assay. These hormones have very similar structures So fairly sensitive analytical methods may be necessary. Information theory, discussed in more detail in Section 4.1, attempts to provide a measure of the information content of each technique. Most measurements are noi infinitely precise, and the smallest distinguishable quantity is referred to as a “bit.” This can most easily be visualised by considering the problems of analogue to digital conversion (ADC) in a computer, where qualitative information is degraded to digitised quantities: if the ADC is 10 bits then there are 210 or 1024 precisely distinguishable levels of signal, no more and no less.With most continuous analytical methods, it is also possible to estimate the precision in measurement and detection limit, and the useful range over which practical measurements can be made. This can then be used to estimate information content which is related to the concept of entropy which is discussed in more detail in Section 4.1. One good example is a low-resolution mass spectrum, scanned over a given mass range.At each mass the entropy (or negative information content) can be calculated, and then summed over the entire mass range. Equivalently a chromatogram is sampled at a certain number of discrete points, and so information from this analytical measurement can also be calculated. The less the entropy of the technique the more informative the technique. Thus two instrumental methods can be compared. this approach is also very useful when comparing qualitative (e.g. , TLC) to quantitative ( e . g . , spectroscopic) techniques. However, the information theoretical approach has come under considerable criticism.67-70 A crude formulation of the information theoretical approach as applied to the mass spectrometry of unknown organic compounds is as follows.The analyst might scan over a mass range of 500. If the intensity of each peak is subject to about 3% error, then there will be about 10016 levels of information, which is around 16 for each peak, or 4 bits, as 16 equals 24. Hence there are 2000 bits of information in the entire spectrum. There are thought to be about 1042 possible organic compound structures which could be represented by 140 bits of information. Similar arguments could be made for other techniques and measure- ments. However, this formulation assumes that the intensities at each peak are mutually independent: this is most unlikely. For certain types of measurement such as mass spectrometry, peak intensities at each mass are likely to be correlated as compounds normally yield more than one ion; however, in chromatography this is not so as each peak arises from a pure compound.It is hard to quantitatively estimate the degree of likely correlation using information theoretical approaches, so care should be taken when reading theoretical information arguments. In Section 3.1 an approach to information theory that does take into account correlations between parameters is discussed, but this would be hard to apply to mass spectro- metric data because of the inherent complexity of the problem. This method is an alternative to the multivariate method for comparing methods for measuring properties and considerably more work is likely to be performed in this field in the future. 3.2. Replicates Most analysts routinely plan their work so as to take replicates. The normal reason for replicate analysis is to provide confidence in the analytical method and sampling strategy.If replicate measurements are very similar then the analyst has confidence in his results, if not he may well decide to modify the measurement procedure. Often standard deviations in measurement are cited. However, replicate information can often be used in far more subtle ways. One important use of replicates is when investigating inter-laboratory , inter-analyst and inter-machine variability. The ANOVA (or analysis of variance) method71 can be employed for the analysis of internal variability, and can be used to answer questions such as which instrument is more reliable? A simple approach involves taking several replicates per instrument and comparing the variability on each instrument.A more sophisticated approach includes multi-dimensional ANOVA where several measurements are replicated on the same system.72 Hierarchical or multi-level ANOVA is commonly employed in analytical chemistry. For example, consider the determination of the concentration of a compound in an extract by GC. It is possible to replicate the extraction procedure. For each extract replicate injections may be performed, and finally the process of integrating the chromatogram can be repeated several times. If each of these levels is replicated three times over then there will be 27 replicates in all. A hierarchical ANOVA analysis would try to separate out the systematic variability due to the extraction, injection and integration. An even more sophisticated approach is multivariate ANOVA (MANOVA).73 A chro- matographer may be interested in the accuracy of the integration of several peaks.It is possible to perform a separate ANOVA analysis on the intensity of each peak independently but the errors are likely to be multivariate. It is, of course, possible to go still further and perform hierarchical MANOVA. There is very limited literature available on multivariate hierarchical ANOVA but this approach is cer- tainly worth considering. Finally, ANOVA methods can be used to answer fairly basic questions. For example, using identical chromatographic data which automated instrumental integration algorithms provide the most consistent answer using a variety of analysts? By transferring data from one instrument to another and getting the same analysts to integrate the chromatogram on each of two instruments, a feeling for the variability of software methods can be obtained.74 It is then possible to plan out whether more reliable data can be obtained by replicate analyses from one analyst or by using a different instrumental integration algorithm.One of the most important tasks of the analyst is to obtain quantitative instrumental data, so some empirical consideration of the errors in quantitation is beneficial before planning a project that involves extensive quantitative analy- ses. There are many possible ways of applying ANOVA methods, and the best use of such techniques probably depends on what is known about the data and what questions1640 ANALYST, DECEMBER 1987, VOL.112 are to be asked from the data. For example, is ANOVA analysis to be used to find out which instrumental technique out of a range of techniques is most reliable? Or is it to be used to improve the reliability of a given analytical method? Finally, is it merely used to determine confidence limits in existing experimental results? As with many chemometric methods the choice of approach (hierarchical, multi-dimen- sional, multivariate, etc.) is subjective. For example the choice of whether to apply ten univariate ANOVA calcula- tions on the errors associated with the replicate quantitation of ten GC peaks or whether to perform MANOVA depends whether the errors associated with the quantitation of each peak are expected to be independent or whether the errors are multivariate: this in turn assumes some knowledge of the physics and the nature of the experimental errors.Hier- archical ANOVA (involving levels of error, e.g., a top level might be a laboratory, the next level the technique employed and the bottom level the analyst used) is preferred to multi-dimensional ANOVA (where the three factors in the example cited are treated as three dimensions of an error matrix), if the different sources of error are expected to be of a very different nature or significance: also these two types of ANOVA analyses require differently designed strategies for replicates analysis and so the choice may be dependent on the availability of manpower. After ANOVA the analyst can use replicate information to establish the accuracy of his methods and determine which techniques are the most efficient.Typical questions of interest are: how many replicates are necessary to increase the confidence of a measurement to a given degree of certainty? Which parameters of a system can most profitably be replicated? However, after being guided to the answers tc such questions, the analyst usually throws away replicate information, preferring instead to use mean readings in subsequent analysis. Response surface methodology wae discussed above. Consider a typical problem faced by the laboratory manager. He might have the resources to make 21 measurements. Is it better to observe 21 different points on the surface or to replicate seven points three times over? Can replicate information be used or should an average value of the response at each point on the surface be employed? It is possible to obtain confidence limits for response surfaces using replicates information ,75 so throwing away the replicate information and using an average response only loses informa- tion.Another common use of replicates is when spectra or chromatograms are replicated. This is common in pyrolysis studies76 where analytical measurements are rapid, and so time is normally available to record the spectra of each sample several times. Thus replicate information can be used to improve multivariate separations. 3.3. Analytical Method Development Having chosen one or more instrumental techniques and having decided on an experimental design, sampling and replicate strategy, it is important to maximise the information obtained from the instrument.Method development has been particularly important in chromatography.41,4*,77 As discussed briefly in Sections 2.2 and 2.3, one example is HPLC optimisation. It is now very common for chromatographers to use simplex or factorial methods to optimise chromatographic separation. The success of such methods depends, though, on asking the right questions about the analytical technique in the first place. For example, is resolution or signal to noise the most important factor to be optimised? A function of how informative or useful a chromatogram is, can then be developed, which may depend on the widths, separation and heights of selected peaks. Such models are obviously critically dependent on why the chromatogram is being run.After that it is important to select which factors are to be optimised, for example, flow-rate, temperature, solvent composition, and so on. Only after this can chemometrics be truly used to optimise conditions. An alternative approach might be to develop a physical model of how chromatographic peak shapes are affected by the various factors and to use calculus to determine the optimum response function. However, this depends on having a very detailed understanding of the physical processes of adsorption on columns and has not, as yet, been widely accepted. Several recent applications in GC and HPLC have been reported. Optimisation is by no means restricted to chromato- graphy, although this application has probably been the subject of most attention, especially because of the needs of the pharmaceutical industry.Atomic absorption spec- trometry78 and flow injection analysis79 have recently been the subject of simplex optimisation studies. It is important to note that for certain designs the calculations can in fact be fairly trivial, and are possible using spreadsheets on microcomputers. It is not, therefore, essential to have a detailed understanding of statistics to employ experimental design as an aid to analytical method develop- ment. 3.4. Tuning Instruments Similar approaches can be used where the instrument must be automatically optimised. This is commo_n in spectroscopy: for example, the more homogeneous a magnetic field in an NMR spectrometer, the sharper the peaks are and the better the signal to noise ratio.43 Manually searching for the optimum often involves adjusting ten or more field controls, and can be a painstaking task for the operator.Autoshimming algorithms have been developed using simplex methods, whereby the homogeneity is automatically altered. Because only one parameter is measured e.g., the width of a reference peak, and this measurement can be obtained in a few seconds, simplex methods often involving several hundred steps can be very effective here. Naturally the choice of simplex and factorial methods is dependent on how rapidly a given measurement can be performed. There are a variety of optimisation algorithms for instru- ments, and software can be fairly readily written to perform this function. 3.5. Fourier Methods So far ways of optimising the appearance of sequential analytical data have been discussed.However, the chemo- metrician often wants to acquire information as fast as possible. Conventional approaches to chemical data acqui- sition are fairly inefficient. The chromatographer or spectro- scopist records a large mount of information but the positions, intensities and occasionally shapes of a selective number of peaks are all that are normally of interest. A typical spectrum or chromatogram, being the value of a response function with time , consists of several thousand digitised points. However, the analyst is typically interested in only a few parameters from the data. These might be the positions, integrals and occasionally widths of a few principal peaks. Thus only a couple of dozen numbers may be of interest and the remaining measurements can be discarded.Hence conventional approaches to the recording of instrumental data are highly inefficient because only a proportion of a trace contains useful information and the rest is noise between peaks. The majority of instrument time in normally spent acquiring unwanted noise. The Fourier solution22,8&8* is to acquire all the information all the time. This can be achieved by modifying instrumental conditions: in NMR all the peaks are excited simultaneously,83.84 whereas in infrared spectroscopy an interferogram is produced from the interference between twoANALYST, DECEMBER 1987, VOL. 112 1641 beams of radiation.85 Most spectroscopic techniques, includ- ing mass, UV - visible, Raman, and so on80381 can now be recorded in the Fourier transform (FT) mode.The key to this type of spectroscopy is that signal information is obtained more efficiently, hence the signal to noise ratio in a given time is higher. Thus equivalent intensities can be obtained in NMR in a few seconds by Fourier methods, compared to several minutes by conventional approaches.86 Fourier methods are now commonplace in certain spectro- scopic techniques, notably NMR. However, the raw data of the Fourier spectrometer contain information about all the signals of interest mixed up in each reading, and so the direct response from the instrument is not readily interpretable to the scientist. These raw data are normally referred to as time domain data, and a computational method called Fourier transformation is used to separate out the signals into a directly interpretable spectrum called the frequency domain.The latter resembles a directly recorded spectrum, and peak positions, intensities and widths are fairly visible. The Cooley - Tukey algorithm has made possible rapid on-line Fourier methods.87 Considerable extensions to the simple Fourier methods for acquiring data have been developed over the last decade, notably two-dimensional Fourier transform approaches.88789 Fourier methods have not been widely applied in all spectroscopic techniques. There are two barriers. Firstly, there is the expense of instrumentation. Manufacturers have to be committed to mass-produce machines before they become affordable. Secondly, new instrumental methods need to be developed, and often these instruments are difficult to operate: naturally research is required to improve the tuning and operation of Fourier machines.Hence, although Fourier methods in spectroscopy are considerably more efficient than conventional frequency- sweep approaches, the choice must depend on the availability of suitable instrumentation and on how severe the signal to noise problem is. Fourier approaches to data acquistion are by no means restricted to spectroscopy, but can, for example, be used in electrochemistry90 and a variety of other instrumental tech- niques. What is important, however, is to distinguish between methods for acquiring data (the Fourier spectrometer) and methods for processing data by Fourier methods.There are many advantages in Fourier processing, and any data (e.g., chromatographic, titration, conventional spectra) may be treated by Fourier methods. This will be discussed in Section 4. 4. Getting the Most Out of Instrumental Signals In the last section we discussed how to optimise instrumental conditions, how to choose the most appropriate techniques and how to plan the method for recording data. Once data are available, the analyst wants to interpret them. The first step is normally to obtain parameters from these data, normally quantitative parameters such as peak intensities, but also sometimes qualitative parameters such as whether a peak is present or whether a hump consists of one or more peaks. There are a huge number of approaches to this problem.However, it is very important to realise that data processing techniques are often an alternative to instrumental methods. For example, it is possible to resolve out neighbouring chromatographic peaks either by computational deconvolu- tion or by modifying experimental conditions: a similar choice must be made when attempting to optimise signal to noise ratios. Sometimes it is desirable to use both data processing and experimental design at the same time but as the analyst’s time and expertise are frequently limited, a choice must be made. It is, therefore, important to consider how information can be extracted from analytical data after it has been recorded. 4.1. Information Theory The conventional approach to optimising data is to use curve fitting, multivariate analysis, filtering and similar data pro- cessing techniques.Sometimes a model of signals and noise is used, and foreknowledge of the system can be applied to advantage. However, a different approach comes from information theory. Possibly the first major work in this area was by Shannon91 who proposed entropy as a measure of information content. A great deal of further theory has been developed92-95 especially by Eckschlager and co-workers. Information theory has been briefly described in Section 3.1, and a development based on the maximum entropy approach is discussed below in Section 4.4. Entropy is possibly the most important concept in informa- tion theory and is related to the concept of degeneracy. This concept is most easily understood by reference to discrete distributions.For example, an unbiased coin is tossed twice. Which distribution is most likely? Is it two heads, two tails or one head and one tail? There is only one way in which two heads can be achieved and one way in which two tails can be achieved but two ways in which one head and one tail can be achieved. The latter distribution has higher degeneracy (two rather than one) and so is the most likely distribution. Degeneracy is quantified by entropy. Obviously such argu- ments can be extended to continuous measurements. A major aim of analytical measurements is the reduction of entropy, and consequent increase in certainty (or reliability). For example, consider a quantitative measurement which might be the proportion of a solvent in a two-solvent system.Before the measurement there is complete uncertainty as to the proportion of the solvent, i.e., the proportion could be between 0 and 1.0. After measurement the uncertainty is reduced, i.e., instead of a flat distribution between 0 and 1.0, the new probability distribution will be centred on the mean reading with a standard deviation related to the accuracy of the reading. Thus the uncertainty or entropy of the system is reduced. If two measurement techniques are compared, that which reduces the entropy most is the more informative technique. These arguments can be extended further. The concept of bits of information introduced previously helps us to define entropy. A bit can be related to the accuracy of a technique, so two resolved “levels” can be envisaged as two overlapping regions in measurement space.If sufficiently resolved these can be treated as discrete bits and the distribution per bit can then be related to degeneracy. Information theory can be used to guide the experimenter. A good discrete example is the detection of a counterfeit coin that is known to be over- or underweight from a batch of coins96 by a weighing procedure, using no other form of knowledge. If there are 100 coins initially, is it best to first weigh 25 coins against 25? If these balance then the counterfeit coin is known to be in the other half of the coins. Using information theory it is possible to reach an optimum weighing strategy. Similarly, in analytical chemistry, is it best to take more replicates or observe more points when calibrating a straight line? Using information theory we can define the information content we are aiming for and then calculate how many points are required given various constraints of the system.In an example cited by Kateman and Pijpers,97 at least two replicates per point are needed to fit reliably a straight line of five points to reduce entropy by a defined amount. Information theory can also be used for qualitative analy- ses. For example, consider a problem of distinguishing eight isomers from each other by thin-layer chromatography.98 A method that provides three bits of independent information (8 = 23) can do this. The R, values of each of the eight isomers can be measured for different eluting solvents. Consider initially a system where each compounds has a discrete R, value, and several compounds elute at identical R, values.Let us assume that four compounds elute at R, = 0.2 and 4 at R , = 0.6 for a given solvent. Then in this solvent the information1642 ANALYST, DECEMBER 1987, VOL. 112 content is one bit. Let us now take another solvent, in which the four compounds that eluted at R, = 0.2 now elute at R, = 0.6 and the other four compounds at R, = 0.8. The information content of this second solvent system is, again, one bit. However, there is no point running the compounds in both solvents as they both yield identical information. This can be formalised by saying that the information yielded by running the compounds in both solvent systems is the sum of the entropies minus the correlation coefficient (or 1 + 1 - 1 .O) which takes into account the independent nature of the information.If both solvents had an information content of one bit but were not correlated exactly then the summed information content would be between one and two. It is possible to extend the arguments and find a combination of solvent systems that yield three bits of independent informa- tion and so can uniquely distinguish the eight isomers. A more realistic example is where the R, value is not a discrete number but is modelled by a distribution. Then each isomer will probably have a unique central R, value modelled by a spread (often Gaussian) function. Information theoretical arguments can be extended to such continuous distributions. The main difficulty with information theory is the need to model a system.This is necessary as the simplest concepts are derived from the degeneracy of discrete distributions. Also the problems of correlation are extremely difficult to take into account, especially in areas such as spectroscopy or chromato- graphy where different peaks in a trace may be related to each other. Another problem is foreknowledge: sometimes related knowledge needs to be taken into account which is often intuitively obvious to the analyst ( e . g . , in spectroscopy or chromatography). A final area of debate is the definition of entropy, discussed below in the section on maximum entropy. However, information theory is likely to emerge as a powerful tool for the chemometrician, which, if correctly applied, could solve many fundamental problems about which techniques are best and what is the most efficient manner to observe a system.4.2. Filters There is a huge amount of literature on filters,99-102 which were originally used in time series analysis. There are several problems that can be solved by filtering. An important problem is that of the resolution of peaks. Consider the example of two closely overlapping peaks in a chromatogram or spectrum, which might merge together to form a hump. Is an observed hump one peak or many? Obviously any single peak can have any shape and if nothing more is known about the system there is no further evidence to resolve out overlapping peaks. However, as in the example of response surface methodology discussed above, reasonable intuitive assumptions are necessary before a simple model can be proposed.For example, it is likely that a single peak has a first derivative that crosses zero only once; more detailed know- ledge such as noise and shape distribution can also be incorporated. If a hump consists of several peaks, what are the parameters (positions, intensities, etc.) characterising the individual components? There are many ways of dealing with this problem. One is by curve fitting or deconvolution.101 There are innumerable regression methods for this which will be discussed below. However, in Fourier spectroscopy the spectrum (frequency domain) is related to a time domain by a Fourier transform. It can be shown that the broader the peak in a spectrum, the sharper (faster relaxing) its corresponding time domain signal is. Resolution is reflected in a spectrum by separation of peaks: in the time domain resolution is indicated by interference patterns.The closer two peaks are in the frequency domain the less frequent the interference fringes are in the time domain. The more obvious the fringes the easier it is to resolve two peaks. However, the broader two peaks are in the frequency domain, the faster decaying the corresponding time domain series becomes. If the time series decays very rapidly (broad peaks) the interference pattern becomes less noticeable. These arguments can be extended to suggest that the faster relaxing (time domain) or broader (frequency domain) peaks are relative to their separation, the lower the possibility of resolving the peaks. However, resolvability of two peaks of identical separation increases as they are sharpened.This process is rather difficult to perform on the spectrum (although convolution can offer a computa- tionally slow method), but is simple in the corresponding time domain. If a decay mechanism in the time domain is described by e-af where t is time and a is a positive constant, multiplication by a function of the form e+bt where b is a positive constant reduces the apparent decay rate, and so sharpens the lines in the frequency domain spectrum, increas- ing resolvability. This is a simple example of a filter. However, this approach does have limitations. Imposed on the signal is noise. In a typical time series the signal decays into the noise. Thus, the end of a time series consists largely of noise.A filter of the kind described above amplifies the end of the time series, so increasing noise relative to signal, thus reducing the Fourier advantage. Hence a better filter is one in which the noise is reduced but the interference in the middle of the time series is amplified. A filter of the type e+br-Ct* where b and c are positive constants, can have this property providing that b and c are correctly chosen. There are many other filters, but the considerations above lead us to devising optimum filters. Spectroscopists, particularly in the area of NMR spectro- scopy,99 have worked extensively in this area and derived formulae for such filters depending on the signal to noise ratio and relaxation time of the signals. Resolution is optimised with minimum loss of signal to noise ratio.It is important to understand very carefully the reasoning behind the derivation of optimum filter formulae. A line shape function is necessary for the derivation of such a formula. In solution NMR spectra this function is generally represented by a Lorentzian or Cauchy distribution in the frequency domain, which can be shown to correspond to a negative exponential in the time domain. Hence an exponential filter will, if properly chosen, result in narrowing of lines without any distortion in their shape. However, such filters do not necessarily work properly for other line shapes such as Gaussians which occur in many other spectroscopic techniques. In fact, the formulae for optimum filters also assume a noise distribution, known signal to noise ratio and relaxation rate.The greater the knowledge of the spectral parameters the better is the filter. The best filter can be obtained when full knowledge is available about the spectrum. However, if everything is known about a spectrum there is clearly no need to process the data further. Therefore caution must be exercised when applying so called optimum filters. Investigators have tried deriving optimum filters for various line shapes, but often with limited success. Lorentzian line shapes lend themselves to nice algebraic solutions but many other line shapes such as Gaussians end up yielding equations with no algebraic solutions, so the filters are essentially empirical. One fairly general class of “optimum” filter is the Wiener filter103 where the power spectrum of the signal and of the noise need to be known, but apart from that the method is fairly general. For a given noise and signal distribution it is possible to derive specific filters based on this criterion.It is not necessary, however, to acquire data in a Fourier spectrometer in order to use Fourier transform methods. It is important to distinguish between the Fourier advantage (obtaining data more efficiently) and merely using Fourier transforms to handle data (as is possible in chromatography). In the latter instance filters in the time domain are used largely for mathematical or computational convenience. It can be shown that there is a mathematical correspondence between techniques in the frequency domain (or domain of direct interest to the analyst, e .g . , spectrum or chromatogram) and the time domain (or Fourier domain). In particular theANALYST, DECEMBER 1987, VOL. 112 1643 convolution theorem shows that integration in one domain corresponds to multiplying the data by a function in the other domain.104 There have been several papers on convolution of chromatographic data with a line shape function,l05 but alternative methods have been developed in the Fourier domain,106 involving transforming the chromatogram to a time series equivalent to raw FTNMR or FTIR data. Fourier self-deconvolution (FSD)107 has been very effectively applied in Ramanlog and IRlW spectroscopy. A difficult problem is when to work in which domain. No matter how data are acquired there always is a corresponding domain related by a Fourier transform.A rule of thumb suggests that if peak shapes and noise distributions are well known in advance ( e . g . , in NMR) working in the time (or Fourier) domain is preferred to working in the frequency (or spectral) domain however data are acquired: computationally simple approaches such as filtering and zero-filling can be employed. However, if line shape and noise functions are largely unknown, hard to model or irreproducible (e.g., in chromato- graphy), then it is best to work in the frequency domain (or domain of direct interest to the analyst). It should be remembered that definitions of frequency and time domain are sometimes reversed. Time-series analysis is a classical mathematical approach to the analysis of sequential data such as occur when observing the sales of a product with time, the change in concentration of a compound in a geological core or the oscillations in an electrical circuit.Using this terminology the spectrum or frequency domain is always the sequence of data of direct interest to the investigator (in the case of this review it may be either a chromatogram or a spectrum). Some analytical chemists reverse the definitions time and frequency domain. There are several other simple uses of Fourier methods. The first is to remove base-line drift.110 Removing the first data point of a time series is equivalent to subtracting a sine wave from a spectrum, and is a computationally facile way of correcting a wavy base line. The second is as an alternative to interpolation.Zero-filling, or adding trailing zeroes to the time domain, is equivalent to a fairly sophisticated interpola- tion function in the frequency domain. Another use of Fourier methods is when analysing correlo- grams. The formal mathematical method of spectral analysis is defined as the Fourier transformation of a correlogram: it is possible to obtain the spectrum of long-term sales figures and of chemical molecules so the mathematical term used to describe a general method employed by the chemometrician should not confuse the analyst. The autocorrelation function of a time series is the correlation coefficient of the time series with itself shifting by various amounts. Consider a time series of 1000 data points. The correlation coefficient with itself will be unity.Then the time series is shifted and points 1 to 999 are correlated with points 2 to 1000, and so on. The more shifts (or lags), the more data points there are in the correlogram. However, as the number of data points increases, a corre- spondingly smaller proportion of the original data is used. For example, only half the data are used when the time series is shifted by 500 datapoints against itself. Therefore, it is necessary to balance the number of data points in the correlogram against the number of lag positions. This can be performed by reducing the weighting of the later lag positions. However, this is a similar concept to filtering. For example, multiplying the correlogram by a negative exponential func- tion will perform this task. Correlograms are commonly employed in econometrics111 and occasionally used in analy- tical chemistry. A simple filter is a Bartlett window,llz which consists merely of multiplying the data by a square function, so earlier data points are weighted by one and later datapoints by zero.This is equivalent to a cut-off filter. More elaborate filters such as Hanning windows ,113 Hamming windows114 and gain functions are often employed. 115 Chebeshev series have recently been effectively used. Parzen and Tukey filters (or windows) are often preferred to Bartlett filters in modern practice.116 In practice these filters increase both the apparent signal to noise ratio and resolution in the resultant transform. It is also possible to compute confidence limits on the intensities of the peaks in the resultant transform if correlo- grams are used.117 However, caution should be exercised when performing such calculations.Inevitably some model of the data and noise distribution is required prior to applying window functions and an incorrect application can lead to artifacts or meaningless confidence limits. Fourier transformation of correlograms rather than raw data is often preferred when correlated (or coloured) noise dominates the data. In spectroscopy most noise is of an uncorrelated (white non-stationary) type and so straight Fourier transformation is perfectly acceptable. In chromato- graphy correlograms can be more effectively employed to reduce noise levels. Cross-correlation of two time series with each other can also be performed.This technique has been used in correlation chromatography118 where a reference peak is correlated to an observed chromatogram. This approach can provide useful information as to whether a given peak is actually present or not. 4.3. Direct Enhancement of Analytical Data Methods involving Fourier transforms were discussed in the previous section. These approaches are probably most suit- able when line shapes and noise are of known distributions. Obviously any data can be treated via Fourier methods however acquired. If, however, the shape of the signals is unknown other methods often yield better results. A simple method for resolution enhancement is to take deriva- tives.119J20 Several close peaks add together to form a hump which has various inflection points.These inflection points are solutions to the first or higher derivative equations, and so by taking high enough derivatives indefinitely close peaks may theoretically be resolved. This process is limited as derivatives also amplify noise levels. More sophisticated approaches involve smoothing at the same time as computing the derivatives. The most widely used method is the Savitsky - Golay method.121.122 A polynomial of order M is fitted through N points using a weighted least-squares error criterion, so each data point is replaced by this moving average smoothing function. The Savitsky - Golay method involves using a look-up table of pre-calculated coefficients, so instead of performing a simultaneous calculation of derivatives plus fitting a smoothing function (which is time consuming) a weighted sum of N points is computed, the weights being the coefficients, which can be performed very rapidly.This approach is common in chromatography. It is not, of course, necessary to compute derivatives and the Savitsky - Golay method can also be employed as a simple rapid smoothing function. Curve-fitting is another conventional approach to determin- ing numbers and intensities of peaks. There have been various papers on non-linear curve fitting recently,1*3J24 but essen- tially iterative approaches to determining the numbers and shapes of peaks are assessed via a statistical test such as x2. Sometimes rapid filters “on-the-fly” are required. These are called “recursive” filters; in analytical chemistry the Kalman filterl25J26 has been a particularly successful real-time filter. As data are recorded, the nature of the filter changes; hence the directly displayed data are already smoothed.This is in contrast to conventional methods where the data are first recorded and later filtered. Finally, the approximation of chromatograms by Chebeshev polynomials should be noted.127 These can be used to fit the experimental data, and then the new chromatogram is drawn out. There is a massive amount of literature on the merits of curve-fitting and alternative empirical methods for directly1644 ANALYST, DECEMBER 1987, VOL. 112 enhancing instrumental signals. There is no single optimum method that can be applied over a wide variety of examples. The most appropriate method depends on what questions are to be asked of the data, and how much is already known about the data.For example, if the main objective of the work is to obtain exact quantitative parameters such as peak integrals, methods such as taking derivatives may distort or lose such information, but curve fitting should preserve such informa- tion. If peak shapes are known in advance (as is often so in many forms of spectroscopy where but rarely the case in chromatography) , deconvolution using an exact physical model can be applied. There is a need for more work on method selection to be performed in this area. 4.4. Maximum Entropy Methods These approaches derived from information theory and one of the first spectroscopic applications was in image processing ,128 using infrared images of the sky.Recently the technique has been applied to various types of analytical data including NMR,129J30 X-ray131 and Raman spectroscopy.132 It is possible to define negative information content by entr0py,9lJ~~ as discussed in Section 4.1. The essence of such an approach to the enhancement of spectroscopic data is that it is possible to fit analytical data to a large number of models. It is also possible to derive a statistical test for goodness of fit, and very many data sets will give an equivalent statistic for goodness of fit. The maximum entropy criterion selects the data set with least structure (most entropy) and thus least possibility of artifacts. When discussing information theoret- ical approaches to optimising measurement in the previous section, the principal concern was with minimising the entropy (or uncertainty) of the measurement process, so increasing the reliability of measurements.However, if there are several different spectra all of which are equally consistent with observed data, the solution with the maximum entropy or highest degeneracy is the most likely solution. For example, consider the distribution of probabilities when a coin is tossed twice. As discussed previously, the most likely distribution is the one in which the probability of obtaining heads and tails is equal, which has higher degeneracy than the probability of only obtaining heads or only tails. In the absence of other knowledge (e.g., being told the coin is biased) we choose the solution with most degeneracy or least structure. This is typical of solutions in analytical chemistry.The measurements we make provide only a certain amount of information about a system: from limited measurements the analyst attempts to deduce more information about the system, and should choose the most likely solution which is that with maximum entropy. Probably the best known application of maximum entropy in analytical chemistry is in NMR signal processing. Various approaches to filtering have already been discussed. These involve multiplying data by a windowing or filtering function. In turn this process involves assumptions about the structure of the data that may be unjustified. Maximum entropy methods reduce the need for assumptions about the data and for equivalent foreknowledge yield the most likely solution. An analytical trace can be thought of as a probability distribution.In the example of the toss of a coin there are only two data points (corresponding to heads and tails), but in a typical spectrum of chromatogram there may be several hundred or thousand data points representing the number of sampling points in a spectrum or chromatogram. In the absence of other data the analyst has no reason to assign any specific structure to the trace and so should choose the solution with the least structure, which is a flat distribution. It can be shown that this distribution has maximum entropy and so is the most likely distribution. In practice, some evidence for peaks is obtained from the measurement process and this evidence is used to impose structure on the maximum entropy solution.Maximum entropy methods are normally iteratively implemented. Consider, for example, a time series (e.g. , raw Fourier NMR data). This time series consists of signals plus noise. The signals can be represented by a spectrum consist- ing, for example, of 1000 sequential intensity (magnetisation) readings. The analyst wants to know where peaks are in the spectrum and what their intensities are. He would like an answer of the form “there are three peaks of intensities 50,30 and 20 centred at positions 400, 620 and 800, respectively.” The maximum entropy implementation is as follows. A noise-free spectrum is guessed. This might consist of equal intensities at each of the 1000 points in the spectrum. This is inverse Fourier transformed to give a time series.The time series is then compared to the true (observed) time series by calculating the residuals between these two data sets. If, for example, the spectrum actually did consist of 1000 peaks of equal intensities , the residuals would represent pure noise. Assuming white non-stationary noise or a normal distribution, a statistical test such as the x2 test can be used to test how well the residuals conform to this random distribution. The closer the residuals are to this distribution the lower the value of x2. This statistic can be reduced by computing the change in x2 as the intensity at each spectral frequency is altered. For example, the value of x2 might be reduced as the intensities at positions 1-20 in the spectrum are reduced, but increase if the intensities at positions 21-30 are reduced.This would suggest a peak between positions 21 and 30. However, for any given value of x 2 there are a large number of solutions and, in effect, if there are 1000 normalised frequencies, x2 can be regarded as a function in 999-dimensional space. So the chemometrician is faced with a fairly severe optimisation problem. The maxi- mum entropy criterion is then employed. For any given value of x2 the most likely solution is that with maximum entropy or least possibility of artifacts. This is chosen as the new search direction and the iteration continues until the target value of x2 is reached. Of course, there is no need to use x2 and other statistics (according to what is expected or known of the noise distribution) may be employed.The iteration rate can be altered. Finally it is possible to include models of the data. For example, consider the example of a spectrum consisting of lines of a known shape (distribution function). Each line would normally be characterised by a width, position and intensity function. If the widths are known the lines can then be reduced to single points in the spectrum, and so are completely resolved. This model-building requires a further definition of entropy which differs from the original Shannon form.129 Intuitively this is fairly reasonable. The weakness of normal arguments about degeneracy is that other information about a system is ignored. For example, it might be known that a coin is biased. This then changes the likelihood of various probability distributions.Frequently in analytical chemistry the analyst does have further knowledge about the system. There are several other definitions of entropy,134J35 and much thought must go into the physical process being studied before trying to maximise the entropy of a system. There are various misunderstandings about this method. Maximum entropy itself is a well established criterion in the mathematical literature. However, the key to the method is an effective computational implementation. Because of the difficulties of convergence only one group, that in Cam- bridge,128 appears to have developed an effective computa- tional implementation, which properly converges during the iterations. Maximum entropy methods are distinguished from conven- tional filters in that there are minimum assumptions about the data.A typical filter involves multiplying data by a function which introduces structure that is not part of the raw data. In turn, this structure can result in artifacts or peaks that are not actually present in the data but appear in the transform if theANALYST, DECEMBER 1987, VOL. 112 1645 filter is misapplied. The problem of foreknowledge and optimum filters has already been discussed. For equivalent assumptions about the data the maximum entropy method, if properly implemented, can always produce the most likely solution. Maximum entropy has limitations in that it is not easy to implement complicated knowledge about a particular system. For example, certain types of knowledge might very readily be coded into an expert system which could then be used for conventional data processing.This would not be easily implemented inside the maximum entropy algorithm. Another problem with maximum entropy is that it is a computationally slow method, and so the extra information obtained using this data processing approach should be compared to the advantages of using extra instrument time or optimising the data acquisition system. In an application such as 13C NMR or infrared astronomy where the signal to noise ratio is normally limited, maximum entropy probably has advantages. For spectroscopic techniques where data can be acquired very rapidly maximum entropy methods are prob- ably of little use. Certain recent advances in the application of maximum entropy methods have been made, principally in the areas of FTNMR.In particular it is possible to combine Laplace transforms with Fourier transforms to produce a two-dimen- sional map from one-dimensional data136; the method has been extended to negative peaks,137 two-dimensional Fourier transforms138 and improving integration. 139 Wider applica- tions of this method to analytical data are likely to be seen in the future. 4.5. Fuzzy Methods Another approach recently applied to analytical data derives from fuzzy theory.140.141 All the methods described in previous sections, apart from the information theoretical approaches, deal with matrices of exact numbers. Noise is modelled as a superimposed distribution on top of signals, but the actual numbers recorded by an instrument are assumed to be precise numbers.A better model is that readings recorded by an instrument are subject to error. A fuzzy model assigns a membership value between one and zero to each observation. This number, in turn, can be modelled by a membership function. Consider, for example, linear calibration. There are likely to be standard errors in the response and also possibly in the measurement of the independent variable. These can be modelled by error distribution functions such as Gaussians. If a reading is recorded as (xl, yl) where y is the response and x the dependent variable, then the point x l , yl is assigned a membership value of one. However, there is a finite probabil- ity that the true reading was, in fact, (x1 + ti1,, y1 + sly). This probability can be given by the membership function, which can be modelled as probability contours around the observed points.The criterion of the best fit straight line is normally given by minimising a (weighted) least-square error estimator. In fuzzy theory the criterion is obtained by maximising the total membership function. The effectiveness of this approach has been demonstrated in linear calibration. 141 The weakness of least-squares methods is that they are strongly influenced by outliers that contribute disproportionly to the summed least- square estimator, whereas the fuzzy approach which would set the membership function of an outlier close to zero is normally more robust. Obviously the effectiveness of this method depends somewhat on the prior model of the noise function and, as is usual for most chemometrics methods, improves the greater the foreknowledge available about the structure of the data.However, fuzzy methods take into account variability both in the x and y directions as noise is modelled by two- (or more) dimensional error space and so should be particularly relevant where there is uncertainty in sampling intervals. Fuzzy methods have also been compared with conventional multivariate approaches for classification of samples140 and appear on the example data chosen to classify samples correctly with a higher frequency. A lot probably depends, however, on the nature of the noise in the data and whether the fuzzy approaches correctly take into account the noise distribution. 4.6. Mixture Analysis This area has recently been the subject of a great deal of attention.Typical analytical data consist of a spectrum or chromatogram arising from a mixture of compounds. The analyst may want to calculate the amounts of each compound in the mixture. This problem is by no means trivial. Fourier and deconvolution approaches have been discussed in earlier sections. Deconvolution can be most effectively applied when precise models for curve shapes are known, which is often so in areas such as NMR and electronic absorption spectroscopy but not in chromatography. Three other approaches to mixture analysis will be discussed, namely approaches based on multivariate calibration as promoted by Kowalski, Vandeginste, Wold and others; approaches based on mixture analysis developed by Windig and co-workers in the area of pyrolysis mass spectrometry and finally methods based on quantitative library searching developed by Delaney and co-workers.When attempting to quantitate the amount of a compound in a mixture, the analyst must use a different strategy according to the amount of information available and the method of detection. Conceptually the simplest example is that of univariate calibration. Consider the problem of measuring the concentration of a known compound in a naturally occurring sample. If the pure compound is available it is possible to construct a calibration graph, by measuring the response (e.g., absorption) as the concentration of that compound is changed. If the response changes linearly, then linear regression may be used to calculate the best-fit parameters and confidence limits to the observed univariate calibration graph.Even this comparatively simple procedure has been the subject of a great deal of discussion,142 which will not be pursued in this review. However, a more complex and realistic example is when the matrix interferes with the sample. For example, it might be desired to measure the concentration of a metal in a crude oil by absorption spectroscopy. Because the matrix (i.e., the oil) also influences the absorption spectrum, the observed response for equivalent concentrations of metal on its own and metal trapped in the oil will be different. This problem can be solved using the standard addition method (SAM)143 where the sample is added in known amounts to the matrix. Provided that a pure sample is available, the proportions added are known and the response is linear, it is possible to use this approach to calibrate the concentration of sample in the matrix. In mixture analysis the amounts of several compounds in a sample are of interest. This can only be measured if several detectors are used.For example, if a mixture consists of four samples, a minimum of four measurements of the response must be taken. In practice the number of detectors far exceeds the number of components in the mixture as each sampling point in a chromatogram or spectrum can be regarded as a detector. Instead of univariate calibration, multivariate approaches must be used. Multiple linear regressi0n14~ is used where the matrix effect is known or negligible and the generalised standard addition method (GSAM)145 where the matrix effect is unknown.These two approaches are merely the multivariate equivalents to the univariate examples. There are, however, so many ways of obtaining suitable data for1646 ANALYST, DECEMBER 1987, VOL. 112 GSAM (how should the mixtures be varied to calibrate the response most efficiently?) that experimental design is often employed to guide the analyst in this area. The approaches outlined assume that the response of the analytes is linear and additive, i.e., the raw spectra or chromatograms can be added together. In practice, this is by no means always so. For example, fluorescent spectra are rarely additive. This is because the various components interfere with each other. In such an instance the calibration graph becomes non-linear and an exact algebraic model falls down.The methods of partial least squares (PLS)146 and principal component regression have been used in such instance. PLS will be discussed in more detail later. A different situation arises when the pure components are not available. For example, a set of extracts might be available. These could be of geological sediments. It would be very likely that certain principal compounds are available in all these extracts, but the pure isolated compounds not available or possibly not even identified. In such an instance the set of spectra or chromatograms are analysed by target transform factor (TTFA). Essentially the series of measure- ments form an n by m matrix where n is the number of samples and m the number of measurements (data points along a spectrum).The m measurements can be modelled by a number of factors, ideally corresponding to the number of pure compounds in the mixture. TTFA aims, firstly, to determine how many factors (i.e., how many components) are in the mixture, secondly, to determine what these factors are (i.e., the true spectra of each component) and, thirdly, to determine the proportion of each factor in each component spectrum ( i e . , the amounts of each compound in each mixture). There are an enormous number of methods for determining the optimum number of components, of which two of the most cited are cross-validation and the Malinowski indicator function. These will be discussed in more detail in section 5.3. A further sophistication is iterative target trans- form factor analysis148 (ITTFA).Above we considered first a univariate system with a single response. We then discussed methods for a multivariate response as occurs when observing several data points along a chromatogram or spectrum. Even more informative are two- dimensional techniques such as diode array HPLC and gas chromatography - mass spectrometry. Some very sophisti- cated methods have been used in these areas149Js0 involving separating up to seven components in a mixture. There is, however, still considerable debate as to the best method to use in mixture analysis, and much interest is likely to be centred in the area of quantitative analysis of mixtures. Possibly the solution to choice of method will lie with expert systems determining how much is known about the data in advance.A radically different approach seems to have been devel- oped in the area of pyrolysis mass spectrometry.151-153 This approach has the advantage that data can be acquired very rapidly, involving a few minutes or less per sample. However, it has the disadvantage that diagnostic information is not readily available and it is rarely possible to assign any given peak to a molecular fragment ion. Hence multivariate methods are virtually indispensable tools for the interpreta- tion of these large data sets. The technique of graphical rotation has been employed in this area. Consider a normal- ised mixture of three compounds (A, B and C). Factor analysis should yield two principal factors. A typical pyrolysis mass spectrum may consist of 50 peaks.Each peak originates from a given compound. Let us assume that factor one corresponds to compound A. Then peaks arising entirely from fragmentation of compound A will ideally lie along this component. This can be graphically illustrated. Each peak can be geometrically illustrated by a point in the factor diagram. Some peaks may be due to background noise and they will lie close to the origin. Peaks strongly representative of true compounds should lie further from the origin and so show positive loadings. The two-dimensional factor plot can be represented as a circle, which may be divided up into sections. A peak equally representative of factor one and two would lie at 45” to these two factors. In the variance diagram approach the circle is divided up into sections of 10” intervals. The loadings in each section are then summed.The next stage is to rotate this so that the maximum loading (variance) is along the first principal factor. This approach can then be used to calculate the estimated spectra of the individual components in the mixture. It can be further applied to deducing the spectra of unknowns in a mixture. Clearly the loadings along factor one should be representative of the spectrum of the most abundant compound in the mixture and those of factor two of the next most important compound. This approach has been used very effectively when deducing the spectra of unknowns in a series of geochemical mixtures.153 Interest- ingly, rotation in principal component space has been applied to a very different problem, that of multivariate geochemical time series in which spectral analysis is performed on rotated components (as is discussed in Section 5.4).Similarly spectra (this time of harmonics of the Earth’s orbit around the Sun) can be extracted from the data. Sometimes the analyst wants to ask “what are the pure spectra” from a mixture, rather than “what are the propor- tions of compounds’’ in the mixture. Whereas some of the above methods can be applied to such problems, self-model- ling curve resolution (SMCR)154J55 is an important approach when a mixture consists of an unknown amount of an unknown number of unknown compounds. Library searching can also be used in mixture analysis. There is not enough space in this review to discuss these methods in a great deal of detail except where they impinge upon multivariate and other approaches for mixture analysis.Consider, for example, the example where a library of known compounds may be available and the analyst wants only to detect which of the already characterised compounds are in the mixture. Optimisation can be combined with library searching to home-in on the solutions.156 This approach has been shown to be very effective indeed for closely similar spectra.157 A further extension is where some spectra are available in the library and others are unknown; it is possible to predict the unknown spectra and add these to the library.158 More interestingly is the example where library searching is combined with a multivariate approach such as self-modelling curve resolution.159 It would be possible to extend the discussion of mixture analysis to include more details on library searching methods, databasing and expert systems. Indeed, the modern tendency is to mix these approaches, which are important complements to each other. In the next few years the chemometrician will not merely have to understand statistical approaches but will undoubtedly need to expand his knowledge to computer science in general. Chemometrics in particular is concerned with methods. If one particular method is more efficient than another then the analyst or chemometrician should use the better method. If it happens that an expert system analyses a mixture more efficiently than a purely statistical method it is preferable to use an expert system.The analysis of mixtures is a major area in chemometrics. However, at present workers in different fields tend to have developed their own methods. What is likely to be seen in the future is a great deal of work on comparison of methods, and also on the combination of approaches such as library searching, multivariate methods, Fourier methods, expert systems and optimisation, all of which have been used to some degree in mixture analysis. However, at present there are SO many possible techniques that the unwary investigator is unlikely to be able to predict which approach is best for his own problem unless considerable time is spent studying the available methods first. It is always possible to obtainANALYST, DECEMBER 1987, VOL.112 1647 meaningless results when putting data through inappropriate packages. 4.7. Data Reduction Techniques From a spectrometer or chromatogram, the analyst often only requires a few parameters, such as the positions and intensities of peaks. However, the raw data is frequently in the form of several hundred or thousand readings. Data reduction is important for two principal reasons. Firstly, computer storage space is limited. Secondly, reduced data is required for subsequent analysis ( e . g . , pattern recognition, databasing). Thirdly, often assignable peaks only are of interest. There are many problems associated with data reduction. In some techniques such as gas chromatography - mass spec- trometry the amount of raw data is so enormous that it is only practicable to store information in a reduced form.Elaborate methods are then needed, taking into account noise levels, operator determined thresholds, etc., to code the raw data in such a way as to reduce the data set in size within loss of information. The Environmental Protection Agency for- mat160 is well established in such instances. A simple method recently used for reducing the number of data points in a spectrum or chromatogram is called maximum entropy161 but should not be confused with the technique described in Section 4.4. This procedure involves reducing the number of data points in an array by taking the average of unequal proportions of the array. If the data are changing rapidly, only small regions are averaged, whereas if the data are flat, the average of several successive data points is taken.It is possible to predict an optimum compression factor before information is reduced. The criterion for the optimum, indeed, is an entropy criterion but the implementation and method is very different from the maximum er:tropy approach used to enhance the appearance of NMR and other spectroscopic data. Library searching has been discussed above. Many pro- cedures have recently been developed and depend very much on foreknowledge, e . g . , whether spectra of pure compounds are available or not. Naturally deconvolution and other methods reduce still further a data set into tables of peaks and positions. Of course, even comparatively simple processes such as methods for integration and base-line correction99Jl” are critical stages in the data reduction process. Accuracy of measurement of integrals depends on sampling rate, ADC resolution, line widths, nature and type of noise, shape of peak and so on.Most machines print out integrals automatic- ally without providing the user with any indication of the accuracy of integration. Some very sophisticated approaches have been recently applied to the integration problem, including that of maximum entropy,139 when estimating integrals in the presence of wavy base lines in 0 ’ 7 NMR spectra. A good aim of data reduction methods is to reduce the raw data to a table of numbers but to retain sufficient extra information in this table to be able to re-create the original data within a given degree of accuracy (as can be visualised using difference techniques).Ideally sufficient information should be retained to allow accurate reconstructions. Data reduction is employed routinely by the chemo- metrician and indeed almost anyone who uses quantitative instrumental data. Pattern recognition and other approaches are normally applied to matrices of reduced intensity data rather than raw instrumental data. It is probably not suffi- ciently realised how crucial simple techniques such as integra- tion and base-line correction can be in the calculation of such matrices. The analyst should possibly examine his reduction methods in detail prior to using further methods. It is possible to examine the accuracy of different analysts and methods of data reduction using techniques such as ANOVA,74 and this is strongly recommended.The final stage of data reduction is normally referred to as pre-processing, which involves scaling tables or matrices of numbers prior to interpretation, often via statistical packages. Various methods for pre-processing will be discussed in Section 5.1 on cluster analysis as the first step prior to clustering, but pre-processing is equally necessary in other areas of interpretative data analysis. The main lesson the analyst should learn from the above discussion is even after the instrumental method has been optimised and the signals processed, there is yet further distortion in the computed parameters before a table of intensities, positions of peaks, numbers of peaks and so on is presented to the analyst. If insufficient care is taken these errors could snowball during subsequent numerical analysis. 5.Processing Sets of Instrumental Analyses Conventionally there is a tendency to confuse chemometrics with chemical pattern recognition. It is hoped that this review, so far, has emphasised that pattern recognition is only a small part of chemometrics. However, most of the early chemists who described themselves as chemometricians were involved in pattern recognition studies. Some of the earliest work was done before the term “chemometrics” was coined, particularly by Kowalski and Bender in the early 1970s.162 The analyst is interested in sets of samples, which may be different foods, rocks, water samples or biological tissues, and ultimately wants to ask questions about these samples. For example, is a tissue cancerous? Is a beer flat? Is an oil productive? In most instances samples are complicated mixtures of chemicals, and within these mixtures are fingerprints which if properly analysed provide information about the nature of the source material.Earlier sections have described how to choose and optimise the analytical measurements, how to get the most out of the data and how to plan sampling strategies. If we follow the flow of this review, we will be left with a table of intensities and peaks, sometimes assigned to individual compounds, but sometimes just a table of numbers. Unless a problem is fairly simple, chemometrics methods can then be used to interpret these large multivariate data sets. Because the literature in this area is vast, only broad onlines are summarised in the following sections.Four excellent books on pattern recogni- tion are available163-166 and together provide comprehensive summaries of the literature over the last 15 years. 5.1. Classification and Clustering Classification of samples is one of the principal goals of pattern recognition. Methods for classification can be divided into unsupervised and supervised approaches. The difference between these methods is that for supervised approaches a test (or training) set is required: this means that certain samples of known origins and classification must first be analysed to set up a model. In unsupervised methods no prior test set is required. Cluster analysis (an unsupervised approach) was described in detail by Massart and Kaufman,lS to whose book the reader is referred for mathematical details.There are texts oriented towards biologists.167 The most commonly employed unsupervised method for classification is hierarchical cluster analysis and there are a very large number of computer packages available to perform this. The result is normally displayed graphically as a dendrogram. Objects that are most similar are joined together at the top levels of such a diagram. Clusters are often obvious to visualise. However, it is essential to recognise that there are a very large number of possible ways of computing dendro- grams. The stages of hierarchical cluster analysis are discussed in detail below. Similar systematic approaches are necessary when applying any multivariate pattern recognition approach and considerations such as the method used for pre-processing are by no means unique to cluster analysis.However, this technique probably has the most complex set of options and so is a good example of the systematic choices encountered in chemometrics.1648 ANALYST, DECEMBER 1987, VOL. 112 The first problem is the conversion of raw instrumental data to a form suitable for clustering. Typical data might be a table of 20 GC peak intensities for 30 water samples. Which water samples come from the same region? There are a variety of methods for pre-processing or scaling. One common method is to normalise the data so that intensities for each sample sum to a constant total: each peak then becomes a proportion rather than an absolute amount. This method is commonly employed by the analytical chemist, but introduces the problem called closure.168 A second method of scaling is standardisation: this involves calculating the mean and stan- dard deviation of the peak intensities over all 30 samples (in the example cited), and then subtracting the mean and dividing by the standard deviation. Standardising is a form of weighting.Each variable is equally weighted after standardis- ing and so the variation in the intensity of each GC peak is equally significant. Standardisation has the disadvantage in that variables with low absolute mean are often subject to greater relative experimental error but have equal influence on the analysis. It is, of course, possible to employ any weighting function to take into account the significance of each variable. Another possible method of scaling is to take the logarithms of the variables, often after normalising (if appropriate) and before standardising: this is equivalent to weighting small variables more than large variables.It is commonly used by chemists in everyday life: for example, the pH scale is a logarithmic scale. Is it best to use pH or [H+] as a measure of acidity? This depends on the problem being considered. It is, of course, possible to combine different methods of scaling. For example, it is common to both normalise and standardise a data set: if the data set is represented by an n by m matrix corresponding to n peaks and m samples in the example cited above, then normalisation involves scaling the n peaks to a constant total and standardisation scaling over the m samples for each of the peaks.After pre-processing to give a matrix of n columns (GC peak intensities in this example) and m rows (30 water samples) it is then necessary to compute a measure of (dis)similarity between samples. Clearly the more similar two samples are the closer they are likely to be related (similar geographical origins, for example). There are a huge number of possible similarity measures. The most obvious is the Euclidean distance. Each sample can be represented as a point in n-dimensional space (or n - 1 dimensions if the readings have been normalised), each dimension representing one variable. The straight line distance between each point is the geometric (Euclidean) distance. Another distance measure is the Manhattan or City block distance. This is the sum of the absolute distances along each axis.In two-dimensional space the City block distance is the sum of the distances along the x and y axes of a point from the origin, whereas the Euclidean distance is the vector distance. A more generalised distance measure is called the Minkowski distance. The City block distance is the one-dimensional Minkowski distance, the Euclidean distance the two-dimensional Minkowski distance; three- and higher dimensional distances can be computed. The Chebyshev distance is the longest side along the axis of the City block distance. A different approach is to use correlation coefficients. The nearer the correlation coefficient is to one, the closer samples are related.Correlation coefficients are measures of similarity whereas distances are measures of dissimilarity. The Pearson r value is commonly employed. An alternative is the cosine between the vectors for two points: at first sight this may not appear to be a correlation coefficient, but if two points are identical then the cosine between them is one. The latter similarity coefficient is a measure of direction rather than distance. Whether direction or distance is more relevant depends in part of how the original data are scaled. For example, if the raw aata are unscaled, then if all the peaks in one sample are exactly twice the intensity of those from another sample it is likely that the two samples come from identical sources. Hence the choice of distance measure must depend on the questions being asked about the data and pre-processing stages.The difficulty with the distance measures described above is that they take no account of correlation between the variables. For example, in pyrolysis several peaks might arise from one compound. Compound A might yield ten peaks and compound B five peaks. Without any further knowledge of the system, compound A will contribute twice as much to the similarity matrix. This problem can be overcome by calculating the correlation coefficients between all the variables. A measure called the Mahalanobis distance takes this into account. There are several papers that discuss and define Mahalanobis distance. A recent study involves the comparison of Mahalanobis distance and Euclidean distance spaces in the calibration of near infrared data,1b9 to which the reader is referred to for further details of this method.Another distance measure is derived from information theory. Finally, categorical rather than continuous data can be analysed by clustering tech- niques. Binary tests ( e . g . , in bacteriology), descriptions of colour, texture, and so on, often yield valuable information about the similarities between samples. Several coefficients have been proposed, including the Jaccard index, where the number of features common to two samples is divided by the total number of features present for both samples and the matching coefficient where the sum of features present and absent in both of two samples is divided by the sum of features present in only one of the two samples.More details on these distance measures are provided elsewhere. 15 The choice of which distance space to use depends very much on what is known about the structure of the data, and the reader is referred to references cited above for more details. In particular, though, it is essential to remember how the data are scaled prior to choosing a distance measure. Some combinations of distance measure and scaling are unsuitable, and it is necessary to read the appropriate manuals or texts prior to choosing a given measure of distance. Further, it is important to consider whether variables are likely to be correlated (in other words Mahalanobis distance measures should be considered) or independent. If an inappropriate distance measure is used it is possible to obtain meaningless output.Once a similarity coefficient is chosen, the original n by m matrix is converted to an m by m (square) matrix with both rows and columns representing samples. This is the raw input into clustering procedures. There are, yet again, very many methods for clustering. Consider three objects A, B and C. If the correlation coefficients between the scaled readings for these three objects are computed and the correlation between A and B = 0.9, A and C = 0.7 and C and B = 0.2, then clearly A and B are the most similar. A simple clustering method would join A and B up in one group first and then join C later. Most clustering approaches differ in the method for comput- ing clusters. If A is joined to B in the example above, how is the similarity between the cluster AB and the object C calculated? The 3 x 3 matrix is reduced to a 2 X 2 matrix so new similarity measures must be calculated.This is typical of clustering methods. As each object in turn is linked to another object or cluster (two or more linked objects) the size of the (dis)similarity matrix is reduced and new distance measures are calculated. One of the simplest ways of doing this is to use the single linkage170 or nearest neighbour method. The distance of the new cluster from old clusters is given by the nearest distance of an object within the cluster. So the correlation between AB and C is 0.7 in the example above. Alternatively the complete linkage171 or furthest neighbour method yields the furthest distance, 0.2 in the example above.The average172 linkage method calculates the average distance between the new and old clusters, which is 0.45 in the example. Normally clusters will consist of several objects andANALYST, DECEMBER 1987, VOL. 112 1649 there are two types of average linkage method, the weighted pair group and the unweighted pair group. In the latter instance the average distance between two clusters is calcu- lated no matter how many objects are in cluster. For example, if the average similarity measure of a cluster of three objects is 0.5 and of two objects is 0.8, then the unweighted pair group method gives an average similarity of 0.65. However, the weighted pair group method weights the distances according to the number of objects in each cluster, which in the example cited yields a similarity measure of 0.6 rather than 0.65 because there are three objects in the cluster having a similarity measure of 0.5 but only two in the cluster having a similarity measure of 0.8.Another linkage method is called the centroid172 method. The centroid is the geometric centre of a group (or cluster) and the distance between two clusters is given by the distance between the centres of each cluster. Again there are weighted and unweighted centroid methods. All the linkage methods described above involve joining two objects with the highest similarity (lowest dissimilarity), re-calculating the similarity coefficients and continuing. A slightly different approach is called Ward’s method or the error sum of squares method.173-174 For each cluster, the error sum of squares can be defined as the sum of squares between the objects of a cluster and the centroid of that cluster.The lower this value the better the clusters. Ward’s method involves calculating clusters for which the sum of squares increases least. For example, consider an example where objects A, B and C are joined in one cluster. Only objects 0 and E remain to be classified. The object closest to the centroid of ABC is the one clustered next. This method has many advantages but is computationally slow. It is also important to recognise the equivalence of clusters and the more familiar “dendrogram.” Once objects are joined together in a dendrogram they are considered part of a cluster. Other clustering methods involve joining several clusters simultaneously.These have been extensively applied in geology175 but apparently not in analytical chemistry and so are not discussed here. The majority of commercial software ( e . g . , CLUSTAN, SAS, SPSS) gives the user the opportunity of a wide variety of options of joining dendrograms together and it is important to consider carefully the choice of methods. Recently, however, more sophisticated approaches have been developed. The approaches above can be described as hierarchical agglomera- tive. MASLOC176-178 is a non-hierarchical agglomerative method based on location theory. The advantage of this system is that objects clustered at an early stage in the calculation do not necessarily remain together throughout all stages of the computation, allowing correction of errors and detection of outliers.CLUPLOT179 is a method for detection of the number of significant clusters rather than linking of individual objects. Divisive procedures180 have only rarely been used in analytical chemistry. These approaches can be classified according to distance measure and means of calculating clusters, in the same way as hierarchical clustering techniques. However, the difference is that the first step in the analysis treats all the objects as one cluster. Then the least similar objects are split off. The original objects are split into two clusters. These clusters, in turn, are split into smaller clusters until individual objects are separated out, so a dendrogram in reverse is computed. Finally minimal spanning treeslglJ82 have been employed in analytical chemistry.The objects are joined up in distance space so that the total distance between the objects is minimal. Then the largest distance is broken (to form two clusters) and so on until a given criterion is reached. Cluster analysis has been widely used, not only in chemotax- onomy, but in computing similarities between spectra,183 in geochemistry,l84 classification of stationary phases in gas chromatography,185 and so on. There is not room in this review to discuss all applications in detail, but it is very important to realise that there are several hundred possible combinations and methods available? and that interpretation of dendrograms can be fairly complex. A different approach is supervised learning. In cluster analysis approaches there was no need to assume, in advance, structure in the data.However, sometimes the analyst has available samples of known origin: for example he might have GC traces of several species of bacteria. He might know which traces belong to each species. This information forms a training set. After setting up a model for the various classes of objects (e.g. , different species, cancerous and non-cancerous tissue) the analyst then wants to test which class an unknown specimen fits best to, or perhaps whether the specimen actually is a member of any of the previously known classes at all. At this point it is important to note that different authors use differing terminology and that “clusters,” “classes” and “groups” are effectively the same. There are two main approaches to supervised learning, called soft modelling and hard modelling.The former type of modelling is normally associated with disjoint modelling. In soft modelling, each potential class is modelled independently of each other. For example GC traces of seven bacteria of one known species might be analysed and common trends in these data are used to determine the most common analytical features of this class of bacterium. A similar analysis might be performed on a completely different species of bacterium. The analyst might then want to classify an unknown bacterium and will perform the analysis by taking the GC trace of the unknown bacterium and asking the class (if any) with which the analytical data are most consistent. However, because the modelling of both classes has been performed independently, it is entirely possible that an analysis is consistent with both classes (i.e . , ambiguous): soft modelling allows this possibility and also does not expect a categorical answer. It has the advantage that extra classes can be added to the analysis without changing the existing models and also allows for outliers or samples that cannot be classified into existing classes. Hard modelling methods, in contrast, expect a training set consisting of all the classes of interest and then tries to set up a model that classifies an unknown sample unambiguously into one of the already established classes. Possibly the most famous example of soft modelling is the SIMCA (soft independent modelling of class analogy) approach.186188 In SIMCA a principal component model is established for each class.Principal component analysis aims to reduce the data from a large number of original measure- ments ( e . g . , 20 GC peak areas) to a small number of principal trends (e.g., two or three main factors). The key to PCA is what is described as a variance-covariance matrix or a correlation matrix: this is the correlation of each variable (e.g., peak intensity) with each other. The interested reader is referred to references already given in this review, but it should be noted that PCA (and related factor analysis), although very common in chemistry, is not the only method for reducing the dimensionality of data: canonical variates and correspondence analysis are two other approaches referred to below.Principal components can be expressed as a linear combination of variables. For example, the first principal component might be 0.5x1 - 0.2x2 + . . . where x1 is variable 1 and so on: variables can be graphically plotted in principal component space and this is referred to as a loadings plot. Similarly samples can be plotted in principal component space: this is a scores plot. The next stage in SIMCA is to decide how many principal components adequately describe each class: this is performed by a method called cross- validation. Each principal component is calculated in turn. Then a quarter of the readings are removed randomly. If the principal components remain the same within reasonable limits the model is judged to have converged. An important extra feature of SIMCA is the ability to deal with a limited amount of missing values, i.e., readings for which there is no1650 ANALYST, DECEMBER 1987, VOL.112 information at all. Once class models are established various further information can be obtained. The modelling power of each variable for each class gives the analyst an indication as to how significant the variable is for a given class model. If for example a class model is established from atomic absorption measurements on 20 elements, then the elements with the greatest modelling power are those that are most diagnostic of the given class, and so probably are the most useful elements if an analysis is used to ask whether a sample is a member of a given group or not. The discriminatory power of a variable between two classes gives an indication of how diagnostic the variable is when differentiating between two classes.After class models have been established, then unknown samples can be assigned to each class according to the distance from the class model. SIMCA probably represents the most widespread method available specifically for chemometricians. There are, however, various misconceptions about the method. Firstly, SIMCA is a method, not a computer package, and there have been many implementations of SIMCA on microcomputers and mainframes, including in other packages such as SAS.189 Secondly, one original feature of SIMCA is that it is easily implemented on microcomputers. A method called the NIPALS algorithm190 is used to calculate principal com- ponents rapidly.There is, however, no need to use NIPALS when employing SIMCA and more modern VAX-based versions are being developed that rely on more robust but slower PCA methods. Finally, SIMCA is by no means the only method for soft disjoint modelling. Other approaches include SPHERE,191>192 UNEQ193 and PRIMA.194 These methods model class boundaries and distances rather differently. In chemometrics, it is important to distinguish between implementations (SIMCA has been available for many years and there are several user-friendly packages) and actual methods. Other approaches for soft modelling may be equally as applicable as SIMCA but user-friendly software is not readily available commercially at present. Hard modelling has been used by statisticians for over 50 years, the first classical work on linear discriminant analysis1g5 being developed by Fisher.We will only briefly discuss the available methods in this review as chemometricians now largely use soft modelling. In discriminant analysis196 the training set consists of representatives of all the classes expected by the analyst. A discriminant function is one that divides the variable space into regions which represent different classes. An unknown is classified according to the region in space to which it belongs. For example, a very simple discriminant function isx = 0. Positivex values are indicative of one class and negative values of another class. The simplest form of discriminant analysis constructs linear boundaries in space. Quadratic or higher order functions can also be used.Finally, some authors have adopted the concept of a dead-space around the borders between two classes197: an object falling in this region can equally well be classified into either class. The weakness of hard modelling is that all relevant classes must be presented in the training set. Extra classes cannot be added unless the model is changed. Further, the method is usually used to provide categorical information about class member- ship and the method takes no account of the structure of each class. There is no allowance for overlapping of class models. However, this approach has been employed for many years and so there is a large amount of literature on it, and it is implemented in nearly every major general purpose statistical package, unlike SIMCA.One further area of interest in discriminant analysis is the method used to determine the discriminant function from the training set: this is called a learning machine.198 The simplest method is a linear learning machine199 employed in chemical pattern recognition long before SIMCA approaches. Other methods include sim- plex,200 least-squares201 and feedback202 techniques. A display method superficially similar to PCA can be used to visualise discriminant functions. Canonical variates203 may be used to represent graphically information from linear discriminant analysis. Each axis is a linear combination of variables just as in PCA, but in this instance the axes are the discriminant functions: hence canonical variates are chosen to optimise class separations whereas principal components are chosen to represent optimally the main features of variability within a data set.Finally, a very different approach to classification is the K-nearest neighbour method. 162y204 The training set either for individual classes ( e . g . , SIMCA) or for all the classes of interest together (e.g. , discriminant analysis) have been modelled earlier. In the K-nearest neighbour approach the distance of an unknown from members of the training set is computed, in appropriate multi-dimensional space. If, for example, K = 3, and the three nearest neighbours all belong to a single class, clearly the unknown is classified into this class. In practice samples difficult to classify will have neighbours belonging to different groups.As in the clustering methods described above, the choice of number of neighbours and the measure of distances is important and can influence the classification method. This approach, although very simple, lacks the sophistication of other methods, e . g . , in the detection of outliers (samples that fit neither class) and also validation of the classification procedure. The large number of methods available to the chemo- metrician for the classification of samples have been outlined. It is impossible in this review to outline all the many applications. However, with the ready availability of user- friendly software packages, it is important, as stressed above, that the chemometrician does not confuse user-friendliness and commercial availability with the best approach.The chosen method must, ideally, relate to how much is known about the data, what training sets (if any) are available and what questions are to be asked about the data. Consideration must also be given to in-house expertise and the quality of the data. There is no one method that will magically solve the problems of classification of samples, and it is important to gain an understanding of what techniques are available before using multivariate methods for classification. If improperly used, completely meaningless results can come from readily available software packages. 5.2. Relating Different Measurements Another problem frequently encountered by the analyst is to relate different sets of data. For example, the taste of a portion of meat is likely to be related to its chemical composition.In diode array detection HPLC, electronic absorption is related to chromatographic elution times. This problem can be solved by multivariate calibration, and was briefly discussed in the section on mixture analysis. The analyst wants to calibrate a measurement ( e . g . , colour) to other measurements (e.g. , chemical composition). The relationship is unlikely to be simple. There are a large number of techniques in this area, but PLS (partial least-squares) is the best known meth0d~205-208 possibly because it has been developed by the proponents of the SIMCA package. There is a massive amount of literature on the applications of PLS. Recently the method has been extended to PLS2.209 A further sophistica- tion is n-dimensional PLS.190 It is often desirable to relate several sets of measurements.For example, chemical compo- sition might be related to temperature, pH, pressure and other factors. A multivariate model is fitted to several blocks of parameters, and these models are calibrated or linked together. For example, a multivariate model of chemical composition can be established; this, in turn, may be related to a multivariate model of colour or any related factor. There is considerable debate as to whether PLS really is the best approach to multivariate calibration. Historically this method was one of the first chemometric techniques to becomeANALYST, DECEMBER 1987, VOL. 112 1651 available as a user-friendly microprocessor-based package. However, there is no guarantee that the PLS approach theoretically extracts the most reliable information and use of this method must depend in part on what is known about the data, the nature of noise and signals and so on.Another approach to relating sets of data, discussed in Section 3.1, is procrustes analysis. Other methods such as principal com- ponent regression and canonical correlation can also be employed, but are less readily available to the analyst. One of the possible weaknesses of PLSl is that it was principally developed for microcomputer applications and so uses the NIPALS algorithm (applied first to classical SIMCA metho- dology) for calculating principal components. NIPALS has the advantage in that it is fast when memory and computer speed are limited, and so it made PCA methods practicable on early microcomputers: however, with the advent of faster and cheaper computing power, this restraint is less crucial, and should not be a major factor in the choice of methods.There is a great deal of theoretical statistical literature comparing and explaining methods for multivariate calibra- tion. The analyst does not need to understand these methods in detail but must be aware of the different approaches. Chemometrics attempts to make predictions from data when the answer is not immediately obvious by “eyeballing.” Therefore, PLS is only really useful if it predicts new trends in analytical data. However, if PLS is misapplied, then apparent trends could be artifacts of the data processing technique. 5.3. Pattern Recognition Classification and clustering methods are part of chemical pattern recognition.However, not all problems of interest to the analyst are about classifying groups of samples. Consider a typical problem which might be the analysis of water samples. The chemical composition of these samples might be influ- enced by various sources of pollution along a river. They might be influenced by how close the samples were to the estuary or the source of the river. The analyst does not wish to classify samples into polluted and non-polluted, but rather to determine the main factors that influence the composition of the samples. Factor analysisl47.2lo.211 is an important area of chemo- metrics. It has briefly been discussed in the section on mixture analysis. In the discussion of supervised classification methods principal components was used largely to reduce the dimen- sionality of the data.However, often the factors in themselves are of interest. For example, the variability between samples may be due to a few main factors; consider measurements of polluted water in a river: there may be two or three sources of pollution and these could be represented as factors. Factor analysis is a principal component based method. Firstly, PCA is performed on the data set, appropriately pre-processed and scaled as discussed elsewhere in this review. The next step is to determine the significant number of factors. If successful these factors may correspond to physic- ally meaningful processses or components. A good example, discussed above, is mixture analysis where each factor ideally corresponds to one component in a mixture.Most factor analysis algorithms rank the factors according to their contribution to the over-all variability of the data set. A simple example is of a peak arising from a single compound. Although 20 measurements of intensity across the peak might be made, apart from experimental noise these measurements will maintain exactly the same proportions in each sample if the peak corresponds to one pure component. Hence in the absence of noise 20 measurements should be reduced to one factor corresponding to 100% of the variability and 19 corresponding to 0% of the variability. In practice there will be some noise reflected in random fluctuations in peak intensity, so the actual picture will not be so clear.The first factor might correspond to 90% of the variability, the second 570, and so on. Can we unambiguously state that there is only one factor? As is usual in chemometrics there are a large number of criteria for estimating the number of significant factors. A very simple test is a 95% cutoff, i.e., all factors are considered significant until their sum is greater than 95% of the total variability. This method does not allow for differing noise levels and distributions, and a more satisfactory method has been proposed by Malinowski . 2 1 2 3 3 Several other criteria have also been used, but the best method depends on foreknowledge of noise and signal distributions in a similar way to optimum filters in signal processing: much work needs to be carried out in this area.A completely different criterion, already discussed briefly, is cross-validation,214 and this probably works best when the noise is hard to model. Some analogy between filters versus maximum entropy and best numerical criteria for estimating the number of significant factors and cross-validation can be made. Recently, seven methods for the estimation of significant number of factors have been compared in the mass spectrometry and nuclear magnetic resonance spectroscopy of mixtures.215 The inter- ested reader is referred to references in the cited paper to several other techniques. Once the number of significant factors has been estimated it is often desired to interpret these factors in terms of physical processes. For example, in the example of environment pollution studies, each factor might correspond to a source of pollution.In mixture analysis each factor should correspond to the spectrum of each component of the mixture. The simplest way to do this is by rotation. PCA is used to determine the dimensionality of the space. The principal components may then be rotated in PCA space. Orthogonal rotation preserves the geometry of the space, i.e., in practical terms scaling of the raw data is preserved. There are various criteria for rotation but the main objective is to try to maximise the variability along each factor, and the clustering of objects around a given new principal component. There are two principal types of orthogonal rotation, namely varimax216 and quartimax,217 differing according to the definition of variability.More flexible methods involve oblique rotations where the geometry is not necessarily preserved. These include quartimin ,218 oblimax219 and covarimin216 among other methods. Sometimes a physical model for a factor already exists. For example, is the proximity of a sampling site close to a source of pollutions a major influence on the chemical composition of river water samples? This can be approximately modelled as a test factor. If a factor already determined by PCA closely resembles this target factor (as estimated by a least squares criterion) then it is likely that this factor is a true source of variability. This approach is called target transform factor analysis.220 Finally, there may be little or no information as to the nature of the true factor, but it still might be suspected that a number of physical factors causing variability do exist.Consider the example of the polluted river again: it might be suspected that there are two main sources of pollution, but the positions of these sources along the river might be unknown. Iterative target transform factor analysis221 (IT-TFA) can be employed in such instances, although the analysis is rather complex. Another technique for pattern recognition that has recently been a subject of considerable interest is correspondence analysis.222-224 Apart from canonical variates discussed above, all other multivariate methods of data display and reduction mentioned in this review so far are based on principal component methods. These methods start with a correlation (or variance - covariance ) matrix of variables.In PCA vari- ables (e.g., peak intensities) are dealt with differently to objects (samples). In correspondence analysis, used greatly by French-speaking statisticians but hardly at all in the English- speaking world until recently, both variables and samples are treated equally. Consider the example of an n by rn matrix,1652 ANALYST, DECEMBER 1987, VOL. 112 with n peak intensities and m samples; assume that the peak intensities are normalised for each sample. Consider the total normalised intensity for one peak over all the samples. Let us assume that this intensity is 10% of the total intensities for all n peaks in the sample. A similar argument can apply to the m samples. If peak intensities for each sample are normalised, then the proportion of the total intensity for each sample is llm.Consider a peak that does not vary in intensity between samples: the proportion of the summed intensity over all samples and all peaks will be 0. llm. It is possible to replace all the readings with the expected reading that would occur if there were no variability between samples. The squared difference between this expected reading and the observed reading divided by the expected gives a value of x2, and a contingency table can be drawn up. Subsequent calculation of principal factors can be performed as in normal principal component analysis. The greater the value of x2, the more dependent the variables and samples are. However, unlike in PCA both variables and samples are treated symmetrically. The analysis is identical if rows and columns are interchanged.Correspondence analysis has not been used, so far, very frequently in chemistry because the columns and rows are often of a very different nature: the columns may represent spectroscopic peak areas and the rows specimens of blood, for example. Very different questions are asked of the variables than of the samples. In correspondence analysis the variables and samples are often plotted on the same diagram equivalent to super- imposing PCA loadings and scores plots. There is, however, no need to do this. It is possible in PCA, also, to superimpose both types of graph, and this technique is called a biplot. There is sometimes some interest in which variables are related to which object.5.4. Time Series Analysis In the examples discussed in the previous section, we have assumed that samples are unrelated sequentially to each other. In many instances samples can be ordered. This occurs, for example, in geochemical analyses down a core, where each successive sample is related in age and depth to previous samples. If multivariate measurements are recorded at every sampling position (e.g., the intensities of several GC peaks) the rows of the intensity matrix are related to each other in a defined sequence. This type of data can be described as a multivariate time series. In Sections 2.1, 3.5 and 4.2 approaches for the analysis of univariate time series, including spectral analysis and Fourier transforms were discussed. It is entirely possible to use these approaches for the study of multivariate time series.A single parameter can be calculated at each sampling point. This parameter may be the concentration of a single compound, a ratio or a sum. Several parameters can be c0mpared.2~ Subsequently time series (Fourier or spectral) analysis can be performed on the data: the method depends very much on how much is known about the data in advance and on the nature of the noise (ARMA processes225 suggest spectral analysis). When performing exploratory data analysis on naturally occurring time series several difficulties arise. The most serious is in regularity of sampling. Often insufficient material is available to sample completely regularly. The analyst may have limited access to a geological core with an incomplete record: because considerable amounts of material are required for reliable analyses only irregular samples are available.Another common situation is when monitoring natural processes. For example, the analyst might want to sample sea water regularly in time: however, it might not be possible for the technician to be able to sample absolutely regularly over extended periods of time. For time series analysis (where cyclical trends are expected), data must be regularly spaced in time. It is therefore necessary to inter- polate the raw data.25 As usual there are many interpolation methods to choose from. A decision as to the optimum method must be made according to the ratio of the expected inaccuracy of the sampling time to the mean sampling interval; what is known about the data and noise in advance; what questions are to be asked from the data.An even more severe problem occurs when the measurement in the field needs to be calibrated to a time scale: for example, depth down a core is related to geological age, but not linearly. What is the best method of calibrating these two scales? Frequently exact depth - age curves are unknown, and ages are only available at certain isotopic events. The need for regularly spaced data is a severe drawback to conventional methods for time series analysis. Some geologists have developed methods for study- ing periodicity without the need for exact dating (e.g., periodic regression)22”228 as the information that samples are sequen- tially related is sufficient to help the search for systematic variation.However, these methods remain to be exploited by the chemometrician. Sometimes the connection between two variables in a time series is of interest. For example, two compounds may vary in a similar way along a series of readings. The technique of cross-spectral analysis will look for correspondence between two sets of readings. The methods above do not, however, take into account the multivariate nature of the time series. A simple extension24225 involves the calculation of principal components of the multivariate matrix and subjecting these to time series analysis. This approach is likely to be superior to using univariate parameters such as ratios as it takes into account the inherently more informative nature of the multivariate data.However, caution must be exercised: multivariate methods do not always yield a better answer. Consider, for example, the result of time series analysis on the most abundant compound in a chromatogram: the measurement of the intensity of the compound is likely to be subject to less error than that of the least abundant compound. Yet if both these measurements are weighted equally in the multivariate time series the analysis may actually be worse owing to the introduction of relatively more noise. The scaling and pre-processing steps have to be carefully thought out prior to employing multivariate methods for the analysis of time series. It is possible to go even further when analysing multivariate time series if more than one principal component is taken into consideration.25 Spectral analysis can be performed on rotated principal components.These in turn yield a surface which for a two PC model has as the axes intensity and the two principal components. In this way it is possible to resolve out overlapping frequencies from spectral analysis of the multi- variate time series. There are undoubtedly many further techniques that can be applied to the analysis of multivariate times series in analytical chemistry that remain to be exploited. Possibly this area is very challenging to the chemometrician. 6. Use and Abuse of Chemometrics In previous sections a formidable array of methods have been considered. This review is incomplete in that many methods have been omitted, and the emphasis in each section is subjective.However, the reader will have noted that chemometrics can aid the analyst in a large variety of different ways. As instruments become more sophisticated and data become easier to obtain, the analyst is confronted with an even more awesome task. Chemometrics is bound to become essential to the analytical laboratory of the 1990s. However, the reader must be aware that, if improperly applied, chemometrics yields meaningless results.ANALYST, DECEMBER 1987, VOL. 112 6.1. Importance of Understanding All Steps in the Acquisition and Processing of Data 1653 One of the commonest problems to confront the chemometri- cian is poor data. The analyst normally tries to answer questions by measuring experimental variables. Afterwards he may wish to fit a response surface model or to classify his samples.Yet if the experimental strategy has not been thought out very carefully there is often little more that can be done with the data. Chemometrics can only reveal the trends that are actually buried within the data. Typical problems are of data sets with missing values: does a missing value mean that the variable was not recorded at all, or was it at a level too low to be detectable? Another problem is the misinterpretation of output. An obvious example includes the use of correlation coefficients to assess the goodness-of-fit to a given model. If an incorrect number of experiments are performed it is possible to obtain almost unit correlation coefficients from any data. More seriously, many analysts wish to interpret multi- variate data matrices, for example to classify samples.However, these data are obtained from instrumental analyses. Prior to application of pattern recognition methods it is essential to have some feel for the quality of the data. Have replicate analyses been taken? Is absolute quantitation, although potentially more informative than relative quantita- tion, sufficiently error proof? What are the problems asso- ciated with the deconvolution of closely overlapping peaks? Will these affect the analysis? Can this be improved? It is not sufficient to present a table of intensities and expect meaning- ful analysis of the trends buried within the data unless some idea of the experimental error is available. Often very careful thought must go into which instrumental technique is most suitable for the problem in question.A good example is the comparison of GC-MS, GC and pyrolysis GC. GC-MS contains the most directly interpretable structural information about samples, pyrolysis GC the least. However, data are acquired faster using pyrolysis methods, which also allow for more reliable quantitation, especially using replicate analyses. GC - MS is undoubtedly the least reproducible quantitative technique of the three because of the problem of calibrating intensities at different masses. So, of these techniques, should the analyst choose the most readily quantitative method or the method that contains most directly interpretable structural information? The choice of method depends on what ques- tions are to be asked of the data. Similarly, the subsequent data analysis technique must depend on the quality of the instrumental data. For any one application it is possible to compare instrumental techniques by methods such as pro- crustes analysis or information theory, but these take time. It is essential to consider the time available to the analyst.The choice of experimental method should depend very much on why the experiment is being performed. It is best to target experiments so that they yield one type of data very reliably but lose other, less interesting features of the samples. Experimental design is also vital to the analyst. Major projects are sometimes undertaken without proper thought to the design of the experiment. The resultaat data may be of very little use to the chemometrician.Even comparatively simple problems such as chromatographic optimisation can be radically improved by experimental design. A further problem faced by the chemometrician is to establish when data processing is an alternative to experimental design. For example, is the maximum entropy method an alternative to optimising instrumental resolution in NMR? A great deal depends on the nature of the individual problem and generalisations are not possible. Another major area is method selection. There are many methods for pre-processing, clustering, classification, filter- ing and so on. However, many papers are quoted out of context, and suggest that there is one optimum method for a given task. Careful reading of the literature normally indicates that a given optimum method is only applicable to a very narrowly defined problem or data set and may not be optimum for the experiment under consideration.Again, it is essential to define the questions being asked, the nature of the data and the foreknowledge available, prior to planning strategies. Finally, every problem poses unique questions and is solved by unique methods. The options for method selection, instrumental optimising, pre-processing, post-processing, etc., must number many millions of combinations. There is no automatic solution to a large group of problems. 6.2. Abuse of Chemometrics Many scientists will be excited by the possibilities of using chemometrics, and will appreciate how powerful a tool this new discipline can be. However, considerable thought must be given to the merits or otherwise of using chemometric methods.For example, user-friendly software is now being developed particularly for pattern recognition in commercial instrumentation. This does not mean, however, that any particular package is automatically the best method to analyse the particular problem under consideration: user-friendliness and good marketing should not be confused with scientific truth. The laboratory manager will often have a difficult choice as to whether to trust a readily available piece of software (often the cheapest solution in terms of manpower) or to consult an expert or to train one of his staff sufficiently that they in turn will be able to weigh up which method is best for the particular problem and possibly develop or use less friendly programs (undoubtedly the most expensive invest- men t) .Another common misapprehension is that the chemometri- cian prefers computational approaches to traditional “eyeball- ing.” If equivalent information can be obtained by visual inspection of data or simple methods of analysis, then sophisticated statistical methods of analysis are likely to be redundant. Chemometrics is only really useful where informa- tion not obvious at first sight can be obtained from data. Hence the analyst should not be surprised if the use of chemometrics results in different conclusions from those he might have predicted intuitively. Finally, chemometrics is not magic. Chemometrics aims to increase the efficiency of the analytical process. If it has failed to save time, money, manpower, machine time or whatever, then it has been misapplied.Before turning to chemometric methods, the laboratory manager must very carefully consider his objectives. 7. The Future 7.1. The Automated Laboratory One vision of the future is that chemometrics will be at the heart of the automated laboratory. As instruments acquire data more and more rapidly so there will be the need to optimise the instrumental and interpretative functions. This review has concentrated very largely on the statistical aspect of chemometrics. Some workers argue that chemo- metrics is restricted to the application of statistics to chemical data. However, the central philosophy of chemometrics is an open-mindedness as to which method is best for the problem under consideration.If a new approach can yield a more reliable answer more quickly then that approach is most useful. Increasingly, interest is turning towards expert systems (artificial intelligence; knowledge engineering). A very obvi- ous area where they can help is in method selection. Expert systems can also be used as an alternative to statistical methods: for example, component spectra of mixtures can often be reconstructed by an expert system. Following on from this argument , library searching, clustering and data basing techniques are often valuable complements to expert systemsANALYST, DECEMBER 1987, VOL. 112 particularly in the area of spectroscopy. Another example is the automated optimisation of chromatographic separations. It has been shown how simplex methods can be used to improve resolution.Can this be performed automatically? Chemometrics, robotics and expert systems can combine to solve this problem. It is likely that the mathematically minded chemometrician will have to learn some computer science if he is not to be left behind in the future. Most problems encountered in the every- day analytical laboratory are comparatively simple. Although techniques such as n-dimensional PLS and correspondence analysis have been discussed, how many real data sets actually are good enough to merit these methods? Chemometrics can help the analyst at a much more basic level, for example, simply by quantitating a mixture more reliably, by automating the choice of methods and making these readily available, on-line, to a technician may involve many hundreds of man-years of work, a great deal of which might be mundane to the statistician but essential to bringing chemometrics safely into everyday use.7.2. Mathematical Chemistry Another view of chemometrics is that it is part of mathemat- ical chemistry. Of course mathematics and statistics have been heavily used for solving problems in quantum mechanics, statistical mechanics, spectroscopy, kinetics, the structure of matter and so on. Chemometrics is only a small part of mathematical chemistry. It is also a fairly recent development: quantum chemists have known the energy of binding of hydrogen for a century and so have had plenty of time to develop sophisticated techniques to account for this exact physical phenomenon. Analytical chemists have only recently had large data sets available to them, and so comparatively little time to develop their techniques.Thus chemometrics is moving slowly along parallel lines to those already established in physical chemistry. Matrix algebra and optimisation are the key to quantum mechanics and similarly are now being used in chemometrics. Information theory has been applied to a very sophisticated level in the study of the structure of liquids229 and there has been relatively little work in this area in analytical chemistry. The chemical community is very heterogeneous and the literature divided. Quantum chemists and chemometricians have developed different jargon and terminology for many of the same matrix operations. Because of the education system rather than any scientific barrier, these communities are still divided: theoretical chemists tend to be employed in universi- ties whereas chemometricians work, for the most part, in industry.There are areas where both skills are now important. A good example is molecular graphics, which is increasingly used in the pharmaceutical industry. Structure - energy calcu- lations require a knowledge of quantum chemistry; however, often the data on different compounds and the search for known structures require data basing, cluster analysis and expert systems. Perhaps there will grow up a subject of mathematical chemistry with chemometrics as one cornerstone. 7.3. Education More than ever, education in chemometrics is being stressed. The typical scientist in industry needing to use chemometrics was formally trained before it was possible to study chemo- metrics at higher eductional establishments.In order to understand properly the methods he might want to use, intensive training is needed. It will take many years before an adequate supply of graduates reach suitably senior positions in industry so, for the foreseeable future, education is almost inseparable from research. Indeed, much attention is focused on this area. There will be an urgent need for students trained in chemometrics. The students of today are the managers of 20 years’ time, when a knowledge of chemometrics is likely to be a prerequisite for competent management of a modern analytical laboratory. However, within the higher education system there is a serious lack of suitably trained teachers of chemometrics as these people too were schooled before the need to understand chemometrics.It will take many years before course structures are radically altered and enough staff members are recruited with a knowledge and competence in this area. Meanwhile the practising chemometricians will need not only to concentrate on solving problems for other people, but on educating other people: the development of textbooks, audiovisual aids, tutorial software and so on are almost prerequisites. Indeed, a recent National Bureau of Standards conference voted educational aspects of chemometricians as the top seven out of twelve categories that the chemometrician should be concentrating 011.7 It is unusual that such an emphasis on education and research is being displayed in this new subject, but as chemometrics is very much about understanding and applying methods correctly rather than merely cataloguing facts and rules, the general acceptance of this subject will be critically dependent on the time and effort spent in education pursuits by the practitioners.8. Conclusion This review has discussed many of the most topical methods in chemometrics. The unitiated reader will, if he has read this far, be bewildered by the variety of options available to him. What has been emphasised throughout, however, is that chemometrics is a very fluid subject. It is not possible simply to put data through an automatic package and obtain the correct answer. A great deal of understanding of the nature of the problem and the questions to be asked is essential prior to using chemometrics methods.The future looks very exciting indeed and the next decade should see the growth and development of chemometrics into a mature subject. 1. 2. 3. 4. 5. 6. 7. 8. 9. 10. 11. 12. 13. 9. References Kowalski, B. R., Anal. Chem., 1980, 52, 112R. Frank, I. E., and Kowalski, B. R., Anal. Chem., 1982, 54, 232R. Delaney, M. F., Anal. Chem., 1984, 56, 261R. Ramos, L. S., Beebe, K. R., Carey, W. P., Sanchez, E., Erickson, B. C., Wilson, B. R., Wangen, L. E., and Kowalski, B. R., Anal. Chem., l986,58,294R. Kowalski, B. R., Editor, “Chemometrics Theory and Applica- tion,” ACS Symposium Series No. 52, American Chemical Society, Washington, DC, 1977. Kutz, D. A., Editor, “Chemometrics Estimators of Sampling, Amount and Error,” ACS Symposium Series No.284, Ameri- can Chemical Society, Washington, DC, 1985. Spiegelman, C. H., Walters, R. L., and Sacks, J., Editors, J. Res. Natl. Bur. Stand., Special Issue, 1985, 90(6), 391. Kowalski, B. R., Editor, “Chemometrics, Maths and Statistics in Chemistry,’’ Reidel, Dordrecht, 1984. Sharaf, M. A., Illman, D. L., and Kowalski, B. R., “Chemo- metrics,” Wiley, New York, 1986. Bawden, D., Editor, “Chemometrics Series,” Research Studies Press, Letchworth. Massart, D. L., Editor-in-Chief, Hopke, P. K., Spiegelman, C. H., and Wegscheider, W., Editors, Brereton, R. G . , and Dessy, R. E., Associate Editors, Chemometrics and Intelligent Laboratory Systems, Elsevier, Amsterdam. Kowalski, B. R., Editor-in-Chief, Brown, S. D., and Vande- ginste, B.G. M., Associate Editors, Journal of Chemometrics, Wiley, Chichester. Hartigan, J. A., “Clustering Algorithms,” Wiley, New York, 1975.ANALYST, DECEMBER 1987, VOL. 112 14. 15. 16. 17. 18. 19. 20. 21. 22, 23. 24, 25. 26. 27. 28. 29. 30. 31. 32. 33. 34. 35 * 36. 37. 38. 39. 40. 41. 42. 43. 44. 45. 46. 47. 48. 49. 50. 51. 52. 53. 54. Sneath, P. H. A., and Sokal, R. R., “Numerical Taxonomy,” Freeman, San Francisco, 1973. Massart, D. L., and Kaufman, L., “Interpretation of Analytical Data by the Use of Cluster Analysis,” Wiley, New York, 1983. Joereskog, K. G., Klovan, J. E., and Reyment, R. A., “Geological Factor Analysis,” Elsevier, Amsterdam, 1976. Cochran, W. G., and Cox, G. M., “Experimental Design,” Second Edition, Wiley, New York, 1957.Davies, 0. L., “The Design and Analysis of Industrial Experiments,’’ Oliver and Boyd, London, 1954. Box, G. E. P., Hunter, W. G., and Hunter, J. S., “Statistics for Experimenters,” Wiley, New York, 1978. Myers, R. H., “Response Surface Methodology,” Allyn,and Bacon, Boston, 1976. Kateman, G., and Pijpers, F. W., “Quality Control in Chemical Analysis,” Wiley, New York, 1981. Brereton, R. G., Chemometrics Intell. Lab. Syst., 1986, 1, 17. Davies, 0. L., and Goldsmith, P. L., “Statistical Methods in Research and Production,” Oliver and Boyd, Edinburgh, 1972. Brassell, S. C., Brereton, R. G., Eglinton, G., Grimalt, J., Liebezeit, G., Pflaumann, U., and Sarnthein, M., in Rullko- etter, J., and Leythaeuser, D., Editors, “Advances in Organic Geochemistry 12,” Pergamon Press, Oxford, 1986, p.649. Brereton, R. G., Chemometrics Intell. Lab. Syst., 1987,2, 177. Cleveland, W. S., and Devlin, S. J., J. Am. Stat. , 1982,77,520. Jenkins, G. M., “Spectral Analysis and Its Applications,” Holden-Day, San Francisco, 1968. Box, G. E. P., and Jenkins, G. M., “Time Series Analysis, Forecasting and Control,” Holden-Day, San Francisco, 1970. Koopmans, L. H., “The Spectral Analysis of Time Series,” Academic Press, New York, 1974. Davis, H. I., “The Analysis of Economic Time Series,” Principia Press, Bloomington, IN, 1941. Kennett, J. P., and Shackleton, N. J., Nature (London), 1976, 260, 513. Harding, R. H., ‘‘Faurier Series and Transforms: A Computer Illustrated Text,” Adam Hilger, Bristol, 1985. Bracewell, R., “The Fourier Transform and Its Applications,” McGraw-Hill, New York, 1985.Brigham, E. O., “The Fast Fourier Transform,” Prentice-Hall, Englewood Cliffs, NJ, 1974. Davis, J. C . , “Statistics and Data Analysis in Geology,” Second Edition, Wiley, New York, 1986. Bertero, M., Boccacci, P., and Pike, E. R., Proc. R. Soc. London Ser. A , 1982, 382, 15. Davis, J. C., and McCullogh, M., “Display and Analysis of Spatial Data,” Wiley, New York, 1975. David, M., “Geological Ore Reserve Estimation,’’ Elsevier, Amsterdam, 1977. Chubb, F. L., Edward, J. T., and Wong, S . L., J. Org. Chem., 1980,45, 2315. Morgan, S . L., and Deming, S . N., J. Chromatogr., 1975,112, 267. Berridge, J. C., “Techniques for the Automated Optimization of HPLC Separations,” Wiley, New York, 1985. Schoenmakers, P. J., “Optimization of Chromatographic Selectivity, a Guide to Method Development,” Journal of Chromatography Library, Volume 35, Elsevier, Amsterdam, 1986.Ernst, R. R., Rev. Sci. Instr., 1968, 39, 998. Beveridge, G. S. G., and Schechter, R. S . , “Optimisation, Theory and Practice,” McGraw-Hill, New York, 1970. Deming, S. N., and Morgan, S. L., Anal. Chem., 1973, 45, 279A. Burton, K. W. C., and Nickless, G., Chemometrics Intell. Lab. Syst., 1987, 1, 135. Nelder, J. A., and Mead, R., Comput. J., 1965, 7, 305. Gustavsson, A., and Sundkvist, J. E., Anal. Chim. Acta, 1985, 167, 1. Routh, M. W., Schwartz, P. A., and Denton, M. B., Anal. Chem., 1977,49, 1422. Betteridge, D., Wade, A. P., and Howard, A. G., Talanta, 1985, 32, 709. Betteridge, D., Wade, A. P., and Howard, A. G., Talanta, 1985, 32, 723.Parker, L. R., Cave, M. R., and Barnes, R. M., Anal. Chim. Acta, 1985, 175,231. Box, G. E. P., Biometrika, 1952,39,49. Box, G. E. P., Biometries, 1954, 10, 16. 55. 56. 57. 58. 59. 60. 61. 62. 63. 64. 65. 66. 67. 68. 69. 70. 71. 72. 73. 74. 75. 76. 77. 78. 79. 80. 81. 82. 83. 84. 85. 86. 87. 88. 89. 90. 91. 92. 93. 94. 95. 96. 1655 Forthofer, R. N., and Koch, G. G., Biometries, 1973,29, 143. Allus, M. A., Brereton, R. G., and Nickless, G., Chemo- metrics Intell. Lab. Syst., in the press. Morgan, S. L., and Jacques, C. A., J. Chromatogr. Sci., 1978, 16, 500. Deming, S. N., and Morgan, S. L., Clin. Chem., 1979,25,840. Kristof, W., and Wingersky, B., “Proceedings of the 79th Annual Convention of the APA,” 1971, p. 81. Gower, J. C., Psychometrika, 1975, 40, 33.O’Donnell, A. G., Hahaie, M. R., Goodfellow, M., Minnikan, D. E., and Hoyck, V., J . Gen. Microbiol., 1985, 131, 2033. MacFie, H. J. H., in Meuzelaar, H., Editor, “Proceedings of the First Snowbird Symposium on Pattern Recognition Methods in Analytical Spectroscopy, Snowbird, UT, 16-18 June, 1987,” Plenum Press, New York, in the press. Williams, A. A., and Langron, S. P., J. Sci. Food Agric., 1984, 35, 558. Martens, M., Martens, H., and Wold, S . , J . Sci. Food Agric., 1983, 34, 715. Lawley, D. N., Biometrika, 1959, 46, 59. Reeve, D. A., and Crozier, A., in MacMillan, J., Editor, “Hormonal Regulation of Development I-Encyclopedia of Plant Physiology,” Volume 9, Springer-Verlag, Berlin, 1980. Scott, I. M., Plant Cell Environ., 1982, 5 , 339. Reeve, D. R., and Crozier, A., Plant Cell Environ., 1983, 6, 365. Scott, I. M., Plant Cell Environ., 1983, 6, 367. MacMillan, J., in Crozier, A., and Hillman, J. R., Editors, “The Biosynthesis and Metabolism of Plant Hormones,” 1984, Caulcutt, R., and Boddy, R., “Statistics for Analytical Chem- ists,” Chapman and Hall, London, 1983. Sheffe, H., “The Analysis of Variance,” Wiley, New York, 1959. Cole, I. W. L., and Grizzle, I. E., Biometries, 1966, 22, 810. Yendle, P. W., and Brereton, R. G., Chemometrics Intell. Lab. Syst., submitted for publication. Deming, S . N., in Kowalski, B. R., Editor, “Chemometrics: Mathematics and Statistics in Chemistry,’’ Reidel, Dordrecht , 1984, p. 267. Windig, W., Haverkamp, J., and Kistemaker, P. G., Anal. Chem., 1983,55, 81. Deming, S . N., Bower, J.D., and Bower, K. D., Adv. Chromatogr., 1984,24, 35. Janse, T. A. H. M., Van der Wiel, P. F. A., and Kateman, G., Anal. Chim. Acta, 1983, 57, 229. Chen, L., Zhang, L., and Su, N. S . , Feuxi Huaxue, 1984, 12, 124. Griffiths, P. R., Editor, “Transform Techniques in Chemistry,” Heyden, London, 1978. Marshall, A. G., Editor, “Fourier, Hadamard and Hilbert Transforms in Chemistry,’’ Plenum, New York, 1982, Remirez, R. W., “The FFT. Fundamentals and Concepts,” Prentice-Hall, Englewood Cliffs, NJ, 1985. Shaw, D., “Fourier Transform NMR Spectroscopy,” Elsevier, Amsterdam, 1984. Farrar, T. C., and Becker, E. D., “Pulse and Fourier Transform NMR Spectroscopy,” Academic Press, New York, 1971. Griffiths, P. R., “Chemical Infrared Fourier Transform Spec- troscopy,” Wiley, New York, 1975.Ernst, R. R., Adv. Magn. Reson., 1966, 2, 1. Cooley, J. W., and Tukey, J. W., Maths Comput., 1965, 19, 297. Bax, A., “Two Dimensional Nuclear Magnetic Resonance in Liquids,” Delft University Press, Delft, 1982. Morris, G. A., Magn. Reson. Chem., 1986, 24, 371. McCreery, R. C., and Ross, P., in Marshall, A. G . , Editor, “Fourier, Hadamard and Hilbert Transforms in Chemistry,’’ Plenum, New York, 1982, p. 527. Shannon, C., Bell Syst. Tech. J . , 1948,27, 379. Eckschlager, K., and Stepanek, V., “Information Theory as Applied to Chemical Analysis,” Wiley, New York, 1984. Eckschlager, K., and Stepanek, V., Anal. Chem., 1982, 54, 1115A. Liteanu, C., and Rica, I., Anal. Chem., 1979, 51, 1986. Massart, D. L., and Smits, R., Anal. Chem., 1974,46, 283.Kateman, G., and Pijpers, F. W., “Quality Control in Analytical Chemistry,’’ Wiley, New York, 1981, pp. 152-153. p. 1.1656 ANALYST, DECEMBER 1987, VOL. 112 97. 98. 99. 100. 101. 102. 103. 104. 105. 106. 107. 108. 109. 110. 111. 112. 113. 114. 115. 116. 117. 118. 119. 120. 121. 122. 123. 124. 125. 126. 127. 128. 129. 130. 131. 132. 133. 134. 135. 136. 137. Kateman, G., and Pijpers, F. W., “Quality Control in Analytical Chemistry,” Wiley, New York, 1981, pp. 150-151. Massart, D. L., J. Chromatogr., 1973, 79, 157. Lindon, J. C., and Ferrige, A. G., Prog. NMR Spectrosc., 1980, 14, 27. Bozic, S. M., “Digital and Kalman Filtering,” Edward Arnold, London, 1979. Jansson, P. A., “Deconvolution with Applications to Spectro- scopy,” Academic Press, New York, 1984.Howard, S. J., in Vanasse, G. A., Editor, “Improved Resolu- tion of Spectral Lines Using Minimum Negativity and Other Constraints,” Spectrometric Techniques, Volume 111, Academic Press, New York, 1983, p. 234. Goldman, S., “Information Theory,” Dover, New York, 1953. Lam, R. B., Wiebolt, R. C., and Isenhour, T. L., Anal. Chem., 1981, 53, 889A. Smit, H. C., in Kowalski, B. R., Editor, “Chemometrics Maths and Statistics in Chemistry,” Reidel, Dordrecht, 1984, p. 225. Wright, N. A., Villalanti, D. C., and Burke, M. F., Anal. Chem., 1982, 54, 1738. Kauppinen, J. K., in Vanasse, G. A., Editor, “Fourier Self Deconvolution in Spectroscopy,” Spectrometric Techniques, Volume 111, Academic Press, New York, 1983, p. 199. Bowley, H. J., Collin, S. M. H., Gerrard, D. L., James, D.I., Maddams, W. F., Tooke, P. B., and Wyatt, I. D., Appl. Spectrosc., 1985, 39, 1004. Yang, W., Griffiths, P. R., Byler, D. M., and Susi, H., Appl. Spectrosc., 1985,39,282. Marshall, A. G., “Fourier, Hadamard and Hilbert Transforms in Chemistry,” Plenum, New York, 1982, p. 24. Harvey, A. C., “The Econometric Analysis of Time Series,” Philip Alan, Oxford, 1981. Bartlett, M. S., Nature (London), 1948, 161, 686. Chatfield, C., “The Analysis of Time Series: An Introduction,” Third Edition, Chapman and Hall, London, 1984, p. 142. Chatfield, C., “The Analysis of Time Series: An Introduction,” Third Edition, Chapman and Hall, London, 1984, p. 143. Koopmans, L. H., “The Spectral Analysis of Time Series,” Academic Press, New York, 1974, p. 169. Jones, R. H., Technometrics, 1965, 7 , 531.Jenkins, G. M., and Watt, D. G., “Spectral Analysis and Its Applications,” Holden-Day, San Francisco, 1968, Section 6.4.2. Laeven, J. M., Smit, H. C., and Kraak, J. C., Anal. Chim. Acta, 1983, 150, 253. Maddams, W. F., Appl. Spectrosc. , 1980, 34, 245. Whitbeck, M. R., Appl. Spectrosc., 1980, 35, 93. Savitsky, A., and Golay, M. J. E., Anal. Chem., 1964, 36, 1627. Nevius, T. A., and Pardue, H. L., Anal. Chem., 1984,56,2249. Grimalt, J., Iturriaga, H., and Tomas, X., Anal. Chim. Acta, 1982,139, 155. Scheeren, P. J. H., Klous, Z., Smit, H. C., and Doombos, D. A., Anal. Chim. Acta, 1985, 171, 45. Brown, S. D., Anal. Chim. Acta, 1986, 181, 1. Rutan, S. C., J. Chemometrics, 1987, 1, 7. Scheeren, P. J. H., Klous, Z., and Smit, H. C., Anal. Chim.Acta, 1985, 171,45. Skilling, J., and Bryan, R. K., Mon. Not. R. Astron. SOC., 1984, 211, 111. Skilling, J., Sibisi, S., Brereton, R. G., Laue, E. D., and Staunton, J., Nature, London, 1984,311,466. Laue, E. D., Skilling, J., Staunton, J., Sibisi, S., and Brereton, R. G., J. Magn. Reson. , 1985, 62, 437. Livesey, A. K., and Skilling, J., Acta Crystallogr., 1985, 41, 113. Ni, F., and Sheraga, H. A., J. Raman Spectrosc., 1985,16,337. Shannon, C. E., and Weaver, W., “The Mathematical Theory of Communication,” University of Illinois Press, Urbana, IL, 1949. Burg, J. P., “Maximum Entropy Spectral Analysis,” PhD Thesis, Stanford, CA, 1975. Viti, V., Barone, P., Guidoni, L., and Massari, E., J. Magn. Reson., 1986, 67, 91. Brereton, R. G., Sibisi, S., Skilling, J., and Staunton, J., unpublished work.Laue, E. D., Skilling, J., and Staunton, J., J. Magn. Reson., 1985, 63, 418. 138. 139. 140. 141. 142. 143. 144. 145. 146. 147. 148. 149. 150. 151. 152. 153. 154. 155. 156. 157. 158. 159. 160. 161. 162. 163. 164. 165. 166. 167. 168. 169. 170. 171. 172. 173. 174. i75. 176. 177 178. 179. 180. 181. Laue, E. D., Mayger, M. R., Skilling, J., and Staunton, J., J. M a p . Reson. , 1986, 68, 14. Laue, E. D., Pollard, K. 0. B., Skilling, J., Staunton, J., and Sutkowski, A. C. J. M a p . Res., 1987, 72, 493. Otto, M., and Bandemer, H., Anal. Chim. Acta, 1986,184,21. Otto, M., and Bandemer, H., Chemometrics Intell. Lab. Syst., 1986, 1, 71. Mandel, J., and Linning, F. J., Anal. Chem., 1957,29,743. Bader, M., J. Chem. Educ., 1980, 57, 703. Ebel, S., Glaser, E., Abdulla, S., Steffens, U., and Walter, V., Fresenius 2.Anal. Chem., 1982,313, 24. Jochum, C., Jochum, P., and Kowalski, B. R., Anal. Chem., 1981, 53, 85. Lindberg, W., Persson, J. A,, and Wold, S., Anal. Chem., 1983, 55, 643. Malinowski, E. R., and Howery, D. G., “Factor Analysis in Chemistry,’’ Wiley, New York, 1980. Vandeginste, B. G. M., Derks, W., and Kateman, G., Anal. Chim. Acta, 1985, 173, 253. Lindberg, W., Ohman, J., and Wold, S., Anal. Chem., 1986, 58, 299. Vandeginste, B . , Essers, R., Bosman, T., Reijnen, J., and Kateman, G., Anal. Chem., 1985, 57, 971. Windig, W., and Meuzelaar, H., Anal. Chem., 1984, 56, 2297. Windig, W., Haverkamp, J., and Kistemaker, P. G., Anal. Chem., 1983,55, 81. Windig, W., Meuzelaar, H. L. C., Shafizadelu, F., and Kelsey, R. G., J. Anal. Appl. Pyrolysis, 1984,6, 233. Lawton, W. H., and Sylvestre, E. A., Technometrics, 1971,14, 617. Sharaf, M. A., and Kowalski, B. R., Anal. Chem., 1982, 54, 1291. Delaney, M. F., Warren, F. V., and Hallowell, J. R., Anal. Chem., 1983, 55, 1925. Delaney, M. F., Warren, F. V., and Hallowell, J. R., J. Chem. In$ Comput. Sci., 1985, 25, 27. Mauro, D. M., and Delaney, M. F., Anal. Chem., 1986, 58, 2622. Delaney, M. F., and Mauro, D. M., Anal. Chim. Acta, 1985, 172, 193. INCOS GC-MS Manual: MSDS System Reference, Volume 11, Finnigan, California, 1985. Karstang, T. V. , and Kvalheim, 0. M., personal communica- tion. Kowalski, B. R., and Bender, C. F., J. Am. Chem. SOC., 1972, 94, 5632. Jurs, P. C., and Isenhour, T. L., “Chemical Applications of Pattern Recognition,” Wiley, New York, 1975. Varmuza, K., “Pattern Recognition in Chemistry,’, Springer- Verlag, Berlin, 1980. Strouf, 0. , “Chemical Pattern Recognition,” Research Studies Press, Letchworth, 1986. Wolff, D. D., and Parsons, M. I. L., “Pattern Recognition Approach to Data Interpretation,” Plenum, New York, 1983. Goodfellow, M., Jones, D., and Priest, F. G., Editors, “Computer Assisted Bacterial Systematics,” Academic Press, London, 1985. Aitchison, J., J. Int. Assoc. Math. Geol., 1984, 16, 617. Naes, T., J. Chemometrics, 1987, 1, 121. Sneath, P. H. A., J. Gen. Microbiol., 1957, 17, 201. Johnson, S. C., Psychometrika, 1967,32,241. Sokal, R. R., and Michener, C. D., Kansas Univ. Sci. Bull., 1958, 38, 1409. Orloc, L., J. Ecol., 1966, 54, 193. Ward, J . H., J. Am. Stat. ASSOC., 1963, 58, 236. Davis, J. C., “Statistics and Data Analysis in Geology,” Wiley, New York, 1973, pp. 456-473. Massart, D. L., Kaufman, L., and Coomans, D., Anal. Chim. Acta, 1980, 122, 347. Massart-Leen, A-M., and Massart, D. L. , Biochem. J., 1981, 196, 61 1. Esbensen, K. H., Kaufman, L., and Massart, D. L., Meteor- itics, 1984, 19, 95. Coomans, D., and Massart, D. L., Anal. Chim. Acta, 1981, 133, 225. MacNaughton-Smith, P., Williams, W. T., Dale, M. B., and Mockett, L. G., Nature (London), 1964,202, 1034. Zahn, C. T., IEEE Trans. Comput., 1971, 20, 68.ANALYST, DECEMBER 1987, VOL. 112 1657 182. 183. 184. 185. 186. 187. 188. 189. 190. 191. 192. 193. 194. 195. 196. 197. 198. 199. 200. 201. 202. 203. 204. 205. Massart, D. L., and Kaufman, L., Anal. Chem., 1975, 47, 1244A. Willett, P., J. Chem. Inf. Comput. Sci., 1983, 23, 22. Vuchev, V., Sci. Terre. Ser. Inf. Geol., 1983, 16, 19. Huber, J. K. F., and Reich, G., J. Chromatogr., 1984,294,15. Wold, S., Technical Report No. 387, University of Wisconsin, 1974. Wold, S., Pattern Recognition, 1976, 8, 127. Wold, S., and Sjostrom, M., in Kowalski, B. R., Editor, “Chemometrics Theory and Practice,” Am. Chem. SOC. Symp. Ser. 52, 1977, p. 243. Yendle, P. W., Brereton, R. G., and Badham, S . J., SAS European Users Group lnternation Proceedings 1986, SAS Institute, Corey, NC, 1986, 21. Wold, S., Geladi, P., Esbensen, K., and Ohman, J., J. Chemometrics, 1987, 1, 41. Strouf, O., and Fusek, J., Collect. Czech. Chem. Commun., 1979, 44, 1370. Fusek, J., and Strouf, O., Collect. Czech. Chem. Commun., 1979,44, 1362. Derde, M. P., and Massart, D. L., Anal. Chim. Acta, 1986, 184, 33. Juricskay, V., and Veress, G. E., Anal. Chim. Acta, 1985,171, 61. Hand, D. J., “Discrimination and Classification,” Wiley, New York, 1981. Lachenbruch, P. A., “Discriminant Analysis,” Hafner, New York, 1975. Strouf, O., “Chemical Pattern Recognition,” Research Studies Press, Letchworth, 1986, p. 20. Nilsson, N. J., “Learning Machines,” McGraw-Hill, New York, 1965. Strouf, O., “Chemical Pattern Recognition,” Research Studies Press, Letchworth, 1986, p. 16. Gustavsson, A., and Sundkvist, J. E., Anal. Chim. Acta, 1985, 167, 1. Moriguchi, I., and Komatsu, K., Eur. J . Med. Chem. Clin. Ther., 1981, 16, 19. Ichise, M., Yamagishi, H., Oishi, H., and Kojima, T., J. Electroanal. Chem., 1980, 108, 213. MacFie, H. J. H., Gutteridge, C. S., and Norris, J. R., J. Gen. Microbiol., 1978, 104, 67. Wold, S., and Dunn, W. J . , J. Chem. Inf. Comput. Sci., 1983, 23,6. Wold, H., in Krishnaiah, P. R., Editor, “Multivariate Analy- sis,” Academic Press, New York, 1966. 206. 207. 208. 209. 210. 211. 212. 213. 214. 215. 216. 217. 218. 219. 220. 221. 222. 223. 224. 225. 226. 227. 228. 229. Wold, S., Ruhe, A., Wold, H., and Dunn, W., SIAM J. Stat. Cornput., 1984, 5 , 735. Naes, A., and Martens, H., Commun. Stat. Simul. Comput., 1985, 14, 735. Lorber, A., Wangen, L. E., and Kowalski, B. R., J. Chemometrics, 1987, 1, 19. Manne, R., Chemornetrics Intell. Lab. Syst., 1987, 2, 187. Horst, P., “Factor Analysis of Data Matrices,” Holt, Rinehart and Winston, New York, 1965. Rummel, R. J., “Applied Factor Analysis,” Northwestern University Press, Evanston, 1970, pp. 372-385. Malinowski, E. R., Anal. Chem., 1977, 49,612. Malinowski, E. R., J. Chemometrics, 1987, 1, 33. Wold, S., Technometrics, 1978, 20, 397. Hearmon, R. A., Scrivens, J. H., Jennings, K. R., and Farncombe, M. J., Chemometrics Intell. Lab. Syst., 1987, 1, 167. Kaiser, H. F., Psychometrika, 1958,23, 187. Harman, H. H., “Modern Factor Analysis,” Third Edition, University of Chicago Press, Chicago, 1967. Carroll, J. B., Psychometrika, 1958, 18, 187. Saunders, D. R., Psychometrika, 1960, 25, 199. Rozett, R. W., and Petersen, E. M., Anal. Chem., 1975, 47, 2377. Roscoe, B. A., and Hopke, P. K., Comput. Chem., 1981,5, 1. Greenacre, M. J., “Theory and Applications of Correspon- dence Analysis,” Academic Press, London, 1984. BenzCcri, J. P., “Histoire et Prehistoire de 1’Analyse des DonnCes,” Dunod, Paris, 1982. Everitt, B. S., “The Analysis of Contingency Tables,” Chap- man and Hall, London, 1984. Chatfield, C., “The Analysis of Time Series: An Introduction,” Third Edition, Chapman and Hall, London, 1984, p. 50. Birks, H. J. B., Palaeogeogr. Palaeoclimatol. Palaeoecol. , 1985,50, 107. Reyrnent, R. A., Stockholm Contrib. Geol., 1963, 10, 1. Birks, H. J. B., and Gordon, A. D., “Numerical Methods in Pollen Analysis,” Academic Press, London, 1985. Jaynes, E. T., “Papers on Probability, Statistics and Statistical Physics,” Reidel, Dordrecht, 1982. Paper A71200 Received May 19th, 1987 Accepted July 28th, I987
ISSN:0003-2654
DOI:10.1039/AN9871201635
出版商:RSC
年代:1987
数据来源: RSC
|
5. |
Determination of chromium and molybdenum with 2-(5-bromopyridylazo)-5-diethylaminophenol by reversed-phase liquid chromatography |
|
Analyst,
Volume 112,
Issue 12,
1987,
Page 1659-1662
Chang-Shan Lin,
Preview
|
PDF (454KB)
|
|
摘要:
ANALYST, DECEMBER 1987, VOL. 112 1659 Determination of Chromium and Molybdenum with 2-( 5-Bromopyridylazo)-5-diet hylaminop henol by Reversed-p hase Liquid Chromatography Chang-shan Lin and Xiao-song Zhang Department of Applied Chemistry, University of Science and Technology of China, Hefei, Anhui, People’s Republic of China The reversed-phase liquid chromatographic determination of chromium(ll1) and molybdenum(V1) chelates with 2-(5-bromopyridylazo)-5-diethylaminophenol was investigated. The metal chelates in 50% ethanol solution were separated within 12 min by using methanol - tetrahydrofuran -water (10 + 15 + 75) containing 0.01 M lithium sulphate and 5 x 10-3 M Tris buffer (pH 7.7) as the mobile phase at a flow-rate of 1.0 ml min-l, and were detected at 600 nm. The detection limits for chromium and molybdenum are 0.066 and 0.12 ng, respectively. The method has been applied to the determination of chromium and molybdenum in alloy steel and waste water samples.Keywords: Chromium determination; molybdenum determination; 2-(5-bromop yridy1azo)-5-dieth yl- aminophenol; high-performance liquid chromatography; alloy steel and waste water analysis In recent years reversed-phase liquid chromatography (RPLC) has been shown to be a convenient means for the separation and determination of trace amounts of metal ions as chelates. Excellent reviews of this subject are available.l.2 Many reagents, including P-diketones,3 8-hydroxyquinoline,4 dithizone,s dithiocarbamates6.7 and 4-(2-pyridylazo)resorci- no1 (PAR)Sl4 have been used to chelate metal ions prior to RPLC separation.2-(5-Bromopyridylazo)-5-diethylaminophenol (5-Br- PADAP) has the same chelating system as PAR but is more sensitive for the determination of most metal ions. Although 5-Br-PADAP has been used extensively for the spectropho- tometric determination of transition metal~~15-17 and in thin-layer chromatography,l8 no liquid chromatographic method using this reagent has been reported for metal ions. 5-Br-PADAP is an unselective chelating agent that forms water-insoluble chelates with most transition metals and the high sensitivity of most 5-Br-PADAP chelates that absorb in the visible region suggests that they would be stable under RPLC conditions. In this study, the separation and simultaneous determina- tion of chromium and molybdenum in the form of 5-Br- PADAP chelates by RPLC are described.Experimental Reagents and Solutions All chemicals used were of analytical-reagent grade unless stated otherwise. Standard solutions of chromium(V1) and molybdenum(V1) were prepared by dissolving potassium chromate and ammo- nium molybdate, respectively, in water. Analytical-reagent grade 2-(5-bromopyridylazo)-5-diethyl- aminophenol was obtained from Beijing Chemicals Factory and dissolved in ethanol to give a 5 x 10-3 M concentration. Tris(hydroxymethy1)aminomethane (Tris) buffer solution (0.1 M) was prepared by dissolving the compound in water and adjusting to pH 7.7 with dilute sulphuric acid. The mobile phase was methanol - tetrahydrofuran - water (10 + 15 + 75, VlV) containing 0.01 M lithium sulphate and 5 X 10-3 M Tris buffer adjusted to pH 7.7 by adding dilute sulphuric acid prior to the addition of methanol and tetra- hydrofuran.Apparatus Liquid chromatography was carried out using a Shimadzu Model LC-4A HPLC instrument with a Model SPD-1 spectrophotometric detector and a Chromatopac C-R2A data processor. A 10-pm particle size Shim-pack PNH2 column (Shimadzu, 250 mm x 4.6mm i.d.) was used for all experiments. A Shimadzu Model UV-240 recording spectro- photometer was used for spectral measurements. A Shanghai Model pH S-2 pH meter was also used. Procedure To a nearly neutral sample solution (not more than 2 ml) containing 0.5-30 pg of chromium(V1) and 0.1-60 pg of molybdenum(VI), 2.0 ml of 0.1 M Tris buffer solution (pH 7.7), 1.0 ml of 10% mlV hydroxylamine hydrochloride solution and 2.0 ml of 5 x 10-3 M ethanolic 5-Br-PADAP solution were added.The mixture was heated in boiling water for about 20 min. After cooling, 5 ml of ethanol were added and the solution was diluted to volume in a 10-ml calibrated flask. The solution was filtered through a 0.45 pm membrane and a 20 pl aliquot was injected on to the column. The 5-Br-PADAP chelates were eluted with the methanol - tetrahydrofuran - water mobile phase at a flow-rate of 1.0 ml min-1. The chelates in the eluate were detected at 600 nm, the sensitivity being set at 0.02 or 0.01 absorbance for full-scale deflection. The amount of each metal was deter- mined by measuring the peak heights. Results and Discussion Effect of Concentrations of Methanol and Tetrahydrofuran A number of combinations of organic solvents with water, such as methanol -, acetonitrile -, tetrahydrofuran -, ethanol -, acetone -, isopropyl alcohol - and ethyl acetate - water, were investigated as mobile phases, but none of the binary mobile phase systems was found to be satisfactory for the separation of the 5-Br-PADAP chelates of chromium(II1) and molyb- denum(V1).Therefore, several ternary mobile phase systems were examined. A methanol - tetrahydrofuran -water system (A = 40% methanol and B = 20% tetrahydrofuran in water, VlV) was found to be the most suitable for the separation of the chelates. A simple methanol - tetrahydrofuran - water1660 ANALYST, DECEMBER 1987, VOL. 112 ternary mobile phase, however, gave poor peak shapes and low sensitivities.When lithium sulphate and Tris buffer were added to the ternary mobile phase, excellent peak shapes and high sensitivities were obtained. The effect of the concentra- tion of methanol and tetrahydrofuran in the mobile phase on the retention of the chelates is shown in Fig. 1. The optimum results were obtained with 25% A and 75% B (methanol - tetrahydrofuran - water, 10 + 15 + 75, V/V). 5 4, I I 1 I I 0 0.25 0.5 0.75 1 .o Volume ratio, B:A Fig. 1. Effect of concentration of methanol and tetrahydrofuran in the mobile phase on the retention times of 5-Br-PADAP chelates with chromium(II1) and molybdenum(V1). Mobile phase contains 0.01 M lithium sulphate and 5 X l o - 3 ~ Tris buffer (pH 7.7). Flow-rate, 1.0 ml min-l; column, Shim-pack PNH2. A = 40% V/V methanol, B = 20% V/V tetrahydrofuran in water 10 c .- E E ..- c C 0 a .- c c a E 5-Br-PADAP MoV1 -x-X X "IX 0 0.01 0.02 0.03 0.04 0.05 LiZSOdM Fig. 2. Effect of the concentration of lithium sulphate on the retention times of the 5-Br-PADAP chelates with chromium(II1) and molybdenum(V1) using methanol - tetrahydrofuran - water (10 + 15 + 75, V/V) as mobile phase; all other conditions as in Fig. 1 10 C .- E --. .- ; c E .- c al K c 5 h 5-Br-PADAP X M oVI 6 7 8 I x-x-x-x I I PH Fig. 3. Effect of pH of Tris buffer in the mobile phase using the same conditions as in Fig. 2 except that the concentration of lithium sulphate = 0.01 M Effect of Concentration of Lithium Sulphate The effect of lithium sulphate on the retention of each chelate was examined by adding the salt to the mobile phase.As shown in Fig. 2, the retention of all the species increased with increasing salt concentration as a result of salting-out effects. The salt may suppress the decomposition of the chelates on the column. The optimum chromatogram was obtained with 0.01 M lithium sulphate in the mobile phase. Effect of pH of Buffer Added to the Mobile Phase To examine the optimum pH range of the mobile phase, two buffers (0.01 M) were tested, potassium dihydrogen phosphate - disodium hydrogen phosphate and Tris, and the latter was found to be more suitable for the separation of the chelates and higher peak heights were obtained. The results obtained are shown in Fig. 3. It is evident that the retention of the molybdenum(V1) chelate remained unchanged in the pH range 7-8, but those of the chromium(II1) chelate and unreacted 5-Br-PADAP increased in the pH range 7-8.A constant peak height for each chelate was obtained in the pH range 7-7.8. Selection of Detection Wavelength In order to choose the wavelength for the detection of chromium(II1) - and molybdenum(V1) - 5-Br-PADAP che- lates, the absorption spectrum of each chelate in the mobile phase was measured by stopped-pump and wavelength-scan- ning methods at the respective retention time of the chelate. The absorption maxima of the chromium(II1) and molyb- denum(V1) chelates were exhibited at 595 and 605 nm, respectively, which were consistent with the data measured spectrophotometrically in 50% ethanol solution. The detec- tion wavelength was set at 600 nm. Pre-column Derivatisation of Chelates Preliminary experiments indicated that molybdenum(V1) formed a red chelate with 5-Br-PADAP only in the presence of hydroxylamine hydrochloride in the appropriate pH range at 100 "C.Hydroxylamine hydrochloride has the dual func- tions of reducing chromium(V1) to chromium(II1) [which reacts with 5-Br-PADAP to form the chromium(II1) - 5-Br-PADAP chelate] and accelerating the colour develop- ment of both molybdenum(V1) - and chromium(II1) - 5-Br- PADAP chelates. To examine the optimum pH range of the colour develop- ment of both chelates, a Tris - sulphuric acid - acetic acid buffer system was used. The peak heights of the chelates were almost constant in the pH range 4-8, and buffer concentra- tions < 0 . 0 5 ~ did not affect the peak heights.The effect of varying the concentration of 5-Br-PADAP in the sample solutions, containing 5 vg each of chromium(V1) and molyb- denum(VI), on the peak heights of their chelates was examined. Constant peak heights for the chromium(II1) and molybdenum(V1) chelates were obtained in the concentration range 5 x 10-4-2.5 x l o - 3 ~ . Hence, 1.0 ml of lOYo m/V hydroxylamine hydrochloride solution, 2.0 ml of 0.1 M Tris buffer solution (pH 7.7) and 2.0 ml of a 5 X 10-3 M 5-Br-PADAP ethanol solution were used in subsequent determinations.ANALYST, DECEMBER 1987, VOL. 112 1661 Neither chromium(V1) nor molybdenum(V1) reacted with 5-Br-PADAP at room temperature; therefore, the sample solution had to be heated in a boiling water-bath for 15-30 min to accelerate the reactions. Both chelates formed with 5-Br-PADAP under the above-mentioned conditions were stable for at least one month.0 5 10 Retention time/min Fig. 4. Chromatogram of chromium(II1) - and molybdenum(V1) - 5-Br-PADAP chelates on a Shim-pack PNH2 column with a 2 0 4 injection of a solution containing 1 p.p.m. each of chromium and molybdenum and 1 x 10-3 M 5-Br-PADAP using methanol - tetrahydrofuran - water (10 + 15 + 75, V/V) containing 0.01 M lithium sulphate and 5 X 10-3 M Tris buffer (pH 7.7) as a mobile phase at a flow-rate of 1.0 ml min-1. Detector wavelength, 600 nm; detector sensitivity, 0.04 absorbance unit for full-scale deflection ~ ~~ ~~~ ~ Table 1. Coefficients of linear regression analysis for calibration of the chromatographic detector Metal ion Parameter Chromium(V1) Molybdenum(V1) Concentration range, Range of amount p.p.m.. . . . . . 0.05-3.0 0.01-6 .O injectedhg . . . . . . 1-60 0.2-1 20 Slope,Amlg-1 . . . . 0.031 0.014 Intercept, A . . . . . . -0.0016 - 0.00008 Correlation coefficient . . 0.9998 0.9997 Chromatogram and Calibration Graphs A typical chromatogram for the separation of the 5-Br- PADAP chelates of chromium(II1) and molybdenum(V1) is shown in Fig. 4, The 5-Br-PADAP chelates were eluted in the order molybdenum then chromium, and the retention times for each chelate were 3.4 and 5.5 min, respectively, at a flow-rate of 1.0 ml min-1. Base-line separation of both chelates was achieved. The slopes and intercepts of the calibration graphs for the simultaneous determination of chromium and molybdenum calculated by linear regression analysis of the peak height (absorbance) versus metal ion concentration (p.p.m.) data are summarised in Table 1.The absolute detection limits calculated as the amount injected that gave a signal twice the background noise (signal to noise ratio 2 : 1) were 0.066 ng for chromium and 0.12 ng for molybdenum (full-scale deflection = 0.01 A). Effect of Foreign Ions The interference effect of numerous ions was examined by using the method to determine 5 pg each of chromium and molybdenum in the presence of each foreign ion in turn. An ion was considered not to interfere if it caused a change in the peak heights of chromium(II1) and molybdenum(V1) chelates of less than +5%. The amounts (in pg) found to be tolerable were as follows: aluminium(III), mercury(II), tungsten(I1) , zinc(II), 100; calcium(II), copper(II), lead(II), 50; iron(II), manganese(II), nickel@), 20; cadmium(II), titanium(IV), 10.The cobalt(I1) - 5-Br-PADAP chelate had an absorption band near the chromium(II1) chelate band and overlapping peaks were obtained, interfering seriously with the determination of chromium. The vanadium(V) - 5-Br-PADAP chelate had a similar retention time to the molybdenum(V1) chelate and caused serious interference in its determination. Applications to Real Samples Alloy steel analysis An alloy steel sample was dissolved with sulphuric acid - nitric acid. After appropriate dilution, chromium was oxidised with sodium persulphate in the presence of silver ions, and the excess of persulphate was destroyed by boiling.The solution was diluted to about 100 ml and adjusted to slightly acidic. A 50-ml volume of 5% mlV sodium hydroxide solution was added to precipitate iron, nickel, cobalt; copper, titanium, manganese, etc. The precipitate was filtered and washed with Table 2. Determination of chromium and molybdenum in alloy steel Found,* YO m/m Certified value, YO mlm Sample Chromium Molybdenum Chromium Molybdenum 70-5 0.23 0.30 0.23 0.315 42 0.07 97 1.18 - 0.072 - - 1.16 - * Average of five analyses. Table 3. Determination of chromium in waste water Sample Sample volume takedm1 Chromium-plating electrolyte waste water A . . . . 1 .o Chromium-plating electrolyte waste water B . . . . 1 .o 1 .o 1.0 Industrial waste water A . . . . . . . . . . . . 2.0 2.0 Industrial waste water B .. . . . . . . . . . . 2.0 2.0 * Average of two parallel determinations. Chromium added& 0 5.0 0 5.0 0 10.0 0 10.0 Chromium found*/pg Recovery, '/O 7.5 12.4 98 13.1 18.2 102 3.5 13.4 99 1.5 11.2 97ANALYST, DECEMBER 1987, VOL. 112 References Schwedt, G., Chromatographia, 1979, 12, 613. O’Laughlin, J. W., J . Liq. Chromatogr., 1984, 7 , 127. Gurira, R. C., and Carr, P. W., J . Chromatogr., 1982,20,461. Berthod, A., Kolosky, M., Rocca, J. L., and Vittori, O., Analwis, 1979, 7 , 395. Ohashi, K., Iwai, S . , and Horiguchi, M., Bunseki Kagaku, 1982, 31, E285. Schwedt, G., Chromatographia, 1978, 11, 145. Schwedt, G . , Chromatographia, 1979, 12, 289. Hoshino, H., Yotsuyanagi, T., and Aomura, K., Bunseki Kagaku, 1978, 27, 315. Watanabe, E., Nakajima, H., Ebina, T., Hoshino, H., and Yotsuyanagi, T., Bunseki Kagaku, 1983, 32, 469.Roston, D. A., Anal. Chem., 1984,56, 241. Hoshino, H., and Yotsuyanagi, T., Anal. Chem., 1985,57,625. Zhang, X.-S., Zhu, X.-P., and Lin, C . 6 , Talanta, 1986, 33, 838. Noffsinger, J. B., and Danielson, N. D., J . Liq. Chromatogr., 1986,9, 2165. DiNunzio, J . E., Yost, R. W., and Hutchison, E. K., Talanta, 1985, 32, 803. Shibata, S . , in Flaschka, H. A., and Barnard, A. J., Editors, “Chelates in Analytical Chemistry,” Volume 4, Marcel Dekker, New York, 1972, pp. 194-202. Johnson, D. A., and Florence, T. M., Talanta, 1975, 22, 253. Wang, X.-L., Wu, S . - S . , and Zhou, B.-J., Lihua Jianyan Huaxue Fence, 1983, 19,56. Hu, Z.-D., and Zhang, Y.-S., Chin. J . Chromatogr., 1985, 3, 191.1% mlV sodium hydroxide solution. The filtrate was adjusted to be slightly acidic with sulphuric acid, transferred into a 250-ml calibrated flask and diluted to volume. A 2-ml aliquot of the solution was taken, and chromium(II1)- and molybdenum(V1) - 5-Br-PADAP chelates were derived and determined by liquid chromatography as described. The results showed good agreement with the certified values (Table 2). Waste water analysis The results of the determination of chromium in chromium- plating electrolyte waste water and industrial waste water by coupling a standard additions procedure with the liquid chromatographic separation of chromium(II1) - 5-Br-PADAP chelates are shown in Table 3. The recoveries of chromium were 97-102%. Six replicate analyses of chromium-plating electrolyte waste water A and industrial waste water A gave average chromium concentrations of 7.4 and 1.73 mg 1-1 with relative standard deviations of 3.5 and 4.2%, respectively. Conclusion The determination of chromium and molybdenum as their 5-Br-PADAP chelates by pre-column derivatisation using reversed-phase high-performance liquid chromatography is sensitive and relatively free from interferences. The results of the determination for seven samples showed that the precision and accuracy are satisfactory. It is anticipated that numerous 5-Br-PADAP chelates, in addition to those of chromium(II1) and molybdenum(VI), can probably be separated and determined by this method, employing different bonded-phase materials as stationary phases. 1. 2. 3. 4. 5. 6. 7. 8. 9. 10. 11. 12. 13. 14. 15. 16. 17. 18. Paper A7171 Received February 27th) 1987 Accepted June 8th) 1987
ISSN:0003-2654
DOI:10.1039/AN9871201659
出版商:RSC
年代:1987
数据来源: RSC
|
6. |
Quantitative determination and stability of lisuride hydrogen maleate in pharmaceutical preparations using thin-layer chromatography |
|
Analyst,
Volume 112,
Issue 12,
1987,
Page 1663-1665
Monir Amin,
Preview
|
PDF (1556KB)
|
|
摘要:
ANALYST, DECEMBER 1987, VOL. 112 1663 Quantitative Determination and Stability of Lisuride Hydrogen Maleate in Pharmaceutical Preparations Using Thin-layer Chromatography Monir Amin Faculty of Pharmacy, Department of Analytical Chemistry, A1 Azhar University, Cairo, Egypt A thin-layer chromatographic (TLC) method has been developed for the determination of lisuride hydrogen maleate (LHM). Lisuride hydrogen maleate [3-( 10,l Oa-didehydro-7-methyI-9a-ergolinyl)-l , I -diethylurea hydrogen maleate] is a substance which has an effect on the CNS. The extraction of the active ingredient from the drugs is performed in a fully automated, electronically controlled extraction apparatus within 2-5 min. The solutions are applied to pre-coated silica gel 60 F254 TLC plates and the separation of the active substance from the inactive ingredient is complete within 20 min.The separated spots are quantified by a thin-layer chromatogram spectrophotometer and the data are processed on-line using a CHB 66/80 computer. The developed chromatographic method is suitable for quality control and for long-term stability studies of LHM in pharmaceutical preparations and can be reproduced with a maximum coefficient of variation of 2.5%. The stability study of LHM in pharmaceutical preparations indicates that this substance is stable only at low temperatures (20 "C). At high temperatures (40,50 and 60 "C) degradation of LHM occurs after a short storage time. The prediction of the stability study shows that LHM is stable for at least 32 months at 20 "C (room temperature) in solid form.In solutions LHM was completely degraded after a very short time even at 20 "C. Keywords: Thin-layer chromatography; lisuride hydrogen maleate; pharmaceuticals The determination of the long-term stability and after 1, 3, 6, 9 and 12 months' storage of the tablets. LHM in pharmacokinetic data of pharmaceutical preparations solution was determined only after 1 month of storage because requires analytical methods that are reliable in the presence of it was completely degraded at temperatures of 40, 50 and mixtures of excipients :ji:d degradation products. In this work 60 "C. a fast, specific and sensitive thin-layer chromatographic (TLC) method has been developed for the determination of lisuride hydrogen maleate (LHM) . The separation efficiency, accuracy and simplicity of TLC in the determination of active compounds in pharmaceutical preparations are well estab- lished.1-12 Experimental Materials and Reagents Preparation A. Tablets with 25 pg of lisuride hydrogen Preparation B. Powder mixture containing 10 mg of lisuride Preparation C. Methanolic solution containing 100 pg ml-l Pure substance. The drug examined was 10 mg of lisuride maleate per tablet. hydrogen maleate per 100 mg. of lisuride hydrogen maleate. hydrogen maleate dissolved in 100 ml of methanol. Procedures Sample preparation Preparations A and B. The active ingredient was extracted from the pulverised tablet and the powder mixture with methanol. A fully automated and electronically controlled extraction apparatus (produced by W.Krannich K.G., Gottingen, FRG) (Fig. 1) was used. All the operations such as the addition of the extraction solvent, stirring, heating and filtration were controlled by a computer as described previously. 13,14 In the absence of the extraction apparatus the active ingredient can be extracted from the pulverised tablets or from the powder mixtures with methanol. The filtered solutions of the active ingredient were diluted or concentrated so that 1 ml of the solution contained 0.1 mg of lisuride hydrogen maleate. For the stability study of LHM the tablets and solutions were stored at temperatures of 20, 40, 50 and 60 "c. The contents of the active ingredient in tablets were determined Fig. 1. Amin et al. 13 Electronically controlled extraction apparatus according to1664 ANALYST, DECEMBER 1987, VOL.112 Thin-layer chromatography Pre-coated silica gel 60 F254 plates, 20 X 20 cm with a layer thickness of 0.25 mm (E. Merck, Darmstadt, FRG) divided into 12 1.5-cm bands, were used. The standard and sample solutions were applied with automatic spotting micropipettes (1.2 and 5 1.11 according to Dr. Barrolier) in the form of spots. The sample solutions were usually applied twice, standard solutions only once, hence a total of four different samples could be spotted on one plate. To obtain a calibration graph, amounts of LHM between 0.5 and 2 pg per spot were applied. After saturation of the chamber with the mobile phase (chloroform - methanol, 85 + 15) LHM was separated. The length of the run was about 15 cm and the developing time was approximately 20 min. The determination must be carried out with the exclusion of light at <20"C. Measurement and evaluation The lisuride hydrogen maleate spots were measured directly on the TLC plates with a PMQII thin-layer chromatogram spectrophotometer. The quantitation of the spots was carried out either on-line, using a CHB 66/80 computer, on the basis of the calibration lines set up on the same TLC plate, or with reference to the degree of remission of a reference spot of known concentration.4 Results Figs.2 and 3 show the reflection spectrum and the linearity of the relationship between the concentration and the remission for LHM. Fig. 4 shows the separation of LHM from the pharmaceutical preparations. Typical chromatograms obtained by TLC for pharmaceutical preparations are shown in Figs.5 and 6. Prediction of Stability The prediction of the stability of lisuride hydrogen maleate in tablets was determined using the temperature coefficient Q (Fig. 7) of the substance. The temperature coefficient was estimated graphically from the degradation curves of lisuride hydrogen maleate at 40 and 50 "C and/or at 50 and 60 "C (Fig. 7) using the relationships where tx = time for 10% decomposition at x "C. 90 1 2 1 , , , , , , , I 70 220 240 260 280 300 320 340 Wavelengthhm Fig. 2. Reflection spectrum of 2 pg of LHM on a silica gel plate (Amax, = 245 nm) I I I I I 0.5 1 1.5 2 Amount of LHM/pg Fig. 3. silica gel plate (Amax. = 245 nm) Calibration line in retlected light of LHM measured on a S A A A S B B B S C C C Fig.4. Separation of LHM from pharmaceutical preparations on a silica gel plate. S = standard solution; A, B, C = solutions of preparations A, B and C, respectively. Amount applied = 2 pg per spot. The photograph was taken with a Reprostar camera at 365 nm S A B C Fig. 5. Typical thin-layer chromatogram of LHM on a silica gel plate. S = standard solution; A, B, C = solutions of preparations A, B and C, respectively. Amount applied = 2 pg per spot Fig. 6. (calibration graph). Amounts applied: 0.5, 1, 1.5 and 2 pg per spot Typical thin-layer chromatogram of LHM on a silica gel plate If 10% of the lisuride hydrogen maleate was degraded at 50 "C after a storage time of 3 months (Q = 2.31) then the same amount of degradation, 10% of the active ingredient, will occur ( i ) at 40 "C after 2.31 x 3% = 6.9 months, (ii) at 30 "C after 2.312 X 3% = 16.0 months and (iii) at 20 "C after 2.313 X 3% = 36.9 months.Similarly, if 10% of lisuride hydrogen maleate was degraded at 40 "C after a storage time of 6.7 months and Q = 2.23, thenANALYST, DECEMBER 1987, VOL. 112 1665 80 I I I I I 2 4 6 8 10 12 Storage time/months Fig. 7. De radation of LHM in tablets after storage at 20,40,50 and 60 "C for different periods of time ~ ~ Table 1. Reproducibility of the determination of lisuride hydrogen maleate using the pure substance. Results are the average of nine measurements Amountappliedperspot . . . . . . 2'": * Arithmetic mean of the remission (x) . .81.61 /O Standard deviation of the individual values (S.D.) . . . . . . . . .. 1.01% Coefficient of variation . . . . . . . . 1.24% RF R1J *Relative degree of remission ("/o) = - x 100, where RF = degree of remission of the compound spot and RU = degree 01 remission of the lower stratum of the plate. Table 2. Determination of lisuride hydrogen maleate in pharmaceutical preparations A, B and C. Results are the means of 11 determinations. Amount of active ingredient found Amount of Coefficient Preparation in preparation Mean S.D. variation,% A . . . . 25 pgpertablet 24.85pg 0.516pg 2.07 B . . . . 10mgper100mg 10.02mg 0.234mg 2.35 C . . . . 100 pg ml-1 99.80 pg 1.945 pg 1.95 active ingredient of Table 3. Results of studies on the stability of lisuride hydrogen maleate in pharmaceutical preparations. Results are means of six samples; value after preparation but before storage, 99.4 + 2.07%, results quoted as percentage lisuride hydrogen maleate in tablets or solution Storage temperature Storage time/ LHM in tablets:- months 20 "C 40 "C 50 "C 60 "C 1 99.3 * 2.1 99.1 f.1.8 98 f 1.6 93 * 1.8 3 99.0 k 2.1 96 C 1.7 90 k 2.5 80 iz 1.9 6 97.5 k 2.2 91 f 2.4 75 f 3.0 62 k 2.6 9 96.0 k 2.2 86 f 2.7 60 -C 2.9 43 + 2.5 12 93.5 k 2.9 81 k 2.7 45 _+ 2.7 25 _t 3.0 .LHM in solution:- 1 50k2.5 4 + 1.3 0* O* * Active ingredient was completely degraded. the same amount of degradation, 10% of the active ingredient, will occur ( i ) at 30 "C after 2.23 x 6.7% = 14.9 months and (ii) at 20 "C after 2.232 x 6.7% = 33.3 months. Discussion The TLC method described is sensitive, rapid and accurate for the determination of lisuride hydrogen maleate as the pure substance and in pharmaceutical preparations such as tablets and solutions.It is suitable for quality control (Tables 1 and 2) and can be reproduced with a maximum coefficient of variation of 2.35%. The method is also suitable for long-term stability studies (Table 3) and is applicable to the separation of the degradation product of LHM. The degradation product occurs as a fluorescent spot (excitation wavelength 365 nm, at h,,,, = 440 nm) and has a smaller RF value than the active substance. This degradation product and the pathway of its formation are still under investigation. The stability studies on LHM show that this substance degrades after short storage times at 50 and 60 "C. After 12 months' storage at 50 and 60 "C it was found that about 55 and 75% of the initial dose were degraded, respectively.The tablets and powder containing this active ingredient should be kept at temperatures no greater than 20 "C in order to achieve a shelf-life of 2 years. In solution, the substance demonstrates considerable instability even at 20 "C. The degradation was such that at 20 "C after 1 month's storage, only 50% of the active ingredient remained. With storage at 40 "C this fell to 4% and at 50 and 60 "C no active ingredient remained in solution after 1 month. It is therefore recommended that pharmaceutical preparations containing LHM must be solid and should not be used more than 2 years after the production date. To achieve a shelf-life of longer than 3 years the drug (tablets, capsules or powder) should be refrigerated at 4 "C. 1. 2. 3. 4. 5. 6. 7. 8. 9. 10. 11. 12. 13. 14. References Filke, W. W., Anal. Chem., 1966,38, 1967. Schlemmer, W., J . Chromatogr., 1971, 63, 121. Oesterling, T. O., Morozowich, W., and Roseman, T. J., J . Pharm. Sci., 1972, 61, 1861. Amin, M., and Jakobs, U., Fresenius 2. Anal. Chem., 1974, 268, 119. Amin, M., J . Chromatogr., 1974, 101, 387. Martin, J. L., Duncombe, R. E., and Shaw, W. H. C., Analyst, 1975, 100, 243. Amin, M., J . Chromatogr., 1975, 108, 313. Amin, M., Fresenius 2. Anal. Chem., 1987, 328, 114. Tauber, U., Amin, M., Fuchs, P., and Speck, U., Arzneim. Forsch. (Drug Res.), 1976, 26, 1492. Amin, M., and Sepp, W., J . Chromatogr., 1976, 118, 225. Amin, M., and Jakobs, U., J . Chrornatogr., 1977, 131, 391. Touchstone, J . C., and Sherma, J., Editors, "Densitometry in Thin Layer Chromatography," Wiley-Interscience, New York, 1979. Amin, M., Korbakis, Z., and Petrick, D., Fresenius 2. Anal. Chem., 1976, 279, 283. Amin, M . , in Bertsch, W., Hara, S . , Kaiser, R., and Zlatkis, A., Editors, "Instrumental HPTLC," Hiithig, Heidelberg, 1980, pp. 9-37. Paper A61261 Received August 4th, 1986 Accepted July 3rd, 1987
ISSN:0003-2654
DOI:10.1039/AN9871201663
出版商:RSC
年代:1987
数据来源: RSC
|
7. |
Isolation of codeine and norcodeine from microbial transformation liquors by preparative high-performance liquid chromatography |
|
Analyst,
Volume 112,
Issue 12,
1987,
Page 1667-1670
Mark Gibson,
Preview
|
PDF (582KB)
|
|
摘要:
ANALYST, DECEMBER 1987, VOL. 112 1667 Isolation of Codeine and Norcodeine From Microbial Transformation Liquors by Preparative High-performance Liquid Chromatography Mark Gibson," Terry M. Jefferiest and Colin J. Soper School of Pharmacy and Pharmacology, University of Bath, Claverton Down, Bath BA2 7AY, UK Codeine and norcodeine have been extracted together from a 7-1 batch of microbial transformation liquor using an XAD-4 resin column. The extract was divided equally and used to compare normal-phase (NP) and reversed-phase (RP) preparative high-performance liquid chromatography for the recovery of pure codeine and norcodeine. The conditions for each column were optimised for maximum sample throughput with complete resolution of both compounds. The aqueous fractions of codeine and norcodeine from the RP column were also passed through an XAD-4 resin column to extract the compounds finally, whereas the organic solvent fractions from the NP column did not require this step.The RP procedure gave recoveries of 71.25% of codeine (98.6% purity) and 80.69% of norcodeine (97.5% purity) and the NP procedure gave recoveries of 79.20% of codeine (96.7% purity) and 83.97% of norcodeine (97.2% purity). Owing to its superior sample throughput and lower operating costs, the RP procedure is recommmended for routine use. Keywords: Codeine and norcodeine isolation; preparative high-performance liquid chromatography; microbial transformation liquors; XA 0-4 resin extraction N-Demethylation is often an important intermediate step in drug synthesis and manufacture, but such reactions can be difficult to achieve by chemical means, requiring toxic reagents and often resulting in low or variable yields.The use of micro-organisms may offer a potentially safer and more efficient means of preparing N-demethylated drug inter- mediates.' It has been demonstrated that certain species of the fungus Cunninghamella are capable of N-demethylating a variety of pharmaceutically important drug molecules. For example, in a model transformation system, the tertiary amine' codeine is selectively converted by Cunninghamella to its N-demethylated metabolite norcodeine, rather than to the 0-demethylated metabolite morphine.' The codeine to norcodeine transformation has been scaled up from shake flasks to a laboratory fermenter with a 7-1 capacity in order to produce workable amounts of the desired norcodeine transformation product.After the transformation period, pure norcodeine and codeine were recovered by an initial extraction of the two compounds on to an XAD-4 resin column, followed by their separation using preparative high-performance liquid chromatography. Both normal-phase (NP) and reversed-phase (RP) columns were optimised for maximum sample throughput with complete resolution and a comparison was made of the recoveries achieved using half the XAD-4 resin extract for each column. The yield and purity of codeine and norcodeine were determined by analytical HPLC after each stage of the process. Experimental Apparatus The analytical HPLC system consisted of an LDC Consta- metric I11 pump, a Rheodyne 7125 injection valve fitted with a 100-yl loop and a Pye Unicam UV detector coupled to a Perkin-Elmer Sigma 10 data station.A 150 x 4.6 mm i.d. stainless-steel column was slurry packed with 5-pm ODS- Hypersil. * Present address: R and D Laboratories, Cyanamid of Great t To whom correspondence should be addressed. Britain Ltd., Fareham Road, Gosport, Hampshire PO13 OAS, UK. The preparative HPLC system consisted of a Model 830 preparative liquid chromatograph (Du Pont , Stevenage, UK), which used a Haskel pneumatic amplifier pump of 70 ml stroke capacity, a Rheodyne 7125 injection valve fitted with either a 2-ml or 10-ml sample loop, a three-port eluent fraction collection assembly, a Du Pont UV - visible detector fitted with a preparative flow cell and a Servogor 220 recorder.The injection port assembly, HPLC column and mobile phase were maintained at 25 "C by an integral oven. The fermenter used to produce the transformation material was a 2000 Series of 7-1 capacity supplied by L. H. Engineering (Stoke Poges, UK).2 The microbial transformation medium has been described previously.3 Reagents and Materials Codeine phosphate was purchased from MacFarland Smith (Edinburgh, UK). Norcodeine base reference standard was prepared from codeine by the method of Montzka et al.4 Acetonitrile, methanol, chloroform, 1,2-dichloroethane and propan-2-01 were all of HPLC grade and all other solvents were of laboratory-reagent grade from Fisons (Loughbor- ough, UK). De-ionised, glass doubly distilled water was used for the preparation of reagent solutions and HPLC aqueous mobile phases.Potassium dihydrogen phosphate (KH2P04), disodium hydrogen phosphate (Na2HP04.2H20) and ammo- nium acetate were of analytical-reagent grade and triethyl- amine was of laboratory-reagent grade, from BDH (Poole, UK). Glacial acetic acid and pentanesulphonic acid sodium salt (HPLC grade) were from Fisons, UK. Amberlite XAD-4 resin material (BDH) was washed thoroughly before use with methanol followed by distilled water. All HPLC station- ary phases were obtained from Jones Chromatography (Glamorgan, UK) . Procedures Analytical HPLC The mobile phase consisted of 5 mM pentanesulphonic acid sodium salt in 4.4 mM Sorensen's phosphate buffer (pH 6.5) - acetonitrile (84 + 16). Detection was at 240 nm at a flow-rate of 1.5 ml min-1.A series of calibration solutions were prepared in the mobile phase from a stock solution containing 100 mg each of codeine and norcodeine to produce a calibration range of 50-250 pg ml-1, Each solution was1668 ANALYST, DECEMBER 1987, VOL. 112 Table 1. Accuracy and precision data for 10 replicate injections of codeine and norcodeine at 50 pg ml-1 in the mobile phase and fermention liquor, using the reversed-phase analytical HPLC assay Mobile phase Transformation liquor Parameter Codeine Norcodeine Codeine Norcodeine Mean peak Standard Relative standard Range of error for heightlmm . . . . 41.6 64.2 40.8 63.7 deviation/mm . . 0.765 1.129 0.799 1.159 deviation (RSD),% 1.84 1.76 1.96 1.86 duplicate injections,*% . . 2.94 2.81 3.13 2.91 Accuracy,?% .. . . - - 98.1 99.2 t X RSD **- where t = 2.26 at 10 - 1 degrees of freedom and n = 2. fi t Mean concentration in transformation liquor divided by mean concentration in mobile phase. Table 2. Recovery of codeine and norcodeine from spiked fermenta- tion liquor using the XAD-4 resin procedure Codeine added in Flask 250 mlhg 1 100 2 70 3 50 4 30 5 0 Norcodeine Recovery, % 250 m l h g Codeine Norcodeine 0 84.9 - 30 83.6 96.5 50 82.5 95.8 70 83.1 97.3 100 - 97.7 Mean: 83.5 96.8 added in consisting of 95% ethanol - 1,2-dichloroethane - acetonitrile (50 + 45 + 5) and containing 4 x lO-5% of triethylamine. The flow-rate was optimised at 25 ml min-1 at 25 "C and detection at 260 nm. The reversed-phase system consisted of a 500 X 22 mm i.d. 25-40 pm LiChroprep RP-18 column with a mobile phase consisting of methanol - phosphate buffer at pH 7.0 (40 + 60).The optimum flow-rate was found to be 20 ml min-1 at 25 "C and detection at 260 nm. Results and Discussion Analytical HPLC A satisfactory procedure was required to determine the initial amounts of codeine and norcodeine in transformation liquor and also to determine the yield and purity of these compounds at each stage of the extraction scheme. Reversed-phase chromatography was selected for development because it was envisaged that these compounds could be determined directly from the transformation liquor without the prior need for an organic extraction. This was achieved by using the ion-pairing reagent pentanesulphonic acid sodium salt to increase the retention of codeine and norcodeine away from a large unretained peak caused by components in the transformation medium.The external standard method described under Procedures gave linear relationships for both codeine and norcodeine over the concentration range 50-250 pg ml-1 (r > 0.998). The precision of the method was tested by the addition of codeine and norcodeine at 50 pg ml-1 to the mobile phase and to the transformation liquor and this gave a relative standard deviation for replicate injections (n = 10) of <2%. The accuracy in measuring these compounds directly in the transformation liquor was better than 98% (Table 1). assayed in triplicate and the peak heights were used for the calibrations. Samples were determined by an external standar- disation procedure in which standard solutions (100 pl), containing codeine and norcodeine in the mobile phase, were injected alternately with extracted medium samples, or transformation medium that had been diluted with the mobile phase if necessary to fit the calibration range.Extraction with Amberlite XAD-4 resin A 500 X 25 mm i.d. glass column with tap was packed with 100 g of Amberlite XAD-4 resin in distilled water. The pH of media samples containing codeine and norcodeine was adjus- ted to pH 9.5 with 2 M NaOH and the media were pumped through the column at 2 ml min-1 using a peristaltic pump. The alkaloids were desorbed from the column using acetone at 2 ml min-1, dried with anhydrous MgS04, filtered (Whatman No. 1) and evaporated to dryness under vacuum using a rotary evaporator.Preparative HPLC Columns were slurry-packed in chloroform - methanol (4 + 1) using 0.7 g of packing material per cm3 of column volume at 5000 lb in-2 using the equipment and procedures described by Hamilton and Sewell.5 The mass of material packed into each column was determined by the collection, drying and weighing of excess material. Residues to be injected were dissolved in the minimum volume of mobile phase, filtered (0.45 pm Millipore Type HA) and introduced using either a 2- or 10-ml loop. Fractions were collected by monitoring the UV detector and manually operating a four-way valve connected to the detector outlet. This permitted three fractions to be collected individually or the mobile phase to be recycled, if required. The normal-phase system consisted of a 250 X 22 mm i.d.15-25 pm LiChroprep Si 60 column with a mobile phase Extraction with Amberlite XAD-4 Resin Previous workers have demonstrated that the extraction of drug molecules from body fluids with Amberlite resins is comparable to, if not better than, the conventional liquid - liquid extraction techniques in terms of time, efficiency and cost. 6 8 The resin extraction procedure involves single adsorp- tion and desorption steps and the over-all resin extraction efficiency is dependent on factors such as the pH of the sample to be extracted, operating temperature, flow-rate of sample through the column, polarity of the eluting solvent and flow-rate of the eluent.8 Preliminary investigations of three column materials were conducted on a 1-g scale to compare their extraction perfor- mances for codeine and norcodeine (10 mg of each) from the transformation media.Partisil 10 ODS-2 was chosen because of its high carbon loading (22%) and its performance determined under different conditions of temperature, flow- rate, pH of transformation medium and polarity of eluting solvent. The maximum recovery of codeine and norcodeine occurred at 4 "C and 5 ml min-1 when the medium pH was 8.5 and contained 5% methanol. Acetone was the eluting solvent. Extraction efficiencies for codeine and norcodeine under these conditions were 70 and 68%, respectively, or 6.9 and 6.7 mg of test compound per gram of packing material. Amberlite XAD-2 and XAD-4 resins were examined at 25 "C and 1 ml min-1 with the same amounts of the two compounds as before, finally eluting with acetone.The maximum extraction efficiency from XAD-2 occurred at pH 8.5 for codeine (71%) and pH 9.5 for norcodeine (87%). Although XAD-4 behaved similarly towards pH, extraction efficiencies were higher, and at pH 9.5 were 76 and %YO, respectively; hence XAD-4 was selected for further studies. In microbial transformation studies variable concentrations of substrate and transformation products are encountered, with the product concentration in the transformation liquor increasing at the expense of the substrate. Therefore, theANALYST, DECEMBER 1987, VOL. 112 1669 Ti me/m i n Fig. 1. Preparative HPLC separations obtained under maximum throughput conditions, as described under Procedures. (a) 15-25 vm LiChroprep Si 60,250 x 22 mm i.d.column; 2-ml injection containing 135 mg each of (A) codeine and (B) norcodeine bases; flow-rate, 25 ml min-1. (b) 25-40 pm LiChro rep RP-18, 500 X 22 mm i.d. column; 10-ml injection containing (A7552 and (B) 450 mg of the codeine and norcodeine bases, respectively; flow-rate, 20 ml min-1 Fermentation liquor containing codeine and norcodeine Basify to pH 9.5 (5 M NaOH) Filter (Whatman No. 1) Wash cells (distilled water) 1 (A) XAD-4 extraction Acetone eluate divided equally, evaporated to dryness and dissolved in appropriate mobile phase Normal phase PreDarative HPLC Reversed-phase Preparative HPLC Evaporate methanol I ’ I from mobile Evaporate organic mobile phase c 4 phase Codeine Norcodeine c Codeine Norcodeine 1 1 (6) XAD-4 XAD-4 Evaporate 1 1 acetone eluate Codeine Norcodeine Fig.2. phase and reversed-phase preparative HPLC Comparison of extraction procedures employing normal- recovery of codeine and norcodeine was determined over a range of concentrations employing a 100-g XAD-4 resin column as described under Procedures. The results in Table 2 show that the mean recoveries are high at 83.5 and 96.8% for codeine and norcodeine, respectively, and that these recoveries are largely independent of the relative proportions of codeine and norcodeine present in the transformation liquor. Preparative HPLC Thin-layer chromatography was used to optimise the normal- phase separation of codeine and norcodeine using HPLC compatible solvents and silica gel layers. The mobile phase which gave R, values of 0.33 and 0.15 for these compounds was tested on a 150 x 4.6 mm i.d.silica column and gave k’ values of 2.0 and 9.2, respectively, with the resolution calculated as 7.3. This system was then scaled up to the 250 X 22 mm i.d. column and the conditions given in Procedures were used. The optimum flow-rate was determined using 4 mg ml-1 each of codeine and norcodeine and an injection volume of 2 ml at flow-rates from 15 to 40 ml min-1. Column efficiencies were calculated as reduced plate height (h) and plotted against mobile-phase velocity. The minimum values of 4.0 and 7.0 for h were achieved for norcodeine and codeine, respectively, at a flow-rate of 25 ml min-1. The maximum sample loading which could be used to produce complete resolution (100% purity) at the optimum flow-rate was then determined and found to be 135 mg each of codeine and norcodeine in a 2-ml injection volume. Two parameters that are commonly used to describe the performance of a preparative HPLC system9 are the “loading capacity,” defined as the maximum amount of sample that can be applied without impairing the separation efficiency, and the “throughput ,” defined as the amount of pure substance that can be separated per unit time.Using these definitions the silica column had a sample capacity of 270 mg, equivalent to 5.12 mg of sample per gram of silica, and a throughput of 9 mg of alkaloid per minute using a 30-min run time per sample. Fig. l ( a ) is a chromatogram obtained under these conditions of maximum throughput. For the reversed-phase system, a 100 x 4.6 mm i.d.10-pm Partisil ODs-2 column was used to examine the conditions suitable for scaling up. Methanol, propan-1-01, tetrahydro- furan and acetonitrile were compared for their selectivity using ammonium acetate - acetic acid as buffer at various pHs. A methanol - acetate buffer of pH 5.5 (30 + 70) gave k’ values of 3.4 and 7.0 for norcodeine and codeine, respectively, with a resolution of 1.8. This was then scaled up to a 500 X 22 mm i.d. column of LiChroprep RP-18 and the methanol concentration increased to reduce the analysis time. Phosphate buffer was found to increase the retention of codeine compared with that obtained in acetate buffer, so that with a 2-ml loop, the resolution increased from 1.75 to 2.51, although the analysis time also increased from 25 to 45 min.The optimum flow-rate was determined and minimum h values of 7 and 12.8 were obtained for codeine and norcodeine, respectively, at a flow-rate of 20 ml min-1. The loading capacity was then studied at this flow-rate using a 10-ml sample loop with the detector sensitivity set to a minimum (2.56 a.u.f.s.) at 260 nm. Increasing amounts of codeine and norcodeine were intro- duced and the fractions collected and analysed for purity. It was found that at a purity of >98%, 552 mg of codeine base and 450 mg of norcodeine base could be separated, equivalent to a sample loading of 9.48 mg g-1 of packing material and a loading capacity of 1002 mg. Under these conditions the column throughput was 22.3 mg of alkaloid base per minute using a 45 min run time per sample.Fig. l(6) is a chromato- gram obtained under these conditions of maximum through- put. Comparison of Normal and Reversed-phase Preparative Systems Fig. 2 summarises the two procedures used for the recovery of pure codeine and norcodeine from the transformation liquor using XAD-4 resin either with normal-phase or reversed- phase preparative HPLC. The advantage of using an organic mobile phase with the normal-phase column is the ease of compound recovery and HPLC-grade solvents were used to avoid contamination of the compound residues. The reversed-1670 ANALYST, DECEMBER 1987, VOL. 112 Table 3. Recoveries of codeine and norcodeine from a 7-1 batch of fermentation liquor at each stage of the two extraction schemes given in Fig. 2 Norcodeine Codeine Stage of procedure Amount recovered/g Recovery,% Amount recovered/g Recovery,% Starting transformation mixture, 6900 ml 1.9044 100 XAD-4 (1OOg) extraction, (A) .. . . 1.6592 87.12 Sample divided into two equal fractions 0.8296 + 87.12 0.8296 Reversed-phase preparative HPLC . . 0.7946 80.95 71.25 XAD-4 (100 g) extraction, (B) . . . . 0.6785 79.20 Normal-phase preparative HPLC . . 0.7542 98.6% (purity) 96.7% (purity) 0.8942 100 0.8246 92.2 0.4123 + 92.2 0.4123 0.3973 88.86 80.69 83.97 0.3608 0.3755 97.5% 97.2% phase system required an additional extraction step (B) to follow the HPLC which increased the time of the complete procedure. The total amount of codeine and norcodeine in the 7-1 fermentation run exceeded the extraction capacity of the XAD-4 column and so was extracted as three separate volumes.The three acetone extracts obtained were combined and then divided into two equal volumes so that the silica and reversed-phase systems could be compared directly. The content of codeine and norcodeine in the fermentation liquor was determined at each stage of the two procedures and the results are given iri Table 3. They show that both systems are very suitable for the isolation of codeine and norcodeine from the fermentation liquor. The normal-phase system gave higher recoveries for codeine (79.2%) and norcodeine (84%) compared with those from the reversed-phase system, which were 71.3 and 80.7%, respectively. This was entirely due to the additional XAD-4 extraction step (B) required. Calcula- tions based on the data in Table 3 show that this step is the least efficient in the scheme, the recoveries of 87.1 and 92.2% for codeine and norcodeine at step A are repeated at step B with recoveries of 85.4 and 90.81%, respectively.Similar calculations can be made for the extraction efficiencies of the two HPLC systems which show that the recoveries of codeine and norcodeine using the reversed-phase method are 95.8 and 96.4%, respectively, compared with 90.9 and 91.1% for the normal-phase method. The extraction scheme should therefore be based on the reversed-phase system and the efficiency of the XAD-4 step improved. This could probably be achieved by using the recently introduced polymeric HPLC materials which have similar polystyrene - divinylbenzene structures and properties to XAD-4, but much higher efficiency owing to their small (8-10 pm) particle size. A 300 x 7.5 mm i.d.Polymer PLRP-S column (Polymer Laboratories, Shropshire , UK) has been used by one of us for the large-scale separation of lipids from organochlorine pesticides and polychlorinated biphenyls. 1” This would mean of course that each extraction scheme now contained two HPLC steps, unless the polymeric column could also be used to separate codeine from norcodeine. However, the investigation of such a column is outside the scope of this paper. The superior throughput of the reversed-phase column (22.3 mg min-1 of alkaloid base) compared with the silica column (9.0 mg min-1 of alkaloid base) also indicates that reversed-phase HPLC should be used. Even if the silica column were doubled in length to equal the reversed-phase column, its increased throughput would only be 18 mg min-1, assuming the mobile phase composition could be adjusted to retain the same analysis time. Finally, the cost of using the reversed-phase system is less than that for the normal-phase system because of the lower cost of the aqueous - organic mobile phase and the long life of the reversed-phase material. On ~ columns are difficult to re-activate ciency and hence throughput. the and other hand; silica gradually lose effi- 1. 2. 3. 4. 5. 6. 7. 8. 9. 10. References Sewell, G. J., Soper, C. J., and Parfitt, R. T., Appl. Microbiol. Biotechnol., 1984, 19, 247. Sewell, G. J., PhD Thesis, Bath University, 1982. Gibson, M., Soper, C. J., Parfitt, R. T., and Sewell, G. J . , Enzyme Microb. Technol., 1984, 6 , 471. Montzka, T. A., Matiskella, J . D., and Partyka, R. A., Tetrahedron Lett., 1974, 14, 1325. Hamilton, R. J . , and Sewell, P. A., Editors, “Introduction to HPLC,” Chapman and Hall, London and New York, 1982, p. 90. Miller, W. L., Kullberg, M. P., Banning, B. E., Brown, L. D., and Doctor, B. P., Biochem. Med., 1973, 7, 145. Kullberg, M. P., Miller, W. L., McGowan, F. J . , and Doctor, B. P., Biochem. Med., 1973, 7, 323. Weisman, N., Lowe, M. L., Beattie, J. M., and Demetrion, J . A., Clin. Chem., 1971, 17, 875. Dejong, A. W. J., Poppe, H., and Kraak, J. C., J. Chroma- t o p . , 1978, 148, 127. Seymour, M. P., Jefferies, T. M., and Notarianni, L. J., Analyst, 1986, 111, 1203. Paper A71207 Received May 21st, 1987 Accepted July 9th, 1987
ISSN:0003-2654
DOI:10.1039/AN9871201667
出版商:RSC
年代:1987
数据来源: RSC
|
8. |
Determination of stanozolol in tablets by derivative ultraviolet spectrophotometry and high-performance liquid chromatography |
|
Analyst,
Volume 112,
Issue 12,
1987,
Page 1671-1674
Vanni Cavrini,
Preview
|
PDF (481KB)
|
|
摘要:
1671 ANALYST, DECEMBER 1987, VOL. 112 Determination of Stanozolol in Tablets by Derivative Ultraviolet Spectrophotometry and High-performance Liquid Chromatography* Vanni Cavrini, A. Maria Di Pietra, M. Augusta Raggi and M. Grazia Maioli Department of Pharmaceutical Sciences, University of Bologna, via Belmeloro 6, 40 126 Bologna, Italy Two rapid assay procedures based on high-performance liquid chromatography (HPLC) and derivative UV spectrophotometry have been developed for the specific determination of the anabolic steroid stanozolol in pharmaceutical formulations (tablets). The HPLC determination was carried out on a reversed-phase C18 column using a mobile phase consisting of methanol - 0.05 M aqueous ammonium dihydrogen phosphate (85 + 15) at a flow-rate of 1.0 ml min-1 with UV detection at 230 nm.The procedures based on first- and second-derivative spectrophotometry have been shown to be able to suppress the background absorption due to excipients. The described HPLC and derivative spectrophotometric methods were com parable in terms of accuracy and precision and are a convenient alternative to the lengthy official USP methods for the rapid and reliable quality control of commercial stanozolol dosage forms. Keywords: High-performance liquid chromatography; derivative spectroscopy; stanozolol determination; pharmaceutical formulations; ultraviolet spectrophotometry Stanozolol, 17a-methyl-2’H-5a-androst-2-eno[3,2-c]pyrazol- 17p-01, is a heterocyclic steroid displaying pronounced ana- bolic properties with low androgenic effects.’ It is used therapeutically to accelerate recovery from protein deficiency and protein-wasting disorders (e.g., osteoporosis).Few methods are available for determining stanozolol in its dosage forms. The pharmacopoeia1 methods2 include a difference spectrophotometric procedure for the determination of stano- zolol in tablets and a spectrophotometric method, based on the determination of the stanozolol - bromocresol purple complex, for tablet dissolution tests. Other reported methods include gas chromatographic3 and spectrophot~metric~ pro- cedures. Combined gas chromatography - mass spectrometry has been used for the determination of stanozolol in urine samples for doping control .5,6 Therefore, it was considered desirable to develop additional assay methods suitable for the rapid and reliable quality control of stanozolol pharmaceutical formulations. The ver- satility of high-performance 1iqui.d chromatography (HPLC) in steroid analysis is well known7.8; on the other hand, derivative ultraviolet (UV) spectrophotometry has been successfully used for the correction of the background absorption due to the inactive components in analysis of pharmaceutical steroid formulations.9JO In this study both HPLC and derivative spectrophotometric approaches were followed to develop rapid alternative procedures to the lengthy official methods for the specific determination of stanozolol in commercial dosage forms.Experimental Materials Stanozolol and starch, coloured with erythrosine (E127), were kindly supplied by Zambon (Italy).The internal standard, 4-acetylbiphenyf , was purchased from Janssen Chimica (Bel- gium). All other chemicals were RPE grade from Carlo Erba (Italy). Methanol (HPLC-grade) (Carlo Erba) and de-ionised, doubly distilled water were used for the chromatographic procedure. A stanozolol stock solution (80.76 pg ml-1) was prepared in ethanol - 0.1 M hydrochloric acid (7 + 3) and the internal standard, 4-acetylbiphenyl, in methanol (115 pg ml-1). * Presented in part at the 2nd International Symposium on Drug Analysis, Brussels, 27-30 May, 1986, Apparatus A Varian HPLC system consisting of a Model 5020 chromato- graph, a UV-50 variable-wavelength detector and a Model 4290 integrator was used. Manual injections were made using a Rheodyne 7125 injection valve (1O-pl loop).The spectro- photometric studies were performed on a Varian Model DMS 90 double-beam spectrophotometer using 1-cm quartz cells with a slit width of 1 nm and a scan speed of 50 nm min-1 over the range 300-200 nm. A Varian Model 9176 recorder was used with a full-scale deflection of 5 mV x 100 (first-derivative spectra) or 1 mV X 100 (second-derivative spectra) and a chart speed of 2 cm min-1. HPLC Method Chromatographic conditions Chromatographic separations were performed at ambient temperature on a reversed-phase Hypersil RP-18, 10-pm column (30 cm x 4 mm i.d.) using a mobile phase consisting of methanol - 0.05 M ammonium dihydrogen phosphate (85 + 15) at a flow-rate of 1.0 ml min-1. The detector wavelength was set at 230 nm with a sensitivity of 0.05 a.u.f.s.Calibration graphs Working standard solutions containing 16-40 pg ml-1 of stanozolol and 11.5 pg ml-1 of internal standard were prepared by transferring 2-5 ml of stanozolol stock solution into separate 10-ml calibrated flasks, adding 1.0 ml of internal standard solution and adjusting the volume with methanol. The peak height was measured and the peak-height ratios of the drug to internal standard were plotted against the respective mass ratios to obtain the calibration graphs. Spectrophotometric Method The notation for amplitude measurements in the derivative domain was made according to Fasanmade and Fell.11 Calibration graphs Working standard solutions containing 40-100 pg ml-1 of stanozolol were prepared in ethanol - 0.1 M HCl(7 + 3). The derivative UV spectra were recorded against a solvent blank. The peak-to-zero amplitudes 1D239 and the peak-to-peak amplitudes 2D247,231 were measured (millimetres) and plotted against the corresponding concentrations to obtain the respec- tive calibration graphs.1672 ANALYST, DECEMBER 1987, VOL.112 Stanozolol Assay An amount of powdered tablet equivalent to approximately 3.36 mg of the drug was extracted twice with 20-ml portions of ethanol - 0.1 M HCl (7 + 3) with magnetic stirring, and the extracts obtained were filtered, combined in a 50-ml cali- brated flask and diluted to volume with the extracting solvent. For HPLC analyses, a 5-ml aliquot of the resulting solution was transferred into a 10-ml calibrated flask, 1.0 ml of internal standard solution was added and the volume was adjusted with methanol.A 10-pl volume was injected into the chromato- graph in triplicate. The sample solutions were chromato- graphed concurrently with the appropriate standard solution (32 pg ml-1) and the peak-height ratios (drug to internal standard) were used for the determination of stanozolol in each sample. For the derivative spectrophotometric method, the solu- tions from the drug extraction were determined directly as described under Calibration graphs and the concentration of stanozolol in each sample was determined by interpolating the appropriate calibration graph. The same solutions were also analysed by a conventional spectrophotometric procedure; the absorbance values were measured at 230 nm and the concentration of stanozolol was calculated by direct comparison with standard solutions. Results and Discussion Chromatography A reversed-phase HPLC method was developed to provide a specific procedure suitable for the rapid quality control of stanozolol formulations and as a reference method for derivative UV spectrophotometric procedures.A mobile phase consisting of methanol - 0.05 M aqueous ammonium dihydrogen phosphate (85 + 15) was chosen after several trials with acetonitrile - water and methanol - water. The described chromatographic system allows an adequate resolution (I?, = 2.2) between stanozolol (t, = 5.3; k’ = 1.52) and the internal standard, 4-acetylbiphenyl (t, = 4.2; k’ = l.O), in a reasonable time (Fig. 1) (R, = resolution; t, = retention time; k’ = capacity factor). For quantitative determinations a linear calibration graph (y = 0.2572~ - 0 2 4 6 8 Ti meimi n Fig.1. Typical chromatograms of (1) 4-acetylbiphenyl, the internal standard (11.5 pg ml-1) and (2) stanozolol (33.6 pg ml-1). Column, Hypersil CI8; mobile phase, methanol - 0:05 M ammonium dihydrogen phosphate (85 + 15) at a flow-rate of 1.0 ml min-1; and UV detection at 230 nm, 0.05 a.u.f.s. 0.448; r = 0.9996; n = 5 , where y andx are response ratios and mass ratios, respectively) was obtained over the working concentration range 16-40 pg ml-1. The relative standard deviation (0.65%) of the peak-height ratio of stanozolol to internal standard, derived from replicate (n = 8) analyses of a single stanozolol solution, illustrates the precision of the chromatographic procedure. The specificity of the chromato- graphic system was ascertained by a separate chromatographic analysis of the extracts of excipient mixtures without stano- zolol; no interfering peaks at the retention times of the drug and internal standard peaks were observed.Derivative Ultraviolet Spectrophotometry A mixture of ethanol - 0.1 M hydrochloric acid (7 + 3) was used for extracting stanozolol from powdered tablet samples. Fig. 2 shows the absorption (zero order) UV spectra of ( a ) stano- zolol standard solution, ( b ) an extract of commercial stano- zolol tablets and (c) an extract of an excipient mixture lacking stanozolol. As can be seen, the excipients co-extracted with stanozolol and would, therefore, significantly interfere in a conventional spectrophotometric determination, as indicated by the upward displacement of the drug spectral band. However , the application of the derivative spectrophoto- metric technique allowed complete elimination of the back- ground absorption due to the excipients. In fact, the amplitude of the peak at h.= 239 nm to the zero line ‘D239 in the first-derivative spectrum (Fig. 3) and the peak-to-peak ampli- tude 2D247,231 in the second-derivative spectrum (Fig. 3) of the drug appear not to be affected by the co-extracted excipients. Therefore, the measurement of both the amplitudes ID239 and 2D247,231 was the basis for the development of a specific and simple procedure for the analysis of stanozolol tablets. Linear calibration graphs, ID239 = 0.6500~ - 0.500 ( r = 0.9999; n = 5 ) and 2D247,231 = 0.8543~ - 0.643 ( r = 0.9998; n = 5) were obtained between the measured amplitudes and the drug concentration (c = 40-100 pg ml-1) in ethanol - 0.1 M hydrochloric acid (7 + 3), the extracting solvent system chosen.The relative standard deviations of the amplitudes, derived from replicate ( n = 8) recordings of the derivative spectra, all fell in the range 0.65-1.00%. 1.5 Q) m g 1.0 e a a -c1 0.5 0 B 200 250 300 Wavelengthhm Fig. 2. Zero-order spectra of (A) stanozolol (80 pg ml-l), (B) stanozolol tablet extract and (C) excipient extract. Solvent, ethanol - 0.1 M hydrochloric acid (7 + 3). Solutions were equimolar in stanozolol and excipients. Excipient composition was that of product B, Table 1ANALYST, DECEMBER 1987, VOL. 112 1673 Table 1. Assay results for the determination of stanozolol in commercial tablets by HPLC, derivative UV spectrophotometry and conventional absorption spectrophotometry.Results are the average of five determinations and are expressed as a percentage of claimed content. RSD = relative standard deviation Absorption method HPLC First-derivative method Second-derivative method (h = 230 nm) Found RSD, Yo Product * Found RSD, Yo Found RSD, Yo Found RSD, Yo A . . . . . . 101.38 1.47 101.60 0.86 100.24 1 S O 108.40 0.92 B . . . . . . 103.60 1.60 104.60 1.23 104.20 1.54 112.10 1.26 erythrosine. Product B contains stanzolol (3 mg per tablet), cellulose, lactose, magnesium stearate and talc. Table 2. Recovery values obtained for the determination of stanozolol in synthetic mixtures. Results are the average of five determinations and are expressed as a percentage of stanozolol added * Product A contains stanozolol (2 mg per tablet), calcium hydrogen phosphate, lactose, magnesium stearate and starch coloured with Absorption method First-derivative method Second-derivative method (h = 230 nm) HPLC Synthetic preparation* Found RSD, Yo Found RSD, Yo Found RSD, Yo Found RSD, Yo A .. . . . . 100.07 0.88 99.35 1.23 98.70 1.58 107.17 1.62 B . . . . . . 99.81 0.96 100.48 0.86 99.20 1.17 106.72 1.03 * The composition was identical with that of the corresponding commercial formulation (Table 1). 247 2nd derivative 231 200 250 300 r 0.5 1 3 5 Excipient: stanozolol ratio Fig. 4. Effect of the excipient to stanozolol ratio on the drug determination. Methods: (0) absorption (zero-order); (0) first- derivative; (A) second-derivative spectra; and (0) HPLC.The ratio values on the abscissa are relative to those of the commercial formulations A and B examined. (a) Formulation A; ( b ) formulation B can be seen, the data for the product A were in agreement with the labelled amounts, whereas for product B a significantly higher drug level was found. No significant differences were found between the results obtained by HPLC and those obtained by first- and second-derivative spectrophotornetry for the same batch at the 95% confidence level (Student’s t-test; tabulated value of t = 2.31). For comparison, a conventional absorption method (A = 230 nm) was also applied and the results, as expected, were unacceptably high owing to the contribution of the excipient absorption. 200 250 Wavelengthhm 300 Fig.3. First- and second-derivative spectra of stanozolol (solid line) and excipient extract (broken line). Solvent and concentration as in Fig. 2 Analysis of Stanozolol Tablets Commercially available stanozolol tablets were analysed by the described HPLC and derivative UV spectrophotometric methods. The results obtained are summarised in Table 1. As Accuracy In order to demonstrate the validity and applicability of the proposed methods, recovery studies were performed by analysing synthetic mixtures which reproduced the composi- tion of commercial formulations. The results obtained (Table 2) showed both the liquid chromatographic and the derivative spectrophotometric methods to be satisfactorily accurate and precise.Additional experiments were carried out to verify the effect of a variation in the excipient to stanozolol ratio on the accuracy of the proposed methods. Fig. 4 shows the recovery1674 ANALYST, DECEMBER 1987, VOL. 112 values obtained by various analytical procedures (HPLC, conventional and derivative UV spectrophotometry) for the analysis of two different synthetic formulations (A and B) of stanozolol within a relatively wide span of excipient to stanozolol ratios (0.5-3.0 times that of the corresponding commercial formulation). In both instances, a conventional UV determination was affected by a significant error (ca. +5%) even at a low excipient to drug ratio and the recovery values were found to increase with excipient concentration. Conversely, the first- and second-derivative procedures were able to compensate for the absorption of the inactive ingredients, giving results comparable to those obtained by a specific HPLC method.The specificity of the proposed procedures for the determi- nation of stanozolol in the presence of its 17-dehydro analogue, the only reported degradation product of stano- zolol,3 has not been investigated in this study because the compound was not available. It is likely, however, that the HPLC procedwe may offer advantages over spectrophoto- metric methods. Conclusion A conventional spectrophotometric determination (h = 230 nm) was found to be inapplicable to the direct determination of stanozolol in commercial tablets owing to interfering excipient absorption. The conversion of the zero-order UV spectrum into higher order (first- and second-derivative) spectra resulted in complete elimination of the non-specific matrix interferences and considerably improved the accuracy of the method.Hence, the proposed derivative spectropho- tometric procedures were found to be suitable, in terms of accuracy and precision, for the reliable and rapid quality control of commercial stanozolol tablets and are a feasible and convenient alternative to the rather laborious official (USP) difference spectrophotometric methods. The reversed-phase HPLC method, developed as a useful reference method, also represents a sensitive and flexible analytical tool which could be used for a specific stanozolol assay and for tablet dissolution tests. The authors thank Dr.M. Paciotti, Zambon Farmaceutici, Italy, for the samples supplied. This work was supported (40%) by the Minister0 della Pubblica Istruzione, Italy. 1. 2. 3. 4. 5. 6. 7. 8. 9. 10. 11. References Reynolds, J. E., and Prasad, A. B . , Editors, “Martindale. The Extra Pharmacopoeia,” Twenty-eighth Edition, Pharmaceu- tical Press, London, 1982, p. 1433. “United States Pharmacopeia, XXI Revision, National Formu- lary, XVIth Edition,” United States Pharmacopeial Conven- tion, Rockville, MD, 1985, p. 982. Magin, D. F., J. Chromatogr., 1975, 115, 687. Shingbal, D. M., and Barad, U. G., J . Assoc. Off. Anal. Chem., 1985, 68, 98. Ward, R. I . , Lawson, A. M., and Shackleton, C. H., in Frigerio, A., and Ghisalberti, E . L., Editors, “Proceedings of the International Symposium on Mass Spectrometry in Drug Metabolism 1976,” Plenum Press, New York, 1977, p. 465; Chem. Abstr., 1978, 88, 34061~. Bertrand, R., Masse, R., and Dugal, R., Farm. Tijdschr Belg., 1978, 55, 85; Chem. Abstr., 1979, 90, 115928j. Kautsky, M. P., Editor, “Steroid Analysis by HPLC: Recent Applications,” Chromatographic Sciences Series, Volume 16, Marcel Dekker, New York, 1981. Gorog, S., “Quantitative Analysis of Steroids,” Studies in Analytical Chemistry, Volume 5, Elsevier, Amsterdam, 1983, Traveset, J., Such, V., Gonzalo, R., and Gelpi, E., J. Pharm. Sci., 1980, 69, 629. Fell, A. F., Anal. Chem. Symp. Ser., 1982, 10, 495. Fasanmade, A. A., and Fell, A. F., Analyst, 1985, 110, 1117. p. 99. Paper A71225 Received June 4th, 1987 Accepted July 15th, 1987
ISSN:0003-2654
DOI:10.1039/AN9871201671
出版商:RSC
年代:1987
数据来源: RSC
|
9. |
Determination of five components in a pharmaceutical formulation using near infrared reflectance spectrophotometry |
|
Analyst,
Volume 112,
Issue 12,
1987,
Page 1675-1679
Pierre Dubois,
Preview
|
PDF (555KB)
|
|
摘要:
ANALYST, DECEMBER 1987, VOL. 112 1675 Determination of Five Components in a Pharmaceutical Formulation Using Near Infrared Reflectance Spectrophotometry Pierre Dubois, Jean-Rene Martinez and Pierre Levillain Departement de Chimie Analytique, Faculte des Sciences Pharmaceutiques, 2 bis Boulevard Tonnelle, 37042 Tours Cedex, France Near infrared reflectance spectrophotometry was applied to the determination of five components in a liquid pharmaceutical formulation. No dilution, separation, extraction or chemical reaction was necessary. The instrument was calibrated with a limited number of samples (25). The accuracy and reproducibility of the measurements were excellent for phenazone (0.2%), glycerol (1 %) and ethanol (3%), limited for lidocaine hydrochloride (5.7%) and poor for sodium thiosulphate (>lo%).Keywords: Near infrared reflectance spectrophotometry; pharmaceuticals; multi-component analysis The quality control of pharmaceuticals, either during produc- tion or in the final dosage form, is costly for the manufactur- ers. Methods currently used are time consuming and complex and often require numerous steps (dissolution, extraction, separation, assay of individual components, etc.) , each increasing the risk of errors. The recent improvement in quality control procedures may be considered a result of progress in spectrometric techniques such as near infrared,l-S Raman,637 Fourier transform infrareds-10 and photoacoustic11 spectroscopy. The near infrared reflectance analysis technique (NIRA) developed by Norrisl and recently discussed by Wetzell4 has been widely used in fields such as food and agricultural, polymer and detergent analysis.In this technique the sample can be directly introduced into the instrument and its composition obtained in less than 1 min without any preliminary treatment. In spite of this advantage, its application to pharmaceutical quality control is only just being developed.12 The require- ments for pharmaceutical quality control are more severe than in other fields; drug control needs excelled accuracy, specific- ity and precision. Further, as active components are often present in low amounts in pharmaceutical formulations, the methods must be very sensitive. Absorption in the near infrared region is generally weak, which is an advantage for major components (it is not necessary to dilute samples) but concentrations of minor components are often near to the detection limit.Near infrared analysis frequently requires a large number of samples for calibration (100 or more).4J3 There are two main reasons for this. Firstly, for biological compounds calibration samples cannot be prepared by weighing pure substances as the results are inaccurate, and secondly, NIRA allows the quantitation of solid samples which undergo considerable interference by specular reflection depending on physical parameters ( e . g . , particle size). A large number of samples are therefore necessary to take account of this imprecision. For this work we selected Otipax ear-drops, a liquid pharmaceutical formulation containing two active com- ponents (phenazone, 4 g; lidocaine, 1 g), two solvents (ethanol, 24 g; glycerol, 70.9 g) and one antioxidant (sodium thiosulphate, 0.1 g) to investigate the application of near infrared analysis to pharmaceutical quality control.The concentrations of the five components vary within a broad range (70-0.1%). The use of a synthetic liquid allowed us to test whether it was possible to limit the number of samples required for calibration. Standards can be prepared by weighing and a liquid medium eliminates specular interfer- ences encountered with solids. For this purpose we prepared only 25 samples to calibrate the instrument. Experimental Starting Materials The various products used (glycerol, 95% ethanol, phenaz- one, lidocaine hydrochloride monohydrate and sodium thio- sulphate pentahydrate) complied with the standards of the French or European Pharmacopoeia.Samples Examined Using the starting materials, the following were prepared: a mixture of glycerol - 95% ethanol (65 + 35 V/v>; solutions of phenazone, lidocaine and sodium thiosulphate at various concentrations (up to 30 mg per 100 ml) in the above mixture; and 25 samples in which the concentrations of the five components varied from the theoretical value by +25%. In order to avoid the determination of each component being affected by the matrix, we randomly varied the percentage of the five components in the 25 samples. These solutions were used for calibration, each standard being determined twice. The calibration therefore involved 50 points. We also examined ten samples prepared as for the calibration and six commercial samples supplied by the manufacturer (Biocodex, France).Procedure All measurements were performed on an InfraAlyser 500C (Technicon, Domont, France). To obtain the spectra of the solid starting materials, the samples were spread in a cup so that the surface was smooth, thereby eliminating specular reflection. Liquid samples were injected into a thermostated cell (20 "C) with automatic draining. The bottom of the cell contained a ceramic diffusion element, allowing work in the transmission reflectance mode. Multiple linear regressions were calculated with a Hewlett Packard HP 1000 computer using statistical software designed by Technicon for quantitative application. A good calibra- tion14 will have a multiple correlation coefficient ( R ) near to one and a standard deviation of the range (SD range) greater than or equal to ten times the standard error of estimation (SEE).A good accuracy14 will have a standard error of prediction (SEP) lower than or equal to the SEE. The statistical values were calculated by the computer. The diffuse reflectance data, Rh, collected between 1100 and 2500 nm with a 4-nm scanning increment were trans- formed in the absorbance mode, log (l/Rh), to establish linear relationships between the reflectance data and the concentra-1676 ANALYST, DECEMBER 1987, VOL. 112 tion of each constituent in the formulation according to the relationship CYO = Fo + F1log( 1/&J + &log (1/Rk2). . . + F,log( l/Rh,) where C% = concentration of the component C in the mixture (mlrn); Fo = the constant calibration value dependent on the apparatus; and Fl, F2, F, = specific calibration coefficient for the component C or for the matrix at the wavelength 1,2, .. . n. Positive values correspond to the characteristic wavelength of the component and negative values enable matrix effects to be corrected. Results and Discussion Selection of Working Wavelengths Spectra of pure products15 The most intense bands of the near infrared spectra were harmonics or combination bands of IR bands situated around 3300 nm (3000 cm-1) (vibrations of CH, OH, NH). The harmonics of other vibrations were of an order too high to present a sufficient intensity and were thus difficult to discern. Phenazone and lidocaine (Fig.1) gave spectra (Fig. 2) characteristic of aromatic compounds with two weak bands around 1200 nm (second harmonic) and 1400 nm (first harmonic) and a broad band beyond 2100 nm corresponding to combination bands of basic CH vibrations towards 3300 nm. Additional bands characteristic of water of hydration (1900 and 1400 nm) and NH bands of an amide group (combination Lidocaine hydrochloride.H20 I Phenazone HCI.H20 I band around 2000 nm and first harmonic towards 1500 nm) could be found in the spectrum of lidocaine. Ethanol and glycerol (Fig. 3) showed bands characteristic of the OH group, i.e., combination bands around 2300 and 2100 nm (the first more intense and the second weaker for ethanol than for glycerol), a first harmonic around 1400 nm and a second harmonic towards 1100 nm.As in the mid-range IR, the bands of glycerol were broader than those of ethanol due to intramolecular association phenomena resulting from hydrogen bonding. Bands (Fig. 4) were clearly seen at 1450 nm (first harmonic) and 1900 nm (combination band of water OH vibration) in the spectrum of sodium thiosulphate pentahydrate. The S-0 vibration band towards 9100 nm (1100 cm-1 in conventional IR), although intense, could not be detected in the near IR as it corresponds to a minimum at the fourth harmonic, with very low intensity. In addition, reflectance measurements carried out with crystallised or ground samples of sodium thiosulphate showed a considerable difference, providing evidence for the importance of the form and size of the solid particles analysed.Spectra in glycerol - ethanol Solutions of each of the three compounds (lidocaine, phenaz- one and sodium thiosulphate) were examined in a mixture of glycerol - 95% ethanol (65 + 35, V/V). For each solution, spectra were obtained and the computer selected four characteristic wavelengths for the determination of each component (Table 1). The four wavelengths (Table 1) selected for phenazone and lidocaine were the maxima of the spectra of the pure products but not the absorption maxima seen in the mixture (1490, 1580 and 2090 nm), although at these maxima there was an increased absorption more sensitive than at the wavelengths selected, resulting from the increased concentra- tion of phenazone and lidocaine. The observed increases were due more to a de-structuring action of the hydrogen bond 0, c lu P n Q v) 0.0 I I I I I I 1 1100 1500 2000 1500 2000 2500 Wavelengthlnm Fig.1. Formulae of phenazone and lidocaine hydrochloride Fig. 3. NIR spectra of (a) 95% ethanol and ( b ) glycerol monohydrate 0.0 I I I I 1100 1500 2000 2500 Wavelengthlnm 2.0 al f lu Il.0 n a 1100 1500 2000 2500 Wavelengthlnm Fig. 2. chloride monohydrate NIR spectra of (A) phenazone and (B) lidocaine hydro- Fig. 4. Crystallised sample and (B) group sample NIR spectra of sodium thiosulphate pentahydrate. (A)ANALYST, DECEMBER 1987, VOL. 112 1677 acceptors on the internal bonds of the glycerol - ethanol mixture than to an actual absorption of the compounds, the spectrum of which would have been modified by bonds formed with the solvent. For sodium thiosulphate, however (Table l ) , the intense band around 1900-2000 nm was practically unchanged in solution, allowing the selection of characteristic wavelengths.Mixtures used for calibration In the mixtures containing the components at concentrations close to those in the pharmaceutical formulation, the com- puter selected three characteristic wavelengths for the deter- mination of each component (Table 2). The wavelengths for ethanol and glycerol were not the maxima of the substances in the pure state. Measurements at 1888 nm were used to determine ethanol, and also to correct the matrix effect in the determination of glycerol; measure- ments at 2128 nm were used to determine glycerol and to correct the matrix effect in the determination of ethanol. The characteristic wavelengths of phenazone (2000 nm) and lidocaine (2228 nm) were zones of intense absorbance but were not the wavelengths of maximum absorbance.The proximity of these two wavelengths suggests the possibility of interference leading to a loss of specificity. The wavelength used for sodium thiosulphate was in a zone of very low absorption and did not appear to be favourable for the determination. The wavelengths selected for the mixtures used for calibra- tion were very different from those of the single component solutions. There are two reasons for this. Firstly, the concentrations of ethanol and glycerol varied in the mixtures whereas their proportions were constant in the single com- ponent solutions. Because of this, the absorption of the matrix varied in the first instance but was identical for all the samples in the second.Secondly, phenazone has an intense band at 2428 nm, the characteristic wavelength of lidocaine, forcing the computer to select a different working wavelength. Table 1. Computer selection of the four characteristic wavelengths for the determination of each component in a mixture of ethanol - glycerol (35 + 65 V/V) Characteristic wavelengths of Wavelengths of Component componenthm matrix effectshm Phenazone . . . . . . . . 2152 2072 1736 1576 Lidocaine . . . . . . . . 2428 2124 1596 1184 Sodium thiosulphate . . . . 1776 1736 2004 1448 Calibration Results Twenty-five solutions were measured in duplicate for the calibration. The resulting calibration equations were: glycerol: C, % = 63.176 + 87.6911og(l/RhI) + ethanol: C, o/o = 157.532 + 46.5381og(l/RhI)- phenazone: C, 70 = 19.851 + 53.02210g(l/Rhl) - lidocaine: C, YO = 28.897 + 29.7131og(1/Rhl) - sodium thiosulphate: C, % = 38.801 + 185.93310g(l/Rhl) - Table 2 shows the three wavelengths used for calibration and the values of the multiple correlation coefficient ( R ) , of the standard deviation of the range (SD range) and of the standard error of estimation (SEE), for each component of the solution.The statistical results ( R = 1 and SD range >> SEE) are excellent for glycerol and ethanol which are used as solvents and are therefore present in large concentrations in the final dosage form. These results are satisfactory for phenazone. The statistical values for lidocaine are within the confidence limit ( R = 0.9962, SD range = SEE) but the experimental points show an increased dispersion around the calibration graph over phenazone (Fig.5). Finally, the results are poor for sodium thiosulphate, as expected from its very weak NIRA absorption and low concentration in the solution. 250.54710g(1/Rh2) - 196.16210g( l/Rh3) 101.62510g( l/Rh2) - 144.27410g( 1/Rh3) 80.53910g(l/Rk2) - 10.86510g(l/Rh3) 16.47010g(l/R~,) - 66.67310g(1/Rh3) 87.338108( 1/Rh2) - l33.15410g(l/Rh3) Control Samples We used ten samples prepared as calibration solutions as control samples for determining the calibration equation. Table 3 shows, for each solution, the real concentration, the predicted values measured by the instrument and the standard error of prediction (SEP).The repeatability for duplicate measurements is excellent for glycerol, ethanol and phenazone, lower for lidocaine and poor for sodium thiosulphate. The accuracy of this method is excellent for glycerol, ethanol and phenazone (SEP < SEE), but not for lidocaine (SEP = 4 x SEE) and worse for sodium thiosulphate. The poor results obtained for lidocaine could be due to the fact that its concentration is around the sensitivity limit of this method. Commercial Samples We examined six commercial samples from the same batch, supplied and controlled by the manufacturer (Laboratoire Biocodex). The results obtained (Table 4) show that the repeatability of the determination was excellent for four of the Table 2. Computer selection of three characteristic wavelengths for calibration and statistics regression models. R = multiple correlation coefficient; SD range = standard deviation range; and SEE = standard error of estimation Characteristic wavelengths of Wavelengths of Components componenthm matrix effecthm R SD range SEE .. . . . . . . h, = 1888 0.9999 3.8821 0.0567 h, = 2128 h3 = 1260 h, = 1568 A, = 2048 h2 = 2052 h, = 1724 A, = 1708 Glycerol h, = 2128 h2 = 1436 Ethanol . . . . . . . . hl = 1888 0.9998 3.344 0.0524 Phenazone . . . . . . h, = 2220 0.9989 0.5660 0.0271 Lidocaine . . . . . . . . hl = 2228 0.9962 0.1822 0.0164 Sodium thiosulphate . . . . hl = 1264 0.9625 0.1153 0.0323 h, = 12481678 ANALYST, DECEMBER 1987, VOL. 112 0.7 0.8 0.9 1.0 1.1 1.2 1.3 True values 2 3 4 True values Fig. 5. Calibration graph for (a) lidocaine and (b) phenazone Table 4.Comparison of InfraAlyser , Biocodex and theoretical values InfraAlyser, Component average valuedg Biocodex/g TheoreticaVg Phenazone . . . . 3.993 k 0.015 3.9 4 Lidocaine . . . . 0.934k0.013 0.99 1 Sodium Ethanol, 95% . . 23.24 k 0.12 22.81 24 Glycerol . . . . 71.60 k 0.17 70.8 70.9 thiosulphate . . 0.085 k 0.029 0.1 0.1 five components (phenazone, lidocaine, ethanol and gly- cerol), the results with sodium thiosulphate exhibiting an important dispersion, and that the accuracy of the method was excellent for glycerol, ethanol and phenazone. It was less acceptable for lidocaine and sodium thiosulphate. For lidocaine and sodium thioddphate, the results found were much lower than those obtained with the method used by the Laboratoire Biocodex (Table 4) and were, in fact, outside the acceptable limits as given by the manufacturer.The errors associated with lidocaine (5.7%) and sodium thiosulphate (15%) may be explained by the large dispersion of the results obtained during calibration. Conclusions We have shown that a reduced number of samples is sufficient to obtain a high precision and good accuracy in the determina- tion of the main components in a synthetic liquid mixture. The large number of standards generally used for calibration in NIRA is more a consequence of the biological origin and physical parameters of the standard mixtures than the imprecision of quantitation by the method itself. The NIRA method is well adapted to the quantitation of components whose concentration is high in pharmaceutical formulations.For instance, in Otipax ear-drops the accuracy and precision are excellent for two solvents (ethanol and glycerol) and one active compound (phenazone). For ethanol, our results are better than those obtained by the reference method (some loss of this compound during distillation can result from strong interaction with glycerol). The reproduci-ANALYST, DECEMBER 1987, VOL. 112 1679 bility is good for the second active constituent (lidocaine) but the accuracy is insufficient as our results were out of the range of determination. Its concentration (1 g-%) is at the detection limit of the method. Finally, the structure and concentration of sodium thiosulphate are not consistent with such a method. Therefore, although this method is fast and easy to use and in spite of the improvement in simplicity during the calibration step, NIRA cannot be systematically used for pharmaceutical quality control as low concentrations of active constituents can lead to imprecise and inaccurate results.We thank Mr. Vincent of Laboratoire Biocodex for providing the samples and Mr. Chaillot of Technicon Domont France for the use of the InfraAlyser 500 C. References 1. 2. 3. Norris, K. H., “Food Research and Data Analysis,” Applied Science, New York, 1976. Zappala, A. F., and Post, A., J. Pharm. Sci., 1977, 66, 292. Day, M. S . , and Fearn, F. R. B., Lab. Pract., 1982, 31, 439. 4. 5 . 6. 7. 8. 9. 10. 11. 12. 13. 14. 15. Wetzel D. L., Anal. Chem., 1983, 55, 1165A. Cowe, I. A., McNicol, J. W., and Cuthbertson, D. C., Analyst, 1985, 110, 1227 and 1233. Sato, S., Higushi, S . , and Tanaka, S., Anal. Chim. Acta, 1980, 120,209. King, T. H., Mann, C. K . , and Vickers, T. J., J. Pharm. Sci., 1985, 74, 443. Kisner, H. J . , and Brown, C. W., Anal. Chem., 1982,54,1475. Depecker, C., Sombret, B., and Legrand, P., Analusis, 1985, 13, 349. Cheng, H. Y., and Zuber G. E., Anal. Chem., 1985, 57, 100. Rosenwaig, A., “Photoacoustics and Photoacoustic Spectro- scopy,” Wiley-Interscience, New York, 1980. Mark, H., Anal. Chem., 1986, 58, 379. Jensen, R., “Controle Pharmaceutique et Automatisation,” Technicon European NIRA Colloquium, Paris, 1985, Techni- con Domouted, 1985. Stark, E., “InfraAlyser 500 User’s Club,” Technicon, Domont, 1983. Whetsel, K. B., Appl. Spectrosc. Rev., 1968, 2, 1. Paper A7129 Received February 2nd, 1987 Accepted July 16th, 1987
ISSN:0003-2654
DOI:10.1039/AN9871201675
出版商:RSC
年代:1987
数据来源: RSC
|
10. |
Spectrophotometric determination of hydrogen cyanide in air and biological fluids |
|
Analyst,
Volume 112,
Issue 12,
1987,
Page 1681-1683
Preeti Kaur,
Preview
|
PDF (369KB)
|
|
摘要:
ANALYST, DECEMBER 1987, VOL. 112 Spectrophotometric Determination of Hydrogen Biological Fluids Miss Preeti Kaur, Miss Sweta Upadhyay and V. K. Gupta* Department of Chemistry, Ravishankar University, Raipur 492 0 10, lndia 1681 Cyanide in Air and A spectrophotometric method is proposed for the determination of hydrogen cyanide in air. Hydrogen cyanide present in air is absorbed in 0.002 M sodium hydroxide solution, which is then treated with bromine. The cyanogen bromide so formed reacts with pyridine to form glutaconic aldehyde. The latter is condensed with anthranilic acid to form a yellowish orange polymethine dye which shows a maximum absorbance at 400 nm. Beer's law is obeyed in the range 0.4-4 p.p.m. Other analytical parameters have been investigated. The method has been successfully applied to the determination of hydrogen cyanide in biological samples, e.g., cysteine and whole blood.Keywords: Hydrogen cyanide determination; air; anthranilic acid; spectrophotometry; biological fluids Hydrogen cyanide is highly toxic and is used widely as a fumigant in agriculture and in ship cargoes. It is also emitted into the atmosphere by the coal tar and plastics industries during the manufacture of acrylonitrile and various resins.1 Electroplating and the purification of blast furnaces are also important sources of hydrogen cyanide in air. The reported TLV for hydrogen cyanide in air is 0.01 mg m-3.2 Minute amounts of cyanide produce adverse toxicological effects in man and the invasion of this substance into the body by almost any route results in life-threatening conditions.3 Sensitive methods are therefore required for its determination at p.p.m.levels. Various methods have been proposed for the determination of cyanide.4-11 Hydrogen cyanide is most frequently deter- mined in air by spectrophotometric methods based on Konig's reaction,l2 which has a reasonably high sensitivity and reproducibility. In this paper, a method has been developed for the determination of hydrogen cyanide in air, making use of Konig's reaction, which consists of the conversion of cyanide to cyanogen bromide which then reacts with pyridine to form glutaconic aldehyde. The latter readily reacts with anthranilic acid to form a coloured polymethine dye. This method was used previously by Gupta et al.11 for the determination of cyanide in water.It is now suitably modified for the determination of hydrogen cyanide in air and biological fluids. Hydrogen cyanide in air was absorbed in a dilute solution of sodium hydroxide in an impinger attached to an air sample train3 and later analysed by the proposed method. Various other analytical parameters, e.g., strength, flow-rate and stability of the absorbed solution, have been studied. The method has been applied to the determination of hydrogen cyanide in biological samples (cysteine and whole blood). Experimental Apparatus A Carl Zeiss Spekol spectrophotometer with matched silica cells of 1-cm path length was used for all spectral measure- ments. Midget impingers of 35-ml capacity were used for the absorption of hydrogen cyanide from air.A calibrated PIMCO rotameter was used for air flow measurements. Reagents All chemicals used were of analytical-reagent grade and solutions were prepared with de-ionised water. * To whom correspondence should be addressed. Standard cyanide solution. A stock solution of 1 mg ml-1 of CN- was prepared in de-ionised water. Appropriate dilution gave a working standard of 10 pg ml-1. Pyridine reagent.6 Concentrated HCl(3 ml) was mixed with 18 ml of freshly distilled pyridine and 12 ml of de-ionised water were added. Sodium arsenite solution, 1.5% mlV. Anthranilic acid solution, 0.1% mlV. Bromine water. A saturated solution was used. Absorbing solution , 0.002 M sodium hydroxide. Tris - HCl buffer,3 0.2 M, pH 7.6. Prepared by dissolving 0.60 g of tris(hydroxymethy1)aminomethane and 2.36 g of tris(hydroxymethy1)aminomethane hydrochloride in 100 ml of water.Cysteine solution. 3 Prepared by placing 50 mg of L-cysteine hydrochloride monohydrate in a test-tube, after which one drop of bromocresol green indicator solution is added. NaOH (0.5 M) is added dropwise until all of the cysteine is dissolved. Tris - HC1 buffer (10 ml) is added to the neutralised cysteine and thoroughly mixed. The pH of the solution should be 7.4. Procedure Collection of sample For the generation of hydrogen cyanide a few millilitres of 5 M sulphuric acid were added to a flask containing 5 ml of standard cyanide solution and the flask was stoppered immediately. The flask was connected to two impingers containing 10 ml each of the absorbing solution connected in series to the source of suction.Just after the addition of sulphuric acid pure air was passed through the impinger for 30 min . Analysis After sampling, aliquots of the absorbed solution were transferred into a 10-ml calibrated flask. To each flask was added 0.3 ml of bromine water and the mixture was allowed to stand for 1 min so that bromination was complete. The excess of bromine was decolorised by the dropwise addition of sodium arsenite solution. Then 0.4 ml of pyridine reagent followed by 1 ml of anthranilic acid were added. The mixture was allowed to stand until full colour development had occurred (ca. 10 min). The volume was made up to the mark with a very dilute solution of sodium hydroxide to maintain a pH of ca. 7.2 and the absorbance was measured at 400 nm using distilled water as the reference.The same procedure was followed for the blank, which gave no colour under these conditions.1682 ANALYST, DECEMBER 1987, VOL. 112 Results and Discussion Generation of Hydrogen Cyanide Hydrogen cyanide was generated by the reaction of sulphuric acid with potassium cyanide. The liberated hydrogen cyanide was efficiently trapped in a 0.002 M solution of sodium hydroxide by passing purified air through the solution at a rate of 0.5 1 min-1. The absorbed hydrogen cyanide was then determined by the proposed procedure. It was found that the generation of hydrogen cyanide was quantitative. The stan- dard deviation and relative standard deviation of the absor- bance of 20 pg of cyanide per 10 ml of NaOH solution were +0.008 and 2.0%, respectively, showing the generation to be reproducible.Absorption Efficiency Two midget impingers (capacity 35 ml), each containing 10 ml of 0 . 0 0 2 ~ NaOH solution, were connected in series. An air sample was passed through them for various lengths of time at different flow-rates. After sampling, hydrogen cyanide was determined by the proposed procedure. It was found that 99-100% of the hydrogen cyanide was absorbed by the first impinger. The second impinger gave a negative result for cyanide. Concentrations of sodium hydroxide solution from 0.001 to 0 . 0 1 ~ had no effect on the absorption efficiency. Table 1. Effect of concentration of sodium hydroxide solution on absorption efficiency. Flow-rate = 0.5 1 min-1 Concentra- Sample tion of Cyanide No.NaOH/M passedlpg 1 0.001 10 20 30 40 2 0.004 10 20 30 40 CN- found in first impingerlpg 9.85 20.00 29.90 39.95 10.00 19.95 29.80 39.90 Absorption, YO 98.5 100.0 99.6 99.9 100.0 99.7 99.3 99.7 3 0.008 10 9.95 99.5 20 20.00 100.0 30 29.93 99.8 40 39.82 99.5 4 0.01 10 9.85 98.5 20 19.92 99.6 30 29.95 99.8 40 39.70 99.2 Table 2. Effect of flow-rate on absorption efficiency. Concentration of NaOH solution, 0.002 M CN- found Sample Flow-rate/ CN- passed/ in first Absorption, 1 0.25 10 9.95 99.5 20 19.80 99.0 30 29.92 99.7 40 39.50 98.7 No. 1 min-1 pg impingedpg YO 0.5 1 .o 2.0 10 20 30 40 10 20 30 40 9.80 19.75 29.95 39.35 10.00 19.75 29.82 39.90 98.0 98.7 99.8 99.8 100.0 98.7 99.4 99.7 10 9.95 99.5 20 19.68 98.4 30 29.75 99.2 40 39.50 98.7 Even a variation of the flow-rate from 0.25 to 2 1 min-1 and of the temperature from 15 to 35°C during collection of the sample had no effect on the absorption efficiency.In the time period 20-60 min, no change in absorption efficiency was noted. The results for absorption efficiency are presented in Tables 1 and 2. Stability of the Collected Hydrogen Cyanide Samples It was found that the hydrogen cyanide absorbed in 0 . 0 0 2 ~ sodium hydroxide solution was stable for many days. Effect of Varying Reaction Conditions The amount of bromine water needed for bromination was checked by adding various amounts of saturated bromine water. A minimum of 0.2 ml of bromine water was needed for the complete bromination of cyanide to cyanogen bromide. Any excess of bromine was decolorised by the dropwise addition of sodium arsenite solution.No change in the absorbance was observed for 0.2-1 ml of 1.5% sodium arsenite solution. A minimum of 0.2 ml of pyridine reagent was needed for the conversion of cyanogen bromide into glutaconic aldehyde. The addition of up to 1 ml of pyridine reagent had no noticeable effect on the absorbance but at volumes greater than 1 ml there was a decrease in absorbance. Constant absorbance values were obtained with the addition of 1-5 ml of 0.1% anthranilic acid solution. The effects of time and temperature on the colour develop- ment were studied. It was observed that 2-3 min were sufficient for full colour development and the colour remained stable for 15 min in the range 15-35”C. At higher tempera- tures there was a decrease in the absorbance.Beer’s Law and Sandell’s Sensitivity Beer’s law was found to be valid between 4 and 40 pg per 10 ml of cyanide. Sandell’s sensitivity of the colour reaction was found to be 0.005 pg cm-2. Table 3. Effect of foreign species. Concentration of cyanide, 4 p.p.m. Tolerance * limit, Foreign species p.p.m, Benzene . . Phenol . . Benzaldehyde Ethanol . . Aniline . . Nitrobenzene Zn2+ . . Cd*+ . . Pb2+ . . Hg2+ . . Fe2+ . . cu2+ . . K+ . . . . N a + . . . . . . . . . . 2000 . . . . . . 1000 . . . . . . 800 . . . . . . 1200 . . . . . . 500 . . . . . . 900 . . . . . . 100 . . . . . . 200 . . . . . . 150 . . . . . . 100 . . . . . . 250 . . . . . . 200 . . . . . . 450 . . . . . . 500 * Amount of foreign species that causes a t2Y0 error. Table 4.Determination of generated hydrogen cyanide in air HCN found HCN found (present ( Alridge Sample No. method)/pg method)/pg 1 18.0 18.3 2 6.5 6.2 3 31.3 31.5ANALYST, DECEMBER 1987, VOL. 112 1683 Table 5. Recovery of hydrogen cyanide from cysteine and whole blood samples HCN found* HCN added/ (present Recovery, Sample Pg method)/pg % Cysteine (3ml) . . . . 8 7.9 98.7 15 14.8 98.6 25 24.5 98.0 30 29.7 99.0 Wholeblood(3ml) . . . . 5 4.90 98.0 15 14.80 98.6 20 19.95 99.8 35 34.90 99.7 * Mean of three repetitive determinations. HCN found* (pyr azolone method)/pg 7.6 14.9 25.0 29.5 4.95 14.75 19.85 34.90 Recovery, % 95.0 99.3 100.0 98.3 99.0 98.3 99.2 99.7 Effect of Foreign Species Interference from organic pollutants, e . g . , benzene, phenol and aniline, and metal ions such as zinc, cadmium, lead, mercury and iron were not observed.Oxidising and reducing agents, if present in small amounts, are removed by the sodium arsenite and bromine water, respectively, and hence do not interfere. The tolerance limits are shown in Table 3. Application of the Method Determination of hydrogen cyanide in air Hydrogen cyanide was generated in a fume cupboard by the gradual addition of sulphuric acid to a solution of potassium cyanide.13 The air from the fume cupboard was trapped in sodium hydroxide solution with the help of a suction pump placed outside the chamber. The air was sampled for 30 min and then the solution was analysed by the recommended procedure and compared with the standard benzidine method given by Alridge.4 The results obtained by both methods were found to be identical (Table 4).Determination of hydrogen cyanide in cysteine and whole blood The method has also been applied to the determination of cyanide in cysteine and whole blood samples. It has been reported that cysteine reacts with cyanide in the body and helps in the detoxification of cyanide. Hence the determina- tion of cyanide in cysteine is important from a biological point of view.1 To 3 ml of cysteine solution were added known amounts of cyanide and purified air was passed through the solution. The liberated hydrogen cyanide was absorbed in 0 . 0 0 2 ~ sodium hydroxide solution and determined by the proposed procedure. The results in Table 5 show a ca. 100% recovery of hydrogen cyanide from cysteine, which is in agreement with the results of the pyrazolone method.3 To determine the recovery of cyanide from whole blood, known amounts of cyanide were added to blood and cyanide was determined by the proposed procedure.The results are given in Table 5. Conclusion The proposed method for the determination of cyanide is rapid and simple. No use is made of carcinogenic compounds and it can be applied to the detection of hydrogen cyanide in biological fluids. The sensitivity of the method is comparable to or better than that of other reported methods. The authors are grateful to the Head of the Department of Chemistry, Ravishankar University, Raipur for providing laboratory facilities. One of them (S. U.) thanks CSIR for providing a senior research fellowship. 1. 2. 3. 4. 5. 6. 7. 8. 9. 10. 21. 12. 13. References Patty, F. A., “Industrial Hygiene and Toxicology of Industrial Inorganic Poisons,” Volume 22, Interscience, New York, 1967, pp. 1997 and 1994. Vol’berg, N. Sh., and Kuzmina, T. A,, Russ. J. Anal. Chem., 1983, 597; Zh. Anal. Khim., 1983, 38, 797. Johnson, D. J., and Williams, H. L., Anal. Lett., 1985,18 (B7), 855. Alridge, W. N., Analyst, 1944, 69, 262. Epstein, J., Anal. Chem., 1947, 19, 272. Bark, L. S., and Higson, H. G., Tulunta, 1964, 11, 621. Dagnall, R. M., El Gnamry, M. T., and West, T. S., Talanta, 1968, 15, 167. Hirio, K., Anal. Lett., 1973, 6 , 761. Wei, F. S . , Liu, Y. Q., Yul, F., and Shen, N. K . , Talantu, 1981, 28, 694. Xiang, G., Fenxi Huaxue, 1984, 12, 1959; Anal. Abstr., 1985, 47, 1H50. Upadhyay, S., and Gupta, V. K . , Analyst, 1984, 109, 1619. Konig, W., J . Pract. Chem., 1907, 70, 19. Jacobs, M. B., “The Analytical Toxicology of Industrial Inorganic Poisons,” Volume 22, Interscience, New York, 1967, p. 723. Paper A711 92 Received May 18th, 1987 Accepted July 27th, 1987
ISSN:0003-2654
DOI:10.1039/AN9871201681
出版商:RSC
年代:1987
数据来源: RSC
|
|