The results from Principle Component Analysis are interesting and enlightening, but there is a concern which should be addressed. For the PC analyses, the data were utilized in the form of straight element concentrations in the samples. Some researchers (e.g. Aspinall and Feather, 1972; Murray, 1994) make the point that absolute concentrations may not be as informative for provenance studies as the ratios of element concentrations within the samples. As was shown above, the geochemical trend associated with the PC1 scores appears to be a function of an overall increase of trace element content from northwest to southeast. This overall increase in concentrations might mask other, more subtle geochemical trends of changing element ratios.
The use of element ratios in the statistical analyses might be approached in two different ways. One is to create a data set of individual element ratio pairings (i.e. Ce/La, Mg/Fe, etc.), as was done for a few of the Rare Earth Elements in Discriminant Analysis (discussed below). The drawback to this approach is the necessity to create a second data set from the first one, composed of element ratio pairs. In order to pair each variable with every other one, the size of the data set swells immensely. With fifteen element variables, a complete data set of all the element ratios would have 105 variables.
Avoiding the above scenario, another statistical method to analyze the element ratios in a data set is to use element concentrations calculated as a percentage of the total trace elements for each sample, rather than the absolute concentrations. One way to do this might be to convert the data set to percentages and enter these new data into Principle Component Analysis. However, there is a different statistical tool which is specifically designed to utilize data as percentages, and will automatically calculate the percentages when a "regular" data set is used as input. This statistical method is known as Correspondence Analysis.
Correspondence Analysis works very similarly to Principle Component Analysis, calculating eigenvectors from the data matrix which can then be used to produce a two-dimensional view of the multi-dimensional data set. In Principle Component Analysis the two largest eigenvectors are referred to as PC1 and PC2; in Correspondence Analysis they are DIM1 and DIM2. A Correspondence Analysis of the Prairie du Chien data set was made using the SAS procedure corresp. For less complicated output, the samples were divided into six geographical groups (see Figure 8) rather than plotting each sample location individually. The output, as shown in Figure 9, appears to reveal little more information than did Principle Component Analysis. The six different groups of sample locations have a great deal of overlap. In fact, there is nearly complete overlap between five of the groups, with only Group 6 having relatively distinct separation from the rest. This probably reflects the fact that Group 6 is geographically well-separated from the main body of sample locations (Figure 8).
Figure 8: Sample location groups for Correspondence Analysis and Discriminant Analysis.
Figure 9: Plot of DIM1 versus DIM2, from Correspondence Analysis. Ellipsoidal shapes show the distribution of sample values from each of the six sample location groups. Outliers are marked by a number corresponding to the group to which they belong.
NEXT PREVIOUS TABLE OF CONTENTS
Intro and Background Fieldwork Sample Prep Data Analysis PCA Correspondence Analysis Stepwise DA
Discriminant Analysis More PCA Element Trends Conclusions Bibliography Appendix A: Part 1 Part 2 Part 3 Part 4 Part 5 Appendix B Appendix C