I use spectroscopy to analyse chemical compounds generated from cooking oils. My data set comprises 6 oils, 5 material treatments, 3 timepoints. There would appear to be a range of tests here: (a) correlation of membership category (X matrix) with either the spectra themselves or spectra-derived compound concentrations (Y matrix; I assume that this sort of category description is not appropriate for PLS-DA whereby membership constitutes the Y matrix instead), i.e. this would be an inverse regression; what contrast coefficients should I use for matrices combining time and material or time, material and oil; do I in fact exclude one set of category membership dummy variable columns as in more typical regression analyses? Is the use of Q2 and R2 and 70/30 training/validation sets appropriate here?); (b) spectra (X) versus %fatty acid distribution or generated compound concentrations (Y, both are easily obtained from spectra for the calibration set) whereby I could compare PLS1 levels obtained from individual levels with those derived simultaneously (PLS2) from the whole lot; (c) a PLS-DA analysis of spectral data (X) versus class membership based on, e.g. acceptable compound ingestible levels (Y), i.e. toxic (“1”) or non-toxic (“0”), again do I exclude one of these category membership columns?).