1H NMR Analysis (SIMCA)
Excel Processing
Secondary observation ID labels can be added by inserting a blank second row and filling in the desired labels. This does NOT affect the raw data. It merely makes “viewing” easier in the SIMCA output.
Use the standard method to remove instrument noise. Click here [1] for details.
For PLS-DA analysis, the y-variable values are placed in a row below the row containing the last NMR integral bucket value. When the spreadsheet is transposed in SIMCA, this last row becomes the last column.
After noise removal, the data should be autoscaled. By default, SIMCA-P will use "UV" autoscale the imported data set.
PCA Analysis
1. Start a “new” project by opening the desired integral table spreadsheet
2. Click on new project icon
3. Select the desired Excel file
4. Click “Open”
5. Click on the comma option The data formatting should now look correct.
6. Click “OK”
7. Project type should be SIMCA-P Project
8. Click “next” button
9. Respond “No” to the question regarding the one row that is “empty or contains only text”
It is the row of labels that was inserted using Excel. You can delete the blank row to avoid this.
10. Use the “Commands” button (bottom left) to “transpose” the data set The rows should now correspond to a given NMR spectrum. Each row is an “observation”. The integral values contained in each row are referred to as “variables”.
11. Click on the button at the top of the column to set the first column as the “primary observation ID’s” It may already be labeled as primary.
12. Click on the “Observation IDs primary” button The observation IDs should now be color coded with the observation ID primary color. If secondary observation IDs were inserted using Excel, then click on the second column button and then click on “Observation IDs secondary”. Again, the column should be color coded to the correct color (light yellow).
12. Click on the butoon for the first row It may already be labeled as primary and then click on the “Variable IDs primary” button to color code the first row (green). Next, repeat for the second row containing the ppm ranges that define each bucket. These will be the “Variable IDs secondary” and are turquoise. Click “Next” button in the lower-right.
13. Click the “Finish” button
14. Exclude the solvent region and “ends” of the spectra
15. Highlight the desired rows by dragging the cursor along the top set of buttons and then click the “exclude” button (along the left edge)
16. Repeat for each desired region
17. Click “Next” then click “Finish”
18. Using the menu bar, click on the “autofit” button.
This should calculate the first and second primary components. Additional components can be calculated using the “Calculate next component” button.
19. View the results, click on the “Create four overview plots” button. This produces the “Score Scatter” plot in the upper-left hand corner. The lower-left hand corner contains the “Loading Scatter” plot.
20. Expand the score scatter plot for better viewing
21. Click on data point
22. Right-click and choose “Properties”
23. Choose the color tab and choose coloring type by “identifiers” The default then is to color by secondary observation IDs. This uses the labels inserted using Excel. If desired, change the default colors.
24. Click “Apply”
25. Click “OK”
PLS Analysis
1. The data are prepared in an Excel file as described above.
2. The Excel file is opened in SIMCA as described above for PCA analysis.
3. The opened file should then be transposed. Again, the commands button found in the lower left provides access to this command.
4. The label information for observations and variables should be processed as described for PCA analysis above.
5. The difference between the PCA approach and the PLS approach occurs with labeling of the variables (i.e., the bucketed intensities and the discriminator values).
6. The region containing the bucketed intensities is highlighted. These values are labeled as “x-variables” by clicking on the VARIABLE button in the left hand frame and choosing x-variables.
7. Next, the discriminator values should be set as 1 or 0.
8. Selected data may still be excluded prior to the statistical analysis
9. The dataset is “fit”, additional components can be added/substracted and the results are visualized using the same commands as for PCA analysis described above.
OPLS-DA Analysis
1. This approach applies orthogonal signal correction prior to PLS analysis. This manual is for SIMCA-P+ 12.0 or higher version.
2. Start from an Excel file that has NOT been autoscaled and any outliers should be removed prior to OPLS analysis.
3. The data file should be first analyzed using either PCA or PLS as described above to find two unsupervised, separated groups for comparison. Wild-type or control groups should be assigned as "0", mutant or treatment group should be assigned as "1". This column of "0" and "1" should be taken as Y values.
4. Follow the same procedure as PCA to import the data.
5. Click on Y to change the state of these to Y values. Click on the “Next >” button. There may be a message regarding exclusion of variables with no variance. A new Workset window will appear.
6. Click "Yes to All" and then another workset window will pop-up with "model number", "type of model", etc.
7. Double click it and click "Workset" icon on that pop-up window.
8. Select all the variables, and use "Par" for scaling. Model type "OPLS/O2PLS".
9. Click Autofit button on the toolbar