5  Pre-processing HRMS data

library(readxl)
library(knitr)



5.1 Raw data conversion

5.1.1 MSConvert

What Code Comment
Website https://proteowizard.sourceforge.io/tools/msconvert.html NA
Docker: Pull from docker repo docker pull chambm/pwiz-skyline-i-agree-to-the-vendor-licenses NA
Docker: Help docker run -it –rm chambm/pwiz-skyline-i-agree-to-the-vendor-licenses wine msconvert –help NA
Docker: convert a file docker run -it –rm -e WINEDEBUG=-all -v /your/data:/data chambm/pwiz-skyline-i-agree-to-the-vendor-licenses wine msconvert /data/file.raw docker run -it –rm -e WINEDEBUG=-all -v /home/ORUNET.ORU.SE/twg/Windows_home/Raw_Data/Guo_Feng/100.RAW:/data chambm/pwiz-skyline-i-agree-to-the-vendor-licenses wine msconvert /data/100.raw
Docker: convert all the files in a folder docker run -it –rm -e WINEDEBUG=-all -v /your/data/path/*.RAW:/data chambm/pwiz-skyline-i-agree-to-the-vendor-licenses wine msconvert /data/ docker run -it –rm -e WINEDEBUG=-all -v ~/Windows_home/Raw_Data/Kallinge/New_analysis_20200414/test/*.mzML:/data chambm/pwiz-skyline-i-agree-to-the-vendor-licenses wine msconvert /data/
use vendor centroiding for msLevels in [1,2] msconvert data.RAW –filter “peakPicking true [1,2]” NA
Docker: use vendor centroiding for msLevels in [1,2] docker run -it –rm -e WINEDEBUG=-all -v /home/ORUNET.ORU.SE/twg/Raw_data/Kallinge/New_analysis_20200414/centroid/:/data chambm/pwiz-skyline-i-agree-to-the-vendor-licenses wine msconvert /data/*.mzML –filter “peakPicking true [1,2]” –mzXML Works now but need to copy files to the tla-01/02 server to be ok. From Yu Miao: The only issue is that if you convert mzML to another mzML, you will get an error about “Output filepath is the same as input filepath”. So I added ‘–mzXML’ to give another name. The only issue is that if you convert mzML to another mzML, you will get an error about “Output filepath is the same as input filepath”. So I added ‘–mzXML’ to give another name. This command contain two parts. The first part mapping your local mzML folder to the docker image’s ‘data’ folder. The second part is convert mzML files in this folder into new files.

5.1.2 Waters MSe

https://ccms-ucsd.github.io/GNPSDocumentation/fileconversion_waters/

5.2 Peak picking


PFOS isotopic patterns


Mass traces of PFOS and labelled standards



## Alignment



## Componentization



## Proprietary and standardized MS data format
See: https://en.wikipedia.org/wiki/Mass_spectrometry_data_format



5.3 Open tools for preprocessing

5.4 OpenMS and Toppview

xx


https://www.youtube.com/watch?v=GuK1daIc6vo&list=PL2u38g_AG4MH7yCMF06N2VW7eZOJcglh7&index=5



## MSConvert xx

5.5 MsnBase and XCMS

xx

5.5.1 XCMS

Check: A scalable workflow to characterize the human exposome, SI-11, https://www.nature.com/articles/s41467-021-25840-9#MOESM14

IPO optimization

5.5.2 CAMERA

http://www.metabolomics-forum.com/index.php?topic=278.0 calcCiS: Calculate correlation inside samples That means correlation across the peak = is it really coeluting or not? It is correlation inside the sample; not inside a sample group. This means that camera goes back to the raw data and compares extracted ion chromatograms. The illutration in Carsten’s paper show this: http://pubs.acs.org/doi/abs/10.1021/ac202450g This will fail if compounds are perfectly coeluting.

calcCaS: Calculate correlation accross samples They are correlated if high intensity of feature A means high intensity of feature B. The study design or sample groups are not used for this information. Look at these plots. Each dot is a sample.

5.6 PatRoon

xx

https://www.researchsquare.com/article/rs-36675/v1
https://rickhelmus.github.io/patRoon/articles/tutorial.html

OpenMS performs peak picking and isotope grouping in one step (any idea how to turn off this?), while XCMS does not perform this step here. From Rick: For OpenMS: you can use a trick for this by setting localMZRange=0, this way OpenMS won’t be able to look for isotopes. Note that its detection was more developed for ‘natural compounds’ (eg proteomics, metabolomics) in mind, and that it isn’t really good in grouping halogenated peaks anyway.

PatRoon Docker:
In linux, run:
docker run –rm -p 8787:8787 -u 0 -e PASSWORD=yourpasswordhere -v /home/ORUNET.ORU.SE/twg/Raw_data:/home/rstudio/Raw_data patroonorg/patroonrs /init

5.7 MS-DIAL

MS2Dec:
Sigma window value: 0.1 - 1.0. Smaller value will avoid clustering of peaks that are far from each other (false positive grouping).
In GC-HRMS, try 0.8 to separate and deconvolute very closely coeluting peak in GC (+-1). These parameters should be tested with e.g. QC samples with known compounds to get best deconvolution parameter for your samples.
MS/MS abundance cut off: increase to remove background noise



5.8 Output formats


- MSnExp and XCMSnExp
- MSP
- MFG
- CSV (for both output data or metadata)