Type: | Package |
Title: | Spectral Entropy for Mass Spectrometry Data |
Version: | 0.1.4 |
Date: | 2023-08-07 |
Description: | Clean the MS/MS spectrum, calculate spectral entropy, unweighted entropy similarity, and entropy similarity for mass spectrometry data. The entropy similarity is a novel similarity measure for MS/MS spectra which outperform the widely used dot product similarity in compound identification. For more details, please refer to the paper: Yuanyue Li et al. (2021) "Spectral entropy outperforms MS/MS dot product similarity for small-molecule compound identification" <doi:10.1038/s41592-021-01331-z>. |
License: | Apache License (== 2.0) |
Depends: | R (≥ 3.5.0), Rcpp (≥ 1.0.10) |
Suggests: | testthat |
LinkingTo: | Rcpp |
RoxygenNote: | 7.2.3 |
Encoding: | UTF-8 |
URL: | https://github.com/YuanyueLi/MSEntropy |
NeedsCompilation: | yes |
Packaged: | 2023-08-07 22:58:36 UTC; yli |
Author: | Yuanyue Li [aut, cre] |
Maintainer: | Yuanyue Li <liyuanyue@gmail.com> |
Repository: | CRAN |
Date/Publication: | 2023-08-07 23:10:02 UTC |
Entropy similarity between two spectra
Description
Calculate the entropy similarity between two spectra
Usage
calculate_entropy_similarity(
peaks_a,
peaks_b,
ms2_tolerance_in_da,
ms2_tolerance_in_ppm,
clean_spectra,
min_mz,
max_mz,
noise_threshold,
max_peak_num
)
Arguments
peaks_a |
A matrix of spectral peaks, with two columns: mz and intensity |
peaks_b |
A matrix of spectral peaks, with two columns: mz and intensity |
ms2_tolerance_in_da |
The MS2 tolerance in Da, set to -1 to disable |
ms2_tolerance_in_ppm |
The MS2 tolerance in ppm, set to -1 to disable |
clean_spectra |
Whether to clean the spectra before calculating the entropy similarity, see |
min_mz |
The minimum mz value to keep, set to -1 to disable |
max_mz |
The maximum mz value to keep, set to -1 to disable |
noise_threshold |
The noise threshold, set to -1 to disable, all peaks have intensity < noise_threshold * max_intensity will be removed |
max_peak_num |
The maximum number of peaks to keep, set to -1 to disable |
Value
The entropy similarity
Examples
mz_a <- c(169.071, 186.066, 186.0769)
intensity_a <- c(7.917962, 1.021589, 100.0)
mz_b <- c(120.212, 169.071, 186.066)
intensity_b <- c(37.16, 66.83, 999.0)
peaks_a <- matrix(c(mz_a, intensity_a), ncol = 2, byrow = FALSE)
peaks_b <- matrix(c(mz_b, intensity_b), ncol = 2, byrow = FALSE)
calculate_entropy_similarity(peaks_a, peaks_b,
ms2_tolerance_in_da = 0.02, ms2_tolerance_in_ppm = -1,
clean_spectra = TRUE, min_mz = 0, max_mz = 1000,
noise_threshold = 0.01,
max_peak_num = 100)
Calculate spectral entropy of a spectrum
Description
Calculate spectral entropy of a spectrum
Usage
calculate_spectral_entropy(peaks)
Arguments
peaks |
A matrix of peaks, with two columns: m/z and intensity. |
Value
A double value of spectral entropy.
Examples
mz <- c(100.212, 300.321, 535.325)
intensity <- c(37.16, 66.83, 999.0)
peaks <- matrix(c(mz, intensity), ncol = 2, byrow = FALSE)
calculate_spectral_entropy(peaks)
Unweighted entropy similarity between two spectra
Description
Calculate the unweighted entropy similarity between two spectra
Usage
calculate_unweighted_entropy_similarity(
peaks_a,
peaks_b,
ms2_tolerance_in_da,
ms2_tolerance_in_ppm,
clean_spectra,
min_mz,
max_mz,
noise_threshold,
max_peak_num
)
Arguments
peaks_a |
A matrix of spectral peaks, with two columns: mz and intensity |
peaks_b |
A matrix of spectral peaks, with two columns: mz and intensity |
ms2_tolerance_in_da |
The MS2 tolerance in Da, set to -1 to disable |
ms2_tolerance_in_ppm |
The MS2 tolerance in ppm, set to -1 to disable |
clean_spectra |
Whether to clean the spectra before calculating the entropy similarity, see |
min_mz |
The minimum mz value to keep, set to -1 to disable |
max_mz |
The maximum mz value to keep, set to -1 to disable |
noise_threshold |
The noise threshold, set to -1 to disable, all peaks have intensity < noise_threshold * max_intensity will be removed |
max_peak_num |
The maximum number of peaks to keep, set to -1 to disable |
Value
The unweighted entropy similarity
Examples
mz_a <- c(169.071, 186.066, 186.0769)
intensity_a <- c(7.917962, 1.021589, 100.0)
mz_b <- c(120.212, 169.071, 186.066)
intensity_b <- c(37.16, 66.83, 999.0)
peaks_a <- matrix(c(mz_a, intensity_a), ncol = 2, byrow = FALSE)
peaks_b <- matrix(c(mz_b, intensity_b), ncol = 2, byrow = FALSE)
calculate_unweighted_entropy_similarity(peaks_a, peaks_b,
ms2_tolerance_in_da = 0.02, ms2_tolerance_in_ppm = -1,
clean_spectra = TRUE, min_mz = 0, max_mz = 1000,
noise_threshold = 0.01,
max_peak_num = 100)
Clean a spectrum
Description
Clean a spectrum
This function will clean the peaks by the following steps: 1. Remove empty peaks (mz <= 0 or intensity <= 0). 2. Remove peaks with mz >= max_mz or mz < min_mz. 3. Centroid the spectrum by merging peaks within min_ms2_difference_in_da or min_ms2_difference_in_ppm. 4. Remove peaks with intensity < noise_threshold * max_intensity. 5. Keep only the top max_peak_num peaks. 6. Normalize the intensity to sum to 1.
Note: The only one of min_ms2_difference_in_da and min_ms2_difference_in_ppm should be positive.
Usage
clean_spectrum(
peaks,
min_mz,
max_mz,
noise_threshold,
min_ms2_difference_in_da,
min_ms2_difference_in_ppm,
max_peak_num,
normalize_intensity
)
Arguments
peaks |
A matrix of spectral peaks, with two columns: mz and intensity |
min_mz |
The minimum mz value to keep, set to -1 to disable |
max_mz |
The maximum mz value to keep, set to -1 to disable |
noise_threshold |
The noise threshold, set to -1 to disable, all peaks have intensity < noise_threshold * max_intensity will be removed |
min_ms2_difference_in_da |
The minimum mz difference in Da to merge peaks, set to -1 to disable, any two peaks with mz difference < min_ms2_difference_in_da will be merged |
min_ms2_difference_in_ppm |
The minimum mz difference in ppm to merge peaks, set to -1 to disable, any two peaks with mz difference < min_ms2_difference_in_ppm will be merged |
max_peak_num |
The maximum number of peaks to keep, set to -1 to disable |
normalize_intensity |
Whether to normalize the intensity to sum to 1 |
Value
A matrix of spectral peaks, with two columns: mz and intensity
Examples
mz <- c(100.212, 169.071, 169.078, 300.321)
intensity <- c(0.3716, 7.917962, 100., 66.83)
peaks <- matrix(c(mz, intensity), ncol = 2, byrow = FALSE)
clean_spectrum(peaks, min_mz = 0, max_mz = 1000, noise_threshold = 0.01,
min_ms2_difference_in_da = 0.02, min_ms2_difference_in_ppm = -1,
max_peak_num = 100, normalize_intensity = TRUE)
Calculate spectral entropy similarity between two spectra
Description
msentropy_similarity
calculates the spectral entropy between two spectra
(Li et al. 2021). It is a wrapper function defining defaults for parameters
and calling the calculate_entropy_similarity()
or
calculate_unweighted_entropy_similarity()
functions to perform the
calculation.
Usage
msentropy_similarity(
peaks_a,
peaks_b,
ms2_tolerance_in_da = 0.02,
ms2_tolerance_in_ppm = -1,
clean_spectra = TRUE,
min_mz = 0,
max_mz = 1000,
noise_threshold = 0.01,
max_peak_num = 100,
weighted = TRUE,
...
)
Arguments
peaks_a |
A two-column numeric matrix with the m/z and intensity values for peaks of one spectrum. |
peaks_b |
A two-column numeric matrix with the m/z and intensity values for peaks of one spectrum. |
ms2_tolerance_in_da |
The MS2 tolerance in Da, set to -1 to disable.
Defaults to |
ms2_tolerance_in_ppm |
The MS2 tolerance in ppm, set to -1 to disable.
Defaults to |
clean_spectra |
Whether to clean the spectra before calculating the
entropy similarity, see |
min_mz |
The minimum mz value to keep, set to -1 to disable. Defaults to
|
max_mz |
The maximum mz value to keep, set to -1 to disable. Defaults to
|
noise_threshold |
The noise threshold, set to -1 to disable, all peaks
have intensity < noise_threshold * max_intensity will be removed.
Defaults to |
max_peak_num |
The maximum number of peaks to keep, set to -1 to
disable. Defaults to |
weighted |
|
... |
Optional additional parameters (currently ignored) |
Value
The entropy similarity
References
Li, Y., Kind, T., Folz, J. et al. (2021) Spectral entropy outperforms MS/MS dot product similarity for small-molecule compound identification. Nat Methods 18, 1524-1531. doi: 10.1038/s41592-021-01331-z.
Examples
peaks_a <- cbind(mz = c(169.071, 186.066, 186.0769),
intensity = c(7.917962, 1.021589, 100.0))
peaks_b <- cbind(mz = c(120.212, 169.071, 186.066),
intensity <- c(37.16, 66.83, 999.0))
msentropy_similarity(peaks_a, peaks_b, ms2_tolerance_in_da = 0.02)