Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Rearranging classes to fit within the Transcriptomics Structure #743

Merged
merged 84 commits into from
Dec 8, 2023
Merged
Show file tree
Hide file tree
Changes from 78 commits
Commits
Show all changes
84 commits
Select commit Hold shift + click to select a range
a6b1639
correct Within calculation
Nov 18, 2021
fa4da8b
update unit tests
Nov 18, 2021
3246567
conflicts resolved back to upstream
Feb 4, 2022
a018d4d
Merge remote-tracking branch 'upstream/master'
Feb 15, 2022
15a37d0
Merge remote-tracking branch 'upstream/master'
Feb 17, 2022
892fa45
this is the spot
Feb 18, 2022
211013c
Merge remote-tracking branch 'upstream/master'
Feb 25, 2022
68104ee
Merge branch 'master' of https://github.com/trishorts/mzLib
trishorts Mar 9, 2022
d715a08
Merge remote-tracking branch 'upstream/master'
Mar 16, 2022
3565522
Merge remote-tracking branch 'upstream/master'
Mar 23, 2022
72e7b53
Merge remote-tracking branch 'upstream/master'
Mar 29, 2022
593872a
Merge remote-tracking branch 'upstream/master'
trishorts Apr 13, 2022
42dd034
Merge branch 'master' of https://github.com/trishorts/mzLib
trishorts Apr 13, 2022
fbeaec0
Merge remote-tracking branch 'upstream/master'
trishorts Jun 1, 2022
614ded7
Merge remote-tracking branch 'upstream/master'
Jun 14, 2022
47307c8
Merge branch 'master' of https://github.com/trishorts/mzLib
Jun 14, 2022
28e05ae
Merge remote-tracking branch 'upstream/master'
Jul 6, 2022
0a7c609
Merge remote-tracking branch 'upstream/master'
Jul 26, 2022
630d8c7
Merge remote-tracking branch 'upstream/master'
trishorts Jul 27, 2022
f6a386b
Merge branch 'master' of https://github.com/trishorts/mzLib
trishorts Jul 27, 2022
d673800
Merge remote-tracking branch 'upstream/master'
Sep 11, 2022
675a0ae
Merge branch 'master' of https://github.com/trishorts/mzLib
Sep 11, 2022
15d4baf
Merge remote-tracking branch 'upstream/master'
Sep 27, 2022
03ca9f7
Merge remote-tracking branch 'upstream/master'
Oct 4, 2022
d0a4c79
Merge remote-tracking branch 'upstream/master'
Jan 30, 2023
894b998
Merge remote-tracking branch 'upstream/master'
Mar 15, 2023
88269a1
Merge remote-tracking branch 'upstream/master'
trishorts Apr 24, 2023
9a9b24a
Merge remote-tracking branch 'upstream/master'
trishorts Jun 29, 2023
b4ad231
add space
trishorts Jun 29, 2023
bc59b38
Merge remote-tracking branch 'upstream/master'
trishorts Oct 10, 2023
f3c83ae
first move
trishorts Nov 6, 2023
d6d934b
psmFromTsv unit tests
trishorts Nov 6, 2023
2db71cd
moved library spectrum
trishorts Nov 6, 2023
562f69d
empty unit test for library spectrum
trishorts Nov 6, 2023
d3dcbe9
m
trishorts Nov 6, 2023
2c4334a
library spectrum unit tests
trishorts Nov 7, 2023
a86d68e
lib spec unit tests
trishorts Nov 7, 2023
c7ce32d
PSMTSV unit tests
trishorts Nov 7, 2023
c610791
add tests for variants and localized glycans
trishorts Nov 7, 2023
5e09c14
capitalization convention
trishorts Nov 7, 2023
9055644
read internal ions test
trishorts Nov 7, 2023
74b80ad
uncomment lines
trishorts Nov 7, 2023
d1bc75c
moved fragmentation and library spectrum to new project Omics
trishorts Nov 8, 2023
cec311a
Revert "moved fragmentation and library spectrum to new project Omics"
trishorts Nov 9, 2023
8d88b32
someInterfaces
trishorts Nov 9, 2023
df0f605
good midpont
trishorts Nov 9, 2023
cad0d1c
omics classes and interfaces seem tobe working
trishorts Nov 9, 2023
8991e14
move LibrarySpectrum class to Omics. Create SpectrumMatchFromTsvHeade…
trishorts Nov 10, 2023
02bf807
not working
trishorts Nov 15, 2023
b7d15d6
Fixed up the PR
nbollis Nov 15, 2023
2502322
Merge pull request #2 from trishorts/tempPsmFromTsv
trishorts Nov 16, 2023
924e99f
fix broken test
trishorts Nov 16, 2023
10f53a2
some unit tests
trishorts Nov 16, 2023
d0a55b2
dhg
trishorts Nov 16, 2023
81f9338
Expanded test coverage on file classes
nbollis Nov 16, 2023
382c0da
new header and xlink psmtsv reader unit tests
trishorts Nov 20, 2023
1c779e6
Merge branch 'master' into PsmFrmTsv
trishorts Nov 24, 2023
b1df755
Merge branch 'master' into PsmFrmTsv
nbollis Nov 27, 2023
5833d0f
space update
trishorts Nov 27, 2023
3f1ee5e
update nuspec for omics and added peptide folder to omics fragmentatkion
trishorts Nov 27, 2023
015ec82
Merge branch 'master' into PsmFrmTsv
trishorts Nov 27, 2023
540f449
Moved around most everything that wil need to be for Transcriptomics …
nbollis Nov 27, 2023
b91f11a
Made all tests pass
nbollis Nov 27, 2023
d1c7035
Moved a few methods out of PeptideWithSetModifications and into IBioP…
nbollis Nov 27, 2023
a915405
Moved methods from ProteolyticPeptide to LysisProduct
nbollis Nov 28, 2023
0a2cd99
Marked RNase.tsv to copy always
nbollis Nov 28, 2023
9571870
Cleaned up the code quite a bit
nbollis Nov 28, 2023
179ac3c
Updated product class equalit members
nbollis Nov 28, 2023
d81132a
Updated product class equalit members
nbollis Nov 28, 2023
022d63e
Merge branch 'RNA_FirstIncorporation' of https://github.com/nbollis/m…
nbollis Nov 28, 2023
1a7356a
This one method keeps fighting me
nbollis Nov 28, 2023
706ab75
Removed AnyCPU
nbollis Nov 29, 2023
6b1f48d
Added tests to ChemicalFormual operators
nbollis Nov 30, 2023
c309a10
Merged in Master
nbollis Nov 30, 2023
e3d3c21
Update mzLib.sln
nbollis Nov 30, 2023
7825ae8
Updated Nuspec
nbollis Nov 30, 2023
a7c729d
Merge branch 'RNA_FirstIncorporation' of https://github.com/nbollis/m…
nbollis Nov 30, 2023
3e19f13
Changed naming convention in Digestion Agent
nbollis Dec 7, 2023
2c7fd75
Added comment to clarify base loss ions from 5' end
nbollis Dec 7, 2023
4de474e
Added Comments to BioPolymerWithSetModsExtensions
nbollis Dec 7, 2023
d443092
Renamed LysisProduct to DigestionProduct
nbollis Dec 7, 2023
3027ccb
changed name on one method in DigestionProduct
nbollis Dec 7, 2023
78d6d81
Renamed get fixed mods method in DigestionProduct
nbollis Dec 7, 2023
5fb3270
Merge branch 'master' into RNA_FirstIncorporation
nbollis Dec 7, 2023
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
42 changes: 38 additions & 4 deletions mzLib/Chemistry/ChemicalFormula.cs
Original file line number Diff line number Diff line change
Expand Up @@ -31,7 +31,7 @@
/// Formula can change!!! If isotopes or elements are changed.
/// </summary>
[Serializable]
public sealed class ChemicalFormula : IEquatable<ChemicalFormula>
public sealed class ChemicalFormula : IEquatable<ChemicalFormula>, IHasChemicalFormula
{
// Main data stores, the isotopes and elements

Expand Down Expand Up @@ -59,12 +59,14 @@
Elements = new Dictionary<Element, int>();
}

public ChemicalFormula(ChemicalFormula capFormula)
public ChemicalFormula(IHasChemicalFormula capFormula)
{
Isotopes = new Dictionary<Isotope, int>(capFormula.Isotopes);
Elements = new Dictionary<Element, int>(capFormula.Elements);
Isotopes = new Dictionary<Isotope, int>(capFormula.ThisChemicalFormula.Isotopes);
Elements = new Dictionary<Element, int>(capFormula.ThisChemicalFormula.Elements);
}

public ChemicalFormula ThisChemicalFormula => this;

/// <summary>
/// Gets the average mass of this chemical formula
/// </summary>
Expand Down Expand Up @@ -523,5 +525,37 @@
otherParts.Sort();
return s + string.Join("", otherParts);
}

public override string ToString()
{
return $"{ThisChemicalFormula.Formula} : {MonoisotopicMass}";
}

Check warning on line 532 in mzLib/Chemistry/ChemicalFormula.cs

View check run for this annotation

Codecov / codecov/patch

mzLib/Chemistry/ChemicalFormula.cs#L530-L532

Added lines #L530 - L532 were not covered by tests

public static ChemicalFormula operator -(ChemicalFormula left, IHasChemicalFormula right)
{
if (left == null)
return right == null ? null : new ChemicalFormula(right);
Alexander-Sol marked this conversation as resolved.
Show resolved Hide resolved
if (right == null)
return new ChemicalFormula(left);


ChemicalFormula newFormula = new ChemicalFormula(left);
newFormula.Remove(right);
return newFormula;
}

public static ChemicalFormula operator +(ChemicalFormula left, IHasChemicalFormula right)
{
// if left is null, return right. If right is null, return left. If both are null, return null. If both are not null, add them
if (left == null)
return right == null ? null : new ChemicalFormula(right);
if (right == null)
return new ChemicalFormula(left);

ChemicalFormula newFormula = new ChemicalFormula(left);
newFormula.Add(right);
return newFormula;

}
}
}
131 changes: 131 additions & 0 deletions mzLib/Omics/BioPolymerWithSetModsExtensions.cs
Original file line number Diff line number Diff line change
@@ -0,0 +1,131 @@
using System.Text;
using Chemistry;
using Omics.Modifications;

namespace Omics;

public static class BioPolymerWithSetModsExtensions
{
public static string FullSequenceWithMassShift(this IBioPolymerWithSetMods withSetMods)
{
var subsequence = new StringBuilder();

// modification on peptide N-terminus
if (withSetMods.AllModsOneIsNterminus.TryGetValue(1, out Modification mod))
{
subsequence.Append('[' + mod.MonoisotopicMass.RoundedDouble(6).ToString() + ']');
}

for (int r = 0; r < withSetMods.Length; r++)
{
subsequence.Append(withSetMods[r]);

// modification on this residue
if (withSetMods.AllModsOneIsNterminus.TryGetValue(r + 2, out mod))
{
if (mod.MonoisotopicMass > 0)
{
subsequence.Append("[+" + mod.MonoisotopicMass.RoundedDouble(6).ToString() + ']');
}
else
{
subsequence.Append("[" + mod.MonoisotopicMass.RoundedDouble(6).ToString() + ']');
}

Check warning on line 33 in mzLib/Omics/BioPolymerWithSetModsExtensions.cs

View check run for this annotation

Codecov / codecov/patch

mzLib/Omics/BioPolymerWithSetModsExtensions.cs#L31-L33

Added lines #L31 - L33 were not covered by tests
}
}

// modification on peptide C-terminus
if (withSetMods.AllModsOneIsNterminus.TryGetValue(withSetMods.Length + 2, out mod))
{
if (mod.MonoisotopicMass > 0)
{
subsequence.Append("[+" + mod.MonoisotopicMass.RoundedDouble(6).ToString() + ']');
}
else
{
subsequence.Append("[" + mod.MonoisotopicMass.RoundedDouble(6).ToString() + ']');
}
}
return subsequence.ToString();
}

/// <summary>
/// This method returns the full sequence with mass shifts INSTEAD OF PTMs in brackets []
/// Some external tools cannot parse PTMs, instead requiring a numerical input indicating the mass of a PTM in brackets
/// after the position of that modification
/// N-terminal mas shifts are in brackets prior to the first amino acid and apparently missing the + sign
/// </summary>
/// <returns></returns>
public static string EssentialSequence(this IBioPolymerWithSetMods withSetMods,
IReadOnlyDictionary<string, int> modstoWritePruned)
{
nbollis marked this conversation as resolved.
Show resolved Hide resolved
string essentialSequence = withSetMods.BaseSequence;
if (modstoWritePruned != null)
{
var sbsequence = new StringBuilder();

// variable modification on peptide N-terminus
if (withSetMods.AllModsOneIsNterminus.TryGetValue(1, out Modification pep_n_term_variable_mod))
{
if (modstoWritePruned.ContainsKey(pep_n_term_variable_mod.ModificationType))
{
sbsequence.Append('[' + pep_n_term_variable_mod.ModificationType + ":" + pep_n_term_variable_mod.IdWithMotif + ']');
}
}
for (int r = 0; r < withSetMods.Length; r++)
{
sbsequence.Append(withSetMods[r]);
// variable modification on this residue
if (withSetMods.AllModsOneIsNterminus.TryGetValue(r + 2, out Modification residue_variable_mod))
{
if (modstoWritePruned.ContainsKey(residue_variable_mod.ModificationType))
{
sbsequence.Append('[' + residue_variable_mod.ModificationType + ":" + residue_variable_mod.IdWithMotif + ']');
}
}
}

// variable modification on peptide C-terminus
if (withSetMods.AllModsOneIsNterminus.TryGetValue(withSetMods.Length + 2, out Modification pep_c_term_variable_mod))
{
if (modstoWritePruned.ContainsKey(pep_c_term_variable_mod.ModificationType))
{
sbsequence.Append('[' + pep_c_term_variable_mod.ModificationType + ":" + pep_c_term_variable_mod.IdWithMotif + ']');
}
}

essentialSequence = sbsequence.ToString();
}
return essentialSequence;
}

public static string DetermineFullSequence(this IBioPolymerWithSetMods withSetMods)
{
var subSequence = new StringBuilder();

// modification on peptide N-terminus
if (withSetMods.AllModsOneIsNterminus.TryGetValue(1, out Modification mod))
{
subSequence.Append('[' + mod.ModificationType + ":" + mod.IdWithMotif + ']');
}

for (int r = 0; r < withSetMods.Length; r++)
{
subSequence.Append(withSetMods[r]);

// modification on this residue
if (withSetMods.AllModsOneIsNterminus.TryGetValue(r + 2, out mod))
{
subSequence.Append('[' + mod.ModificationType + ":" + mod.IdWithMotif + ']');
}
}

// modification on peptide C-terminus
if (withSetMods.AllModsOneIsNterminus.TryGetValue(withSetMods.Length + 2, out mod))
{
subSequence.Append('[' + mod.ModificationType + ":" + mod.IdWithMotif + ']');
}

return subSequence.ToString();
}
}
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
namespace Proteomics.ProteolyticDigestion
namespace Omics.Digestion
{
public enum CleavageSpecificity
{
Expand Down
107 changes: 107 additions & 0 deletions mzLib/Omics/Digestion/DigestionAgent.cs
Original file line number Diff line number Diff line change
@@ -0,0 +1,107 @@
using Omics.Modifications;
using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;
using System.Threading.Tasks;

namespace Omics.Digestion
{
public abstract class DigestionAgent
{
protected DigestionAgent(string name, CleavageSpecificity cleavageSpecificity, List<DigestionMotif> motifList, Modification cleavageMod)
{
Name = name;
CleavageSpecificity = cleavageSpecificity;
DigestionMotifs = motifList ?? new List<DigestionMotif>();
CleavageMod = cleavageMod;
}

public string Name { get; init; }
public CleavageSpecificity CleavageSpecificity { get; init; }
public List<DigestionMotif> DigestionMotifs { get; init; }
public Modification CleavageMod { get; set; }

public override string ToString()
{
return Name;
}

Check warning on line 28 in mzLib/Omics/Digestion/DigestionAgent.cs

View check run for this annotation

Codecov / codecov/patch

mzLib/Omics/Digestion/DigestionAgent.cs#L26-L28

Added lines #L26 - L28 were not covered by tests

/// <summary>
/// Is length of given peptide okay, given minimum and maximum?
/// </summary>
/// <param name="length"></param>
/// <param name="minLength"></param>
/// <param name="maxLength"></param>
/// <returns></returns>
protected static bool ValidLength(int length, int minLength, int maxLength)
{
return ValidMinLength(length, minLength) && ValidMaxLength(length, maxLength);
}

/// <summary>
/// Is length of given peptide okay, given minimum?
/// </summary>
/// <param name="length"></param>
/// <param name="minLength"></param>
/// <returns></returns>
protected static bool ValidMinLength(int length, int minLength)
{
return length >= minLength;
}

/// <summary>
/// Is length of given peptide okay, given maximum?
/// </summary>
/// <param name="length"></param>
/// <param name="maxLength"></param>
/// <returns></returns>
protected static bool ValidMaxLength(int? length, int maxLength)
{
return !length.HasValue || length <= maxLength;
}

/// <summary>
/// Gets the indices after which this protease will cleave a given protein sequence
/// </summary>
/// <param name="sequence"></param>
/// <returns></returns>
public List<int> GetDigestionSiteIndices(string sequence)
{
var indices = new List<int>();

for (int r = 0; r < sequence.Length; r++)
{
var cutSiteIndex = -1;
bool cleavagePrevented = false;

foreach (DigestionMotif motif in DigestionMotifs)
{
var motifResults = motif.Fits(sequence, r);
bool motifFits = motifResults.Item1;
bool motifPreventsCleavage = motifResults.Item2;

if (motifFits && r + motif.CutIndex < sequence.Length)
{
cutSiteIndex = Math.Max(r + motif.CutIndex, cutSiteIndex);
}

if (motifPreventsCleavage) // if any motif prevents cleave
{
cleavagePrevented = true;
}
}

// if no motif prevents cleave
if (!cleavagePrevented && cutSiteIndex != -1)
{
indices.Add(cutSiteIndex);
}
}

indices.Add(0); // The start of the protein is treated as a cleavage site to retain the n-terminal peptide
indices.Add(sequence.Length); // The end of the protein is treated as a cleavage site to retain the c-terminal peptide
return indices.Distinct().OrderBy(i => i).ToList();
}
}
}
Original file line number Diff line number Diff line change
@@ -1,9 +1,7 @@
using MzLibUtil;
using System.Collections.Generic;
using System.Linq;
using System.Text.RegularExpressions;
using System.Text.RegularExpressions;
using MzLibUtil;

namespace Proteomics.ProteolyticDigestion
namespace Omics.Digestion
{
public class DigestionMotif
{
Expand Down
15 changes: 15 additions & 0 deletions mzLib/Omics/Digestion/IDigestionParams.cs
Original file line number Diff line number Diff line change
@@ -0,0 +1,15 @@
using Omics.Fragmentation;

namespace Omics.Digestion
{
public interface IDigestionParams
{
int MaxMissedCleavages { get; set; }
int MinLength { get; set; }
int MaxLength { get; set; }
int MaxModificationIsoforms { get; set; }
int MaxMods { get; set; }
DigestionAgent Enzyme { get; }
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

might not be an Enzyme. For example cyanogen bromide

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Renamed Enzyme to DigestionAgent

FragmentationTerminus FragmentationTerminus { get; }
}
}
Loading
Loading