Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Rearranging classes to fit within the Transcriptomics Structure #743

Merged
merged 84 commits into from
Dec 8, 2023

Conversation

nbollis
Copy link
Member

@nbollis nbollis commented Nov 28, 2023

Review after PsmFromTsv is in MzLib

Moved several classes into the omics project and created parent classes and interfaces to work with both Proteins and RNA

Omics - novel and moved code

Digestion

  • Moved CleavageSpecificity and DigestionMotif to Omics
  • DigestionProduct- created a base class for ProteolysisProduct
    • Handles sequence information for digestion products
  • IDigestionParams - interface for digestion parameters
  • DigestionAgent - Base class for Protease and RNase

image

Fragmentation

  • Added ProductTypes specific to RNA
    • Ensured they could not be called by the Peptide DissociationTypeCollection
  • Added DissociationTypeCollection specific to RNA
    • Defines the fragment types from each dissociation type
    • Defines mass shift on terminal ends of fragmentation products
    • Defines which terminus each fragment type belongs to
  • FragmentationTerminus - Added 5' and 3'

Modifications

  • Modification - Moved to Omics
  • ModifcationLocalization
    • Moved to Omics
    • Generalized for Rna
  • ModificationMotif - Moved to Omics
  • SilacLabel - Moved to Omics

Interfaces

  • IBioPolymer - Interface for Proteins and NucleicAcids
    • Contains minimal set of properties for shared operations to be performed successfully
    • Defines a Digest Method

image

  • IBioPolymerWithSetMods - Interface for digestion products with localized mods
    • Defines mimimal set of properties for shared operations to be performed successfully
    • Defines a Fragment and FragmentInternally method
    • Base implementations for methods beloware found in IBioPolymerWithSetModsExtensions
      • Determine Full Sequence
      • Determine Full Sequence with mass shifts instead of mod names
      • Get Essential sequence given a set of modifications to contain
      • Get Base Sequence from Full Sequence

image

Adjustments made to existing code

  • ChemicalFormula - now implements interface IHasChemicalFormula
  • Protein
    • Moved several get only properties to expression bodies
    • Implemented IBioPolymer interface
      • Digestion can occur through the interface method or the preexisting method
  • DigestionParams
    • Implemented IDigestionParams interface
      • Left legacy naming convention intact through pointers in dedicated region

image

  • PeptideWithSetMods
    • Implemented IBioPolymerWithSetMods interface
    • Replaced DigestionParams with IDigestionParams in constructor
    • Excised several methods to IBioPolymerWithSetModsExtensions to leave legacy calls intact
  • Protease
    • Inherits from DigestionAgent
    • Moved several functions to DigestionAgent
  • ProteolyticPeptide
    • Inherits from LysisProduct
    • Moved almost all fields to LysisProduct
      • Left legacy naming convention intact through pointers in dedicated region
    • Excised several methods to LysisProduct

Transcriptomics

  • Defined interfaces for Nucleic Acids and Nucleotides
  • Added Rnase.tsv

MICHAEL SHORTREED and others added 30 commits November 18, 2021 12:30
@nbollis nbollis self-assigned this Dec 5, 2023
mzLib/Chemistry/ChemicalFormula.cs Outdated Show resolved Hide resolved
mzLib/Omics/Digestion/LysisProduct.cs Outdated Show resolved Hide resolved
@@ -235,32 +219,51 @@ public string GetEnsemblFastaHeader()

/// <summary>
/// Gets peptides for digestion of a protein
/// Legacy
/// </summary>
public IEnumerable<PeptideWithSetModifications> Digest(DigestionParams digestionParams,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Obsolete tag needed?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I do not think so. If one wanted to digest but only cared about proteins (such as in the XL or Glyco search task) this method would be called.

int MaxLength { get; set; }
int MaxModificationIsoforms { get; set; }
int MaxMods { get; set; }
DigestionAgent Enzyme { get; }
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

might not be an Enzyme. For example cyanogen bromide

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Renamed Enzyme to DigestionAgent

}
}

protected Dictionary<int, Modification> GetFixedModsOneIsNterminusOrFivePrime(int peptideLength,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

GetFixedModsOneIsNorFivePrimeTerminus?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Made this change

{
switch (mod.LocationRestriction)
{
case "5'-terminal.":
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

terminus not terminal?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This maintains naming convention from terminal mods from proteomics

{
case "5'-terminal.":
case "Oligo 5'-terminal.":
case "N-terminal.":
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

need comments here and below that clearly distinguish peptide and protein terminus mods (with something also that distinguishes cleaved oligo termini with the parent oligo termini

Copy link
Member Author

@nbollis nbollis Dec 7, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is zero distinction between how modifications are handled on 5'-terminal vs N-terminal when applying fixed modificaitons to a DigestionProduct



private static IEnumerable<int[]> GetVariableModificationPatterns(List<KeyValuePair<int, List<Modification>>> possibleVariableModifications,
int unmodifiedResiduesDesired, int[] variableModificationPattern, int index)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i'm not familiar with unmodifiedResiduesDesired. is there an explainer comment

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This was copied and pasted from ProteolyticPeptide, I have no idea what unmodifiedResiduesDesired does, but it worked for peptides and has simply been moved

@@ -0,0 +1 @@
using System;using System.Collections.Generic;using System.Linq;using System.Text;using System.Threading.Tasks;using Chemistry;using MassSpectrometry;namespace Omics.Fragmentation.Oligo{ /// <summary> /// Methods dealing with specific product type for RNA molecules /// </summary> public static class DissociationTypeCollection { /// <summary> /// Product Ion types by dissociation method /// </summary> private static readonly Dictionary<DissociationType, List<ProductType>> ProductsFromDissociationType = new Dictionary<DissociationType, List<ProductType>>() { { DissociationType.Unknown, new List<ProductType>() }, { DissociationType.CID, new List<ProductType> { ProductType.aBaseLoss, ProductType.c, ProductType.dWaterLoss, ProductType.w, ProductType.y, ProductType.yWaterLoss, ProductType.M } }, { DissociationType.LowCID, new List<ProductType>() { } }, { DissociationType.IRMPD, new List<ProductType>() { } }, { DissociationType.ECD, new List<ProductType> { } }, { DissociationType.PQD, new List<ProductType> { ProductType.a, ProductType.aBaseLoss, ProductType.b, ProductType.c, ProductType.d, ProductType.dWaterLoss, ProductType.w, ProductType.x, ProductType.y, ProductType.yWaterLoss, ProductType.d, ProductType.M } }, { DissociationType.ETD, new List<ProductType> { } }, { DissociationType.HCD, new List<ProductType> { ProductType.w, ProductType.y, ProductType.aBaseLoss, ProductType.dWaterLoss, ProductType.M } }, { DissociationType.AnyActivationType, new List<ProductType> { } }, { DissociationType.EThcD, new List<ProductType> { } }, { DissociationType.Custom, new List<ProductType> { } }, { DissociationType.ISCID, new List<ProductType> { } } }; /// <summary> /// Returns list of products types based upon the dissociation type /// </summary> /// <param name="dissociationType"></param> /// <returns></returns> public static List<ProductType> GetRnaProductTypesFromDissociationType(this DissociationType dissociationType) => ProductsFromDissociationType[dissociationType]; /// <summary> /// Mass to be added or subtracted /// </summary> private static readonly Dictionary<ProductType, ChemicalFormula> FragmentIonCaps = new Dictionary<ProductType, ChemicalFormula> { { ProductType.a, ChemicalFormula.ParseFormula("H") }, { ProductType.aWaterLoss, ChemicalFormula.ParseFormula("H-1O-1") }, { ProductType.b, ChemicalFormula.ParseFormula("OH") }, { ProductType.bWaterLoss, ChemicalFormula.ParseFormula("H-1") }, { ProductType.c, ChemicalFormula.ParseFormula("O3H2P") }, { ProductType.cWaterLoss, ChemicalFormula.ParseFormula("O2P") }, { ProductType.d, ChemicalFormula.ParseFormula("O4H2P") }, { ProductType.dWaterLoss, ChemicalFormula.ParseFormula("O3P") }, { ProductType.w, ChemicalFormula.ParseFormula("H") }, { ProductType.wWaterLoss, ChemicalFormula.ParseFormula("H-1O-1") }, { ProductType.x, ChemicalFormula.ParseFormula("O-1H") }, { ProductType.xWaterLoss, ChemicalFormula.ParseFormula("O-2H-1") }, { ProductType.y, ChemicalFormula.ParseFormula("O-3P-1") }, { ProductType.yWaterLoss, ChemicalFormula.ParseFormula("O-4H-2P-1") }, { ProductType.z, ChemicalFormula.ParseFormula("O-4P-1") }, { ProductType.zWaterLoss, ChemicalFormula.ParseFormula("O-5H-2P-1") }, //fragment - Base chemical formula is the corresponding fragment chemical formula subtracing 1 H as H is lost when base is removed { ProductType.aBaseLoss, ChemicalFormula.ParseFormula("H-2") }, // "H-1" -H { ProductType.bBaseLoss, ChemicalFormula.ParseFormula("O1H-2") }, //"OH1" -H { ProductType.cBaseLoss, ChemicalFormula.ParseFormula("O3H-1P") }, //"O3P" -H { ProductType.dBaseLoss, ChemicalFormula.ParseFormula("O4H-1P") }, //"O4H2P" -H { ProductType.wBaseLoss, ChemicalFormula.ParseFormula("H-2") }, //"H"-H { ProductType.xBaseLoss, ChemicalFormula.ParseFormula("O-1H-2") }, //"O-1H" -H { ProductType.yBaseLoss, ChemicalFormula.ParseFormula("O-3H-2P-1") }, //"O-3P-1" -H { ProductType.zBaseLoss, ChemicalFormula.ParseFormula("O-4H-3P-1") }, //"O-4H-1P-1" -1 { ProductType.M, new ChemicalFormula() } }; /// <summary> /// Returns mass shift by product type /// </summary> /// <param name="type"></param> /// <returns></returns> public static double GetRnaMassShiftFromProductType(this ProductType type) => FragmentIonCaps[type].MonoisotopicMass; public static FragmentationTerminus GetRnaTerminusType(this ProductType fragmentType) { switch (fragmentType) { case ProductType.a: case ProductType.aWaterLoss: case ProductType.aBaseLoss: case ProductType.b: case ProductType.bWaterLoss: case ProductType.bBaseLoss: case ProductType.c: case ProductType.cWaterLoss: case ProductType.cBaseLoss: case ProductType.d: case ProductType.dWaterLoss: case ProductType.dBaseLoss: return FragmentationTerminus.FivePrime; case ProductType.w: case ProductType.wWaterLoss: case ProductType.wBaseLoss: case ProductType.x: case ProductType.xWaterLoss: case ProductType.xBaseLoss: case ProductType.y: case ProductType.yWaterLoss: case ProductType.yBaseLoss: case ProductType.z: case ProductType.zWaterLoss: case ProductType.zBaseLoss: return FragmentationTerminus.ThreePrime; case ProductType.M: return FragmentationTerminus.None; case ProductType.aStar: case ProductType.aDegree: case ProductType.bAmmoniaLoss: case ProductType.yAmmoniaLoss: case ProductType.zPlusOne: case ProductType.D: case ProductType.Ycore: case ProductType.Y: default: throw new ArgumentOutOfRangeException(nameof(fragmentType), fragmentType, null); } } /// <summary> /// Product ion types by Fragmentation Terminus /// </summary> private static readonly Dictionary<FragmentationTerminus, List<ProductType>> ProductIonTypesFromSpecifiedTerminus = new Dictionary<FragmentationTerminus, List<ProductType>> { { FragmentationTerminus.FivePrime, new List<ProductType> { ProductType.a, ProductType.aWaterLoss, ProductType.aBaseLoss, ProductType.b, ProductType.bWaterLoss, ProductType.bBaseLoss, ProductType.c, ProductType.cWaterLoss, ProductType.cBaseLoss, ProductType.d, ProductType.dWaterLoss, ProductType.dBaseLoss, } }, { FragmentationTerminus.ThreePrime, new List<ProductType> { ProductType.w, ProductType.wWaterLoss, ProductType.wBaseLoss, ProductType.x, ProductType.xWaterLoss, ProductType.xBaseLoss, ProductType.y, ProductType.yWaterLoss, ProductType.yBaseLoss, ProductType.z, ProductType.zWaterLoss, ProductType.zBaseLoss, } }, { FragmentationTerminus.Both, new List<ProductType> { ProductType.a, ProductType.aWaterLoss, ProductType.aBaseLoss, ProductType.b, ProductType.bWaterLoss, ProductType.bBaseLoss, ProductType.c, ProductType.cWaterLoss, ProductType.cBaseLoss, ProductType.d, ProductType.dWaterLoss, ProductType.dBaseLoss, ProductType.w, ProductType.wWaterLoss, ProductType.wBaseLoss, ProductType.x, ProductType.xWaterLoss, ProductType.xBaseLoss, ProductType.y, ProductType.yWaterLoss, ProductType.yBaseLoss, ProductType.z, ProductType.zWaterLoss, ProductType.zBaseLoss, ProductType.M } } }; public static List<ProductType> GetRnaTerminusSpecificProductTypes( this FragmentationTerminus fragmentationTerminus) { return ProductIonTypesFromSpecifiedTerminus[fragmentationTerminus]; } /// <summary> /// Returns all product ion types based upon specified terminus /// </summary> /// <param name="dissociationType"></param> /// <param name="fragmentationTerminus"></param> /// <returns></returns> public static List<ProductType> GetRnaTerminusSpecificProductTypesFromDissociation( this DissociationType dissociationType, FragmentationTerminus fragmentationTerminus) { var terminusSpecific = fragmentationTerminus.GetRnaTerminusSpecificProductTypes(); var dissociationSpecific = dissociationType.GetRnaProductTypesFromDissociationType(); return terminusSpecific.Intersect(dissociationSpecific).ToList(); } }}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

somehting weird showing up.
image

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For some reason, github thinks there is zero whitespace in this class

Adjusted Chemical Formula subtraction when null - value
Made LysisProduct and abstract class and its constructor protected
Copy link
Contributor

@trishorts trishorts left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

let's see how this works in MM

@trishorts trishorts merged commit 167fa9b into smith-chem-wisc:master Dec 8, 2023
3 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants