Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

SEP V017: Protein, Deprecated Macromolecule, and Alternative Small Molecules #67

Closed
jakebeal opened this issue Jun 2, 2019 · 49 comments
Labels
Accepted final An SEP has been approved and incorporated into the SPEC SEP

Comments

@jakebeal
Copy link
Contributor

jakebeal commented Jun 2, 2019

Proteins are currently represented by the Macromolecule glyph, which looks much like the "shmoo" shape that people often use to represent yeast cells. This SEP proposes to deprecate the "shmoo", represent proteins explicitly with the "pill" glyph, and allow a family of different simple shapes to represent simple chemicals.

Full SEP at: https://github.com/SynBioDex/SBOL-visual/blob/master/SEPs/SEP_V017.md

@cjmyers
Copy link
Contributor

cjmyers commented Jun 2, 2019

This SEP appears to me to fairly accurately represent the general consensus from the discussion. One thing that would be nice to add to the SEP is examples of small molecule / protein complex. I would assume that the small circle (or other shape) overlaid on the protein (pill) glyph would be the rendering for this.

@jakebeal
Copy link
Contributor Author

jakebeal commented Jun 2, 2019

You are exactly correct about what it would look like.

@jakebeal
Copy link
Contributor Author

jakebeal commented Jun 2, 2019

I've updated the SEP and its associated branch to have updated examples for complex.

@cjmyers
Copy link
Contributor

cjmyers commented Jun 3, 2019 via email

@cjmyers
Copy link
Contributor

cjmyers commented Jun 3, 2019 via email

@jakebeal
Copy link
Contributor Author

jakebeal commented Jun 3, 2019

The SEP does leave the original macromolecule glyph, it just demotes it from standard to alternative and says it's "deprecated" meaning it might go away in the future. Do you think it shouldn't be deprecated and/or shouldn't be demoted to alternative?

@cjmyers
Copy link
Contributor

cjmyers commented Jun 3, 2019 via email

@jakebeal
Copy link
Contributor Author

jakebeal commented Jun 3, 2019

I've updated to make the complex examples use some of the old versions.

@cjmyers
Copy link
Contributor

cjmyers commented Jun 3, 2019 via email

@jakebeal
Copy link
Contributor Author

jakebeal commented Jun 3, 2019

Not yet on releasing the paper because 1) small molecule would still need to change, and 2) we need more people that just you and me to weigh in.

@cjmyers
Copy link
Contributor

cjmyers commented Jun 3, 2019 via email

@jakebeal
Copy link
Contributor Author

jakebeal commented Jun 4, 2019

@shyambhakta @rsc3 @JS3xton @ ReneeLizena @bbartley @mikebissell @jamesscottbrown @oerbilgin @hsauro @graik

Tagging you all again: would anybody besides myself and Chris care to express an opinion on the potential compromise I have proposed?
We had many voices in the discussion before, and I think we should hear from more people before this goes to a vote.

@graik
Copy link

graik commented Jun 5, 2019

Thanks for writing this up! Some concerns / suggestions:

(1) I think protein symbol and small molecule should really be put into separate SEPs. This would help keeping comments and discussion focused.

(2) The yeast shape should be deprecated (as originally suggested). I cannot imagine many biodesign diagrams where a random yeast symbol is not causing confusion.

For the protein shape, I would suggest adding two diagrams as alternatives (which set the stage for more detailed protein symbols).
(3a) alternative generic protein symbol
protein-specification-withline
(3b) representation of actual domain architecture
protein-specification-multidomain

(4) I don't think the hexagon was a consensus from the small molecule shape discussion. My impression was rather that "circle" was what most people still preferred. I personally very much like the idea of explicitly small geometric shapes (especially if they come in multiple copies) with labels recommended to be outside the shape.

@cjmyers
Copy link
Contributor

cjmyers commented Jun 5, 2019 via email

@jakebeal
Copy link
Contributor Author

jakebeal commented Jun 5, 2019

Thank you @graik. My thought regarding these:

  1. The reason that I wanted to keep them in the same SEP is that the new protein symbol forces a change of the small molecule symbol --- while "circle" and "pill" may be considered different, we cannot have two different things both defined as "pill", and small molecule has no alternate to fall back on.
  2. I'd like to hear more people speak on this, one way or another. I'd be happy to see an option of whether or not to deprecate be part of the vote.
  3. I personally don't like 3(a) for the reasons already discussed, but am receptive to it if it becomes clear that many want it. 3(b) is definitely another SEP, as there are other discussions that need to be had about harmonizing the rest of the protein language with the DNA language.
  4. My reason for putting hexagon there was to offer one "large" shape; the alternatives are certainly fine to use. I was unclear on how to write up the "multiple small shapes" effectively without going too broad, and was hoping to at least partially cover that with the molecular structure option.

@bbartley
Copy link

bbartley commented Jun 5, 2019

I don't feel strongly one way or the other about deprecating shmoo.

Like @graik, my impression also was that "circle" was what most people recommended and that hexagon was an acceptable alternative. Restating my earlier concern, I think hexagon would not be recommended for a pentose, while circle is universal for all small molecules, and thus should be the recommended glyph.

However, this is a minor issue and would not prevent me from voting for approval.

@rsc3
Copy link
Contributor

rsc3 commented Jun 5, 2019 via email

@rsc3
Copy link
Contributor

rsc3 commented Jun 6, 2019

"Strong concerns were raised regarding the potential confusion between ellipse, pill, and circle. Setting simple chemicals to a small polygon was also not found acceptable in discussion due to the fact that any given small polygon is a mismatch for the molecular structure of many small molecules."

-->

"Strong concerns were raised regarding the potential confusion between ellipse, pill, and circle. Setting simple chemicals to a specific small polygon was also not found acceptable in discussion due to the fact that any given small polygon is a mismatch for the molecular structure of many small molecules."

@jakebeal
Copy link
Contributor Author

jakebeal commented Jun 9, 2019

I have updated the SEP based on the comments both here and on the mailing list, with the following changes:

  • Noted three choices to be voted on with the SEP: 1) deprecation of shmoo, 2) protein-with-lines alternative, 3) hexagon vs. small circle as recommended simple chemical
  • Added notes explaining compatibility with SBGN (we actually can remain compatible, by using Macromolecule and circle).
  • Added explanations and recommendations regarding the alternatives for simple chemical.
  • Added labels for Complex examples.
  • Added some representative examples in Section 3
  • Updated discussion.

Please look and see if you have any further concerns before this goes forward for a vote; if no more significant issues arise by Wednesday, I will move this for a vote.

@graik
Copy link

graik commented Jun 9, 2019

Thanks for the update.
I find it confusing that the yeast shape is still used in most of the examples. Is it supposed to be used as a generic macromolecule shape? That would rather defeat the purpose of this change.

@cjmyers
Copy link
Contributor

cjmyers commented Jun 9, 2019 via email

@graik
Copy link

graik commented Jun 9, 2019

[the yeast shape] is primarily used in the complexes. This is actually my concern with the new pill for protein. It does not look as good for complexes as shmoo does, at least in my opinion.

This shape is immediately and specifically recognized as "yeast cell" by (I presume) a majority of biologists. Using this as a symbol for whatever macromolecule in whatever context is extremely confusing. There is plenty of examples where also the complex between yeast cells and proteins or other molecules is part of a bioengineering design.
Example 1: https://www.pnas.org/content/pnas/108/28/11399/F1.large.jpg
Example 2: https://images.app.goo.gl/P2x1AoYsWwfFn1JK7
Example 3: http://2012.igem.org/File:Washington_ysdexplanation_Screen_shot_2012-10-03_at_2.45.23_PM.png

Therefore, no amount of context will prevent a large share of bioengineers to falsely interpret the examples in the current SEP draft as "yeast interacting with something".

For this reason, we MUST deprecate the use of this yeast symbol as an icon for macromolecule, regardless in which context. This was the whole point of this SEP.

Besides, I cannot see why complexes composed of pill shape + DNA / small molecule or else should look bad in any way. After all, SBOLv specifies something like a technical drawing. Using a clean and basic shape for the most common building block of our designs makes only sense.
It is always a bad idea to have many different ways of saying the same thing. Sometimes, it is unavoidable but here we can avoid it and, I think, we have to.

@jakebeal
Copy link
Contributor Author

jakebeal commented Jun 9, 2019

@graik @cjmyers Since there is no consensus in the discussion, the SEP leaves to the voters on the SEP the question of whether to deprecate the shmoo or not.

  • If people vote to deprecate, then the shmoo will be a SHOULD NOT alternative, and will be removed at some point in the future. Examples will be updated accordingly.
  • If people vote not to deprecate, then the shmoo will be left as an alternative, with just the a that you should not use this in any context involving yeast.

Similarly, the stretchable hexagon is a question being left to voters, examples will be updated as needed following the vote.

I do take the point that SBGN round-rect might be simply used for protein without adding a new glyph at all, and suggest this too can be polled from the voters.

I've updated text to clarify accordingly.

@cjmyers
Copy link
Contributor

cjmyers commented Jun 9, 2019 via email

@jakebeal
Copy link
Contributor Author

jakebeal commented Jun 9, 2019

@cjmyers In my view, this is the final SEP and presents all of the alternatives sufficiently, I believe. The only update after voting will be a) noting which combination achieved consensus, and b) adjusting the actual implementation pull request to reflect the consensus vote.

@graik
Copy link

graik commented Jun 9, 2019

@cjmyers If anyone ever complains, just show them this thread and, I think, it will be clear that we really didn't take this lightly :)

Jake suggested a sensible deprecation path -- we can merely discourage the use of the yeast symbol for now and say it will be officially deprecated in a later version.

I would suggest to remove those "yeast complex" examples as they imply that this symbol should be continued to be used. If we can agree on that on this thread, this would make the voting a lot less complex. Let's wait some days for comments as there are holidays in lots of places right now (Happy Pentecost).

@cjmyers
Copy link
Contributor

cjmyers commented Jun 9, 2019 via email

@jakebeal
Copy link
Contributor Author

jakebeal commented Jun 9, 2019

Please remember that the SEP is not the same as the spec change: we've had SEPs with options for the vote before. Also please remember that many more people have participated in this discussion at a lower frequency. I really don't know which way the vote will go.

Given the lack of consensus, I plan to leave both the examples and the rest of the SEP as is unless I hear significantly more voices giving reason to do otherwise.

@graik
Copy link

graik commented Jun 9, 2019

I'll shut up after this but the "Complex" section with three out of four figures depicting protein by the yeast shape, which is further up declared deprecated, or not. This is really confusing. Also the hexagon shape for small molecule is given much more prominence than the circle that many people preferred. The choices are not always clear.

For the sake of simplifying things, I would be fine to withdraw the "pillshape + lines" version from this SEP so that, at least, the recommended protein symbol is pillshape without need for further questions.

The questions can then be simplified to:
(1) Should the "shmoo" (yeast) symbol remain a valid alternative for macromolecule or should it be deprecated (in favor of SBGN rounded rectangle)?
(2) Is the pill shape adopted as the recommended protein symbol?
(3) Should the preferred symbol for small molecule be (a) circle, (b) hexagon, (c) ?

@cjmyers
Copy link
Contributor

cjmyers commented Jun 9, 2019 via email

@jakebeal
Copy link
Contributor Author

Thank you @graik for the offer to defer the stadium-with-lines alternate; per your suggestion I would indeed like to take you up on that in order to reduce the complexity.

I have updated the wording slightly based on these comments and have shifted so 3/4 of the complex examples use "pill" proteins. In reviewing the discussion, however, I found that hexagon also has many people supporting it (they have just spoken fewer times per person), so I am going to keep the hexagon vs. circle on small molecule the way it is.

@cjmyers
Copy link
Contributor

cjmyers commented Jun 10, 2019 via email

@rsc3
Copy link
Contributor

rsc3 commented Jun 10, 2019 via email

@DavidZong
Copy link

To answer @graik 's 3 questions:

  1. yes the yeast shape should be deprecated because it is needlessly confusing. I got in a discussion with my PI today about a figure I had made and had to explain that the shapes are proteins, not yeast.

  2. I agree with the small molecule shape here. Simple polygons represent these things well. Especially since the old small molecule glyph had the problem of looking too much like rod shaped bacteria.

  3. However, this new macromolecule shape raises a different issue of looking like rod shaped bacteria.

I realize this may be a subset of users, but in population level gene circuits, the rod shape is useful for indicating populations of bacteria.
image

For instance in Chen et al. 2015, the pill shape represent populations of bacteria that interact with each other to form an emergent oscillator circuit.

The lines at the end of the pill might not be distinct enough to differentiate protein vs a drawing of rod shaped bacterium. Perhaps if the pill shape were to have a slightly different aspect ratio, so as to not look like a bacterium?

@rsc3
Copy link
Contributor

rsc3 commented Jun 10, 2019 via email

@DavidZong
Copy link

I agree that a schmoo is more iconic in describing yeast than the pill/stadium being used to describe bacteria. What is unsatisfying to me is how the previous macromolecule symbol looks obviously like a yeast cell, the new proposed symbol looks obviously like E. coli. I'm concerned that at some point in the future we might be having another discussion about changing the macromolecule symbol again because it looks too much like another very common model organism.

So I suppose reliance on context will be crucial for interpreting figures that choose to use the pill shape for proteins and bacteria. Users can differentiate by using the pill with line for proteins and pill/"cluster of pills" for bacteria (or a modified pill shape as @rsc3 suggests), for example.

Therefore, if the community votes to deprecate the schmoo, I would want to have at least one alternative to the stadium/pill. I think the stadium/pill with the line is suitable as an alternative. But, in my opinion, the stadium/pill with the line should be recommended and the plain stadium/pill should be the alternative. The stadium/pill with the lines is less ambiguous. The trade-off here is that the pill/stadium alone is a simpler shape.

@jakebeal
Copy link
Contributor Author

@DavidZong Thank you for your contributions! Two thoughts I have that I would like to hear your responses on:

  1. A distinction that may be of use: I have noticed that, unlike with the yeast diagrams, most uses of a pill to indicate rod-shaped bacteria involve multiple bacteria clustered together (as in your example), presumably because colonies grow in this way. This is a visually distinct usage, as their adjacency is unlike both the isolated use for a protein or the overlapping usage for a complex. Rotations are also often involved.

  2. If these proposals go forward, there will still be an alternative to the pill for protein, in the form of the the SBGN macromolecule symbol (rounded rectangle).

@cjmyers
Copy link
Contributor

cjmyers commented Jun 11, 2019 via email

@DavidZong
Copy link

@jakebeal

  1. You're right in most cases bacteria are arranged in such a way that it looks like a microscope image of them. Like I said previously, if the pill is to be adopted, context will be paramount when a user designs a figure.

  2. i agree with @cjmyers in that adopting the rounded rectangle which is less ambiguous. What were the objections to the rounded rectangle?

@jakebeal
Copy link
Contributor Author

@DavidZong Good that we're on that same page with respect to (1).

With respect to rounded rectangle: there is no objection to it, and it is already adopted for Macromolecule. The proposal is to enhance this with a "pill" shape for protein that matches both one strand of common usage and the protein diagram language proposed in https://pubs.acs.org/doi/abs/10.1021/acssynbio.6b00286

@cjmyers
Copy link
Contributor

cjmyers commented Jun 11, 2019 via email

@jakebeal
Copy link
Contributor Author

I am still opposed to "pill with lines" due to the visual conflict of the lines with arrows. I'm OK to bring the option back in, however.

@shyambhakta
Copy link

shyambhakta commented Jun 12, 2019

I see what @DavidZong is saying about stadium ≅ bacillus. Circle ≅ coccus. (And rounded rectangle ≅ Haloquadratum. 😂 https://en.wikipedia.org/wiki/Haloquadratum)
Standards don't guarantee one can't make a confusing figure using them. I imagine proteins are generally going to be labelled as such, making a clearer distinction. And perhaps they'll be in a context that looks genetic/molecular, alongside circuits.
But yes, I like rounded rectangles as an option.

@jakebeal by this did you mean that the stadium/pill is a kind of rounded rectangle, sort of how I imagine in the next post? If not, perhaps a pill can be defined in terms of a stadium with the short line of zero-length.

  1. If these proposals go forward, there will still be an alternative to the pill for protein, in the form of the the SBGN macromolecule symbol (rounded rectangle).

I thought a "pill with lines" would look like a pill sitting on a DNA backbone, in addition to interfering with arrows. In the literature, pills and non/rounded-rectangles sitting on a line aren't uncommonly used to illustrate genes or other features on a DNA strand. That's why I wouldn't like a pill or rounded rectangles with a line.
If the purpose of the line is to show peptide linkage between protein domains, then the linkers ought to either have the same specs as the domains themselves (e.g. pill/rounded rectangle) (b), or maybe a curve instead of a line (c). I see (c) a lot in protein cartoons.
image

Regarding hand-drawability, I will say that we're not exactly trained to distinguish hand-drawn rounded and sharp-cornered polygons. Hand-drawn shapes that are rounded are sort of… innately recognized as an approximation of the sharp-cornered counterpart. A pill (which looks like an oval approximation when hand-drawn) can easily be chosen by hand-drawers to distinguish it from a true rectangle. Another reason why pill and rounded rectangle may be desired alternatives together, not one or the other.

@shyambhakta
Copy link

Just playing in PowerPoint, adjusting the curvature of a rounded rectangle (a) to encompass the short edge makes it into a stadium/pill (b). Shortening the stadium (b) or adjusting the curvature of a rounded square (c) makes a circle (d).
So here's a "far-out" idea that unites the stadium, rounded rectangle, and circle: I wonder if a protein can be represented as any polygon with rounded corners and straight side lengths ≥0. This would allow a circle (x,y=0), stadium/pill (x=0, y>0), rounded rectangle (x,y>0), or even other convex or concave rounded polygons that better facilitate representation of how proteins may multimerize or interact with other components (e). I showed such restricted to 90° angles, which may be a desirable constraint preventing rounded triangles, hexagons, trapezoids, etc.
Just an idea. Maybe having a broad definition like "rounded 90° polygon" may be too lenient.

image

@jakebeal
Copy link
Contributor Author

@shyambhakta In exploring the potential blurring of boundaries, you've actually explained quite well why I've never really liked "rounded rectangle", because of its similarity to other forms. We put it in there for SBGN compatibility, however, and in a computer-rendered diagram it can be made distinct with appropriate care.

For the more general questions of protein description: you've got the general idea put forth in the protein paper, and I like your suggestion of a potential generalizable language of blocks. However, in order to contain scope and let this SEP move forward, I'd like to defer the more general language of proteins for a later SEP. Also: since we've been talking about this for a while but didn't have an issue open for it yet, I've made one now and seeded it with the paper and your last suggestion: #68

@graik
Copy link

graik commented Jun 12, 2019

I agree that a minimal set of protein features should be discussed in their own SEP.

As to @DavidZong 's comments that pill shape can be mistaken for E. coli. This is a valid concern although, I would also say, that pill shape is perhaps not really "iconic" in this respect. Rounded rectangle is just as common. Bacteria are indeed often symbolized with either shape but then with several copies in random orientation (as in the example). Another popular E. coli symbol is pillshape / rounded rectangle with a single or two ciliae. Example:
https://journals.plos.org/plosone/article?id=10.1371/journal.pone.0093317
image

Another reason why a protein pillshape is not too likely to be mistaken for E. coli is the fact that the protein name will almost always be inside of the pillshape.

I think it would be a good idea though to also start wondering about a good symbol for cell and bacteria. (I would suggest any shape with a double line as a generic symbol for cell or organelle).

@jakebeal
Copy link
Contributor Author

Thank you, @graik.

My sense is that we've reasonably converged on the SEP and the set of questions to be asked, so unless something else appears as a notable blocker, I will soon move this for a vote.

@palchicz palchicz added Accepted and removed Draft labels Jul 11, 2019
@palchicz
Copy link
Contributor

palchicz commented Jul 11, 2019

The resolution of this SEP is captured in the following voting results.

  • Rounded Rectangle becomes RECOMMENDED macromolecule glyph (90%)
  • Shmoo is deprecated (75%)
  • Small molecule becomes Small circle (68%) and Small Polygon (89%) but not stretchable hex (47%) or molecular diagram (58%)
  • "Pill" is adopted as the symbol for protein (79%).
  • "small polygon" becomes the new recommended glyph for small molecule (63%).

@jakebeal
Copy link
Contributor Author

I have updated the SEP to reflect the results of the final vote.

@jakebeal jakebeal added the final An SEP has been approved and incorporated into the SPEC label Sep 24, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Accepted final An SEP has been approved and incorporated into the SPEC SEP
Projects
None yet
Development

No branches or pull requests

8 participants