Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

You solicited suggestions in your tweet #1

Closed
rsharris opened this issue May 31, 2024 · 3 comments
Closed

You solicited suggestions in your tweet #1

rsharris opened this issue May 31, 2024 · 3 comments

Comments

@rsharris
Copy link

Definitely a handy thing to have in one's toolbox.

First (and don't take this a criticism) I'd suggest a different name, something that makes it clear you're generating test sequences.

What follows are things I'll suggest based on a (private) python package I've put together over the past decade, for a similar use case. However, I don't find any API documentation currently in the repo (maybe I don't know where to look), so you may also have some of this functionality. I tried to glean what functionality exists from a cursory scan through the source code.

Skewed alphabets. Most organisms don't have a 50/50 distribution of AT vs GC.

Mutation. Much of what I was generating sequence for was to test an aligner's ability to discover sequences that have evolved. Thus my sequence object has a method that returns another sequence object that's a mutated copy.

Constructors. random sequence, but also sequence read from a file.

Operators. (I don't know the rustic attitude toward overloaded operators.) I implemented "+" as sequence concatenation, unary "-" as reverse complement, "*" by integer (on either side) as concatenated repeat, "<<" and ">>" as shifts (one end is lost). Ordinal operators like "<" etc. "in" as a substring detector.

I should probably make my private package public. Will look at doing that next week.

@natir
Copy link
Owner

natir commented May 31, 2024

Thank you can find stable documentation here and development documentation by click on documentation button in Readme.

First (and don't take this a criticism) I'd suggest a different name, something that makes it clear you're generating test sequences.

Very good point, Indeed, I'll have to come up with a better name.

Skewed alphabets could be implemented almost easily.

About Mutation and Operators I think it's out of scoop of this crates these functionality are to complex and I want this API stay small. What's more, this kind of thing is already covered by other crates or tools.

About Constructor maybe it's covered by create method otherwise could you clarify what you mean.

@rsharris
Copy link
Author

Thank you can find stable documentation [here]and development documentation by click on documentation button in Readme.

Thanks. Indeed, I was correct when I said I didn't know where to look.

About Constructor maybe it's covered by [create method] otherwise could you clarify what you mean.

Ignore my suggestion. I was thinking of a package that created DNA strings and operated on them. In that realm, it can be useful to be able to create the objects by several different means, one of them being to read an existing file.

As I understand the current constructor, the only type of sequence it creates is a random one. That's probably all you need since it seems like the focus of the package is just to write those to a file.

@natir natir closed this as completed May 31, 2024
@rsharris
Copy link
Author

@natir FWIW, my python module is now public at https://github.com/rsharris/echydna . Lightly documented with a couple examples.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants