You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
When I extract FASTA from highquality_clust30 I receive the following headers.
>ESMFOLD V0 PREDICTION FOR MGYP000138429313
>ESMFOLD V0 PREDICTION FOR MGYP001595280761
...
I use FoldComp for a downstream application, and per FASTA specification in this case each sequence will have a header ESMFOLD, which is not unique. The unique id is stored in the comment.
I can run sed on it, but this solution feels hacky.
The highquality_clust30.lookup looks appropriate:
0 MGYP002174220927 0
1 MGYP000064029927 0
Do you have recommendations on how to get proper FASTA headers?
Cheers
V
The text was updated successfully, but these errors were encountered:
Sorry for the late response. I've changed the default to use id/filename when extracting sequences in 412c7a8 and introduced use-title flag if title is needed.
When I extract
FASTA
fromhighquality_clust30
I receive the following headers.I use
FoldComp
for a downstream application, and per FASTA specification in this case each sequence will have a headerESMFOLD
, which is not unique. The uniqueid
is stored in the comment.I can run
sed
on it, but this solution feels hacky.The
highquality_clust30.lookup
looks appropriate:Do you have recommendations on how to get proper FASTA headers?
Cheers
V
The text was updated successfully, but these errors were encountered: