Skip to content

Commit

Permalink
Update README.md
Browse files Browse the repository at this point in the history
  • Loading branch information
pengzhangzhi authored Dec 6, 2024
1 parent 168a445 commit afd5233
Showing 1 changed file with 16 additions and 37 deletions.
53 changes: 16 additions & 37 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -37,7 +37,7 @@ pip install flash-attn --no-build-isolation

Having trouble installing flash attention but still want to use it? A workaround is docker container. You can use the official nvidia pytorch [containers](https://catalog.ngc.nvidia.com/orgs/nvidia/containers/pytorch) which have all the dependencies for flash attention.

3. Install FAESM from github:
3. Install FAESM from GitHub:

```bash
# if you want to use flash attention
Expand Down Expand Up @@ -74,12 +74,24 @@ print("Repr shape:", outputs['last_hidden_state'].shape) # (batch_size, sequenc
# Step 5: start the repo if the code works for u!
```

## ESM-C

Right after EvolutionaryScale release [ESM-C](https://www.evolutionaryscale.ai/blog/esm-cambrian), we follow up with the flash attention version of ESM-C in FAESM. You can run ESM-C easily with the following code:

```python
from faesm.esmc import ESMC
sequence = ['MPGWFKKAWYGLASLLSFSSFI']
model = ESMC.from_pretrained("esmc_300m",use_flash_attn=True).to("cuda")
input_ids = model.tokenizer(sequence, return_tensors="pt")["input_ids"].to("cuda")
output = model(input_ids)
print(output.sequence_logits.shape)
print(output.embeddings.shape)

```

## ProGen2

For generative protein language like ProGen2.
For autoregressive protein language like ProGen2.

```python
import torch
Expand All @@ -90,43 +102,10 @@ device = 'cuda' if torch.cuda.is_available() else 'cpu'
model = ProGenForCausalLM.from_pretrained("jinyuan22/ProGen2-small").to(torch.float16).to(device).eval()
tokenizer = AutoTokenizer.from_pretrained("jinyuan22/ProGen2-small")

sequence = "2GFLPFRGADEGLAAREAATLAARGTAARAYREDSWAVPVPRGLLGDLTARVAALGAASPPPADPLAVTLDLHHVTAEVALTTVLDAATLVHGQTRVLSAEDAAEAATAAAAATEAYLERLQDFVLFMSASVRVWRRGNAAGATGPEWDQWYTVADRDALGSAPTHLAVLGRQADALCHFVLDRVAWGTCGTPLWSGDEDLGNVVATFAGYADRLATAPRDLIM1"
sequence = "2GFLPFRGADM1"

inputs = tokenizer(sequence, return_tensors="pt").to(device)
target = inputs.input_ids[0,...]
with torch.no_grad():
logits = model(inputs.input_ids, labels=inputs.input_ids).logits[0,...]

logits = logits[:-1, ...]
target = target[1:]

bos_token, eos_token = 3, 4
if target[-1] in [bos_token, eos_token]:
logits = logits[:-1, ...]
target = target[:-1]

# remove unused logits
first_token, last_token = 5, 29
logits = logits[:, first_token:(last_token+1)]
target = target - first_token

ce_eval = torch.nn.functional.cross_entropy(input=logits.view(-1, logits.size(-1)), target=target.view(-1), reduction="mean").item()
print(ce_eval)
assert abs(ce_eval - 2.4) < 0.1 # 2.4 is the reference ce for the official progen2-small
```

## ESM-C

Right after EvolutionaryScale release [ESM-C](https://www.evolutionaryscale.ai/blog/esm-cambrian), we follow up with the flash attention version of ESM-C in FAESM. You can run ESM-C easily with the following code:

```python
from faesm.esmc import ESMC
sequence = ['MPGWFKKAWYGLASLLSFSSFI']
model = ESMC.from_pretrained("esmc_300m",use_flash_attn=True).to("cuda")
input_ids = model.tokenizer(sequence, return_tensors="pt")["input_ids"].to("cuda")
output = model(input_ids)
print(output.sequence_logits.shape)
print(output.embeddings.shape)
logits = model(inputs.input_ids, labels=inputs.input_ids).logits[0,...]

```

Expand Down

0 comments on commit afd5233

Please sign in to comment.