ggml: implement quantized KV cache for FA (#7372) #57
Job | Run time |
---|---|
2m 53s | |
1m 59s | |
2m 22s | |
8m 18s | |
11m 36s | |
2m 20s | |
1m 27s | |
2m 37s | |
2m 29s | |
2m 22s | |
2m 13s | |
3m 44s | |
15m 15s | |
1m 18s | |
4m 56s | |
7m 58s | |
2m 14s | |
5m 59s | |
1m 14s | |
1m 11s | |
3m 33s | |
1m 58s | |
5m 44s | |
9m 10s | |
11m 16s | |
5m 10s | |
22m 31s | |
5m 50s | |
6m 33s | |
25m 34s | |
9m 59s | |
6m 21s | |
14m 18s | |
6m 29s | |
8m 16s | |
5m 56s | |
8m 7s | |
20m 4s | |
5m 9s | |
6m 45s | |
6m 12s | |
4m 44s | |
3m 28s | |
3m 3s | |
21s | |
4h 50m 56s |