https://github.com/facebookresearch/llama/issues/79
M2 Max + 96GB unified memory == 7B @ 10 token/s ( https://github.com/facebookresearch/llama/issues/79#issuecomment-1460500315)
12700k + 128GB RAM + 8GB 3070Ti == 65B @ 0.01 token/s ( https://github.com/facebookresearch/llama/issues/79#issuecomment-1460464011)
Ryzen 5800X + 32GB RAM + 16GB 2070 == 7B @ 1 token/s ( https://github.com/facebookresearch/llama/issues/79#issuecomment-1457172578)
2x 8GB 3060 == 7B @ 3 token/s ( https://github.com/facebookresearch/llama/issues/79#issuecomment-1457437793)
8x 24GB 3090 == 65GB @ 500 token/s ( https://github.com/facebookresearch/llama/issues/79#issuecomment-1455284428)
1
tool2d 2023-03-09 10:03:16 +08:00
烦躁,AI 聊天对机器硬件要求,比起早期的 AI 绘图,要高上整整一大截。
|
2
fantasyjm OP 这一条写错了
应该是 Ryzen 5800X + 32GB RAM + 8GB 2070s == 65B @ 0.02 token/s ( https://github.com/facebookresearch/llama/issues/79#issuecomment-1457172578) |
3
agagega 2023-03-09 10:19:14 +08:00 via iPhone
看起来和显存关系非常大?
|
4
netdcy 2023-03-11 18:02:18 +08:00
这个 B 和 token 是什么意思呢?
|
5
Champa9ne 2023-03-13 00:49:23 +08:00
感觉虽然 M1 的架构有容易达到大显存的优势,但是算力强度不够啊。听说只能对标到 1050 Ti ?那 65B 推理一次补得等上三五年 O-o
|