• brucethemoose@lemmy.world
      link
      fedilink
      English
      arrow-up
      1
      ·
      edit-2
      25 days ago

      Soldered is better! It’s sometimes faster, definitely faster if it happens to be lpddr.

      But TBH the only thing that really matters his “how much VRAM do you have,” and Qwen 32B slots in at 24GB, or maybe 16GB if the GPU is totally empty and you tune your quantization carefully. And the cheapest way to that (until 2025) is a used MI60, P40 or 3090.