minus-squarecurry@programming.devtoSelfhosted@lemmy.world•Consumer GPUs to run LLMslinkfedilinkEnglisharrow-up3·15 days agoI tried to run Gemma 3 27B Q4K and was surprised how quickly the VRAM requirements blew up proportional to context window, especially compared to other models (all quantized) at similar size like Qwq 32B. linkfedilink
minus-squarecurry@programming.devtoMicroblog Memes@lemmy.world•Your first error was going to a websitelinkfedilinkEnglisharrow-up19·1 month agoBold of you to assume people know the difference between an app and a website. linkfedilink
I tried to run Gemma 3 27B Q4K and was surprised how quickly the VRAM requirements blew up proportional to context window, especially compared to other models (all quantized) at similar size like Qwq 32B.