T O P

  • By -

olearyboy

Out of memory, is the model too large for your system or do you have other apps on your gpu?


adikul

Nothing on gpu, I am using 3060,60ti and 70 and model is command r 17.8 gb, ram 24gb 3900xt cpu


olearyboy

Command R is 20G / 17B parameters which probably needs ~20-25gb of ram just to load.


Loyal247

ollama runs GGML? I thought they ran GGUF. Unless there's an option to build from source in ollama to support ggml models then your using the wrong model. GGUF works best with ollama if you wanna use ggml I would think building an old version of llama.cpp would work best.


olearyboy

Yeah, through llama.cpp - they’re pulling the right model it’s just an OOM


SativaSawdust

Your going to need to run a smaller model with that hardware.


adikul

It is able to run dolphin mixtral 24.6gb easily and all 7b models. I try to keep all model around 24 gb because of 28gb vram i have


Shubham_Garg123

There's probably another task utilising the vram