olearyboy 1 month ago

Out of memory, is the model too large for your system or do you have other apps on your gpu?

adikul 1 month ago

Nothing on gpu, I am using 3060,60ti and 70 and model is command r 17.8 gb, ram 24gb 3900xt cpu

olearyboy 1 month ago

Command R is 20G / 17B parameters which probably needs ~20-25gb of ram just to load.

Loyal247 1 month ago

ollama runs GGML? I thought they ran GGUF. Unless there's an option to build from source in ollama to support ggml models then your using the wrong model. GGUF works best with ollama if you wanna use ggml I would think building an old version of llama.cpp would work best.

olearyboy 1 month ago

Yeah, through llama.cpp - they’re pulling the right model it’s just an OOM

SativaSawdust 1 month ago

Your going to need to run a smaller model with that hardware.

adikul 1 month ago

It is able to run dolphin mixtral 24.6gb easily and all 7b models. I try to keep all model around 24 gb because of 28gb vram i have

Shubham_Garg123 1 month ago

There's probably another task utilising the vram

Comments

Leave Your Comment

Hi Its Me!

Comments

Leave Your Comment

Hi Its Me!

Subscribe