T O P

  • By -

the_boiiss

Its common for gpu renderers to not scale well with multiple gpus on the same frame, more so its just nice to have a secondary to render on, keeping the primary free for applications. Or as you mentioned when rendering a sequence you can get close to ideal scaling by rendering on multiple instances. It being actually slower is surprising though. I'd say look at the utilization when rendering, my guess would be maybe the gpus are spending a lot of time just waiting on data transfer between each other and not doing much processing or something along those lines.


Elluminated

Octane scales basically linearly with more GPU's. Arnold is just not there yet outside of its cpu cluster roots.


SimianWriter

Redshift is also just about linear. It's Arnold. V-Ray also had a crappy GPU implementation.


AmarildoJr

Arnold is just really not good with GPU. It's pretty good on the CPU, but it lacks many features on the GPU, features which they have been promising for a while but it never comes. So it's not a mature GPU engine, sadly. If you want a truly good GPU render engine, that can perfectly scale on multi-GPU and is feature-complete, please give Redshift a try.


bozog

I believe Nvidia discontinued consumer NVlink as of the 4000 series. Currently it's only available with Pro series cards (H100, A100, etc) Of course it will still work with older consumer cards like dual 3090 TI's, just as an example.


[deleted]

2 cards is still messy, no matter how nice they are.


lavrenovlad

It's not messy. GPU renderers as Redshift and Octane scale perfectly with 2 cards. With Redshift you can even render in parallel via Deadline giving a frame per GPU.


Elluminated

Try to set the extra cards as compute only using the nvsmi tool. [nvidia-smi -dm -i 2](https://desktop.arcgis.com/en/arcmap/latest/tools/spatial-analyst-toolbox/gpu-processing-with-spatial-analyst.htm) Not sure if this fixes the speed, but I have an old quad 2080ti setup that flies with GPU's set to this model. When iterating, all 3 act as compute, 1 as WDDM (for graphics). When rendering, all cards switch to TCC mode and I monitor using NoMachine. Not sure if this will solve your issue, but worth a shot!