Mon Jul 25, 2016 6:29 pm
Also wanted to chime in with corroborating evidence about the GPU overheating glitches on nMPs. We have 3 maxed out systems (12c, D700, 1TB SSD, 64GB RAM) in our studio and all have exhibited the same overheating glitches in both Resolve and Adobe apps when pushed beyond a certain threshold (I run a temp monitor and usually see glitches start around 80ºC as measured at the GPU).
We've had Apple replace two of them with completely new machines, but seen no difference in the pattern – it's very, very easy to reproduce (which is the problem!). At this point we're not going to waste more time going back to them to ask for another of the same poorly designed machine on the off chance we get lucky with a pair of higher-binned D700s...
So after a lot of struggling with it I've developed two workflow solutions in Resolve that are not ideal but will at least prevent the glitches:
1.) Turn of idle background caching in General Options. By default it's on and if you're using the Render Cache it will attempt to cache when the machine is idle. It does this at maximum speed, which of course ramps the GPU temp up beyond the safe limits. If you turn this off and cache only during playback I've seen no glitches, even with R3Ds at 4K with TNR, OFX, etc. My guess is that either the natural compute bottleneck per frame is keeping temps low enough to be stable, or by limiting the GPUs to work on only the displayed frame (TNR aside) it's refreshing VRAM regularly enough to alleviate some of the stress on the cards. Of course you lose some of the efficiency of caching by turning this off, but at least you can put a client back in your room without having to apologize for all the funny lines / colors on their film during a session...
2.) Limit Deliver speeds to 10fps. I imagine this works for the same reasons as above, and obviously is a pain if you are trying to deliver something on a deadline, but I used to have to watch renders like a hawk to spot GPU glitches and then render dozens of patches to fix those shots. At least now I can leave them unattended with reasonable confidence that they'll work.
"It's amazing what you can do when you don't know you can't do it."
Systems:
R16.2.3 | Win10 | i9 7940X | 128GB RAM | 1x RTX Titan | 960Pro cache disk
R16.2.3 | Win10 | i9 7940X | 128GB RAM | 1x 2080 Ti | 660p cache disk