R3D FullRes Premium decode broken after Resolve Studio 14.01

Get answers to your questions about color grading, editing and finishing with DaVinci Resolve.
  • Author
  • Message
Offline

ChristopherSeguine

  • Posts: 49
  • Joined: Thu May 02, 2013 5:00 pm
  • Location: California

R3D FullRes Premium decode broken after Resolve Studio 14.01

PostFri Mar 02, 2018 11:36 am

Poor R3d decode performance, Full Res Premium decode broken in Resolve Studio 14.1,14.2,14.3 - works as expected in 14.01

workstation: 60 Cores/128gb/2x GTX-1080ti/Decklink 12G extreme/900MB/Sec raid

Source: 6144x3160 5:1 R3D 23.976
Decode Quality: Full Res Premium / 16

Timeline 4096x2160 23.976
Decklink 12G extreme UHD 2160p/23.976
Output Scaling 3840x2160
Minimize interface updates - Disabled
Hide UI overlays - Disabled


Resolve 14.01 / R3D SDK 6.2.2 works as expected

Playback with multiple node correction is real time/full resolution/framerate, with 60-70% CPU usage, GPU utilization is 30%

Decode uses all available cores
Config.Dat DT.Manager/Dt.Manager.Red Thread settings are respected, increasing from default 8 to the max of 32 increase performance on 60 core machine

Resolve 14.1,14.2,14.3 / R3D SDK 7.05/7.06 does NOT work as expected

R3D decode uses only 16 threads/cores, CPU utilization is below 20%, GPU utilization is 10%
Playback is only 5fps
config.dat thread settings have no effect.
Red.GPU.Enable in config.dat has no effect, Can not switch between CPU/GPU decode
Can only playback full fps at R3d Half/Good with other settings unchanged.

RedCine 50.2/ R3d SDK 7.07 works as expected

Source: 6144x3160 5:1 R3D 23.976
Full Res Premium
Decklink 12G extreme UHD 2160p/23.976

Real time/full resolution/framerate, with 70-80% CPU usage

---


If your timeline resolution is less than 4k, r3d compression is > 10:1, or decode is not Full Res Premium, you will probably not notice the problem.

When Resolve switched from Red SDK 6.2.2 to SDK 7.05/7.06 (Ipp2 support) something was broken, decode performance for Full Resolution Premium is horrible due to lack of cpu utilization.   Switching Color Science Version 2/Version 3 has no effect, set Color Science Original = black frame

14.01 playback is as expected, 14.1-14.3 playback is broken with all else being the same.

Arri65 , Phantom .cine, DJI DNG raw files work ok, only R3D decode is broken in versions after 14.01
copying  the Red libraries from 14.01 to 14.3 does not work, Resolve does not launch

Logs sent to tech support.

14.3 playback from Color page:
Image

14.01 playback from Color page:
Image
Primary: HP DL580G8 (4x E7-8895 - 60 cores, HT disabled), Decklink Extreme 12G, 2x 1080TI, Arcea 1883ix24
OS: Win 10 for Workstations 17134, nvidia 397.64
Primary source: Red 8k .r3d, Phantom .cine, secondary footage: ArriRaw, DJI X5 CDNG
Offline

ChristopherSeguine

  • Posts: 49
  • Joined: Thu May 02, 2013 5:00 pm
  • Location: California

Re: R3D FullRes Premium decode broken after Resolve Studio 1

PostSat Mar 03, 2018 10:24 pm

Easy to replicate the problem:

On a machine with more than 16 cores, set the CPU affinity of Resolve to use only 16 physical cores - task manager or process lasso.
Play a R3D clip at full res premium on a 4k timeline.

Resolve 14.01, Limiting Resolve to use only 16 cores decreases performance/playback fps - expected behavior 

Resolve 14.1 - 14.3, Limiting Resolve to use only 16 cores results in No decrease in performance - Broken.

It is not a hardware problem or Red SDK problem, there is a performance problem with Resolve's implementation of the Red 7.05+ SDK.
Primary: HP DL580G8 (4x E7-8895 - 60 cores, HT disabled), Decklink Extreme 12G, 2x 1080TI, Arcea 1883ix24
OS: Win 10 for Workstations 17134, nvidia 397.64
Primary source: Red 8k .r3d, Phantom .cine, secondary footage: ArriRaw, DJI X5 CDNG
Offline

Roger Singh

  • Posts: 25
  • Joined: Thu Aug 06, 2015 4:56 am

Re: R3D FullRes Premium decode broken after Resolve Studio 1

PostSun Mar 04, 2018 2:12 pm

Hi Christopher,

Thanks for posting this issue in a very detailed manner. I believe I've been observing the same issue. I have an 18 core xeon, and I noticed the lack of cpu/GPU usage as well.

I see no performance difference when changing the raw settings from full res premium to half res premium to half res good.

The only option I've seen that sped up playback was switching on proxy mode to half res.

I generally have performance mode disabled, I remember it was affecting the colours when it was enabled.
Offline

Michael Lindsay

  • Posts: 4
  • Joined: Tue Mar 21, 2017 4:43 pm

Re: R3D FullRes Premium decode broken after Resolve Studio 1

PostSun Mar 04, 2018 6:04 pm

Hi Christoper

We have the exact same problem on a z8 with 28 (56) cores which we reported a month ago... actually our example of the problem is even more acute. We can get to 95+% CPU usage in 14.01 creating editorial files but this drops to about 32% with 14.3.. ..and interestingly the latest redcine which is basically the same SDK with a GUI can make the CPUs sit at 100%..

14.01 is fine but everything later with r3ds does not decode properly.. we did detailed testing and fed back everything (through our dealer) to davinci... they unfortunately say they can't recreate which I appreciate makes fixes difficult (we have now recreated on 2 other machines) but I think they are using Linux with their big core z8 and Linux may work differently ... slightly worried NAB is approaching so unless their is some significant noise this may get sidelined.

My problem is we own a Red Monstro as need the new SDK.. the same SDK in other software is not a problem.. so I am tending towards this being a davinci issue.

I even spent more money on loads more memory (to populate every memory channel) and a better GPU cards to try and solve this... nothing helped.

Delighted to see your detailed post! As you note this only effects REd files.. 14.3 is super fast with ARRI and Sony raw.. but as owner of red cameras who just sold a RRX this is not really not great for us.

Ps I have a 7 year old z800 that can deal with red files faster in 14.01 than a very expensive z8 in 14.1+
Offline

ChristopherSeguine

  • Posts: 49
  • Joined: Thu May 02, 2013 5:00 pm
  • Location: California

Re: R3D FullRes Premium decode broken after Resolve Studio 1

PostMon Mar 05, 2018 5:36 pm

Notice that the "Use GPU for Red Debayer" option disappeared from the Preferences/Video and Audio I/O page after Resolve 14.01, as did the option of setting different decode/play quality settings in camera raw settings.

Red sdk allows GPU processing to be done in existing application pipeline or using red sdk gpu processing - my guess is Resolve switched red decode methods and broke performance.  When using the the red gpu processing you must specify the number of threads to use - setDecompressionThreadCount(size_t decompressionThreadCount), doesn't look like Resolve is doing this properly.
 
If Performance Mode/Automatic, or Performance Mode/Optimize Decode Quality is enabled - Resolve drops the R3D decode from Full Resolution Premium/16bit to what looks like  Half Res Good/8bit - this is not documented what exactly performance mode is doing - maybe Rohit or Peter could explain.    Maybe others are not complaining as much because they have performance mode enabled and are not monitoring in 4k and do not notice the drop in quality.

To be clear, we've tested this on our HP, Asus, and Supermicro dual/quad cpu systems - all exhibit this problem.  14.01 performance is normal, install 14.1 through 14.3 without any other system changes, R3D Full Resolution performance is unusable.

We have not received a reply to support requests from BM yet.
Primary: HP DL580G8 (4x E7-8895 - 60 cores, HT disabled), Decklink Extreme 12G, 2x 1080TI, Arcea 1883ix24
OS: Win 10 for Workstations 17134, nvidia 397.64
Primary source: Red 8k .r3d, Phantom .cine, secondary footage: ArriRaw, DJI X5 CDNG
Offline

ChristopherSeguine

  • Posts: 49
  • Joined: Thu May 02, 2013 5:00 pm
  • Location: California

Re: R3D FullRes Premium decode broken after Resolve Studio 1

PostThu Mar 08, 2018 6:13 pm

Puget Systems also did a test which shows the problem:

https://www.pugetsystems.com/labs/articles/DaVinci-Resolve-14-GPU-Scaling-Core-i9-vs-Xeon-W-vs-Dual-Xeon-SP-1121/

2: RED footage at "Full Res." decode quality is... weird
This was not as much of an issue with 4K RED footage, but with 6K and especially 8K RED footage trying to use "Full Res." decode quality resulted in very odd results. Not only did we simply see lower playback FPS compared to using "Half Res." but in many cases using "Full Res." decode resulted in a performance drop when we increased the number of GPUs. It may be that we are hitting a CPU or storage bottleneck, but given the fact that we saw the same thing with the Dual Xeon CPUs and are using a very fast storage drive (3,500 MB/s read) we think this is more of an issue with DaVinci Resolve itself.
Primary: HP DL580G8 (4x E7-8895 - 60 cores, HT disabled), Decklink Extreme 12G, 2x 1080TI, Arcea 1883ix24
OS: Win 10 for Workstations 17134, nvidia 397.64
Primary source: Red 8k .r3d, Phantom .cine, secondary footage: ArriRaw, DJI X5 CDNG
Offline
User avatar

Dwaine Maggart

Blackmagic Design

  • Posts: 4138
  • Joined: Wed Aug 22, 2012 2:53 pm

Re: R3D FullRes Premium decode broken after Resolve Studio 1

PostFri Mar 09, 2018 2:25 am

I've done a test here and am not seeing a big difference between Resolve 14.0.1 and 14.3 and RedCine-X 50.2 in CPU utilization.

I'm using a Dell Precision 7920 work station with dual Xeon Gold 6136 12 core 3.0 GHz CPUs, with an NVIDIA GP100 GPU. Running Windows 10 Pro.

I'm using a Red 8192x3456 file that's using REDcode 12:1.

Note that timeline resolution isn't going to make much difference in CPU utilization. Nor are GPU's, as long as you have something half decent.

Assuming you don't have a Red Rocket-X card, the CPU processing load comes from decompressing the Red files. And using Full (or Premium) decoding mode has the heaviest CPU decompression load. In Resolve, when you select Half Res Good decode mode, that dramatically lowers the CPU decompression load. But Half and Full Res Premium decode mode exerts basically the same CPU decompression load.

The GPU's are used for debayering the Red footage. Without a Red Rocket-X card, you are probably never going to run out of GPU (assuming decent GPU's) for debayering, because the CPU decompression will always be the bottleneck.

Below are screenshots of CPU usage from the above conditions, with Resolve 14.0.1, Resolve 14.3 and RedCine-X 50.2.

14_0_1 red 8k clip play full premium decode.PNG
Resolve 14.0.1
14_0_1 red 8k clip play full premium decode.PNG (89.1 KiB) Viewed 737 times


14_3 red 8k clip play full premium decode.PNG
Resolve 14.3
14_3 red 8k clip play full premium decode.PNG (88.48 KiB) Viewed 737 times


RedCine-X Full Quality 8K Playback .PNG
RedCine-X 50.2
RedCine-X Full Quality 8K Playback .PNG (125.54 KiB) Viewed 737 times


Note that all 24 real cores are being pretty much fully utilized in all 3 tests. I'm not sure what's going on with the HyperThread cores. But it seems similar in all 3 tests.

In Christopher's test, it seems that HyperThreading is not being used. Is that because Windows 10 doesn't deal well with 120 cores? Maybe the fact you have 4 CPU's has something to do with what you are seeing. But that doesn't explain why Michael apparently sees something similar with 56 cores on the Z8.

If either of you would care to share a Resolve export .drp file of the tests you used for your results, I'd be interested to try them on the system here and see what happens. Maybe you've got some Preference or Project setting different from what I'm using.
Dwaine Maggart
Blackmagic Design DaVinci Support
Offline

Chip.Murphy

  • Posts: 104
  • Joined: Wed Mar 08, 2017 5:59 pm

Re: R3D FullRes Premium decode broken after Resolve Studio 1

PostFri Mar 09, 2018 6:55 am

My meager 1950x has no problem doing 16 cores /32 threads with Red.
Attachments
20180309_015456.png
20180309_015456.png (371.73 KiB) Viewed 718 times
Offline

ChristopherSeguine

  • Posts: 49
  • Joined: Thu May 02, 2013 5:00 pm
  • Location: California

Re: R3D FullRes Premium decode broken after Resolve Studio 1

PostFri Mar 09, 2018 8:03 am

Thanks Dwaine,

PM sent with files

Your task manager doesn't show that its not using the hyper threaded cores, it shows that resolve is not using the 2nd cpu - cores are displayed in logical order, every other core is a physical core.  

Resolve doesn't support Processor Groups, so it can only use max 64 threads, so I have hyper threading disabled on the quad 15 core.
We cant't use red rockets as they compromise quality.

Problem may be specific to systems with more than 28 physical cores - dual 14 core and up, the puget systems test is dual 14core, my tests are dual 14 core,/dual 16 core/quad 15 core, Michael's test are dual 14 core .  I can't think why that would be.

Do you have a dual 14 core system there to test?

I tried limiting Resolve to only using 24 cores through task manager , but it does not help

Did you have Performance Mode enabled? as that would drop r3d decode to half/good.

Are you able to play one of the sample clips I uploaded at Full Resolution/Premium on a 4096x2160 timeline, at 24fps, Performance Mode disabled,  out a decklink extreme to a UHD monitor?    Bottom line, I can in 14.01 with cpu to spare, can not in 14.3
Primary: HP DL580G8 (4x E7-8895 - 60 cores, HT disabled), Decklink Extreme 12G, 2x 1080TI, Arcea 1883ix24
OS: Win 10 for Workstations 17134, nvidia 397.64
Primary source: Red 8k .r3d, Phantom .cine, secondary footage: ArriRaw, DJI X5 CDNG
Offline

Michael Lindsay

  • Posts: 4
  • Joined: Tue Mar 21, 2017 4:43 pm

Re: R3D FullRes Premium decode broken after Resolve Studio 1

PostMon Mar 12, 2018 4:00 pm

Hi Dwaine

Thanks for doing some quick testing ….. But are you showing CPU usage just for playback?? that is not relevant as it must be for full quality rendering out of files or the better result you show may just be capped with the base frame rate (you would never see 100% usage on my setups even in 14.01 if I am just playing back).. also reading and writing out to separate drive arrays helps max things out (even if the drive is potentially so fast one may deem that irrelevant to do).. Also to reiterate on your images of 14.3 one can clearly see the 2nd chip is ignored where as in 14.01 it is just being underused and this could be due to bandwidth limitations elsewhere or capped due to playback rate etc also as stated by Christopher if performance mode is engaged it would comletly squew the results.

On 3 machines (a 2x6c a 2x12c and the 2x14c) we are seeing performance cripled by anything later than 14.01...

like for like Resolve used to be fast now it is not!...

I am not a super expert in all apects of Resolve but I have being using it for a long time for 2 very particular tasks. This is R3d Decoding for editorial and finishing …. Due to the sometimes painful computational strain of R3D files and the fact I have had Red cameras since the very begging I am a expert in how long it takes to decode R3d files on every machine we have ever owned. The current big pain for us is editorial files creation and not finishing.

In a specific to Red use well balanced setup/machine decoding R3ds for editorial in either Redcine or Resolve will max out EVERY CPU CORE.. hyper-threading turned on. Not 60% but 95%+.. often sitting at 100% and only dropping a little at the end and beginning of new files. This performance is essential for Editorial file creation! Resolve's14.3 apparent R3d bug renders a Brand new Z8 to be slower than a 7 year old Z800 using 14.01!

I believe my dealer has sent all my details to you guys.. ?

thank you

Michael
Offline

ChristopherSeguine

  • Posts: 49
  • Joined: Thu May 02, 2013 5:00 pm
  • Location: California

Re: R3D FullRes Premium decode broken after Resolve Studio 1

PostMon Mar 19, 2018 4:14 pm

After 14.01, Resolve is not using the Red sdk setDecompressionThreadCount() and setConcurrentImageCount() to set the correct # of threads based on system hardware = very poor HighRes Premium decode performance in multi-cpu systems.  

These should be set as LogicalCores(including hyperthreaded) - 1, or Preferably a user adjustable setting in the red camera raw tab.

You can check this bug by monitoring the threads Resolve is using in task manager, set resolve's cpu affinity to less cores, it still uses the same amount of threads playing R3ds - resolve does not adjust based on available cores.  

You can duplicate  the performance problem by simulating it with Redline to convert a R3D to DNXHR with no scaling:

redline --i M:\TEST_001.R3D --o D:\TEST.MXF --format 204 --colorSciVersion 3 --gammaCurve 34 --colorSpace 25 --gpuPlatform 2 --cudaDeviceIndexes 0 1 --numGraphs 60 --logFile D:\MXFTest.txt

If you omit --numGraphs 60, which specifies the number of threads, you will see the same poor performance as in Resolve 14.3


Additional Red problems as of Resolve 14.3:

D.E.B. should function in ColorScience3 mode (stage 1 processing), it does not

Exposure control should function in ColorScience3 mode (stage 1 processing), it does not. 

CameraRaw/Red/Tint in ColorScience3 should be in the section with White Balance and Use Camera Metadata selectable as they are related.

Performance Mode does not work on Red 6k clips, only 8k - expected behavior drop from High Res Premium to Half Res Good, actual behavior - nothing, (4k timeline)

other 14.3 problems:

Hide GUI overlays Disables Viewer Lut

Scopes are always effected by Viewer Lut, no matter if No lut is specified.  Only Hide GUI overlay disables GUI lut applied to scopes.
Primary: HP DL580G8 (4x E7-8895 - 60 cores, HT disabled), Decklink Extreme 12G, 2x 1080TI, Arcea 1883ix24
OS: Win 10 for Workstations 17134, nvidia 397.64
Primary source: Red 8k .r3d, Phantom .cine, secondary footage: ArriRaw, DJI X5 CDNG

Return to DaVinci Resolve

Who is online

Users browsing this forum: Andrew Kolakowski, Bing [Bot], Carsten Sellberg, Google [Bot], Google Feedfetcher and 32 guests