6.2K decode problem on Nvidia

Get answers to your questions about color grading, editing and finishing with DaVinci Resolve.
  • Author
  • Message
Offline

AXYZE9

  • Posts: 4
  • Joined: Fri May 19, 2023 4:09 pm
  • Real Name: Piotr Plenzler

6.2K decode problem on Nvidia

PostFri May 19, 2023 4:35 pm

Hey,

For 5 months I've been trying do fix issues with 6.2K playback - videos from Fujifilm X-H2S.
Issue = Nvidia GPU isn't being used to decode video which makes editing completely unusable.

Problem exists on these configs:
Ryzen 7 3700X, 2x16GB, 500GB SSD, RTX 3060Ti, Windows 11 clean install
Ryzen 7 3700X, 2x16GB, 2TB NVMe, RTX 3060Ti, Windows 10
Ryzen 7 3700X, 2x16GB, 2TB NVMe, RTX 2060, Windows 10
Core i7 10850H, 2x16GB, 2TB NVMe, RTX 2070 Mobile, Windows 11

Problem doesn't exist on this config:
Ryzen 9 5900X, 2x16GB, 2TB NVMe, RTX 3060 12GB, Windows 10

Blackmagic support tried to help me but unsuccessfully and today they responded that Nvidia SDK is likely to blame... and they recommend getting RTX 3090... but Adobe Premiere uses SDK as well and there is no problem, files are decoded by hardware so its smooth as butter, whereas with Resolve I get just ~14fps max. Other software such as PotPlayer also uses NVDEC, its just problem with Resolve.

To narrow down problem I've tried
- Tweaking resizable BAR
- Trying out Resolve 18.5 Public Beta 2
- Nvidia drivers from 2021-2023
- Clean install of Windows, clean install of drivers, clean install of Resolve Studio.
- Changing CUDA to OpenCL in Resolve settings
- Enabling/Disabling H264/H265 decode in Resolve settings
- Disabling Gsync, changing monitor res and frequency
- Updating BIOS, resetting BIOS
- Replacing CUDA DLLs with newer ones (Resolve uses 11.0, tried 11.1-11.5 and it doesn't fix problem, 11.6 crashes Resolve)

Today I've investigated a little more and by playing with Resolve memory I've saw that NVDEC DLL isn't being called at all when these 6.2K files are being played. I can unload NVDEC DLL from Resolve memory during playback and Resolve doesn't crash (!). So its not like NVDEC refuses to play it, nothing in whole code even communicates to this module, because otherwise Resolve would crash.

In January I've made ticket to Blackmagic where I blamed GPU whitelisting in Resolve for it... and it seems like I was right for all these months, but Blackmagic Support blames Nvidia for it (once again, Premiere using same SDK and same DLLs work fine!).

I'm thinking that it could be whitelisting based on VRAM capacity, thats how RTX 3060 12GB would work and RTX 3060 Ti wouldnt.

So a request to all people reading, please get this video file, play it and see if its smooth. You can also check if its decoded by Nvidia (Task Manager -> Performance -> GPU -> Video Decode). Respond with your result and your PC specs so we can be sure what is causing it and be more knowledgeable in future. 6K+ is becoming mainstream, that issue is very bad. Even if Blackmagic won't fix it together we can check if VRAM capacity is indeed the problem and if so from which capacity problem is solved (10GB, 12GB?).

Video file: https://drive.google.com/file/d/1z7Q1QY ... sp=sharing

Btw 8GB VRAM is plenty for 6.2K editing without VFX, this whitelisting thing doesn't make any sense now, its probably some left over in code from GTX era for reducing crashes on lower end cards.
I'm sure that a lot of people had this problem, but are used to transcoding footage at high resolutions so they don't report it. Or just use high end GPUs with 16GB VRAM. Blackmagic told me that just one other person noticed this problem during all these months from January... Please guys help to narrow down this issue and hopefully help Blackmagic team to fix it.
Offline

Jim Simon

  • Posts: 36022
  • Joined: Fri Dec 23, 2016 1:47 am

Re: 6.2K decode problem on Nvidia

PostFri May 19, 2023 9:03 pm

So this is weird.

When I tested it from Media Storage, it played fine.
When I imported it into a project, and then added it to a timeline, it sometimes played fine.


Studio 18.5b2 on Windows 10
i5-2500K
RTX 3060 12 GB
Gaming Driver 531.79


It is being decoded by the GPU.
Last edited by Jim Simon on Fri May 19, 2023 9:09 pm, edited 1 time in total.
My Biases:

You NEED training.
You NEED a desktop.
You NEED a calibrated (non-computer) display.
Offline
User avatar

roger.magnusson

  • Posts: 3874
  • Joined: Wed Sep 23, 2015 4:58 pm

Re: 6.2K decode problem on Nvidia

PostFri May 19, 2023 9:07 pm

It's a known fact that Resolve only uses the hardware decoder if there is sufficient VRAM left in addition to what the image processing itself would use. What isn't known (at least I don't think it's in the manual) is where the limits are.

For 1080p material in a 1080p timeline it used to be 3.5GB VRAM minimum to enable hardware decoding, according to old replies from BMD in the forum.
Offline

AXYZE9

  • Posts: 4
  • Joined: Fri May 19, 2023 4:09 pm
  • Real Name: Piotr Plenzler

Re: 6.2K decode problem on Nvidia

PostFri May 19, 2023 9:23 pm

[quote="roger.magnusson"]It's a known fact that Resolve only uses the hardware decoder if there is sufficient VRAM left in addition to what the image processing itself would use. What isn't known (at least I don't think it's in the manual) is where the limits are.

If its known fact then why Blackmagic Team doesn't know about it and after months of tickets they blame it on Nvidia Codec SDK?

Can anyone from BM team comment why there is such whitelisting that doesn't make any sense as 8GB is plenty for editing 6.2K footage including grading? I see VRAM usage on RTX3060 12GB, its below 6GB on 15 minute project.

If this limit does make sense then please at least state how much VRAM is needed. RTX3080 10GB is enough to bypass this artificial limit?

Funny fact: integrated GPU on Ryzen 9 7900X decodes fine. Integrated GPU that is 4x slower than $500 laptop and also doesn't have ANY dedicated memory, but can have "shared 16GB" decodes fine, while RTX 3070 8GB doesn't...

Happily I'm within return period and I can get RTX 3060 12GB.
Offline

VMFXBV

  • Posts: 804
  • Joined: Wed Aug 24, 2022 8:41 pm
  • Real Name: Andrew I. Veli

Re: 6.2K decode problem on Nvidia

PostFri May 19, 2023 9:23 pm

I get 29.97 playback on an 8K timeline with my config with two nodes and a CST.

16GB 6800XT though.
14.9GB - 15.1GB VRAM used with only Resolve open
90-99% decoder usage

One note though, Premiere doesn't convert everything into 32bit float. Maybe that combined with what Roger said above is the issue.
AMD Ryzen 5800X3D
AMD Radeon 7900XTX
Ursa Mini 4.6K
Pocket 4K
Offline

4EvrYng

  • Posts: 768
  • Joined: Sat Feb 19, 2022 12:45 am
  • Real Name: Alexander Dali

Re: 6.2K decode problem on Nvidia

PostSat May 20, 2023 1:20 am

AXYZE9 wrote:So a request to all people reading, please get this video file, play it and see if its smooth. You can also check if its decoded by Nvidia (Task Manager -> Performance -> GPU -> Video Decode). Respond with your result and your PC specs so we can be sure what is causing it and be more knowledgeable in future. 6K+ is becoming mainstream, that issue is very bad. Even if Blackmagic won't fix it together we can check if VRAM capacity is indeed the problem and if so from which capacity problem is solved (10GB, 12GB?).

Nvidia 2080 Super with 8GB here

Playback is choppy even with zero adjustment nodes, task manager shows GPU isn't used for decoding, only CPU. Which is IMHO ridiculous because:

1. mpv.net plays it smoothly, consuming only 3.5GB
2. I can play 6K BRAW files (see 'Girl in Milk Bath' sample) smooth as a butter.

So I wondered could workaround for this issue be to start using proxy. I created full size HQX proxy, placed it on 4K timeline, and was able to play it smoothly.
Offline
User avatar

Uli Plank

  • Posts: 25449
  • Joined: Fri Feb 08, 2013 2:48 am
  • Location: Germany and Indonesia

Re: 6.2K decode problem on Nvidia

PostSat May 20, 2023 2:58 am

I think you are missing the hardware decoder needed for that resolution in 10 bit.
It plays perfectly smooth on Apple silicon.
My disaster protection: export a .drp file to a physically separated storage regularly.
www.digitalproduction.com

Studio 19.1.3
MacOS 13.7.4, 2017 iMac, 32 GB, Radeon Pro 580 + eGPU
MacBook M1 Pro, 16 GPU cores, 32 GB RAM, MacOS 14.7.2
SE, USM G3
Offline

4EvrYng

  • Posts: 768
  • Joined: Sat Feb 19, 2022 12:45 am
  • Real Name: Alexander Dali

Re: 6.2K decode problem on Nvidia

PostSat May 20, 2023 3:41 am

Uli Plank wrote:I think you are missing the hardware decoder needed for that resolution in 10 bit.

Unless I am missing something Nvidia is supporting H265 4:2:0 10-bit 8K decoding on 20 series, like mine is. If that is any indicator, mvp.net doesn't have problem playing this file with GPU acceleration on my computer.

https://developer.nvidia.com/video-codec-sdk
Offline
User avatar

roger.magnusson

  • Posts: 3874
  • Joined: Wed Sep 23, 2015 4:58 pm

Re: 6.2K decode problem on Nvidia

PostSat May 20, 2023 8:51 am

Well mpv.net only has to decode the video stream, it doesn't have to buffer frames converted to 32-bit float RGBA in VRAM like Resolve does for its processing. That takes up a lot more space.
Offline

AXYZE9

  • Posts: 4
  • Joined: Fri May 19, 2023 4:09 pm
  • Real Name: Piotr Plenzler

Re: 6.2K decode problem on Nvidia

PostSat May 20, 2023 11:57 am

Uli Plank wrote:I think you are missing the hardware decoder needed for that resolution in 10 bit.
It plays perfectly smooth on Apple silicon.


No. It is supported by hardware and there is half of processing capacity left (it can do 8K x 8K 30fps).

Problem is GPU whitelisting by Resolve and support team blames NVIDIA for that(???).
Offline

AXYZE9

  • Posts: 4
  • Joined: Fri May 19, 2023 4:09 pm
  • Real Name: Piotr Plenzler

Re: 6.2K decode problem on Nvidia

PostSat May 20, 2023 12:08 pm

roger.magnusson wrote:Well mpv.net only has to decode the video stream, it doesn't have to buffer frames converted to 32-bit float RGB in VRAM like Resolve does for its processing. That takes up a lot more space.


I totally understand that Resolve requires more but:
On short projects RTX 3060 12GB uses less than 8GB of VRAM with 6.2K clips.
If VRAM is insufficient then RAM can be used, just like it is used on RTX3060 12GB with 1 hour project with VFX.
Now we have RTX 4060Ti still with 8GB, slow 128bit bus, but system has very very fast DDR5 and NVMe PCI-E 4.0. Such PC cannot decode 6.2K now.

Integrated GPU in Ryzen 7900X that has 1/3 power of $500 laptop decodes video easily whole having 0MB of dedicated VRAM, but up to 16GB shared from system RAM.

M1 MacBook Air also decodes it fine. 0MB dedicated VRAM. Just shared.

Its ridiculous.
Offline

4EvrYng

  • Posts: 768
  • Joined: Sat Feb 19, 2022 12:45 am
  • Real Name: Alexander Dali

Re: 6.2K decode problem on Nvidia

PostSat May 20, 2023 6:05 pm

roger.magnusson wrote:Well mpv.net only has to decode the video stream, it doesn't have to buffer frames converted to 32-bit float RGBA in VRAM like Resolve does for its processing. That takes up a lot more space.

Yes, but that part of my posts was to point out hardware and driver's capability to handle that file seems to be there so hardware and driver do not seem to be culprit of his experience.

I'm sure NVRAM space is consumed when doing NVDEC of H265, but it is also when doing hardware decoding of BRAW, yet 6K BRAW is hardware accelerated and plays smooth like a butter on my 2080 but DR resorts 6K H265 to CPU.
Offline
User avatar

roger.magnusson

  • Posts: 3874
  • Joined: Wed Sep 23, 2015 4:58 pm

Re: 6.2K decode problem on Nvidia

PostSat May 20, 2023 6:56 pm

BRAW uses DCT compression which is much easier in terms of compute power to decompress than H.264/H.265. There might be some synergies in how the memory is managed when using BRAW in Resolve as BMD makes the whole chain.
Offline

4EvrYng

  • Posts: 768
  • Joined: Sat Feb 19, 2022 12:45 am
  • Real Name: Alexander Dali

Re: 6.2K decode problem on Nvidia

PostSat May 20, 2023 7:01 pm

roger.magnusson wrote:BRAW uses DCT compression which is much easier in terms of compute power to decompress than H.264/H.265. There might be some synergies in how the memory is managed when using BRAW in Resolve as BMD makes the whole chain.

Easier in the terms of the compute power or easier in the terms of VRAM consumption? I thought VRAM consumption would be same between the two and that VRAM consumption is one that made BMD decide not to hardware accelerate 6K H265 decoding?
Offline
User avatar

roger.magnusson

  • Posts: 3874
  • Joined: Wed Sep 23, 2015 4:58 pm

Re: 6.2K decode problem on Nvidia

PostSat May 20, 2023 7:35 pm

Easier in terms of compute power. Sorry for the tangent but since you mentioned it plays smoothly I just wanted to note that in terms of compute power they are two completely different things. Decoding DCT isn't very compute intensive, the opposite of H.264/H.265.
Offline

4EvrYng

  • Posts: 768
  • Joined: Sat Feb 19, 2022 12:45 am
  • Real Name: Alexander Dali

Re: 6.2K decode problem on Nvidia

PostSat May 20, 2023 7:59 pm

roger.magnusson wrote:Easier in terms of compute power. Sorry for the tangent but since you mentioned it plays smoothly I just wanted to note that in terms of compute power they are two completely different things. Decoding DCT isn't very compute intensive, the opposite of H.264/H.265.

I guess I haven't communicated well what I meant so allow me to rephrase it:

1. If 6K BRAW on 4K Resolve timeline has same NVRAM consumption as 6K H265 then logic implies there should be no reason to whitelist hardware decoding of 6K H265 (with caveat we don't know nowhere near everything that is going on behind scenes), and

2. With VRAM consumption (potentially) not being an issue then if mvp.net can play hardware decoded 6K H265 smoothly then logic implies Resolve should be able too.

In any case, issue seems to become non-issue once proxies are used and use of proxies for H265 seems to be good practice to start with, so ...
Offline
User avatar

roger.magnusson

  • Posts: 3874
  • Joined: Wed Sep 23, 2015 4:58 pm

Re: 6.2K decode problem on Nvidia

PostSat May 20, 2023 8:23 pm

I think you said it right. ;) "We don't know".

I looked up the 3.5GB limit, a BMD employee mentioned that in reference to encoding so it might not apply for decoding.

But I think it's safe to say it's not a conspiracy, BMD don't usually limit these things unless they have to.
Offline

4EvrYng

  • Posts: 768
  • Joined: Sat Feb 19, 2022 12:45 am
  • Real Name: Alexander Dali

Re: 6.2K decode problem on Nvidia

PostSat May 20, 2023 9:29 pm

roger.magnusson wrote:But I think it's safe to say it's not a conspiracy, BMD don't usually limit these things unless they have to.

Conspiracy? Highly unlikely. Done for valid reason? Always possible, majority of a times things are done for good reason. Or it could be that code that decides what should be accelerated and what shouldn't be hasn't been updated in a long time and uses just resolution as criteria, not generation of the chip too. Like Uli Plank points out NVDEC didn't always support H265 decoding over 4K. But as link I posted indicates that support has been added some years ago.
Offline

CougerJoe

  • Posts: 598
  • Joined: Wed Sep 18, 2019 5:15 am
  • Real Name: bob brady

Re: 6.2K decode problem on Nvidia

PostSun May 21, 2023 12:14 am

With 24GB Vram , and renaming this test file multiple times, I then add 4 videos to 4 tracks, timeline is the actual resolution of the clips, blend them so Resolve must play all videos and can't use an intelligent method of disregarding the video that is hiding under another track.

1 track activated and playing = 51% of 24GB Vram in use
2 tracks activated and playing = 78% of 24GB Vram in use
3 tracks activated and playing = 89% of 24GB Vram in use
4 tracks activated and playing = 83% of 24GB Vram in use (choppy)

Something goes wrong with 4 activated tracks playing, I don't see an increase in CPU use which might be expected if 3x tracks are being decoded by GPU and 1x track is decoded by CPU but instead everything becomes laggy and not really editable

So that is a surprising finding and maybe is a bug?
On a 4K timeline playback results are similar 3x active tracks plays fine, 4x is laggy, the difference being 3 active playing tracks uses uses 72% Vram, and 4 active tracks 62% , so the same reduction in Vram use when adding 4th track
-Task Manager.png
-Task Manager.png (15.21 KiB) Viewed 4672 times

This is what it looks like when playing on a 4K timeline, and half way through playback I activate the 4th track, The GPU decoder appears to turn off, then decoder continuously spikes while giving poor playback, while at the same time VRAM use goes down. The first half of playback shows very smooth GPU decoder activity

To test if it really was the Nvidia GPU decoder to blame, I played the video in a media player and kept opening new players until I noticed frame drops. To my eye playback looked smooth with simultaneous playback of 5 videos. 6 videos caused noticable frame dropping, but no where as bad as Resolve playback with 4 simultaneous playbacks. 6 video playback showed the GPU decoder at 100% activity, 5 video playback showed decode in 90+ percent range.

So not a fault with Nvidia GPU decoder, it's capable of 5 simultaneous playbacks smoothly in a media player with the test file
Offline
User avatar

Uli Plank

  • Posts: 25449
  • Joined: Fri Feb 08, 2013 2:48 am
  • Location: Germany and Indonesia

Re: 6.2K decode problem on Nvidia

PostSun May 21, 2023 2:04 am

You can’t compare a player to DR.
DR is decoding into a huge 32 bit float color depth, needing much more VRAM.
My disaster protection: export a .drp file to a physically separated storage regularly.
www.digitalproduction.com

Studio 19.1.3
MacOS 13.7.4, 2017 iMac, 32 GB, Radeon Pro 580 + eGPU
MacBook M1 Pro, 16 GPU cores, 32 GB RAM, MacOS 14.7.2
SE, USM G3
Offline

CougerJoe

  • Posts: 598
  • Joined: Wed Sep 18, 2019 5:15 am
  • Real Name: bob brady

Re: 6.2K decode problem on Nvidia

PostSun May 21, 2023 2:15 am

Uli Plank wrote:You can’t compare a player to DR.
DR is decoding into a huge 32 bit float color depth, needing much more VRAM.


The decoding aspect of GPU decoding silicon is the same when decoding the full frame, it doesn't change depending on the software. What the software does with that data will be different, and is why Resolve has very high Vram use with 3x videos playing simultaneously, while a media player can play 5x videos at full resolution and only use 60% Vram.

The point being made is that there's nothing wrong with the Nvidia GPU decoder, which apparently Black Magic is blaming (according to user)
Last edited by CougerJoe on Sun May 21, 2023 2:17 am, edited 1 time in total.
Offline

4EvrYng

  • Posts: 768
  • Joined: Sat Feb 19, 2022 12:45 am
  • Real Name: Alexander Dali

Re: 6.2K decode problem on Nvidia

PostSun May 21, 2023 2:16 am

Uli Plank wrote:You can’t compare a player to DR.
DR is decoding into a huge 32 bit float color depth, needing much more VRAM.

Great point, I completely wasn't thinking about that. Is that one 32 bit float (4 bytes) per each of the YRGB values for total of 16 bytes per pixel?

... on the second thought, aren't decoded values stored in system memory, not VRAM?
Offline

CougerJoe

  • Posts: 598
  • Joined: Wed Sep 18, 2019 5:15 am
  • Real Name: bob brady

Re: 6.2K decode problem on Nvidia

PostSun May 21, 2023 2:19 am

4EvrYng wrote:
Uli Plank wrote:You can’t compare a player to DR.
DR is decoding into a huge 32 bit float color depth, needing much more VRAM.

Great point, I completely wasn't thinking about that. Is that one 32 bit float (4 bytes) per each of the YRGB values for total of 16 bytes per pixel?

... on the second thought, aren't decoded values stored in system memory, not VRAM?


It's not a great point, it's unrelated to the problem of the GPU decoder not working in Resolve even though I have plenty of Vram left

This is the playback test on a 720P timeline, so Vram is not an issue. Again I playback 3x blended videos, playback very smooth, 1/2 way through I activate the 4th video, GPU decoder turns off, and then keeps spiking giving poor lagged playback, and again instead of VRAM increasing with the 4th track Vram goes down
720p.png
720p.png (14.9 KiB) Viewed 4642 times
Offline

4EvrYng

  • Posts: 768
  • Joined: Sat Feb 19, 2022 12:45 am
  • Real Name: Alexander Dali

Re: 6.2K decode problem on Nvidia

PostSun May 21, 2023 2:45 am

CougerJoe wrote:
4EvrYng wrote:
Uli Plank wrote:You can’t compare a player to DR.
DR is decoding into a huge 32 bit float color depth, needing much more VRAM.

Great point, I completely wasn't thinking about that. Is that one 32 bit float (4 bytes) per each of the YRGB values for total of 16 bytes per pixel?

... on the second thought, aren't decoded values stored in system memory, not VRAM?


It's not a great point, it's unrelated to the problem of the GPU decoder not working in Resolve even though I have plenty of Vram left

I felt it was a great point because once decoded to 32-bit float frame will consume more NVRAM. Then I realized that is on assumption more than one decoded frame is kept buffered in NVRAM (because even at 16 bytes per pixel 6.2K frame consumes "only" 300-ish MB) but then 6K BRAW would face same issue as it too is decoded to 32-bit float. Thus I edited my post to say "... on the second thought".
Offline

CougerJoe

  • Posts: 598
  • Joined: Wed Sep 18, 2019 5:15 am
  • Real Name: bob brady

Re: 6.2K decode problem on Nvidia

PostSun May 21, 2023 2:57 am

4EvrYng wrote:I felt it was a great point because once decoded to 32-bit float frame will consume more NVRAM. Then I realized that is on assumption more than one decoded frame is kept buffered in NVRAM (because even at 16 bytes per pixel 6.2K frame consumes "only" 300-ish MB) but then 6K BRAW would face same issue as it too is decoded to 32-bit float. Thus I edited my post to say "... on the second thought".


Maybe there's a pipeline within Resolve that gets saturated from the GPU decoder but I had thought the negative with keeping all the data within the Vram is that you need more of it, but the positive is that if you have enough VRAM performance should be faster than NLE's that don't operate the same way such as Premiere Pro.

I can try Premiere at a later date see it it can handle playback of 5 simultaneous videos the way a media player can or it will also lag. I think probably will lag but for reasons different to Resolve. Resolve has a much more efficient playback engine due to keeping everything in Vram and generally is the case.
Offline
User avatar

Uli Plank

  • Posts: 25449
  • Joined: Fri Feb 08, 2013 2:48 am
  • Location: Germany and Indonesia

Re: 6.2K decode problem on Nvidia

PostSun May 21, 2023 2:58 am

It's a combination of VRAM use and hardware decoders needs.

Even if this is only a laptop and the Nvidia card is weaker than the one with the same name in a desktop, you can see what happens when rendering something demanding on page 58:
https://www.dropbox.com/s/l6znbhmtjc3mm ... 0.pdf?dl=0

Try to observe your GPU with GPU-Z for free when it starts stuttering.
My disaster protection: export a .drp file to a physically separated storage regularly.
www.digitalproduction.com

Studio 19.1.3
MacOS 13.7.4, 2017 iMac, 32 GB, Radeon Pro 580 + eGPU
MacBook M1 Pro, 16 GPU cores, 32 GB RAM, MacOS 14.7.2
SE, USM G3
Offline

CougerJoe

  • Posts: 598
  • Joined: Wed Sep 18, 2019 5:15 am
  • Real Name: bob brady

Re: 6.2K decode problem on Nvidia

PostSun May 21, 2023 3:14 am

I turned off GPU decoding to check how my CPU plays a single video, it plays it poorly much like what happens when I attempt to play 4 videos simultaneously. So I think operation is as expected when Resolve turned off the GPU decoder for the 4th track, but it's still turned on for the first 3 tracks, but my CPU is not fast enough to decode 6.2K HEVC and so everything slows down.

I still don't understand the need to turn off GPU decoding for the 4th track/video on a 720P timeline when I am not short of Vram. Would be interesting if someone with higher than 24GB Vram try the same and if GPU decoder remains on for all tracks. Then we would know Resolve is operating to an internal rule and is capable of more.
Offline

4EvrYng

  • Posts: 768
  • Joined: Sat Feb 19, 2022 12:45 am
  • Real Name: Alexander Dali

Re: 6.2K decode problem on Nvidia

PostSun May 21, 2023 3:34 am

CougerJoe wrote:Maybe there's a pipeline within Resolve that gets saturated from the GPU decoder but I had thought the negative with keeping all the data within the Vram is that you need more of it, but the positive is that if you have enough VRAM performance should be faster than NLE's that don't operate the same way such as Premiere Pro.

Keeping frames in VRAM (buffer) makes things smoother but there still has to be swapping with system memory, and app has to do memory management balancing act one way or the other.
Offline

VMFXBV

  • Posts: 804
  • Joined: Wed Aug 24, 2022 8:41 pm
  • Real Name: Andrew I. Veli

Re: 6.2K decode problem on Nvidia

PostSun May 21, 2023 11:00 am

CougerJoe wrote:
I still don't understand the need to turn off GPU decoding for the 4th track/video on a 720P timeline when I am not short of Vram. Would be interesting if someone with higher than 24GB Vram try the same and if GPU decoder remains on for all tracks. Then we would know Resolve is operating to an internal rule and is capable of more.


Its an nvidia feature. Consumer cards are limited to 3 concurrent nvenc streams (until 2020 it used to be two). If you want more you buy a Quadro / A card.

https://techgage.com/wp-content/uploads ... Matrix.png
AMD Ryzen 5800X3D
AMD Radeon 7900XTX
Ursa Mini 4.6K
Pocket 4K
Offline

CougerJoe

  • Posts: 598
  • Joined: Wed Sep 18, 2019 5:15 am
  • Real Name: bob brady

Re: 6.2K decode problem on Nvidia

PostSun May 21, 2023 11:24 pm

VMFXBV wrote:
CougerJoe wrote:
I still don't understand the need to turn off GPU decoding for the 4th track/video on a 720P timeline when I am not short of Vram. Would be interesting if someone with higher than 24GB Vram try the same and if GPU decoder remains on for all tracks. Then we would know Resolve is operating to an internal rule and is capable of more.


Its an nvidia feature. Consumer cards are limited to 3 concurrent nvenc streams (until 2020 it used to be two). If you want more you buy a Quadro / A card.

https://techgage.com/wp-content/uploads ... Matrix.png


It is 5 simultaneous Nvenc encodes now, but Nvdec is unlimited, Nvenc the encoder, Nvdec the decoder. Currently there's no reasonable explanation on 3 videos play fine in Resolve, 4 don't
Offline

VMFXBV

  • Posts: 804
  • Joined: Wed Aug 24, 2022 8:41 pm
  • Real Name: Andrew I. Veli

Re: 6.2K decode problem on Nvidia

PostMon May 22, 2023 11:53 pm

CougerJoe wrote:
VMFXBV wrote:
CougerJoe wrote:
I still don't understand the need to turn off GPU decoding for the 4th track/video on a 720P timeline when I am not short of Vram. Would be interesting if someone with higher than 24GB Vram try the same and if GPU decoder remains on for all tracks. Then we would know Resolve is operating to an internal rule and is capable of more.


Its an nvidia feature. Consumer cards are limited to 3 concurrent nvenc streams (until 2020 it used to be two). If you want more you buy a Quadro / A card.

https://techgage.com/wp-content/uploads ... Matrix.png


It is 5 simultaneous Nvenc encodes now, but Nvdec is unlimited, Nvenc the encoder, Nvdec the decoder. Currently there's no reasonable explanation on 3 videos play fine in Resolve, 4 don't


Unlimited means nothing when the hardware can't handle it. And it probably can't handle decoding 4x6K streams at once. Putting 6K H265 10bit on a 720p timeline still needs Resolve to decode and downscale 6K to 720p as H265 isn't capable of being decoded at lower resolutions like RAW.

This is a quote from nvidia forums from a VR developer.

"All the Nvidia 10xx/20xx/30xx have the same video decoding cap: 8192x8192@30=8192x4096@60=5760x5760@60=6688x3344@90=5792x2896@120 FPS"

Another indication there is a limit to what the cards can do with higher resolutions. And it seems its 3 streams at once for 6K and much lower than this for the low tier cards.
AMD Ryzen 5800X3D
AMD Radeon 7900XTX
Ursa Mini 4.6K
Pocket 4K
Offline
User avatar

Uli Plank

  • Posts: 25449
  • Joined: Fri Feb 08, 2013 2:48 am
  • Location: Germany and Indonesia

Re: 6.2K decode problem on Nvidia

PostMon May 22, 2023 11:57 pm

VMFXBV wrote:… Resolve to decode and downscale 6K to 720p as H265 isn't capable of being decoded at lower resolutions like RAW.


Not to confuse newbies here: that only applies to wavelet compression, other methods profit little or not at all.
My disaster protection: export a .drp file to a physically separated storage regularly.
www.digitalproduction.com

Studio 19.1.3
MacOS 13.7.4, 2017 iMac, 32 GB, Radeon Pro 580 + eGPU
MacBook M1 Pro, 16 GPU cores, 32 GB RAM, MacOS 14.7.2
SE, USM G3
Offline

VMFXBV

  • Posts: 804
  • Joined: Wed Aug 24, 2022 8:41 pm
  • Real Name: Andrew I. Veli

Re: 6.2K decode problem on Nvidia

PostTue May 23, 2023 12:09 am

Uli Plank wrote:
VMFXBV wrote:… Resolve to decode and downscale 6K to 720p as H265 isn't capable of being decoded at lower resolutions like RAW.


Not to confuse newbies here: that only applies to wavelet compression, other methods profit little or not at all.


Applies to RAW afaik, be it wavelet or DCT. You can debayer BRAW and newRED (which are both DCT) to lower resolutions / lower quality to "gain performance". You can't do that with H264 or H265. Putting 6K on a 720p timelines means you downscale / crop it to said timeline resolution and you only gain performance by the means that processing (grading) is done at 720p instead of 6K but the original streams are still decoded at 6K. Unless that process is different somehow.
AMD Ryzen 5800X3D
AMD Radeon 7900XTX
Ursa Mini 4.6K
Pocket 4K
Offline

CougerJoe

  • Posts: 598
  • Joined: Wed Sep 18, 2019 5:15 am
  • Real Name: bob brady

Re: 6.2K decode problem on Nvidia

PostTue May 23, 2023 1:04 am

VMFXBV wrote:Unlimited means nothing when the hardware can't handle it. And it probably can't handle decoding 4x6K streams at once.

Another indication there is a limit to what the cards can do with higher resolutions. And it seems its 3 streams at once for 6K and much lower than this for the low tier cards.


This is what the decoder look like when playing this test file simultaneously 4 times at full resolution. The peaks are everytime the video auto restarts again
--Task Manager.png
--Task Manager.png (11.57 KiB) Viewed 4073 times


Decoder is not overloaded, nor is it with 5x playbacks, at 6x playback the GPU decoder becomes saturated, but decoding doesn't zig zag all over the place as seen in Resolve it sits at 100% while playback is reduced to what might be about 15fps.

In Resolve if you activate the 4th track the GPU decoder turns off, then seems keep turning on/off creating the zig zag pattern. Alternatively it may be decoding 3 tracks and not the 4th which is causing a slow down in processing and reducing the output of the decoder due to slow timeline playback.

There should be no good reason for Resolve to turn of the GPU decoder while I have 12GB of Vram unused. The idea that this is related to not having enough Vram or overloading the GPU decoder or too many simultaneous sessions all appear to be false.

I can't see this thread while submitting so don't know what sort of GPU you have, but you could do a similar test to what I"m doing but scale it to your GPU's Vram using lower resolution files if you don't have 24GB Vram and see if you find anything interesting
Offline
User avatar

Uli Plank

  • Posts: 25449
  • Joined: Fri Feb 08, 2013 2:48 am
  • Location: Germany and Indonesia

Re: 6.2K decode problem on Nvidia

PostTue May 23, 2023 1:06 am

Wavelet allows decoding of every second, forth and so on pixel with massive performance gains. It doesn't work that way with DCT, try for yourself. There's Red demo footage available in either codec.
My disaster protection: export a .drp file to a physically separated storage regularly.
www.digitalproduction.com

Studio 19.1.3
MacOS 13.7.4, 2017 iMac, 32 GB, Radeon Pro 580 + eGPU
MacBook M1 Pro, 16 GPU cores, 32 GB RAM, MacOS 14.7.2
SE, USM G3
Offline

VMFXBV

  • Posts: 804
  • Joined: Wed Aug 24, 2022 8:41 pm
  • Real Name: Andrew I. Veli

Re: 6.2K decode problem on Nvidia

PostTue May 23, 2023 9:01 am

Uli Plank wrote:Wavelet allows decoding of every second, forth and so on pixel with massive performance gains. It doesn't work that way with DCT, try for yourself. There's Red demo footage available in either codec.


So does DCT...with DCT just being less CPU/GPU intensive. But it does work the same way. 4K DCI BRAW (DCT) decoded to 512x270 consumes about 8% Compute. Switching that to 4096x2160 consumes about 14% on a RCM 4K timeline.

Might not mean much since BRAW is easy on resources but its quite a leap from 8% to 14% resource wise.
AMD Ryzen 5800X3D
AMD Radeon 7900XTX
Ursa Mini 4.6K
Pocket 4K
Offline

VMFXBV

  • Posts: 804
  • Joined: Wed Aug 24, 2022 8:41 pm
  • Real Name: Andrew I. Veli

Re: 6.2K decode problem on Nvidia

PostTue May 23, 2023 9:07 am

CougerJoe wrote:
VMFXBV wrote:
Decoder is not overloaded, nor is it with 5x playbacks, at 6x playback the GPU decoder becomes saturated, but decoding doesn't zig zag all over the place as seen in Resolve it sits at 100% while playback is reduced to what might be about 15fps.



This to me sounds exactly like overload. If its because the hardware can't handle it or an artificial cap I don't know. But it doesn't seem Resolve related.
AMD Ryzen 5800X3D
AMD Radeon 7900XTX
Ursa Mini 4.6K
Pocket 4K
Offline
User avatar

Uli Plank

  • Posts: 25449
  • Joined: Fri Feb 08, 2013 2:48 am
  • Location: Germany and Indonesia

Re: 6.2K decode problem on Nvidia

PostTue May 23, 2023 9:10 am

VMFXBV wrote:So does DCT...with DCT just being less CPU/GPU intensive.


Interesting. I found the gain on footage out of Komodo far less than from an Epic. But that might be a weakness in Red's SDK then.
My disaster protection: export a .drp file to a physically separated storage regularly.
www.digitalproduction.com

Studio 19.1.3
MacOS 13.7.4, 2017 iMac, 32 GB, Radeon Pro 580 + eGPU
MacBook M1 Pro, 16 GPU cores, 32 GB RAM, MacOS 14.7.2
SE, USM G3
Offline

VMFXBV

  • Posts: 804
  • Joined: Wed Aug 24, 2022 8:41 pm
  • Real Name: Andrew I. Veli

Re: 6.2K decode problem on Nvidia

PostTue May 23, 2023 9:16 am

Uli Plank wrote:
VMFXBV wrote:So does DCT...with DCT just being less CPU/GPU intensive.


Interesting. I found the gain on footage out of Komodo far less than from an Epic. But that might be a weakness in Red's SDK then.


Komodo is DCT and Epic is wavelet. Maybe that's why. The gains are less because wavelet was very resource heavy?

I've also noticed that decoding Wavelet (from cameras pre DCT) in Resolve using Legacy options really tanks performance but its very good using IPP2.
AMD Ryzen 5800X3D
AMD Radeon 7900XTX
Ursa Mini 4.6K
Pocket 4K
Offline

CougerJoe

  • Posts: 598
  • Joined: Wed Sep 18, 2019 5:15 am
  • Real Name: bob brady

Re: 6.2K decode problem on Nvidia

PostTue May 23, 2023 11:11 pm

VMFXBV wrote:
CougerJoe wrote:
VMFXBV wrote:
Decoder is not overloaded, nor is it with 5x playbacks, at 6x playback the GPU decoder becomes saturated, but decoding doesn't zig zag all over the place as seen in Resolve it sits at 100% while playback is reduced to what might be about 15fps.



This to me sounds exactly like overload. If its because the hardware can't handle it or an artificial cap I don't know. But it doesn't seem Resolve related.


Outside of Resolve, 5x playback of this file simultaneously does not overload the decoder, 6x GPU decoder is at 100% and a slow down is seen. Inside of Resolve 3x playback is fine, 4x isn't, and there is no reason for this to happen that I can see, not a lack of GPU power, overload of GPU decoder, nor overload of CPU or RAM.
BlackMagic staff can you comment?

Return to DaVinci Resolve

Who is online

Users browsing this forum: Bing [Bot], Google [Bot], panos_mts, Rickett and 283 guests