Blackmagic Forum

Sun May 02, 2021 1:43 pm

Hello,
I've been using Resolve for about 3 months and I'm experiencing very poor performance rendering or playing back my projects. After some experiments, it looks like other users are sharing exactly the same problems, which you can read here:
https://forum.blackmagicdesign.com/viewtopic.php?f=21&t=116334 (Fusion page: Major performance issue with MediaIn v Loader)

https://forum.blackmagicdesign.com/viewtopic.php?f=21&t=114641 (Very low CPU usage and crashes in Fusion)

https://forum.blackmagicdesign.com/viewtopic.php?f=21&t=98426 (Why do titles slow performance?)

https://forum.blackmagicdesign.com/viewtopic.php?f=21&t=137249 (Computer very laggy with Resolve)

Some of the messages are quite old and the problems still exist, so I'm not sure BlackMagic is getting the message.

I use a lot of png images and text (Text+ node) in my Fusion Compositions and unfortuately there seems to be serious performance issues in Resolve. For the png images (a very serious issue), you'll find more in the first link I posted.

About the Text+ issue, I decided to run a little experiment:

Create a new project, add a 1 minute Fusion composition. In the Fusion Composition:
1) create a single background node (solid color) and render the clip.
This takes on average 7s on my machine. The render options are QuickTime H.264, 1920x1080 HD, 24 fps. This is interesting: rendering a clip where absolutely nothing happens takes 7s. The generated file is about 11 Mb.
2) add 3 Text+ nodes and the relative Merge nodes. The Text+ nodes are not animated. Render time is consitently 7s. Fair enough, no overhead compared to the previous case.
3) Add 2 more Text+ nodes. Render time is between 7 and 8 seconds. So far so good.
4) Animate the 5 Text+ nodes using the Write On tool. Animations are present only in the first and last 10 seconds, so for 40s out of 60s nothing happens. Render time goes up to 15s on average. For a Write On animation on 5 Text+ nodes, it really seems way too much. GPU utilization never goes above 75%.

Now you can imagine what happens combining more nodes, images, masks, and so on. Playback becomes super laggy. And this is happening with no footage or 3D animations at all!

For curiosity, I ran NVIDIA Nsight System on scenario 4.
The profiling report is full of these Dma Packets and the GPU seems to sleep quite a lot.

Why is all of this necessary for just a Background node and 5 Text+ nodes with a simple Write On animation? Playback and rendering of these kind of compositions should take no time.

My system:
Resolve: 17.1.1 build 9
OS: Windows 10 Enterprise, version 20H2
CPU: Xeon E3-1545M @ 2.9 GHz, 4x2 cores
GPU: NVIDIA Quadro P4000, driver 27.21.14.5241
Ram: 64 GB
SSD: SK hynix SC311 SATA 512GB

Sun May 02, 2021 4:31 pm

While Text+ has been demonstrated to be broken, or better to say have somewhat abnormal speed slowdown, you are asking for some things that can’t (currently) happen in real world. For example there is no ”nothing is happening” switch in most encoders, so whether all frames are value-identical or not, they still must be pushed through the encoding process. This takes time. From the perspective of software, detecting if something is happening is usually done with one simple method: all parameters that contribute to changing the result are collected (hashed together) and comparing these frame hashes tells whether two frames produce same output without needing to actually produce said output. If they do, good software can grab the frame cache if it has stored it. Some food for thought: which is faster, retrieving a cache for solid color generator or generating it from scratch?

Sun May 02, 2021 9:27 pm

If something doesn't work for you with the integrated version of Fusion, I'd recommend to try with the standalone one.
Then use VFX Connect to get it into DR.

Mon May 03, 2021 9:29 am

Hendrik Proosa wrote:... If they do, good software can grab the frame cache if it has stored it. Some food for thought: which is faster, retrieving a cache for solid color generator or generating it from scratch?

I really think retrieving it from the cache is faster, and it has to happen only once if many consecutive frames are all the same. I imaging rendering a frame works like this:
- prepare data on the CPU (those DMA packets maybe)
- send the data to the GPU
- process the data on the GPU
- send the result back to the CPU

If the CPU can detect the next frame is identical to the current one, all the 4 steps can be skipped.
Sending data from CPU to GPU and viceversa is usually the bottleneck, it doesn't matter how many cores you have or how fast your GPU is: this data transfer takes a lot.

If my Fusion composition is made of a single static Background node for one minute, the encoding should process one frame on the GPU, not 1440 (at 24 fps) and the rendering should be immediate, not taking 8 seconds.

Hendrik Proosa wrote:While Text+ has been demonstrated to be broken, or better to say have somewhat abnormal speed slowdown

I'm very sad to read this. Why Blackmagic Design hasn't addressed this issue yet? Some posts are more than one year old.

Mon May 03, 2021 4:30 pm

rinsim wrote:I imaging rendering a frame works like this:
- prepare data on the CPU (those DMA packets maybe)
- send the data to the GPU
- process the data on the GPU
- send the result back to the CPU

If the CPU can detect the next frame is identical to the current one, all the 4 steps can be skipped.

If you skip all these steps, how will data go to h264 encoder chip in GPU and how will encoded frames get back to save them into final mov file? Files do not materialize from thin air, to get something in, you must:
- generate or retrieve from ram/disk cache the frame data;
- send that data to GPU;
- process that data in the encoding hardware;
- send the result back to CPU, stuff it into mov frame packet and write to file on disk.

Take it with a grain of salt, but I'm pretty sure that currently encoded frames are not reused even if they are known to be produced from same source and equal. If you know of any encoder which does (has a setting for this), I'd like to know. At best, currently same framebuffer in gpu can be used as source for all consecutive frames. Retrieving the data and storing into container must still be done.

rinsim wrote:...rendering should be immediate, not taking 8 seconds...

How can sending frames to gpu, encoding frames, getting them back and writing them to file on disk be immediate? In physical world everything takes time.

Mon May 03, 2021 9:14 pm

Try a Quantum computer, once there is a native version for Resolve.

Text+ and Fusion performance. Once again.

Text+ and Fusion performance. Once again.

Re: Text+ and Fusion performance. Once again.

Re: Text+ and Fusion performance. Once again.

Re: Text+ and Fusion performance. Once again.

Re: Text+ and Fusion performance. Once again.

Re: Text+ and Fusion performance. Once again.

Who is online