Pudget Systems test multiple TitanXp in Resolve

Get answers to your questions about color grading, editing and finishing with DaVinci Resolve.
  • Author
  • Message
Offline

Dylan Evans

  • Posts: 66
  • Joined: Mon May 27, 2013 8:57 am

Pudget Systems test multiple TitanXp in Resolve

PostMon Oct 16, 2017 11:06 am

Spoiler: The set up they used show virtually no improvement for most tasks, and only minor improvement in some.

https://www.pugetsystems.com/labs/artic ... n-Xp-1060/

Thoughts?
Offline

AndreasOberg

  • Posts: 452
  • Joined: Wed Sep 25, 2013 9:09 am

Re: Pudget Systems test multiple TitanXp in Resolve

PostMon Oct 16, 2017 11:53 am

Hmm, interesting.
I wonder though. I would think the red Rebayer should be a lot faster. Its much faster in Redcine X with 2 GPUs that it is with one. In fact its twice as fast when I use 2 Titan X (previous gen). Maybe the Titan XP is so fast so its no longer the bottleneck but the CPU is instead a bottleneck.
/Andreas
VFX Director - Visual Forest Ltd
Offline
User avatar

Glenn Venghaus

  • Posts: 1358
  • Joined: Wed Jan 01, 2014 9:56 pm
  • Location: Amsterdam , The Netherlands

Re: Pudget Systems test multiple TitanXp in Resolve

PostMon Oct 16, 2017 1:45 pm

Actually the testing shows exactly where it benefits most to have multiple GPU's. For GPU heavy FX . Duhhhh :mrgreen:
For standard LGG etc as long as you can play fast enough with single GPU , you are good to go with more nodes then is good for you and wont see much benefits fro multiple GPU's , so a bit of wasted test energy or at least focus i would say. But the moment you start messing with NR and currently more relevant with DR14 OFX, and combinations of different of these, thats where you see the benefits and can prevent from your system crumbling down. All the stuff that taxes resolve more then standard grading.
So next tests should be more focused on these things.
Sure you can state that 15 NR is not representative directly, but it "IS" reasonably representative of the possibility to cope with a combination of several GPU accelerated NR+OFX plugins. And makes it easier to standardise the measurements.
So maybe a new candle test where there is a test with a healthy mix of NR + 5-10 different , but standard ResolveOFX plugins.
Beatstep & APC-40 Resolve Edition Controllers https://posttools.tachyon-consulting.com
Test Rig : 2xXeon (24c) | UNRAID KVM OSX VM's | 128GB | 5700XT | 40Gbe
Prod Rig : i9-7940X (14c) | OSX 10.15 | 64GB | 2xVega 56 | 40Gbe | Tb3 | V:Eizo | A:5.1RME
Offline

Larry Li

  • Posts: 37
  • Joined: Tue Jan 24, 2017 6:15 am

Re: Pudget Systems test multiple TitanXp in Resolve

PostTue Oct 17, 2017 1:55 pm

the CPU in the testing setup seems to be the bottleneck, you won't see any benefit by installing more GPU in such case.

can't see they are showing the hardware utilization percentage in the testing. most likely the multiple GPU are idling most of the time, while the CPU is near 100%.
Offline

Dylan Evans

  • Posts: 66
  • Joined: Mon May 27, 2013 8:57 am

Re: Pudget Systems test multiple TitanXp in Resolve

PostTue Oct 17, 2017 6:31 pm

I thought that, however I would expect (without having access to such high end hardware) that the test encoding RAWTIFF to DNX4k should still be showing an improvement as you add GPU's as there is very little decoding to do on the CPU (and actually wierdly when the 4th Titan Xp is added, there is a small jump, but not for the previous 2)

You may well be right however, if the CPU being utilised 100% on the encode for the h.264 or DNX4k then it could be the limiting factor, but at the moment there is no way to alleviate this bottleneck as these are the fastest CPU's out there in terms of clock speed core balance.
Offline

Andrew Kolakowski

  • Posts: 9209
  • Joined: Tue Sep 11, 2012 10:20 am
  • Location: Poland

Re: Pudget Systems test multiple TitanXp in Resolve

PostTue Oct 17, 2017 8:28 pm

It also shows that when you're not going into dual Xeon setups using multiple GPUs is very debatable.
I also think that choice of RED as source caused CPU being a bottleneck. They should use DNxHR etc as source also ( I'm not sure if they actually didn't). And I agree that they should show CPU/GPUs load during tests.
Offline

PeterMoretti

  • Posts: 928
  • Joined: Sat Aug 03, 2013 12:12 am

Re: Pudget Systems test multiple TitanXp in Resolve

PostTue Oct 17, 2017 8:56 pm

Andrew, I totally agree that dual GPUs make more sense with dual CPUs. It just seems that the load sharing is more straight forward for the system to accomplish.

I also have to say that there are times when I'm rendering that nothing (CPU, GPU or disk) is maxed out. So I've come to believe that it's more "complicated" than I had originally thought, where one aspect of the system is pegging and hamstringing the other ones. That does happen in some cases of course. But in others, there is not a *clear* bottleneck that's slowing down the system.
Last edited by PeterMoretti on Wed Oct 18, 2017 1:57 am, edited 1 time in total.
Resolve 14.3 Studio. GTX 970 with GeForce 390.77 driver. Desktop Video 10.9.10. Intensity Shuttle USB 3.0. Windows 10 Pro.
Offline

Larry Li

  • Posts: 37
  • Joined: Tue Jan 24, 2017 6:15 am

Re: Pudget Systems test multiple TitanXp in Resolve

PostWed Oct 18, 2017 1:28 am

Dylan Evans wrote:but at the moment there is no way to alleviate this bottleneck as these are the fastest CPU's out there in terms of clock speed core balance.


may use uncompressed instead, only if you have fast enough write speed of the output. Nvme SSD will do the work, and if it is still not fast enough, 2X Nvme SSD RAID should be.
Offline

Dylan Evans

  • Posts: 66
  • Joined: Mon May 27, 2013 8:57 am

Re: Pudget Systems test multiple TitanXp in Resolve

PostWed Oct 18, 2017 7:07 am

Andrew Kolakowski wrote: They should use DNxHR etc as source also ( I'm not sure if they actually didn't). And I agree that they should show CPU/GPUs load during tests.



Yea, they did use DNX and ProRes as source, they should make it a little clearer, but they transcoded to a few different formats and then ran the tests on these transcoded files.

They also use uncompressed TIFF as a source which shouldn't tax the CPU much at all.
Offline

Chip.Murphy

  • Posts: 167
  • Joined: Wed Mar 08, 2017 5:59 pm

Re: Pudget Systems test multiple TitanXp in Resolve

PostWed Oct 18, 2017 1:31 pm

Poor test, especially considering that not all of the cards are operating at 16x mode.
Offline

Dylan Evans

  • Posts: 66
  • Joined: Mon May 27, 2013 8:57 am

Re: Pudget Systems test multiple TitanXp in Resolve

PostThu Oct 19, 2017 9:51 am

Chip.Murphy wrote:Poor test, especially considering that not all of the cards are operating at 16x mode.


I disagree that it is poor on this basis.

As far as I am aware there are no motherboards that are not dual Xeon which would allow this in the first place, and in addition if the cards were capable of adding any significant performance they would be able to do this in x8 mode, switching to x16 mode would make no difference in this case - the test tells you everything you need to know or can know in this respect.
Offline

Hendrik Proosa

  • Posts: 3015
  • Joined: Wed Aug 22, 2012 6:53 am
  • Location: Estonia

Re: Pudget Systems test multiple TitanXp in Resolve

PostThu Oct 19, 2017 11:28 am

Looks like a test in the lines of "I bought a new computer and my cat videos still run at 25fps and it takes 2 hours to watch a movie, why so? I expected it to run way faster..."
I do stuff.
Offline

AndreasOberg

  • Posts: 452
  • Joined: Wed Sep 25, 2013 9:09 am

Re: Pudget Systems test multiple TitanXp in Resolve

PostThu Oct 19, 2017 1:06 pm

We always run from red raw files which takes a lot of CPU and GPU to decode. That would be a more realistic test case for us at least.

Their system uses a bit too few CPU cores from what I can see. Normally to work with red files you need around 24 core or more to be able to get good speed. That is at least how it was before.
I'm going to do some test in the weekend comparing 2xTitan X and Geforce 1080 Ti (1) and see how they stack up.
/Andreas
VFX Director - Visual Forest Ltd
Offline

Andrew Kolakowski

  • Posts: 9209
  • Joined: Tue Sep 11, 2012 10:20 am
  • Location: Poland

Re: Pudget Systems test multiple TitanXp in Resolve

PostThu Oct 19, 2017 3:28 pm

No, if you want to test GPU you want formats which are low on CPU to avoid this been bottleneck.
In the same time this shows that for RED 6K or 8K source you need more than 1 CPU, before you start thinking about adding more GPUs.
8K or 6K RED is probably bit overkill for these CPUs. Decoding RED itself on CPU will be limiting factor.
But they also used TIFF or ProRes as source so multiple GPUs had place to shine, specially when exporting back to DNxHR etc. (not h264). W need to see CPU and GPUs load during these tests.
They average all the results so this is not detailed enough to judge properly.
Last edited by Andrew Kolakowski on Thu Oct 19, 2017 3:37 pm, edited 1 time in total.
Offline

Andrew Kolakowski

  • Posts: 9209
  • Joined: Tue Sep 11, 2012 10:20 am
  • Location: Poland

Re: Pudget Systems test multiple TitanXp in Resolve

PostThu Oct 19, 2017 3:34 pm

PeterMoretti wrote:Andrew, I totally agree that dual GPUs make more sense with dual CPUs. It just seems that the load sharing is more straight forward for the system to accomplish.

I also have to say that there are times when I'm rendering that nothing (CPU, GPU or disk) is maxed out. So I've come to believe that it's more "complicated" than I had originally thought, where one aspect of the system is pegging and hamstringing the other ones. That does happen in some cases of course. But in others, there is not a *clear* bottleneck that's slowing down the system.


Having many cores machines running all the time at 100% is not easy in real world. Only some processes allow for this- like 3D rendering etc. If you go above 50% (so all real cores are used) this is already quite good.
Offline
User avatar

Jean Claude

  • Posts: 2973
  • Joined: Sun Jun 28, 2015 4:41 pm
  • Location: France

Re: Pudget Systems test multiple TitanXp in Resolve

PostThu Oct 19, 2017 3:49 pm

In my humble opinion : To really stress the GPU and / or CPU, create a virtual disk in RAM. Copy a clip to this new disc and create the project with this clip.

No bottlenecks at the HDD level. Here we see more clearly if the bottlenecks is at the GPU and/or CPU level.
"Saying it is good, but doing it is better! "
Win10-1809 | Resolve Studio V16.1 | Fusion Studio V16.1 | Decklink 4K Extreme 6G | RTX 2080Ti 431.86 NSD driver! |
Offline

Hendrik Proosa

  • Posts: 3015
  • Joined: Wed Aug 22, 2012 6:53 am
  • Location: Estonia

Re: Pudget Systems test multiple TitanXp in Resolve

PostThu Oct 19, 2017 4:18 pm

Andrew Kolakowski wrote:But they also used TIFF or ProRes as source so multiple GPUs had place to shine, specially when exporting back to DNxHR etc. (not h264). W need to see CPU and GPUs load during these tests.

Why would any export show anything about GPU? None of the encodings is done on GPU as far as I know and if any, h264 would be the best candidate, and even this is encoded in dedicated hardware, not through GPGPU pipeline.
I do stuff.
Offline

Andrew Kolakowski

  • Posts: 9209
  • Joined: Tue Sep 11, 2012 10:20 am
  • Location: Poland

Re: Pudget Systems test multiple TitanXp in Resolve

PostThu Oct 19, 2017 9:38 pm

Not talking about GPU encoding but how much Resolve uses GPU for processing. If CPU is close to 100% and GPUs very low then most likely CPU is bottleneck. If CPU is at eg. 30% and multiple GPUs at 30%, but speed remains the same (compared to single GPU) then it shows that multi GPUs is not doing much good.
Offline

Dylan Evans

  • Posts: 66
  • Joined: Mon May 27, 2013 8:57 am

Re: Pudget Systems test multiple TitanXp in Resolve

PostThu Oct 19, 2017 10:48 pm

Hendrik Proosa wrote:
Andrew Kolakowski wrote:But they also used TIFF or ProRes as source so multiple GPUs had place to shine, specially when exporting back to DNxHR etc. (not h264). W need to see CPU and GPUs load during these tests.

Why would any export show anything about GPU? None of the encodings is done on GPU as far as I know and if any, h264 would be the best candidate, and even this is encoded in dedicated hardware, not through GPGPU pipeline.


Encoding very much uses the GPU, and when working with uncompressed formats the GPU does most of the work.
Offline

Andrew Kolakowski

  • Posts: 9209
  • Joined: Tue Sep 11, 2012 10:20 am
  • Location: Poland

Re: Pudget Systems test multiple TitanXp in Resolve

PostThu Oct 19, 2017 10:53 pm

Do you mean export process itself?

There is almost nothing in Resolve exporter which is GPU encoded. When it's GPU encoded it's done by separate GPU chip dedicated for this, not main GPU processors.
GPU (and actually CPU) does about nothing when you read/export DPX, TIFF etc. All GPU load comes from Resolve processing engine. That's why these are best formats when you look of max performance of course at a cost of crazy fast storage which is needed to handle them.
When you read eg. DPX all CPU has to do is unpack it's 30bit data from 32bit word and pass pixels to GPU. I think in some cases DPX frames can be pushed directly to GPU as me GPUs support their structure natively.
Offline

Dylan Evans

  • Posts: 66
  • Joined: Mon May 27, 2013 8:57 am

Re: Pudget Systems test multiple TitanXp in Resolve

PostThu Oct 19, 2017 11:24 pm

I think perhaps I am just misunderstanding the terminology here.

The GPU is observably used the the transcoding process, when you take a uncompressed RAW source and transcode to (for example) DNX this process barely stresses the CPU, but very much stresses the GPU.
Offline

Hendrik Proosa

  • Posts: 3015
  • Joined: Wed Aug 22, 2012 6:53 am
  • Location: Estonia

Re: Pudget Systems test multiple TitanXp in Resolve

PostFri Oct 20, 2017 9:01 am

DNx codecs are not encoded on GPU, neither is Prores. Take an uncompressed dpx sequence and transcode it to some DNx flavor without applying any grading ops, how much GPU use do you see?

To remove all IO bottlenecks a ram disk with uncompressed footage would be a good idea as Jean Claude wrote. To get a good impression of pure GPU speed, exporting a clip and measuring the time would be better than visually noting the fps in timeline playback.
I do stuff.
Offline

Dylan Evans

  • Posts: 66
  • Joined: Mon May 27, 2013 8:57 am

Re: Pudget Systems test multiple TitanXp in Resolve

PostFri Oct 20, 2017 11:42 pm

I think I see what you are saying now, at the very least anything I have put through Resolve in the past has had a simple LUT applied to one node. From an ArriRAW timeline to DNX this would strain the GPU and not the CPU - is this all down to the one node LUT?

Running on a touchbar MBP, the answer to how much GPU usage I see going from 1080p DPX to 1080p DNX is roughly 78%. But there would be several bottlenecks elsewhere and the GPU is only passable, its not a good test case.
Offline

Peter Chamberlain

Blackmagic Design

  • Posts: 13875
  • Joined: Wed Aug 22, 2012 7:08 am

Re: Pudget Systems test multiple TitanXp in Resolve

PostSat Oct 21, 2017 2:19 am

Camera raw debayer and image processing is in GPU, decoding and encoding the compression is cpu.
DaVinci Resolve Product Manager
Offline

John Richard

  • Posts: 382
  • Joined: Thu Aug 23, 2012 3:25 pm

Re: Pudget Systems test multiple TitanXp in Resolve

PostSat Oct 21, 2017 5:03 pm

So very thankful to you all and Puget Systems for this testing and educated commentary. It's so very helpful to us less technically qualified when making hardware purchases. A big improvement over the the Candle Test and outdated hardware guides due to the resources it takes to keep up with current fast changing hardware advancements.

All this reporting and testing help match hardware investment to the project file types people usually work with. A big THANK YOU to you all. Keep it going.
Last edited by John Richard on Sun Oct 22, 2017 3:07 pm, edited 1 time in total.
Offline

Dylan Evans

  • Posts: 66
  • Joined: Mon May 27, 2013 8:57 am

Re: Pudget Systems test multiple TitanXp in Resolve

PostSun Oct 22, 2017 11:53 am

Peter Chamberlain wrote:Camera raw debayer and image processing is in GPU, decoding and encoding the compression is cpu.


Can I clarify a couple of things with you Peter,

1- Does this mean that transcoding from an uncompressed format is not the same as transcoding from RAW itself. So the Pudget methodology of using TIFF is not equivalent to using ArriRAW or the like?

2- How do you define 'image processing'? Does this happen in all transcoding operations, or just when image effects (LUT/FX etc) are applied?
Offline

Andrew Kolakowski

  • Posts: 9209
  • Joined: Tue Sep 11, 2012 10:20 am
  • Location: Poland

Re: Pudget Systems test multiple TitanXp in Resolve

PostSun Oct 22, 2017 1:16 pm

Of course reading RAW is very different than reading TIFF, DPX etc. RAW needs debayering and this is quite GPU intensive. TIFF etc is just push ready RGB data into GPU (in some case it needs unpacking, but this is rathe fast process). That's why Pudget used as source: RED, TIFF, ProREs... so they covered basically all scenarios.
1.You read data- either simple push to GPU, or debayer or decode (or combination of these 2 like in case of eg. RED).
2. Once you have this done and data is now in GPU as 32bit float you have processing, so all LUTs, grading nodes, blur, noise reduction etc
3. Export- final data after processing is encoded to final format (mostly pushed back to CPU).
Offline

AndreasOberg

  • Posts: 452
  • Joined: Wed Sep 25, 2013 9:09 am

Re: Pudget Systems test multiple TitanXp in Resolve

PostSun Oct 22, 2017 3:56 pm

I think from a personal point of view what I would like to know is:
If you have strong CPUs such as Dual Xeon systems does the Dual GPU offer any benefits?

Myself these are the data sources I would normally work with:
- Red 5k Raw. Sorry no 6k Dragon and 8k Helium, the upgrade was just a bit too expensive :D
- H265 from Phantom4Pro. Yes, maybe we have to transcode them into more edit friendly format, but a big problem for us is that the data becomes HUGE if we move it into DNxHR 4k. Space is always a big issue for us with our videos.
- Motion Jpeg from our Canon 1DX. Resolve plays this pretty well actually.

In Redcine X playing back red files is much faster with 2xTitan X. It's actually twice as fast so we hit something like 3x realtime.

To be fair though I find that playback sometimes is realtime and sometimes now in Davinci with red raw.

/Andreas
VFX Director - Visual Forest Ltd
Offline
User avatar

Jean Claude

  • Posts: 2973
  • Joined: Sun Jun 28, 2015 4:41 pm
  • Location: France

Re: Pudget Systems test multiple TitanXp in Resolve

PostSun Oct 22, 2017 4:36 pm

takes a very noisy 5k clip on a 5k timeline: makes a temporal denoise 5 frames. Makes a 1 minute delivery in 5K + force debayer: how long?
"Saying it is good, but doing it is better! "
Win10-1809 | Resolve Studio V16.1 | Fusion Studio V16.1 | Decklink 4K Extreme 6G | RTX 2080Ti 431.86 NSD driver! |
Offline

Dylan Evans

  • Posts: 66
  • Joined: Mon May 27, 2013 8:57 am

Re: Pudget Systems test multiple TitanXp in Resolve

PostSun Oct 22, 2017 5:20 pm

Andrew Kolakowski wrote:Of course reading RAW is very different than reading TIFF, DPX etc. RAW needs debayering and this is quite GPU intensive. TIFF etc is just push ready RGB data into GPU (in some case it needs unpacking, but this is rathe fast process). That's why Pudget used as source: RED, TIFF, ProREs... so they covered basically all scenarios.
1.You read data- either simple push to GPU, or debayer or decode (or combination of these 2 like in case of eg. RED).
2. Once you have this done and data is now in GPU as 32bit float you have processing, so all LUTs, grading nodes, blur, noise reduction etc
3. Export- final data after processing is encoded to final format (mostly pushed back to CPU).


Right, so this is why I am asking this, they may show debayer of a Red source, but not of an ArriRAW source for example which stresses the hardware in very different ways. I would say this means that they haven't covered all scenarios as their test does not tell me what a basic uncompressed RAW > DNX pipeline does to the hardware, or if multiple TitanXp cards might help in this scenario.
Offline

AndreasOberg

  • Posts: 452
  • Joined: Wed Sep 25, 2013 9:09 am

Re: Pudget Systems test multiple TitanXp in Resolve

PostSun Oct 22, 2017 5:45 pm

Ok, I did the test. Myself I use Neat Video for noise reduction so don't know much about the correct settings for noise reduction.

Hardware for this test. 2xTitan X and 2x14 Core 2.6GHz (turbo can run at 3.1GHz continuously without overheating). I can maybe do a test with a single 1080 Ti later.
Timeline 5200x2700.
I saved into DNxHR 444 format onto an SSD Disk with 500MB/s write speed. Material was from Raid of about 800MB/s read.

Tets 1)
Not sure what the correct settings are. The first test was Lum and color threshold was at 0. Movement at 10.7
It was playing back in 10.5fps
CPU was around 45%
GPU 1 30%
GPU 2 20%
Export Time. 2min 26seconds.

2)
If I change lum and color threshold to 3 and Motion is still 10.7
CPU: 20% (I think the CPU is mainly used for decoding the Red files, so its less used when it needs to wait for the noise reduction which is done on the GPU).
GPU 1 up to 25-100%
GPU 2 usually around 10-70%
Playback 4fps
Export: 5.30min

3) Only Red 5.2k debayer with no noise reduction.
Playback 13.5 (while exporting)
CPU 54%
GPU 1 46%
GPU 2 20-30%
No noise reduction
Export time: 2min2sec

Andreas
VFX Director - Visual Forest Ltd
Offline

Dermot Shane

  • Posts: 2720
  • Joined: Tue Nov 11, 2014 6:48 pm
  • Location: Vancouver, Canada

Re: Pudget Systems test multiple TitanXp in Resolve

PostSun Oct 22, 2017 9:28 pm

Dylan Evans wrote:As far as I am aware there are no motherboards that are not dual Xeon which would allow this in the first place,


so maybe run the test with dual xeon machine config'd as per BMD's spec's? That's the tool of choice for heavy lifting anyway
Last edited by Dermot Shane on Mon Oct 23, 2017 5:41 pm, edited 1 time in total.
Offline
User avatar

Marc Wielage

  • Posts: 10901
  • Joined: Fri Oct 18, 2013 2:46 am
  • Location: Hollywood, USA

Re: Pudget Systems test multiple TitanXp in Resolve

PostMon Oct 23, 2017 3:01 am

AndreasOberg wrote:H265 from Phantom4Pro. Yes, maybe we have to transcode them into more edit friendly format, but a big problem for us is that the data becomes HUGE if we move it into DNxHR 4k. Space is always a big issue for us with our videos.

I think managing media space is an important part of any workflow situation. My advice would be to cut down on the total amount of material you're looking at -- for example, if it's Phantom Flex files, chop out the run-up and the run-out and only keep the material that actually could be used in the final show. Take that and convert it to a better edit format like ProRes or DNxHR. To me, drives are a lot cheaper than computer hardware... unless you're shooting 100TB a day of 8K 120fps material.
marc wielage, csi • VP/color & workflow • chroma | hollywood
Offline

AndreasOberg

  • Posts: 452
  • Joined: Wed Sep 25, 2013 9:09 am

Re: Pudget Systems test multiple TitanXp in Resolve

PostMon Oct 23, 2017 12:51 pm

I think managing media space is an important part of any workflow situation. My advice would be to cut down on the total amount of material you're looking at -- for example, if it's Phantom Flex files, chop out the run-up and the run-out and only keep the material that actually could be used in the final show. Take that and convert it to a better edit format like ProRes or DNxHR. To me, drives are a lot cheaper than computer hardware... unless you're shooting 100TB a day of 8K 120fps material.


Hi Mark, sound advice. The way we currently work is that our main computer is also our main storage space for data. We are only really editing and grading our own material so our setup is probably a bit different than bigger houses.

Currently we have this setup (could maybe be useful for others)
- 64TB enterprise RAID 10. This creates a fast read but also has a pretty solid safety since all disk has a backup. Also several other drives that we dont use as much.
- Various 8TB external drives for offsite backup/in a safe. All our videos folders are in 8TB chunks so they are easier to offload
- 2x4TB enterprise ssd when we are out in the field. This has been a godsent, and saved us tons of time.

So far what I have found to be the most efficient solution (not saying its the best, still lots to learn!)
- Trim all material. This can usually reduce your data with at least 30%, but sometimes a lot more based on how strict you were when shooting.
- Avoid creating Optimised media, so not proxies, but media that is high quality that replaces the source.
- Create proxies if its a more advanced editing project where you need to be quick. If its a simpler project proxies may not be needed since Davinci is pretty fast with raw files. This works for red and Canon 1DX motion jpeg which are fast enough to work with.
- H265, currently my workflow is to transcode, but I will try with Geforce 1080 Ti cards. If 1080 hardware acceleration makes it possible to view them and maybe work with them that would make the workflow near perfect.
- Maybe optimise the whole timeline or cache it for the color grading, but so far usually not needed.

I use a program called GoodSynch to do all the backups. Its really good since it only detects changes and can handle renamings without copying the files.

Cheers
/Andreas
VFX Director - Visual Forest Ltd
Offline

Peter Chamberlain

Blackmagic Design

  • Posts: 13875
  • Joined: Wed Aug 22, 2012 7:08 am

Re: Pudget Systems test multiple TitanXp in Resolve

PostMon Oct 23, 2017 3:31 pm

AndreasOberg wrote:Ok, I did the test. Myself I use Neat Video for noise reduction so don't know much about the correct settings for noise reduction.

Hardware for this test. 2xTitan X and 2x14 Core 2.6GHz (turbo can run at 3.1GHz continuously without overheating). I can maybe do a test with a single 1080 Ti later.
Timeline 5200x2700.
I saved into DNxHR 444 format onto an SSD Disk with 500MB/s write speed. Material was from Raid of about 800MB/s read.

Tets 1)
Not sure what the correct settings are. The first test was Lum and color threshold was at 0. Movement at 10.7
It was playing back in 10.5fps
CPU was around 45%
GPU 1 30%
GPU 2 20%
Export Time. 2min 26seconds.

2)
If I change lum and color threshold to 3 and Motion is still 10.7
CPU: 20% (I think the CPU is mainly used for decoding the Red files, so its less used when it needs to wait for the noise reduction which is done on the GPU).
GPU 1 up to 25-100%
GPU 2 usually around 10-70%
Playback 4fps
Export: 5.30min

3) Only Red 5.2k debayer with no noise reduction.
Playback 13.5 (while exporting)
CPU 54%
GPU 1 46%
GPU 2 20-30%
No noise reduction
Export time: 2min2sec

Andreas



Only a little interesting as you don’t define the motherboard, which slots are used for which cards and if any other cards are in slots, Decklink or drive control etc.

Measuring CPU or GPU performance still comes back to the lowest common denominator. You can have fast cpus and slow system memory, fast raid but card in a x4 slot etc.

I’m not saying your test is invalid, just an incomplete story just like the original Pudget tests.
DaVinci Resolve Product Manager
Offline

AndreasOberg

  • Posts: 452
  • Joined: Wed Sep 25, 2013 9:09 am

Re: Pudget Systems test multiple TitanXp in Resolve

PostMon Oct 23, 2017 8:57 pm

Only a little interesting as you don’t define the motherboard, which slots are used for which cards and if any other cards are in slots, Decklink or drive control etc.

Measuring CPU or GPU performance still comes back to the lowest common denominator. You can have fast cpus and slow system memory, fast raid but card in a x4 slot etc.

I’m not saying your test is invalid, just an incomplete story just like the original Pudget tests.


Hiya Peter. Here is the rest of the information:
Supermicro MBD-X10DAI-O C610
Intel Xeon E5-2697 V3 s2011-3 (14 cores with 2.7GHz with turboboost to 3.1GHz, Noctua coolers so they never throttle down but can run in 3.1GHz as long as needed)
2x64GB Cruc DDR4 Server (4x16GB)
2x12GB Asus GTX TITAN X 12GD5 (all sitting in PCI-E 16x)
480GB SAMSUNG PM863 SSD SATA
8x8TB HITACHI SATA HDD (in Raid 10)
Areca ARC-1882i Dual Core PCIe
No decklink cards.

OS. Windows 10.0.15063
Davinci: 14.01

Personally, I think there are other details that would need to be improved with the "test" such as:
- Using the same red clip.
- Also use different types of hardware, otherwise you have little to compare with
- Also this primarily tests noise reduction which is of course only useful if that is what you need to do, ideally you would have tests similar to what you actually work with.

If I would benchmark for myself this is how I would do it:
- benchmark playback of Red files on a 4k timeline with a 4k monitor, 5k is the highest resolution we have.
- Test some formats like Motion Jpeg, ProRes, DNxHR and H265, image sequences.
- Do a few example scenes with various amount of grading, everything from 5 notes to 30
- Do some scenes with stabilization since I use this feature a lot and it's very demaning on the hardware.
- Benchmark an export since it has real world usage.
- Use both 1 card or 2 cards. Could be interesting with 1 or 2 CPUs, but not sure how to turn one of them off without taking it out.

What could be useful is to have a built in benchmark in Resolve. Maxwell Render has this and I always found it very useful. Your harddrive test is my favorite hdd benchmark program.

Best,
Andreas
VFX Director - Visual Forest Ltd
Offline

Andrew Kolakowski

  • Posts: 9209
  • Joined: Tue Sep 11, 2012 10:20 am
  • Location: Poland

Re: Pudget Systems test multiple TitanXp in Resolve

PostMon Oct 23, 2017 11:06 pm

Since you have fats CPUs you can test Red as source, but you have to keep the same project etc. Don't use any 3rd party filtering like NeatVideo- this is slow and not good in this test. We want to test Resolve internal engine, not 3rd party one.

Make complicated project with many nodes, some blur etc. and export your RED source into DNxHR with 1GPU and 2 GPUs set for computing. Check export times and note load for CPU and GPU in both cases.
Do the same test with easy on CPU source, so DNxHR (exporting back to DNxHR).
What you want to see is with 2 GPUs export time going down a lot (at least 1.5x) and both GPUs being used at 50%+ (and CPUs also 50%+).
Last edited by Andrew Kolakowski on Mon Oct 23, 2017 11:26 pm, edited 3 times in total.
Offline

Andrew Kolakowski

  • Posts: 9209
  • Joined: Tue Sep 11, 2012 10:20 am
  • Location: Poland

Re: Pudget Systems test multiple TitanXp in Resolve

PostMon Oct 23, 2017 11:15 pm

Dylan Evans wrote:
Andrew Kolakowski wrote:Of course reading RAW is very different than reading TIFF, DPX etc. RAW needs debayering and this is quite GPU intensive. TIFF etc is just push ready RGB data into GPU (in some case it needs unpacking, but this is rathe fast process). That's why Pudget used as source: RED, TIFF, ProREs... so they covered basically all scenarios.
1.You read data- either simple push to GPU, or debayer or decode (or combination of these 2 like in case of eg. RED).
2. Once you have this done and data is now in GPU as 32bit float you have processing, so all LUTs, grading nodes, blur, noise reduction etc
3. Export- final data after processing is encoded to final format (mostly pushed back to CPU).


Right, so this is why I am asking this, they may show debayer of a Red source, but not of an ArriRAW source for example which stresses the hardware in very different ways. I would say this means that they haven't covered all scenarios as their test does not tell me what a basic uncompressed RAW > DNX pipeline does to the hardware, or if multiple TitanXp cards might help in this scenario.


RED stresses heavily both CPU and GPU. Arri mainly GPU. In many cases it's CPU which is not coping with decoding RED, so GPUs have to wait. This could be potential issue with Pudget test with RED source (but yet again- they used different sources and some of them were light on CPU).
AriRAW doesn't really bring anything special here. You could argue that for example maybe RED SDK has poorly written debayer, so GPUs are not used well, but I don't think this is the case. This would be very visible against all other tests (with different source formats), but Pudget didn't mention such a case.
Offline

AndreasOberg

  • Posts: 452
  • Joined: Wed Sep 25, 2013 9:09 am

Re: Pudget Systems test multiple TitanXp in Resolve

PostTue Oct 24, 2017 12:08 am

Andrew Kolakowski wrote:What you want to see is with 2 GPUs export time going down a lot (at least 1.5x) and both GPUs being used at 50%+ (and CPUs also 50%+).

Hiya. Sounds like a good test. Are these speed improvement something you have noticed on your system or do you have a different setup?

I did a quick test in Redcine X to see what the load is for only playing Red files in full debayer. In Redcine I get 100% on CPU load and GPU load is 50-60 on both cards on a 1080p monitor. This indicates that the CPU is the bottleneck but that the GPUs are used quite heavily. It can easily playback the files with maybe 1.5x realtime speed.
Using the latest version 50.43840
Cheers,
Andreas
VFX Director - Visual Forest Ltd
Offline

Andrew Kolakowski

  • Posts: 9209
  • Joined: Tue Sep 11, 2012 10:20 am
  • Location: Poland

Re: Pudget Systems test multiple TitanXp in Resolve

PostTue Oct 24, 2017 9:35 am

I don't have such a machine, it's just how it should be.
Resolve should give similar results as in case of RED it's all has to go through RED SDK due to fact that RED files are encrypted.
This is quite good result from RedCine and proves their SDK is fairly well written.
Offline

AndreasOberg

  • Posts: 452
  • Joined: Wed Sep 25, 2013 9:09 am

Re: Pudget Systems test multiple TitanXp in Resolve

PostTue Oct 24, 2017 11:16 am

Andrew Kolakowski wrote:I don't have such a machine, it's just how it should be.
Resolve should give similar results as in case of RED it's all has to go through RED SDK due to fact that RED files are encrypted.
This is quite good result from RedCine and proves their SDK is fairly well written.


First I was a bit surprised that the CPU was 100% used in Redcine X, but then I realise what it's doing. It's basically rendering as fast as possible until the first bottleneck is being hit which is the CPU. If you would playback in realtime (and not faster than realtime) then I would think less CPU usage would be needed which is probably what Resolve does (unless rendering).

/Andreas
VFX Director - Visual Forest Ltd
Offline

Andrew Kolakowski

  • Posts: 9209
  • Joined: Tue Sep 11, 2012 10:20 am
  • Location: Poland

Re: Pudget Systems test multiple TitanXp in Resolve

PostTue Oct 24, 2017 11:34 am

Yes and this is actually very good test (in some cases you may need to put data on RAM disk for this test).
Offline

AndreasOberg

  • Posts: 452
  • Joined: Wed Sep 25, 2013 9:09 am

Re: Pudget Systems test multiple TitanXp in Resolve

PostTue Oct 24, 2017 12:16 pm

Andrew Kolakowski wrote:Yes and this is actually very good test (in some cases you may need to put data on RAM disk for this test).

True,
I can test at work as well. We have 2x10 3.1GHz cores that has turboboost to 3.4 or 3.6GHz. Sadly only one 1080 Ti card. From my experience the playback is considerably slower on this machine than to what I have home. Redcine X is a bit tricky because they have some settings for the graphics card. You can increase the frames you send in one batch. It gets faster but can stutter especially if you jump around in time.
I don't think the harddrive is a limit in any way from the red files. Databand with for realtime is around 150MB/s which the raid can easily deal with. I did notice that jumping in time helped alot by having a fast drive such as a M.2 SSD with 3200MB read/s. I guess the raid needs to rotate all the disks to find the file.

The interesting finding here is that for my case in Redcine X 1 faster GPUs may not give any additional speed.
/Andreas
VFX Director - Visual Forest Ltd
Offline

Andrew Kolakowski

  • Posts: 9209
  • Joined: Tue Sep 11, 2012 10:20 am
  • Location: Poland

Re: Pudget Systems test multiple TitanXp in Resolve

PostTue Oct 24, 2017 2:03 pm

YEs, having m.2 or SSD helps even for compressed files. I noticed that in Edius- MP4 files scrub better and overall timeline responsiveness is better when media is on SSD/m.2. It's all down to seek time I assume, specially for long GOP formats where many frames needs to be fetched in advance.

When you try to test eg. 4K TIFF/DPX then RAM disk may be need to saturate GPUs :)
Offline
User avatar

Jean Claude

  • Posts: 2973
  • Joined: Sun Jun 28, 2015 4:41 pm
  • Location: France

Re: Pudget Systems test multiple TitanXp in Resolve

PostTue Oct 24, 2017 5:37 pm

AndreasOberg wrote:.../...
Could be interesting with 1 or 2 CPUs, but not sure how to turn one of them off without taking it out.

Best,
Andreas


Try :
- cmd (command prompt)
- enter msconfig
- go to tab Start
- Advanced command => new popup windows
- Check on Number of processor
- select '1'
- OK and restart windows
- make test

After test : reverse : same way but 'number of processor' = 2 + restart

Edit : Do not forget to post the results: for comparison => I only have one CPU and 2 GPU :)
"Saying it is good, but doing it is better! "
Win10-1809 | Resolve Studio V16.1 | Fusion Studio V16.1 | Decklink 4K Extreme 6G | RTX 2080Ti 431.86 NSD driver! |
Offline

AndreasOberg

  • Posts: 452
  • Joined: Wed Sep 25, 2013 9:09 am

Re: Pudget Systems test multiple TitanXp in Resolve

PostWed Oct 25, 2017 11:26 am

Try :
- cmd (command prompt)
- enter msconfig
- go to tab Start
- Advanced command => new popup windows
- Check on Number of processor
- select '1'
- OK and restart windows
- make test

After test : reverse : same way but 'number of processor' = 2 + restart

Edit : Do not forget to post the results: for comparison => I only have one CPU and 2 GPU :)


Thanks that is a great tip.
I'm busy this weekend but will do some more benchmarks after.
My friend just bought a 16 core threadripper CPU. It would be very interesting to see how this fares considering its about 1000 dollars instead of 4000-5000 for a dual Xeon setup.
/Andreas
VFX Director - Visual Forest Ltd
Offline

AndreasOberg

  • Posts: 452
  • Joined: Wed Sep 25, 2013 9:09 am

Re: Pudget Systems test multiple TitanXp in Resolve

PostThu Oct 26, 2017 10:10 am

Talking about benchmarks. My friend just bought an AMD 16 core threadripper. He has a 4.5k Red Raven with a Geforce 1080 GTX.
I was a bit surprised to see that it played in realtime, maybe with 1.1x speed (compression 1:6). The GPU was used about 30-40% while the CPU was 100%.
The cores were running at about 3.7GHz which is very cool for so many cores.
Will test thread ripper in: Davinci and with 5.2k Red material as well. If you guys have any 6k Dragon or 8k helium clips we can test them as well.
Also want to test this with a 4k monitor.
/Andreas
VFX Director - Visual Forest Ltd
Offline

Andrew Kolakowski

  • Posts: 9209
  • Joined: Tue Sep 11, 2012 10:20 am
  • Location: Poland

Re: Pudget Systems test multiple TitanXp in Resolve

PostThu Oct 26, 2017 1:59 pm

Well, Threadripper 1950x has 16 cores and they are highly clocked, so why so surprised? Even if single core performance is lower in some cases compared to Intel there is still 16 of them. If a process can use them well then there should be no big surprise.
CPU has 16x3.7=59. If you have 2x10 cores Xeons at 3GHz this is 60. Both systems should perform similar.
In case of RED all CPU power is "wasted" to decode those 4,6,8K JPEG2000 data. RED could use something faster to decode. Jpeg2000 is real CPU killer.

Return to DaVinci Resolve

Who is online

Users browsing this forum: captaincarrot55, Daniele96, govind, jonathanljs, Mads Johansen, ohimbz, panos_mts, VMFXBV and 236 guests