CPU for editing 10 bit 4:2:2 h.264/265

Get answers to your questions about color grading, editing and finishing with DaVinci Resolve.
  • Author
  • Message
Offline

dantastic

  • Posts: 6
  • Joined: Sun Apr 07, 2019 5:07 am
  • Real Name: Daniel Spiller

CPU for editing 10 bit 4:2:2 h.264/265

PostMon Jun 28, 2021 8:06 am

Hi there, I'm thinking about building a new PC but really confused about how to improve the editing of the files from my a7s3. It seems that newer intel CPU's can hardware decode h.264 and h.265, but:

- Does resolve take advantage of CPU hardware decoding?

- Do the newer intel chips decode h264 and h.264 10 bit 4:2:2 (some places say it's only 4:4:4 and 4:2:0 but not 4:2:2

- From what I've read there aren't mainstream GPU's that can do this, but am I wrong about that?

Any advice is much appreciated :)
Offline
User avatar

Uli Plank

  • Posts: 25450
  • Joined: Fri Feb 08, 2013 2:48 am
  • Location: Germany and Indonesia

Re: CPU for editing 10 bit 4:2:2 h.264/265

PostMon Jun 28, 2021 9:09 am

The newest xe versions of Intel iGPUs and Apple's M1 can decode such formats in hardware under Studio.
My disaster protection: export a .drp file to a physically separated storage regularly.
www.digitalproduction.com

Studio 19.1.3
MacOS 13.7.4, 2017 iMac, 32 GB, Radeon Pro 580 + eGPU
MacBook M1 Pro, 16 GPU cores, 32 GB RAM, MacOS 14.7.2
SE, USM G3
Offline

Carsten Sellberg

  • Posts: 1471
  • Joined: Fri Jun 16, 2017 9:13 am

Re: CPU for editing 10 bit 4:2:2 h.264/265

PostMon Jun 28, 2021 1:56 pm

dantastic wrote: Does resolve take advantage of CPU hardware decoding?


Hi.

Here is a link to Intel Quick Sync Video, which is Intel's video encoding and decoding hardware for NON XEON CPU's:

https://en.wikipedia.org/wiki/Intel_Quick_Sync_Video

As you see, will its capability depend on which version of Quick Sync your CPU have.
And yes the Paid STUDIO version of Resolve can use it.

Regards Carsten.
URSA Mini 4.6K
Offline

SteveMullen

  • Posts: 136
  • Joined: Thu Sep 15, 2016 3:08 am

Re: CPU for editing 10 bit 4:2:2 h.264/265

PostMon Jun 28, 2021 8:39 pm

Uli Plank wrote:The newest xe versions of Intel iGPUs and Apple's M1 can decode such formats in hardware under Studio.


The question states "10-bit" and "422". From what I know neither h.264 or h.265 support 4:2:2. And, unless the h.254 or h.265 was generated using the Main10 Profile the data are 8-bit. So I'm wonder why the question states "10 bit 4:2:2."

With the M1 there are three 'hardware" units available for decoding and encoding video:

1) the CPU -- which would likely be the slowest, so I don't understand the question.

2) the GPU -- which I assume is used.

3) the NP -- which might be used.

4) Which is unit is used?

5) Are both decode and encode supported?

If one doesn't have the Studio version, what are the answers for the same five questions?
Offline

peterjackson

  • Posts: 1182
  • Joined: Sat Aug 18, 2018 7:12 pm
  • Real Name: Peter Jackson

Re: CPU for editing 10 bit 4:2:2 h.264/265

PostTue Jun 29, 2021 7:33 am

SteveMullen wrote:
Uli Plank wrote:The newest xe versions of Intel iGPUs and Apple's M1 can decode such formats in hardware under Studio.


The question states "10-bit" and "422". From what I know neither h.264 or h.265 support 4:2:2. And, unless the h.254 or h.265 was generated using the Main10 Profile the data are 8-bit. So I'm wonder why the question states "10 bit 4:2:2."


How about at least doing a simple Google search before making random claims?

Both H264 and H265 support 10 bit with 422 just fine. H264 allows up to 12 bit in 444.

H265 supports 420 at 8/10/12, 422 at 8/10/12 and 444 at 8/10/12/16 bit depths.

Both support lossless and all intra as well.
5950x, 3090, 128GB.
Offline

SteveMullen

  • Posts: 136
  • Joined: Thu Sep 15, 2016 3:08 am

Re: CPU for editing 10 bit 4:2:2 h.264/265

PostTue Jun 29, 2021 8:41 pm

Unless your camera uses h.265 main10 you are not going to get 10-bit source files. And, how many cameras record 4:2:2? And, how many cameras record 4:2:2 with 10-bits?

So the question still makes no sense to me. Perhaps if he had stated why he asked the question.
Offline

John Paines

  • Posts: 6327
  • Joined: Tue Jul 28, 2015 4:04 pm

Re: CPU for editing 10 bit 4:2:2 h.264/265

PostTue Jun 29, 2021 8:57 pm

SteveMullen wrote:And, how many cameras record 4:2:2 with 10-bits?


Panasonic S1, S1H, S5, Gh5, Canon C70, Sony A7S III, etc. In a word, many. And of course all BMD cameras, but that wouldn't be h264/5.
Offline

peterjackson

  • Posts: 1182
  • Joined: Sat Aug 18, 2018 7:12 pm
  • Real Name: Peter Jackson

Re: CPU for editing 10 bit 4:2:2 h.264/265

PostWed Jun 30, 2021 6:55 am

Also virtually all recent smartphones record H265 10bit 420 using Filmic or mcpro. In case of Samsung up to 500 MBit UHD 60fps. Even back to the S9 from half a decade ago.


10bit HEVC is pretty much the standard for anyone filming with smartphones.
5950x, 3090, 128GB.
Offline

lukasz.jozwinski

  • Posts: 17
  • Joined: Sat Feb 01, 2020 10:10 pm
  • Real Name: Lukasz Jozwinski

Re: CPU for editing 10 bit 4:2:2 h.264/265

PostThu Jan 23, 2025 5:11 pm

Uli Plank wrote:The newest xe versions of Intel iGPUs and Apple's M1 can decode such formats in hardware under Studio.


Sorry for digging this up.

But seems my basic M1 does not decode H264 10bit 4:2:2 via media engines. Seems it's all going through CPU - it's very laggy. But H265 10bit 4:2:2 seems to be decoded by media engines and it's quote flawless to edit.

Do you have some apple documentation that describes which codecs are decoded by media engines?

I've also heard that M4 media engines are new iteration, but also haven't seen any details what is better :(.
Offline

Bruce Phung

  • Posts: 37
  • Joined: Sun Jun 18, 2023 11:42 pm
  • Location: United States
  • Warnings: 1
  • Real Name: Bruce Phung

Re: CPU for editing 10 bit 4:2:2 h.264/265

PostThu Jan 23, 2025 7:18 pm

SteveMullen wrote:Unless your camera uses h.265 main10 you are not going to get 10-bit source files. And, how many cameras record 4:2:2? And, how many cameras record 4:2:2 with 10-bits?

So the question still makes no sense to me. Perhaps if he had stated why he asked the question.


Obviously, You have no clue of what you are claiming about. You should not give misinformation on a subject that you do not know or understand. You should better just sit back read and learn.
CPU: i9 Core Ultra 285K OCed @5.6Ghz
MBO: MSI Z890 MEG ACE
RAM: 48GB RGB DDR5 8200mhz
GPU: RTX 5080 16GB Triple fan OCed 3200mhz
NVMe: 2TB T705 Gen5 OS, 4TB Gen4 storage
OS: Windows 11 Pro. Custom built hard tube watercooling
Offline

kinvermark

  • Posts: 764
  • Joined: Tue Apr 16, 2019 5:04 pm
  • Real Name: Mark Wilson

Re: CPU for editing 10 bit 4:2:2 h.264/265

PostThu Jan 23, 2025 9:52 pm

Take note of the post dates.... quite old.


AFAIK there is no h.264 10 bit 422 hardware decoder (either intel Quicksync or discrete GPU) available for Windows PC's. If there is a hardware solution, please specify.

I am curious whether or not any of the new Mac's have this ? (i.e. h.264, 10bit, 422 hardware decoder)
Windows 11 laptop. Intel i7-10750H, 32GB RAM, Nvidia 4070 ti Super eGPU, SSD disks. Resolve Studio (latest)
Offline
User avatar

Uli Plank

  • Posts: 25450
  • Joined: Fri Feb 08, 2013 2:48 am
  • Location: Germany and Indonesia

Re: CPU for editing 10 bit 4:2:2 h.264/265

PostThu Jan 23, 2025 11:27 pm

If anybody has a test file, I’m ready to compare with HEVC out of a Sony A7IV.

@Bruce
Why so rude without checking the date?
My disaster protection: export a .drp file to a physically separated storage regularly.
www.digitalproduction.com

Studio 19.1.3
MacOS 13.7.4, 2017 iMac, 32 GB, Radeon Pro 580 + eGPU
MacBook M1 Pro, 16 GPU cores, 32 GB RAM, MacOS 14.7.2
SE, USM G3
Offline

Bruce Phung

  • Posts: 37
  • Joined: Sun Jun 18, 2023 11:42 pm
  • Location: United States
  • Warnings: 1
  • Real Name: Bruce Phung

Re: CPU for editing 10 bit 4:2:2 h.264/265

PostThu Jan 23, 2025 11:36 pm

Ouch. I did not realized this was an old thread. But why old threads keep popping up from time to time on the forum? Just curious.
CPU: i9 Core Ultra 285K OCed @5.6Ghz
MBO: MSI Z890 MEG ACE
RAM: 48GB RGB DDR5 8200mhz
GPU: RTX 5080 16GB Triple fan OCed 3200mhz
NVMe: 2TB T705 Gen5 OS, 4TB Gen4 storage
OS: Windows 11 Pro. Custom built hard tube watercooling
Offline
User avatar

Uli Plank

  • Posts: 25450
  • Joined: Fri Feb 08, 2013 2:48 am
  • Location: Germany and Indonesia

Re: CPU for editing 10 bit 4:2:2 h.264/265

PostThu Jan 23, 2025 11:40 pm

Because some people use the search function, which generally is the right thing to do.
My disaster protection: export a .drp file to a physically separated storage regularly.
www.digitalproduction.com

Studio 19.1.3
MacOS 13.7.4, 2017 iMac, 32 GB, Radeon Pro 580 + eGPU
MacBook M1 Pro, 16 GPU cores, 32 GB RAM, MacOS 14.7.2
SE, USM G3
Offline

CougerJoe

  • Posts: 599
  • Joined: Wed Sep 18, 2019 5:15 am
  • Real Name: bob brady

Re: CPU for editing 10 bit 4:2:2 h.264/265

PostThu Jan 23, 2025 11:47 pm

kinvermark wrote:Take note of the post dates.... quite old.


AFAIK there is no h.264 10 bit 422 hardware decoder (either intel Quicksync or discrete GPU) available for Windows PC's. If there is a hardware solution, please specify.

I am curious whether or not any of the new Mac's have this ? (i.e. h.264, 10bit, 422 hardware decoder)


Nvidia 50 series have decode of 422 10bit for both H.264 and H.265, encode of H.265 422 10bit. Decode of H.264 said to be twice as fast. Decode of H.265 possibly up to 3x faster on 5090 although I don't know if decoders work in parallel or cumulatively
Offline
User avatar

roger.magnusson

  • Posts: 3874
  • Joined: Wed Sep 23, 2015 4:58 pm

Re: CPU for editing 10 bit 4:2:2 h.264/265

PostFri Jan 24, 2025 12:03 am

Have they said that 4:2:2 is for both H.264 and H.265? Nvidia hasn't updated their NVENC/NVDEC support matrix page yet.
Offline

CougerJoe

  • Posts: 599
  • Joined: Wed Sep 18, 2019 5:15 am
  • Real Name: bob brady

Re: CPU for editing 10 bit 4:2:2 h.264/265

PostFri Jan 24, 2025 12:16 am

roger.magnusson wrote:Have they said that 4:2:2 is for both H.264 and H.265? Nvidia hasn't updated their NVENC/NVDEC support matrix page yet.


Around about 9min mark this reviewer says it decodes 422 10bit H.264 which was a surprise to me

Offline

kinvermark

  • Posts: 764
  • Joined: Tue Apr 16, 2019 5:04 pm
  • Real Name: Mark Wilson

Re: CPU for editing 10 bit 4:2:2 h.264/265

PostFri Jan 24, 2025 2:40 am

Well spotted, Bob ! Thanks!

I will still look for official confirmation from Nvidia, but the reviewer is quite clear about it.
Windows 11 laptop. Intel i7-10750H, 32GB RAM, Nvidia 4070 ti Super eGPU, SSD disks. Resolve Studio (latest)
Offline

lukasz.jozwinski

  • Posts: 17
  • Joined: Sat Feb 01, 2020 10:10 pm
  • Real Name: Lukasz Jozwinski

Re: CPU for editing 10 bit 4:2:2 h.264/265

PostFri Jan 24, 2025 10:23 am

I've seen that video too, but I wanted to check on the official source and on the nvidia page there is nothing about it. NVDEC and NVENC pages are not updated yet :(.

But the prices of 5090 is kinda high so I guess I will choose MacBook (Mini, Studio or MBP) but I do not have a clear confirmation they decode H264 LGOP 10bit 4:2:2 via media engines. My M1 MBA seems to decode them via CPU and struggles hard.
Offline
User avatar

Uli Plank

  • Posts: 25450
  • Joined: Fri Feb 08, 2013 2:48 am
  • Location: Germany and Indonesia

Re: CPU for editing 10 bit 4:2:2 h.264/265

PostFri Jan 24, 2025 11:26 am

Post a test clip to a cloud service to try.
My disaster protection: export a .drp file to a physically separated storage regularly.
www.digitalproduction.com

Studio 19.1.3
MacOS 13.7.4, 2017 iMac, 32 GB, Radeon Pro 580 + eGPU
MacBook M1 Pro, 16 GPU cores, 32 GB RAM, MacOS 14.7.2
SE, USM G3
Offline

lukasz.jozwinski

  • Posts: 17
  • Joined: Sat Feb 01, 2020 10:10 pm
  • Real Name: Lukasz Jozwinski

Re: CPU for editing 10 bit 4:2:2 h.264/265

PostFri Jan 24, 2025 1:15 pm

Uli Plank wrote:Post a test clip to a cloud service to try.


I will upload few clips in a second :D. Thx!
Offline

lukasz.jozwinski

  • Posts: 17
  • Joined: Sat Feb 01, 2020 10:10 pm
  • Real Name: Lukasz Jozwinski

Re: CPU for editing 10 bit 4:2:2 h.264/265

PostFri Jan 24, 2025 2:31 pm

Uli Plank wrote:Post a test clip to a cloud service to try.


Hi Uli,

I've posted 12 clips here:

https://drive.google.com/drive/folders/ ... sp=sharing

There are 4 H264 LGOP 10bit 4:2:2 clips, 4 H265 10bit 4:2:0 clips and 4 H265 10bit 4:2:2 clips.

There is also small simple timeline with these clips. It's the way I usually use them (and it's great for testing).

So 4 clips at the same time reduced to .5 size. None of my computers is able to play this without hiccups (especially H264 clips). I've tested on Ryzen with Nvidia RTX 4080, M1 Macbook Air and Intel (13th Gen I5) with Quick Sync and Mobile RTX4060. There are also indicators on the timeline (titles) stating which format it is and if it's graded or not.

I've also attached sound track, because it's the easiest indicator to see if the machine is able to play it, if not sound is really choppy :D

I hope Mac has hardware decode support for H264 10bit 4:2:2. I called Apple support today with this question and they were unable to answer this :D. They said it's very hard and complicated question and they do not know :D

Thanks for your help!

BR,
Lukasz
Offline
User avatar

Uli Plank

  • Posts: 25450
  • Joined: Fri Feb 08, 2013 2:48 am
  • Location: Germany and Indonesia

Re: CPU for editing 10 bit 4:2:2 h.264/265

PostFri Jan 24, 2025 2:52 pm

On the road, I’ll report tomorrow.
My disaster protection: export a .drp file to a physically separated storage regularly.
www.digitalproduction.com

Studio 19.1.3
MacOS 13.7.4, 2017 iMac, 32 GB, Radeon Pro 580 + eGPU
MacBook M1 Pro, 16 GPU cores, 32 GB RAM, MacOS 14.7.2
SE, USM G3
Offline

Steve Alexander

  • Posts: 5648
  • Joined: Mon Mar 23, 2015 2:15 am

Re: CPU for editing 10 bit 4:2:2 h.264/265

PostFri Jan 24, 2025 3:52 pm

I suggest providing a drp (project) rather than just a timeline (drt) to ensure that the playback issue is not the result of a particular project-level setting such as optical flow...


Add - Your H264 10-bit files are 25fps while the others are 23.976fps - Are you using all of these files on the same timeline? (ignore - see below).

Add2 - I brought in your drt and on my MacBook with 19.1.3 Studio and playback set to show all video frames, I don't see any hiccups, really - maybe one or two when playing over the Text+ but not repeatable. What are you seeing? Also - since you have mixed frame rates, what is your project frame interpolation mode?
aka Barkinmadd
Resolve Studio 20 | Fusion Studio 20 | 16" MacBook Pro M1 MAX, 32 GPU cores, 64 GB RAM, 2 TB SSD, Sequoia 15.4.1
Offline

lukasz.jozwinski

  • Posts: 17
  • Joined: Sat Feb 01, 2020 10:10 pm
  • Real Name: Lukasz Jozwinski

Re: CPU for editing 10 bit 4:2:2 h.264/265

PostFri Jan 24, 2025 4:39 pm

Steve Alexander wrote:I suggest providing a drp (project) rather than just a timeline (drt) to ensure that the playback issue is not the result of a particular project-level setting such as optical flow...

Add - Your H264 10-bit files are 25fps while the others are 23.976fps - Are you using all of these files on the same timeline?


I have default settings of optical flow, but included the project file as well.
H264 All-I is playing ok, but for All-Intra codec on Sony you need either v90 SD cards or CFexpress cards which are (both) 10x more expensive than v30. V30 are ok for LGOP 10bit 4:2:2.

For the frame settings it does not matter the timeline. I alywas shoot with 25p (because I'm located in 50Hz country) but my camera does not allow 25p in H265, so I have to use 24p.

But doesn't matter what is the frame rate of the timeline they are choppy (like 6-18fps playback depending on the codec). So it's not that there a bit of choppiness because timeline frame rate is different than footage frame rate. It's just really skippy.

I know there are proxies or timeline playback resolution, but it does not always work too.
Offline

Steve Alexander

  • Posts: 5648
  • Joined: Mon Mar 23, 2015 2:15 am

Re: CPU for editing 10 bit 4:2:2 h.264/265

PostFri Jan 24, 2025 4:55 pm

Set interpolation to nearest and try that - optical is heavy and won't playback in real time when the frame rate of your media does not match the frame rate of your timeline.

Also - if you have 422 10-bit and you don't have the Intel CPU with QuickSync, it will be slow to playback, I believe.
aka Barkinmadd
Resolve Studio 20 | Fusion Studio 20 | 16" MacBook Pro M1 MAX, 32 GPU cores, 64 GB RAM, 2 TB SSD, Sequoia 15.4.1
Offline

lukasz.jozwinski

  • Posts: 17
  • Joined: Sat Feb 01, 2020 10:10 pm
  • Real Name: Lukasz Jozwinski

Re: CPU for editing 10 bit 4:2:2 h.264/265

PostFri Jan 24, 2025 8:48 pm

Yes, but Quick sync only support 10bit 4:2:2 in H265, not in H264 unfortunatelly :(.

The most interesting is H264 10bit 4:2:2 anyway, which is actually 25p on 25p timeline.
Offline
User avatar

roger.magnusson

  • Posts: 3874
  • Joined: Wed Sep 23, 2015 4:58 pm

Re: CPU for editing 10 bit 4:2:2 h.264/265

PostFri Jan 24, 2025 9:08 pm

Playback of your 10-bit long GOP 4:2:2 H.264 file is absolutely fine without issues on an M1 Pro (I didn't try the .drt), but when scrubbing you can tell that it's long GOP and not as responsive as you might want. Probably not accelerated. Apple doesn't really make it clear anywhere what the exact capabilities are of the accelerated codecs and it's a shame.
Offline
User avatar

joema4

  • Posts: 436
  • Joined: Wed Feb 03, 2021 3:26 pm
  • Real Name: Joe Marler

Re: CPU for editing 10 bit 4:2:2 h.264/265

PostSat Jan 25, 2025 12:06 am

lukasz.jozwinski wrote: I've posted 12 clips here...There are 4 H264 LGOP 10bit 4:2:2 clips, 4 H265 10bit 4:2:0 clips and 4 H265 10bit 4:2:2 clips.

There is also small simple timeline with these clips. It's the way I usually use them (and it's great for testing).

So 4 clips at the same time reduced to .5 size. None of my computers is able to play this without hiccups (especially H264 clips). I've tested on Ryzen with Nvidia RTX 4080, M1 Macbook Air and Intel (13th Gen I5) with Quick Sync and Mobile RTX4060. There are also indicators on the timeline (titles) stating which format it is and if it's graded or not...

I downloaded everything and played it on my M1 Ultra Mac Studio, running Resolve Studio 19.1.3 and MacOS Sequoia 15.2. All caching was disabled.

Forward playback: It played them all smoothly at 1x and 2x speed. the H.264 10-bit 4:2:2 clips were a little jerky at 4x forward speed, but the H.265 clips were all smooth at 4x forward speed.

Reverse playback: All clips played smoothly at 1x reverse speed. The 10-bit H.264 clips were a little jerky at 2x reverse speed, but the 10-bit H.265 clips were smooth at 2x reverse speed. They were a little jerky at 4x reverse speed.

It is definitely using video acceleration for all of the cases. That is easy to test -- just go to Resolve Preferences>System>Decode Options and turn off "Decode H.264/H.265 using hardware acceleration", relaunch Resolve and try it. The difference is vast.

While the Apple Silicon decode accelerators work well for 10-bit H.264 4:2:2, they definitely work better on H.265 (HEVC). The old view about H.265 is more sluggish to edit because it's more compressed is no longer correct -- provided you have the right hardware acceleration.

The M1 and M2 Ultra have four hardware decoders, which I think are all used in parallel on a "quad split" timeline like your test. The multiple *encoders* can apparently be used on single-stream Long GOP output, at least in the cases I've tested. Due to that difference, the M1 Max MacBook Pro can export to Long GOP faster than the M4 Pro Mac Mini, since the M1 Max has two encoders and the M4 Pro only has one.

That said, the M1 Ultra isn't that fast anymore; it's about equal to the M4 Max MacBook Pro.
Offline
User avatar

roger.magnusson

  • Posts: 3874
  • Joined: Wed Sep 23, 2015 4:58 pm

Re: CPU for editing 10 bit 4:2:2 h.264/265

PostSat Jan 25, 2025 12:18 am

You might be right but there's an unknown variable here. Does turning off "Decode H.264/H.265 using hardware acceleration" make Resolve switch from the Apple Video Toolbox framework to Resolve's own decoder? Or does it still use Video Toolbox but without hardware? I'm thinking it's the former. That still makes it a bit unclear to me whether 10-bit long GOP 4:2:2 H.264 is hardware accelerated. Like you said it's slower than the other ones.
Offline

CougerJoe

  • Posts: 599
  • Joined: Wed Sep 18, 2019 5:15 am
  • Real Name: bob brady

Re: CPU for editing 10 bit 4:2:2 h.264/265

PostSat Jan 25, 2025 12:42 am

joema4 wrote:
lukasz.jozwinski wrote:

It is definitely using video acceleration for all of the cases. That is easy to test -- just go to Resolve Preferences>System>Decode Options and turn off "Decode H.264/H.265 using hardware acceleration", relaunch Resolve and try it. The difference is vast.

While the Apple Silicon decode accelerators work well for 10-bit H.264 4:2:2, they definitely work better on H.265 (HEVC). The old view about H.265 is more sluggish to edit because it's more compressed is no longer correct -- provided you have the right hardware acceleration.



Could it be using hybrid decoding and not full GPU decoding, possibly due to patents. It's unusual whatever is the cause of the weak decoding.
Offline
User avatar

joema4

  • Posts: 436
  • Joined: Wed Feb 03, 2021 3:26 pm
  • Real Name: Joe Marler

Re: CPU for editing 10 bit 4:2:2 h.264/265

PostSat Jan 25, 2025 3:09 am

roger.magnusson wrote:You might be right but there's an unknown variable here. Does turning off "Decode H.264/H.265 using hardware acceleration" make Resolve switch from the Apple Video Toolbox framework to Resolve's own decoder? Or does it still use Video Toolbox but without hardware? I'm thinking it's the former. That still makes it a bit unclear to me whether 10-bit long GOP 4:2:2 H.264 is hardware accelerated. Like you said it's slower than the other ones.


I did two XCode Instruments traces during playback of the H264 LGOP 10 bit 4:2:2 test file, one with Resolve hardware decode acceleration on and another with it off. It seemed to show calls to VideoToolbox (where MacOS h/w acceleration is accessed) were active when h/w accel was on, and overall CPU levels were low.

The one with Resolve h/w accel off showed no calls to VideoToolbox, and CPU levels were high, with multiple Resolve threads doing software decoding in libavcodec.60.dylib, which is a library in the Resolve package. See attached.
Attachments
XCodeInstruments_DR_HW_Accel_On_vs_Off.jpg
XCodeInstruments_DR_HW_Accel_On_vs_Off.jpg (963.38 KiB) Viewed 9632 times
Offline
User avatar

roger.magnusson

  • Posts: 3874
  • Joined: Wed Sep 23, 2015 4:58 pm

Re: CPU for editing 10 bit 4:2:2 h.264/265

PostSat Jan 25, 2025 3:17 am

Cool, thank you for doing those.
Offline
User avatar

Uli Plank

  • Posts: 25450
  • Joined: Fri Feb 08, 2013 2:48 am
  • Location: Germany and Indonesia

Re: CPU for editing 10 bit 4:2:2 h.264/265

PostSat Jan 25, 2025 3:28 am

I did some testing too, and for me it confirms that hardware decoding is both the same for H.264 or H.265 on Apple silicon with an M1 Pro.

Test methodology (all files on a fast external TB4 drive to exclude a bottleneck there):

- Set all clips to 25 fps in the Clip Attributes, the timeline too.
Result: playing fine, as expected. It gets up to speed in the Edit page in less than a second, in Color it takes 1-2 seconds. That what I know from all long GOP codecs.

– Set all clips to 60 fps in the Clip Attributes, the timeline too:
Pretty much identical behaviour.

– Set all clips to 120 fps in the Clip Attributes, the timeline too.
Now we see the limits of this 2019 machine. Both types of clips reach about 99 to 99.5 fps.
GPU cores are pretty much at full load, CPU cores not challenged.

– Switched hardware decoding off, restarted DR, same test.
All clips play at 70 to 75 fps, CPU cores pretty much fully loaded, GPU cores busy, but not full load.

I'd call this pretty consistent behaviour for both codecs. I'd expect M4 Pro to be fine at 120 fps.
My disaster protection: export a .drp file to a physically separated storage regularly.
www.digitalproduction.com

Studio 19.1.3
MacOS 13.7.4, 2017 iMac, 32 GB, Radeon Pro 580 + eGPU
MacBook M1 Pro, 16 GPU cores, 32 GB RAM, MacOS 14.7.2
SE, USM G3
Offline

lukasz.jozwinski

  • Posts: 17
  • Joined: Sat Feb 01, 2020 10:10 pm
  • Real Name: Lukasz Jozwinski

Re: CPU for editing 10 bit 4:2:2 h.264/265

PostSat Jan 25, 2025 8:37 am

Thanks Uli and joema4 for the tests.

Great ideas with backwards playback and speeding up the clips in the Clip Attributes section :D

It seems that basic M1 either does not have media engines or have different version.

I monitored mine with asitop during the tests with "Used hardware acceleration for H264/H265 decoding" and during H264 playback the GPU was almost idle and CPU was 100% saturated. Quite choppy playback at 1x

During H265 playback CPU was 40-60% saturated and GPU 50-60% on ungraded clips and 80-90% with color graded ones. All were playing 25p at 1x speed forward playback.

So seems that the save choice nowadays for timelines like that is Apple Silicon M1 Pro/Max or something newer.
Or PC based on new Nvidia RTX 50 Series cards. 5090 has 2 6th Gen NVDECS and 32GB vRAM, 5080 also two but half the vRAM 16GB (and almost 2x slower vRAM). Other in the series seems to have 1 6th Gen NVDEC.
5070 Ti has 16GB of vRAM and 5070 just 12GB. Looks like 5080 might be the best choice for video editing in terms of performance for a dollar.

I'm kinda leaning towards Apple Silicon, because how energy efficient it is.

My current PC with RTX 4080 takes ~100W while idling and 300W+ during renders.
My M1 MBA takes 2W tops while idling and 20W+ during renders. M2 I've tested some time ago used 30W tops during renders and had similar speed during renders as RTX 2070 (maybe a bit slower).
Offline

lukasz.jozwinski

  • Posts: 17
  • Joined: Sat Feb 01, 2020 10:10 pm
  • Real Name: Lukasz Jozwinski

Re: CPU for editing 10 bit 4:2:2 h.264/265

PostSat Jan 25, 2025 9:00 am

Also tested on the same timeline (4x clips full 4k resolution) on other PC configs:

Ryzen 7 2700X + RTX 4080 PC was able to
play H265 in 10bit 4:2:0 at full speed 1x forward. GPU has done the decoding

But was not able to do it with H265 10bit 4:2:2 (4 frames per second playback) and H264 LGOP 10bit 4:2:2 (~4fps). CPU has been trying to decode it while nvdec was idle.

8bit LGOP 4:2:0 was played smoothly on this PC (nvdec is doing the work).

It's able to play 4* H264 All-Intra 10bit 4:2:2 even graded (but CPU load was up to 71%, NVDEC idle).

13gen i5-13500H (so with Quick Sync) + RTX 4060 Mobile laptop:
Both Quck Sync and Nvidia nvdec allowed in the preferences.

H265 10bit 4:2:0 decoded by NVDEC and played smoothly (1x forward) both ungraded and graded
H265 10bit 4:2:2 decoded by Intel Quick Sync and ungraded played smoothly, but graded clips not :(

H264 8 bit 4:2:0 LGOP - decoded by nvidia and smooth at 1x forward.

H264 10bit 4:2:2 LGOP - CPU doing all the decoding. Was not able to play 4 clips at once at 1x forward. 2 clips were ok (67% of CPU utilized) and 3 clips were on the verge ... all ungraded.
Offline
User avatar

Uli Plank

  • Posts: 25450
  • Joined: Fri Feb 08, 2013 2:48 am
  • Location: Germany and Indonesia

Re: CPU for editing 10 bit 4:2:2 h.264/265

PostSat Jan 25, 2025 9:35 am

Hi Lukasz,

if you write "MBA" that's a MacBook Air, right? I think those were kind of stripped down regarding such hardware, maybe because they have no fans. Pun intended, as long as we refer to UHD video ;-)

Regarding PCs, your results pretty much confirm what we have observed: until the newest generation they didn't decode all long GOP varieties in dedicated hardware. Massive Threadrippers can play those which are not supported by CPU.
My disaster protection: export a .drp file to a physically separated storage regularly.
www.digitalproduction.com

Studio 19.1.3
MacOS 13.7.4, 2017 iMac, 32 GB, Radeon Pro 580 + eGPU
MacBook M1 Pro, 16 GPU cores, 32 GB RAM, MacOS 14.7.2
SE, USM G3
Offline

lukasz.jozwinski

  • Posts: 17
  • Joined: Sat Feb 01, 2020 10:10 pm
  • Real Name: Lukasz Jozwinski

Re: CPU for editing 10 bit 4:2:2 h.264/265

PostSat Jan 25, 2025 10:57 am

Yes. MBA is MacBook Air ... mine is M1. I wonder if all MacBook Airs are stripped down like that. They usually have two versions of basic M processor in the offering for MBAs. Binned and not binned.

But I had M2 base Mac Mini (which I send back after 10 days), and it was not decoding LGOP smoothly. Even 8bit 4:2:0 from Sony A6600 was quite choppy (but it was different version of DR, so maybe DR is much better optimised now). But maybe all basic Apple Silicon processor (M1-M4) versions have media engine stripped down.

I have to search around to confirm this. Although apple on their Macbook Air page claims they have media engines, but maybe they are limited to 8bit codecs, and that's why Apple does not share their Media Engines specs ... Apple is know to use tricks like that to upsell higher models and configs.
Offline
User avatar

kfriis

  • Posts: 600
  • Joined: Sun Oct 10, 2021 10:14 am
  • Real Name: Kurt Friis Hansen

Re: CPU for editing 10 bit 4:2:2 h.264/265

PostSat Jan 25, 2025 1:58 pm

SteveMullen wrote:Unless your camera uses h.265 main10 you are not going to get 10-bit source files. And, how many cameras record 4:2:2? And, how many cameras record 4:2:2 with 10-bits?

So the question still makes no sense to me. Perhaps if he had stated why he asked the question.


Why not?

Any Apple Silicon Mx (including Pro, Max and Ultra) has hardware support for both h264 and h265, and M1 Pro or newer also supports ProRES 422 in various variants.

Many (real) cameras allow recording in h265 10-bit 4:2:2 - one is the really small, light and relatively low cost travel camera Fujifilm X-M5, which I used for recording 6.2k open gate (internally) in December 2024 for recording "Christmas Lights" and events in Córdoba and Madrid, Spain.

6.2k will probably need Resolve Studio, but the 4k footage should work fine on the free version (I have the Studio, so it would be nice, if someone else could confirm the free version usability). Final Cut Pro 10.x or 11.0 digests 6.2k open gate without hiccup on an ancient Macbook M1 14" M1 Pro.

Apple's iPhone 15 and 16 Pro also allow recording h265 4k 10-bit 4:2:2 (I use Cinema P3 - not free, but reasonably priced) in 200 megabit/sec HLG or Log format, so... these are just a few of many options (and the iPhones are probably "out there" in far more than a hundred million hands by now).

I have recorded h265 footage in South Korea (Incheon and Seoul, in-flight, airports and about) and edited plus rendered in both Davinci Resolve Studio up to and including 19.1.3 and Final Cut Pro up to and including version 11.0.

No problems on that front.

By the way: h265 4k 10-bit 4:2:2 HLG 200 megabit/sec allows for both high quality and quick turnaround, if required. Rendering to h265 is no problem on either FCPX or Resolve Studio (someone else must vouch for the fee version).

So... just go for it.

Regards

P.S. Just discovered, that I answered an old thread. Sorry. But the information stands.
Offline
User avatar

Uli Plank

  • Posts: 25450
  • Joined: Fri Feb 08, 2013 2:48 am
  • Location: Germany and Indonesia

Re: CPU for editing 10 bit 4:2:2 h.264/265

PostSat Jan 25, 2025 3:09 pm

The thread may be old, Kurt, but recently was reactivated by the question which hardware supports H.264 in 10 bit.
My disaster protection: export a .drp file to a physically separated storage regularly.
www.digitalproduction.com

Studio 19.1.3
MacOS 13.7.4, 2017 iMac, 32 GB, Radeon Pro 580 + eGPU
MacBook M1 Pro, 16 GPU cores, 32 GB RAM, MacOS 14.7.2
SE, USM G3
Offline
User avatar

joema4

  • Posts: 436
  • Joined: Wed Feb 03, 2021 3:26 pm
  • Real Name: Joe Marler

Re: CPU for editing 10 bit 4:2:2 h.264/265

PostSat Jan 25, 2025 5:52 pm

lukasz.jozwinski wrote:...Although apple on their Macbook Air page claims they have media engines, but maybe they are limited to 8bit codecs, and that's why Apple does not share their Media Engines specs ... Apple is know to use tricks like that to upsell higher models and configs.


The base M1 has a media engine that accelerates H.264 and H.265 encode/decode. This is essentially the same media engine as the M1 Pro, M1 Max and M1 Ultra, except those have more units.

I think the M1 does *not* have a ProRes engine, which is distinct from the regular media engine, but the M1 Pro has one ProRes engine, the M1 Max has two and the M1 Ultra has four ProRes engines, which are all separate from the "Media Engines" that handle Long GOP.

The M1 CPU used in MacBooks is the same chip (with the same media engines) used in the M1 iPad Pro and 5th gen. iPad Air. Apple's own site says those have an encode and decode engine.

Back when the M1 was first released, there were some incorrect 3rd-party articles published that claimed the base M1 had no media engine whatsoever. Those were wrong because die shots were published, and you can see it with your own eyes. See attached.

If you have an M1 that doesn't seem to decode 10-bit 4:2:2 using Resolve Studio 19.x, do a test with all effects turned off (inc'l retiming, rate conforming, etc), with Resolve's hardware decode acceleration turned on then off. It should be a huge difference in performance.
Attachments
M1_M1Pro_M1Max_DieShots.jpg
M1_M1Pro_M1Max_DieShots.jpg (780.01 KiB) Viewed 9278 times
Offline

lukasz.jozwinski

  • Posts: 17
  • Joined: Sat Feb 01, 2020 10:10 pm
  • Real Name: Lukasz Jozwinski

Re: CPU for editing 10 bit 4:2:2 h.264/265

PostSat Jan 25, 2025 8:04 pm

@joema4

Attached picture shows that M1 Pro and M1 Max have strictly mentioned H264/H265 decoding on media engines while M1 does not. I'm not claiming it does not have that functionality, but definitely cannot decode 4 steams of H264 at the same time in DR :D, so it actually does not matter if it has the hardware decode or not since it's not able to decode 4 streams I'm looking for :D. But since it's missing ProRes decoding, maybe it's also missing 10bit 4:2:2 decode for H264?

Looks like RTX 5080 has two 6th gen NVDECs like RTX 5090, and with retail prices at $999 this might be the best choice for hardware H264/H265 decoding. And power specs of 5080 seems to be similar of power requirements of RTX 4080 so it's quite possible to just swap them without the need for changing whole PC.

BTW, I'm actually encoding almost every video to H265 with my RTX 4080 (this means Main10 so probalby 4:2:0 10bit) to save space, because my internet connection is so slow that uploading H264 takes whole day :D.
Offline

RTCool

  • Posts: 3
  • Joined: Mon Jun 15, 2020 10:21 pm
  • Real Name: Robert Coolidge

Re: CPU for editing 10 bit 4:2:2 h.264/265

PostSun Jan 26, 2025 10:13 pm

Pardon me for asking a side question: I've been using the free version of DR for a while, starting on an Intel MacBook Pro, and then some months back, I upgraded to an M1 Max with 64GB or RAM. Do I need to upgrade to the Studio (paid) version to get full use of the hardware capabilities?
Offline
User avatar

Uli Plank

  • Posts: 25450
  • Joined: Fri Feb 08, 2013 2:48 am
  • Location: Germany and Indonesia

Re: CPU for editing 10 bit 4:2:2 h.264/265

PostTue Jan 28, 2025 1:51 am

Not for decoding or encoding.
But then, many advanced features will either be available only in Studio or be faster. IMHO, it doesn't make that much sense to buy such a pretty expensive machine and not spend those extra 300 bucks to get the most out of it.
To work right away with codec flavours not available in the free version on Windows, just go ahead.
My disaster protection: export a .drp file to a physically separated storage regularly.
www.digitalproduction.com

Studio 19.1.3
MacOS 13.7.4, 2017 iMac, 32 GB, Radeon Pro 580 + eGPU
MacBook M1 Pro, 16 GPU cores, 32 GB RAM, MacOS 14.7.2
SE, USM G3
Offline
User avatar

joema4

  • Posts: 436
  • Joined: Wed Feb 03, 2021 3:26 pm
  • Real Name: Joe Marler

Re: CPU for editing 10 bit 4:2:2 h.264/265

PostTue Jan 28, 2025 3:52 pm

lukasz.jozwinski wrote:....my basic M1 does not decode H264 10bit 4:2:2 via media engines...but I do not have a clear confirmation they decode H264 LGOP 10bit 4:2:2 via media engines. My M1 MBA seems to decode them via CPU and struggles hard...definitely cannot decode 4 steams of H264 at the same time in DR...so it actually does not matter if it has the hardware decode or not since it's not able to decode 4 streams I'm looking for

I was confused about your questions, because for several posts you only asked whether the base M1 can do hardware decoding of H264 10-bit 4:2:2. After we spent lots of time running many tests, you then said the issue (I think) is can the M1 hardware decode *four* simultaneous streams of 4k/23.98 H264 10-bit 4:2:2. Or did you mean can the M4 do that? Was that a specific requirement for your planned workflow, or was that just a test? If you plan on doing any multicam editing, I think it's a good test.

There is a big difference between whether a given CPU can do hardware decoding of a given codec vs whether it can decode 4 streams concurrently with good performance.

I just tested an M4 Pro Mac Mini, and it can do smooth hardware decoding of all four scaled 4k/23.98 Long GOP streams in your test project at 2x forward speed, on Resolve Studio 19.1.3.

I agree the RTX 5080 might be a good choice, provided the OS, driver and application software layers support parallel decode and encode. Just having the hardware doesn't mean the software stack supports that. E.g, the M1 Ultra Mac Studio had four decode/encode units, but it wasn't until years later that MacOS, FCP (and I think Resolve) supported parallel decode/encode of a single stream.

I believe multi-stream parallel decode/encode was supported earlier, but you didn't see the benefit unless running a scenario like your test. So whether the 5080 on Resolve will support that upon release, I don't know.
Offline

CougerJoe

  • Posts: 599
  • Joined: Wed Sep 18, 2019 5:15 am
  • Real Name: bob brady

Re: CPU for editing 10 bit 4:2:2 h.264/265

PostWed Jan 29, 2025 2:16 am

joema4 wrote:
I believe multi-stream parallel decode/encode was supported earlier, but you didn't see the benefit unless running a scenario like your test. So whether the 5080 on Resolve will support that upon release, I don't know.


Nvidia advertises 8 simultaneous streams of 4K30 422 10bit per NVDEC decoding 'chip', but as with anything Nvidia their numbers can't necessarily be believed, Maybe true for a certain encoding for HEVC but not necessarily true for H.264. They're saying H.264 decoding is 2x faster now, so maybe the 8 simultaneous streams is true for H.264 and H.265, 5070ti/5080/5090 have 2 decoders, potentially twice as fast.

When playing back a single file equal amounts of decoding come from both decoders, so multiple decoders look to be transparent to operating system.
Offline
User avatar

joema4

  • Posts: 436
  • Joined: Wed Feb 03, 2021 3:26 pm
  • Real Name: Joe Marler

Re: CPU for editing 10 bit 4:2:2 h.264/265

PostWed Jan 29, 2025 2:27 pm

CougerJoe wrote:...When playing back a single file equal amounts of decoding come from both decoders, so multiple decoders look to be transparent to operating system.

What scenario are you talking about here? A demonstration someone did?

In general, it is technically very difficult to parallelize decoding of a single Long GOP video stream. H.264 and H.265 are classified as either open GOP or closed GOP formats. An open GOP format has dependencies on other GOPs, and they cannot be decoded independently.

Even if a parser determines they are closed GOPs and locates the GOP boundaries, the next stage would require an application thread per decoder. The driver does not magically handle that. The application layer must do thread management, synchronization and exercise caution on all thread safety issues.

Note NVDEC is different than NVENC. On the encode side, there is more automation and the driver by itself can do more to offload the app.

To verify parallel NVDEC activity, you'd need to use a lower-level tool like Nsight Systems. It is free but might require a free NVidia developer account to download it. It is vaguely similar to the XCode Instruments tool I used to post the previous graphs. I'd use Nsight myself, but I don't have a Windows machine: https://developer.nvidia.com/nsight-systems
Offline

CougerJoe

  • Posts: 599
  • Joined: Wed Sep 18, 2019 5:15 am
  • Real Name: bob brady

Re: CPU for editing 10 bit 4:2:2 h.264/265

PostWed Jan 29, 2025 2:46 pm

joema4 wrote:
CougerJoe wrote:...When playing back a single file equal amounts of decoding come from both decoders, so multiple decoders look to be transparent to operating system.

What scenario are you talking about here? A demonstration someone did?

In general, it is technically very difficult to parallelize decoding of a single Long GOP video stream. H.264 and H.265 are classified as either open GOP or closed GOP formats. An open GOP format has dependencies on other GOPs, and they cannot be decoded independently.



This guy tries out some different files in Resolve with 5090, why do you think the 4k30 didn't use GPU decode, would it be Open Gop given your explanation of how it should not work with a dual decoder, although shouldn't it fall back to using a single decoder, Maybe Resolve Bug?

Offline
User avatar

joema4

  • Posts: 436
  • Joined: Wed Feb 03, 2021 3:26 pm
  • Real Name: Joe Marler

Re: CPU for editing 10 bit 4:2:2 h.264/265

PostWed Jan 29, 2025 4:39 pm

CougerJoe wrote:This guy tries out some different files in Resolve with 5090, why do you think the 4k30 didn't use GPU decode, would it be Open Gop given your explanation of how it should not work with a dual decoder, although shouldn't it fall back to using a single decoder, Maybe Resolve Bug?..


Excellent question. If you or anyone else wants to check whether your Long GOP videos use open or closed GOPs, here is a Python script for that. It requires FFMpeg and FFProbe. This is needed because no other metadata utility tells you that: https://www.dropbox.com/scl/fi/5m6h8nil ... 3ldol&dl=0

Based on that one test you linked, it appears the multiple encoders on the new RTX 50-series GPU work well (at least on Premiere). However, parallel decoding on a single Long GOP file might only work for "closed" GOPs. If so this Python script should show whether the GOP structure is open or closed.

There are instructions in the .py file for setup on MacOS and Windows. I tested it on MacOS and it works, but I can't test it on Windows.

If it shows an error, let me know, and I'll try to fix it.
Offline

lukasz.jozwinski

  • Posts: 17
  • Joined: Sat Feb 01, 2020 10:10 pm
  • Real Name: Lukasz Jozwinski

Re: CPU for editing 10 bit 4:2:2 h.264/265

PostSun Feb 02, 2025 1:59 pm

joema4 wrote:I was confused about your questions, because for several posts you only asked whether the base M1 can do hardware decoding of H264 10-bit 4:2:2. After we spent lots of time running many tests, you then said the issue (I think) is can the M1 hardware decode *four* simultaneous streams of 4k/23.98 H264 10-bit 4:2:2. Or did you mean can the M4 do that? Was that a specific requirement for your planned workflow, or was that just a test? If you plan on doing any multicam editing, I think it's a good test.

There is a big difference between whether a given CPU can do hardware decoding of a given codec vs whether it can decode 4 streams concurrently with good performance.

I just tested an M4 Pro Mac Mini, and it can do smooth hardware decoding of all four scaled 4k/23.98 Long GOP streams in your test project at 2x forward speed, on Resolve Studio 19.1.3.

I agree the RTX 5080 might be a good choice, provided the OS, driver and application software layers support parallel decode and encode. Just having the hardware doesn't mean the software stack supports that. E.g, the M1 Ultra Mac Studio had four decode/encode units, but it wasn't until years later that MacOS, FCP (and I think Resolve) supported parallel decode/encode of a single stream.

I believe multi-stream parallel decode/encode was supported earlier, but you didn't see the benefit unless running a scenario like your test. So whether the 5080 on Resolve will support that upon release, I don't know.


I asked generally if basic M1 can decode LGOP 10bit 4:2:2 via media engines. I guess I have to use the method you used (xcode) to see if it calls media engines or software library. Not sure how to do it yet :D. I have zero experience with xcode :D.

But I really often use 4 or more streams on the screen at the same time. Usually to compare lighting with different lights/modifiers. Or to compare different focal lengts (stuff like that). The most streams I have on a screen was 10 :D.

The tests you both run were really useful for me. Now I know I need RTX 5080 or Mac M1 Pro/MAX+ :D.
At the moment 5080 seems to be scarse, I see it listed for 2* suggested nVidia price. I will wait a bit and see. If the prices won't drop, the used M1 Max Macbook will be better choice. For $999 RTX 5080 is better for me (cheaper).

I think reviewers got some nvidia documentation together with their sample cards, because they were spewing some specs which are not available on nvidia official website yet.
Seems 6th gen NVDECs should be able to decode 4 streams of 4k25p easily. And 5080/5090 have 2 of them. If I remember correctly even 20 series were able to decode 7-8 streams of LGOP at once, but it was 8bit 4:2:0. Maybe they were able to decode more, but my storage was saturated when I tested this.

Someone said 5070 Ti has 2 decoders ... nvidia page sais it has ONE. Only 5090 and 5080 has two.
Next

Return to DaVinci Resolve

Who is online

Users browsing this forum: Chris Tempel, Tom Stites and 328 guests