Decklink 8k pro blackmagic-io: AER: can't recover

Ask software engineering and SDK questions for developers working on Mac OS X, Windows or Linux.
  • Author
  • Message
Offline

andrshab

  • Posts: 2
  • Joined: Fri Mar 22, 2024 10:46 am
  • Real Name: Andrew Shab

Decklink 8k pro blackmagic-io: AER: can't recover

PostFri Mar 22, 2024 11:06 am

Hi!

Our setup:

2024-03-22 14.01.35.jpg
2024-03-22 14.01.35.jpg (10.49 KiB) Viewed 711 times


Video source, PC and Monitor are grounded.

Issue:

We see randomly such kernel errors:

- First error type:

Code: Select all
pcieport 0005:00:00.0: AER: Corrected error received: 0005:00:00.0


- Second error type:

Code: Select all
pcieport 0005:00:00.0: AER: Uncorrected (Fatal) error received: 0005:00:00.0
pcieport 0005:00:00.0: PCIe Bus Error: severity=Uncorrected (Fatal), type=Transaction Layer, (Receiver ID)
blackmagic-io 0005:01:00.0: AER: can't recover (no error_detected callback)



After first type of errors there is no effects on capture card. It continues to work.
After second type of errors capture card stop capturing and only PC reboot helps.


We observed some cases that lead to these errors:

1) Frequency of these errors is dependent on PCIe riser. With some risers they can happen some times per second and with others - one time per several days.
2) SDI cable connected but errors can randomly happen after enabling video source.
3) SDI cable disconnected from video source but connected to Decklink. When we bring cable closer to video source, but not even connect it - errors can happen. This situation happens with higher probabiliy if to take a walk on carpet first.
4) Our setup has metal enclosure. Decklink is screwed to it via metal mounting plate which is put on SDI connectors array. If we touch enclosure by hand this errors can appear.

We have some assumptions what can be the problem:

1) Bad Decklink grounding. How to properly ground Decklink 8k pro?
2) Ground loop. Could you please suggest how to avoid it?
3) SDI cable has not suitable impedance. Whick impedance should we use and can it be the reason?
4) Video source is electrically incompatible with Decklink. But also this error can happen when we use Blackmagic UpDownCross HD as a video source. Can be this the reason?

We want to use Decklink 8k pro in medical application so it's very important to have stable device. Could you please help to resolve this issue?



Decklink info:

0005:00:00.0 PCI bridge: NVIDIA Corporation Device 229a (rev a1)
0005:01:00.0 Multimedia video controller: Blackmagic Design DeckLink 8K Pro

Here is full log with such errors. Note that device 0005:00:00.0 has 'device recovery successful', but blackmagic-io 0005:01:00.0 - not.

Code: Select all
Mar 22 11:48:09 [11625.216954] pcieport 0005:00:00.0: AER: Multiple Corrected error received: 0005:00:00.0$
Mar 22 11:48:09 [11625.216981] pcieport 0005:00:00.0: PCIe Bus Error: severity=Corrected, type=Physical Layer, (Receiver ID)$
Mar 22 11:48:09 [11625.226962] pcieport 0005:00:00.0:   device [10de:229a] error status/mask=00000001/0000e000$
Mar 22 11:48:09 [11625.235770] pcieport 0005:00:00.0:    [ 0] RxErr                 $
Mar 22 11:48:09 [11625.242151] pcieport 0005:00:00.0: AER: Multiple Corrected error received: 0005:00:00.0$
Mar 22 11:48:09 [11625.242181] pcieport 0005:00:00.0: AER: can't find device of ID0000$
Mar 22 11:48:10 [11625.899847] pcieport 0005:00:00.0: AER: Corrected error received: 0005:00:00.0$
Mar 22 11:48:10 [11625.899859] pcieport 0005:00:00.0: PCIe Bus Error: severity=Corrected, type=Physical Layer, (Receiver ID)$
Mar 22 11:48:10 [11625.909730] pcieport 0005:00:00.0:   device [10de:229a] error status/mask=00000001/0000e000$
Mar 22 11:48:10 [11625.918322] pcieport 0005:00:00.0:    [ 0] RxErr                 $
Mar 22 11:48:10 [11625.950376] pcieport 0005:00:00.0: AER: Uncorrected (Non-Fatal) error received: 0005:00:00.0$
Mar 22 11:48:10 [11625.950389] pcieport 0005:00:00.0: PCIe Bus Error: severity=Uncorrected (Non-Fatal), type=Transaction Layer, (Requester ID)$
Mar 22 11:48:10 [11625.967262] pcieport 0005:00:00.0:   device [10de:229a] error status/mask=00004020/00400000$
Mar 22 11:48:10 [11625.975844] pcieport 0005:00:00.0:    [ 5] SDES                  $
Mar 22 11:48:10 [11625.982124] pcieport 0005:00:00.0:    [14] CmpltTO                (First)$
Mar 22 11:48:10 [11625.989116] blackmagic-io 0005:01:00.0: AER: can't recover (no error_detected callback)$
Mar 22 11:48:10 [11625.989152] pcieport 0005:00:00.0: AER: device recovery failed$
Mar 22 11:48:10 [11625.989154] pcieport 0005:00:00.0: AER: Multiple Uncorrected (Fatal) error received: 0005:00:00.0$
Mar 22 11:48:10 [11625.989176] pcieport 0005:00:00.0: PCIe Bus Error: severity=Uncorrected (Fatal), type=Transaction Layer, (Requester ID)$
Mar 22 11:48:10 [11626.000291] pcieport 0005:00:00.0:   device [10de:229a] error status/mask=00004020/00400000$
Mar 22 11:48:10 [11626.008883] pcieport 0005:00:00.0:    [ 5] SDES                  $
Mar 22 11:48:10 [11626.015172] pcieport 0005:00:00.0:    [14] CmpltTO                (First)$
Mar 22 11:48:10 [11626.022170] blackmagic-io 0005:01:00.0: AER: can't recover (no error_detected callback)$
Mar 22 11:48:11 [11627.043821] pcieport 0005:00:00.0: AER: Root Port link has been reset$
Mar 22 11:48:11 [11627.043881] pcieport 0005:00:00.0: AER: device recovery successful$
Offline

andrshab

  • Posts: 2
  • Joined: Fri Mar 22, 2024 10:46 am
  • Real Name: Andrew Shab

Re: Decklink 8k pro blackmagic-io: AER: can't recover

PostFri Mar 22, 2024 4:08 pm

Also how to reboot Decklink 8k pro programmatically after such errors? We have tried:
Code: Select all
echo 1 | sudo tee /sys/bus/pci/devices/0005:01:00.0/remove
echo 1 | sudo tee /sys/bus/pci/rescan

but this did not help and lead to OS hanging after several remov/rescan

Return to Software Developers

Who is online

Users browsing this forum: Julusian and 16 guests