Page 1 of 1

HighPoint NVMe M.2 RAID (EVO 970) fails unexpectedly

PostPosted: Thu Nov 07, 2019 1:21 am
by Adriano Castaldini
I bought an HighPoint SSD7101A-1 to make a NVMe RAID0 with 4x 970 EVOs.
I correctly built the RAID with the WebGUI app and formatted it in HFS+.
It happens that after a while (even doing nothing) the RAID fails unexpectedly: the alarm sounds and WebGUI reports “Disk 'Samsung SSD 970 EVO 2TB-S464NB0M400143A' at Controller1-Enclosure1-Device3 failed” (the disk is not always the same). One can say it's a drive failure, but everytime the problem happens WebGUI reports a different failed drive! And if you simply restart the OS, the raid is still there again (with all the files) and seems working fine (I can only suppose that if it were actually a drive failure, once restarted the OS I should hear the alarm sound immediately).

So, what can it be? A problem of the controller's firmware/BIOS? A loose connection of the NVMe plugs inside the controller? Is the controller damaged? Or is actually one of the drives to be damaged?

I can't understand how to solve this issue.
My hardware is an hackintosh with 2x Radeon VII, OSX on EVO 970 1TB, footage on PCIe RAID (4x 2TB 970 2TB EVOs on HighPoint), 64GB ram, GA-X99P-SLI, i7-6950X, DeckLink Mini Monitor 4K, two 8-drives tb2 RAID (G-Speed), Davinci 16.1.1 on Mojave.

Please, someone can help me?

Thanks a lot.

Re: HighPoint NVMe M.2 RAID (EVO 970) fails unexpectedly

PostPosted: Sat Nov 09, 2019 12:03 pm
by Mark Foster
maybe the drives running to hot

what says the webgui under SHI?

Re: HighPoint NVMe M.2 RAID (EVO 970) fails unexpectedly

PostPosted: Sun Nov 10, 2019 5:28 pm
by Adriano Castaldini
Thanks Mark for your reply.
I monitored SHI this last couple of days. I would have liked to see a disk failure in order to know at which temp it happens, but “unfortunately” disks never failed this time! BUT... over 120°F apps tends to crash. (WebGUI sets temp threshold at 140°F.)
Anyway, these are the temps:
1. Without the cover-heatsink, during hard routine the disks easily arrive to 129°F (hard routine I mean: Blackmagic Speed Test in loop + VLC playing a Prores4444 + writing a 256GB RAW video file + play it with MlRawViewer);
2. With the cover-heatsink, during the same hard routine, the disks never exceed 105°F;
3. With the cover-heatsink, Davinci previews on-the-fly an heavily graded RAW (with massive DeNoise, nodes, layers, and some efx) and plays it in loop in real-time (24fps) thanks to a couple of Radeon VII cards, and after about 10 minutes SHI reports 123°F for the disks.
After this last routine, Davinci did't quit correctly: I had to restart the OS, so I consider this a crash-like event.
Also after the first routine, VLC freezed and I needed to restart the OS.
My conclusion is that inspite of the threshold of 140°F set in SHI, after 120°F the disks don't fail but the system does!

Do you have similar temperatures as mine?

I'm curious also because I had to replace the original white thermal pad (provided with HgihPoint controller) with a slimmer one because the provided white pad was too thick (1.5mm) and with the disks mounted inside the card, I noticed that the PCB bowed when I screwed the cover on. So I replaced the provided pad with a 0.5mm slim thermal pad (Alphacool 14W).

Did you experienced the same problem? Your PCB bowed 'cause the thermal pad thickness?

Thanks a lot.

Re: HighPoint NVMe M.2 RAID (EVO 970) fails unexpectedly

PostPosted: Tue Nov 12, 2019 5:06 pm
by Mark Foster
he heat pad should snuggle to the body
and my temp look similar.

and never use without the heatcover!

Re: HighPoint NVMe M.2 RAID (EVO 970) fails unexpectedly

PostPosted: Fri Apr 24, 2020 11:44 am
by Juergen Engelke
Hi,
I bougth the same Adapter with 4 x Samsung 970 EVO Plus but on a Windows PC with Win10 1909 and have the same problem.
My SHI list's normal in IDLE state which is 111°F and in in work it goes up to 145°F.
The Devices fails with i/O Error but the worst problem is the RAID Mgmt. SW removes the logical drive.

I am in contact with Highpoint Global support for a month now and they are failing for a solution.
I'm at the point to send it back to my dealer.

What did you do - contact to Highpoint ?

Regards
Juergen

Re: HighPoint NVMe M.2 RAID (EVO 970) fails unexpectedly

PostPosted: Fri Apr 24, 2020 12:12 pm
by Adriano Castaldini
Hi everyone,
I can only report my experience: I noticed that everytime one of the NVMes drops down, ethernet cable was connected (and active) to the computer. Once, a drop-down happened with nothing working, only screensaver on, BUT ethernet cable was connected and active. Since then, I always make internet connection active only when I strictly need it, but when I'm not in internet I close the connection.
Up to now, with this method, I have had no more a drop down...
[Edit: I had a looong chat with the producer, but they didn't solve the problem.]

Re: HighPoint NVMe M.2 RAID (EVO 970) fails unexpectedly

PostPosted: Fri Apr 24, 2020 2:31 pm
by Juergen Engelke
6.1 Event .PNG
6.1 Event .PNG (430.81 KiB) Viewed 1404 times
Hi Adriano,
thanks for your answer - in my case I cannot see a problem related to the Ethernet connection.

Here is an event example from my case.What i CANNOT UNDERSTAND why the HPT RAID Mgmt. SW
deletes the logical drive without user interaction.

Re: HighPoint NVMe M.2 RAID (EVO 970) fails unexpectedly

PostPosted: Fri Apr 24, 2020 3:46 pm
by Adriano Castaldini
Yes, I see... Very similar to my old situation...
I'm not a technician, I can only suggest that this problem could refer to a conflict between HPT and something in your hardware. I simply saw in my PCI window, under NVM Express Controllers list, the first says Slot: Ethernet. As I said, I'm not a technician, but removing ethernet access when using HPT solved my problem (seems, up to now...)

Re: HighPoint NVMe M.2 RAID (EVO 970) fails unexpectedly

PostPosted: Fri Apr 24, 2020 3:57 pm
by Juergen Engelke
Thanks for the answer !
Ah, I understand - you had an NVMe Driver to much and not related to the RAID Controller ?
But I do not know much about a MAC.
Will rethink my config.

Re: HighPoint NVMe M.2 RAID (EVO 970) fails unexpectedly

PostPosted: Sat Apr 25, 2020 8:51 am
by Juergen Engelke
I am now convinced that HPT RAID SW does not expect that there are additional SSD's besides the ones on the RAID Controller.
In my case I have a Samsung SSD 960 Pro in addition to the 4 x 970 EVO Plus on the HPT Controller.

But I can not delete the driver because it contains my triple boot Win10 System !

So my conclusion is: The Highpoint HW/SW is not compatible with additional SSD's installed.

Re: HighPoint NVMe M.2 RAID (EVO 970) fails unexpectedly

PostPosted: Sat Apr 25, 2020 1:36 pm
by Adriano Castaldini
Possible... but I would not be so sure: my HPT is SSD7101A-1 that is declared NON-bootable (even if someone says it works as bootable in some system), while another model is SSD7103 that IS bootable. So, if a model is NON-bootable means it does expect another different drive will be the bootable one!
You could say: ok, but for HPT SW sake, the bootable drive must not be an SSD model!
...This would mean that HighPoint has in mind a system with an HDD as bootable drive? This is not realistic, since in their home page there is a new 2019 Mac Pro as example...
BUT!!! Every system is a system per se! For example: my system is an hackintosh (a PC machine that runs OSX Mojave) and you can notice I have (had) a problem very similar to your, while in genuine-old-MacPros community I never heard problems like that. My personal opinion is that HPT is optimized for Mac HW, and works “also” in “some” PC HW with some unpredictable problems that “could” happen (and that you have to face somehow on your own...)
Since my long chats with HighPoint, I realized that once a problem appears, you can't permanently solve it, but you can minimize it with some “strategy” that you elaborate after a lot of tests.
The only thing we can do for the community is sharing our experiences to hopefully help others on solving their problems.

Re: HighPoint NVMe M.2 RAID (EVO 970) fails unexpectedly

PostPosted: Sat Apr 25, 2020 2:53 pm
by Juergen Engelke
Yes, mine is the same 7101A-1 non bootable - I don't need the 7103 because I boot from an additional 960 Pro NVMe.
In todays System build it's quite normal to use a SSD as System device.
So Highpoint should assume if someone buys the SSD RAID Controller it is parallel to other NVMe SSD's.
It is nowhere stated in the product description that this is not possible.
So from my point of view it is an HPT SW error which must be fixed.

Re: HighPoint NVMe M.2 RAID (EVO 970) fails unexpectedly

PostPosted: Sat Apr 25, 2020 3:00 pm
by Adriano Castaldini
Mmm... Another point against your theory is that I tested HPT without its SW, using the good old RAID Assistant of MacOS' Utility Disc, and the problem happened anyway!
More than this: I also tested other controllers like Sonnet 4x4 (same thing) even with different kind of NVMe (Sabrent Rocket... a nightmare!)
Finally I suspect that the problem is the motherboard and/or some conflict with the ports...
As I said, up to now the disabled-ethernet solution seems working fine... mah...

Re: HighPoint NVMe M.2 RAID (EVO 970) fails unexpectedly

PostPosted: Sat Apr 25, 2020 6:14 pm
by Mark Foster
have a look at the 970 firmware
if they are 1B2QEXM7 need a firmwareupgrade to 2B2QEXM7

Re: HighPoint NVMe M.2 RAID (EVO 970) fails unexpectedly

PostPosted: Sat Apr 25, 2020 6:16 pm
by Mark Foster
the 7101 is bootable under macOS!

Re: HighPoint NVMe M.2 RAID (EVO 970) fails unexpectedly

PostPosted: Sat Apr 25, 2020 6:47 pm
by Adriano Castaldini
Mark Foster wrote:have a look at the 970 firmware
if they are 1B2QEXM7 need a firmwareupgrade to 2B2QEXM7

AH! That's interesting!!! (And why HighPoint support guys NEVER said it to me???)
May I ask to you how you know that? And how can I verify the current firmware of my SSDs (without removing them from the HPT)?
Thanks.

Re: HighPoint NVMe M.2 RAID (EVO 970) fails unexpectedly

PostPosted: Sun Apr 26, 2020 8:05 am
by Juergen Engelke
Magician SSD4 after removing IO Error.PNG
Magician SSD4 after removing IO Error.PNG (364.84 KiB) Viewed 1323 times

That was the first step I did with my 970er after getting i/O Error.
All are on latest FW checked with latest Magician and all Diagnostic Scans proved the drives are technically OK.
Most of them are brandnew.
My Motherboard ASUS X99-E 10G WS is listed as compatibel in Highpoints product description of the
7101A-1.

Re: HighPoint NVMe M.2 RAID (EVO 970) fails unexpectedly

PostPosted: Sun Apr 26, 2020 9:21 am
by Juergen Engelke
Build 3 HPT Physical.PNG
Build 3 HPT Physical.PNG (123.63 KiB) Viewed 1318 times

Just to stop more speculations - yes, it is supported by 16 Lanes(X16) in the Motherboard.

Re: HighPoint NVMe M.2 RAID (EVO 970) fails unexpectedly

PostPosted: Sun Apr 26, 2020 1:43 pm
by Adriano Castaldini
Mark Foster wrote:have a look at the 970 firmware
if they are 1B2QEXM7 need a firmwareupgrade to 2B2QEXM7

OK, I verified with WEB RAID Management and all the drives have the newer firmware. So, my “old” problem AND Juergen's very similar problem are both not related to a firmware issue...
Juergen, the only thing I can tell you is that once you'll solve your issue please share your method here.

Re: HighPoint NVMe M.2 RAID (EVO 970) fails unexpectedly

PostPosted: Tue Apr 28, 2020 8:40 am
by Juergen Engelke
Adriano,
yes, I will do.
It seems at the moment they are thinking at HPT.

Re: HighPoint NVMe M.2 RAID (EVO 970) fails unexpectedly

PostPosted: Tue May 12, 2020 11:52 am
by Juergen Engelke
Highpoint Global Web Support proposed to initiate the RMA Process.

Re: HighPoint NVMe M.2 RAID (EVO 970) fails unexpectedly

PostPosted: Tue May 12, 2020 6:16 pm
by Adriano Castaldini
Juergen Engelke wrote:Highpoint Global Web Support proposed to initiate the RMA Process.

Thanks for your update. RMA... interesting because they never proposed it to me... But more then this, it's seriously interesting because now you can have a strong confirm: or your HighPoint was actually damaged, or the new one will give you the same problems.
Let us know when the new toy will be delivered to you. Good luck ;)

Re: HighPoint NVMe M.2 RAID (EVO 970) fails unexpectedly

PostPosted: Mon Jun 29, 2020 3:44 pm
by Juergen Engelke
T-Highpoint 2019 Performance RAID0-2.PNG
T-Highpoint 2019 Performance RAID0-2.PNG (567.2 KiB) Viewed 308 times
Hi Adriano,
here I'm back again it took some time for my dealer to change the Controller.
The new one seems to be manufactured in 2019 what I can see from the serial number - I suspect the old one was manufactured in 2017.
I build 2 Raid 0 arrays with 2 TB each as render source and render target.
I think the Performance is sufficient even for 8K Content creation (see attachment).
No I/O errors so far - all looks fine.

Re: HighPoint NVMe M.2 RAID (EVO 970) fails unexpectedly

PostPosted: Mon Jun 29, 2020 8:20 pm
by Adriano Castaldini
I'm really happy for you Juergen!
Very good! So, the good news is that 2019 controllers work!
Thanks for your updated infos.

Re: HighPoint NVMe M.2 RAID (EVO 970) fails unexpectedly

PostPosted: Tue Jun 30, 2020 8:22 am
by Juergen Engelke
Addendum:
I am working with Video Files most in GB size.
My System has now 4 Logical RAID 0 based storage areas.

Copy Performance from HDD RAID to SSD RAID = 700 MB/s
from SSD RAID to HDD RAID = 700 MB/s
from SSD RAID to SSD RAID = 1,25 GB/s

I think that is more as sufficient for large GB Files and it supports positive my 8K setup.