mirceabanu wrote:All functions good, but when the second user connects, it gets an error after ~10-20 seconds (I believe it is trying to communicate with the other user):
Yeah that bit I put in bold from your quote is the important part, from what I can tell.
I have (for now at least) "solved" our issues -- it had to do with subnet priority on the different workstations, because our machines all run on several connections for different purposes:
1.) 10GbE for direct (unswitched) access to shared storage
2.) 1GbE for (switched) communication with the server (Resolve, FlexLM, etc), renderfarm, intranet
3.) WiFi for internet (completely separate gateway)
It's not well documented so I'll just lay out my best understanding of the current Resolve 16 Project Server architecture:
**Reading database for project access**Machine1 ==>>
Project Server <<==
Machine2For simple access to the project, the machines can be on whatever subnet they want. So:
Machine1: WiFi, connected to server via router on 192.168.1.0/24
Machine2: Ethernet, connected to server via switch on 172.16.85.0/24
Machine1: server has the static address 192.168.1.50
Machine2: server has the static address 172.16.85.50
Both server addresses are valid: access keys just need to use the correct one for each workstation.
**Communicating user activity/bin locks/status**........>>
Project Server <<..........
........^^.....................^^...........
........^^.....................^^...........
Machine1 <<=====>>
Machine2For the second part of the equation, machine-to-machine communication, the Project Server doesn't negotiate communication at all (as far as I can tell), the two machines need to resolve an IP address for each other, tell each other what they are doing, then report that back to the Project Server independently for committing writes to the database (which might be why Live Save is so slow in collaboration?).
So let's say
Machine1 is parked at 192.168.1.101 and
Machine2 is 172.16.85.102 and... WAIT!
They're on different subnets! So even though they can both see the server, they can't see each other!
What I realized was that each workstation was prioritizing a different NIC in our setup. So when one pair of machines was in the same project it might be fine because they were both prioritizing the 1GbE network, but if one machine was favoring the WiFi we'd get that generic error message.
As of Resolve 16 that
shouldn't be an issue anymore -- on page 2941 of the manual there's a passage alluding to the new Project Server ability to negotiate different subnets. So if Resolve 16 can collaborate across multiple subnets, why are we still getting an error? Well... I'm sure there are complications with machines running on multiple subnets, but it might also be a slightly "unfinished" feature.
My solution was fixing the NIC priorities to be the same everywhere, keeping everyone on a single subnet for Resolve traffic. Since then we haven't see the error message.
Plenty of other collaboration bugs and crashes though... and really, there is like half of one page devoted to this whole topic in the manual, and a single error message to cover all possible issues. I think this is one place where the documentation & logging could stand to be a bit more verbose!
So people like me can stop trying to guess about it on the forums...
~~
This may not be quite the solution to your problem, but hopefully it helps with your search -- I think if you can't get all the collaborating machines onto a subnet together (doesn't matter which one) you're going to have issues, regardless of what the manual says.