Hi,
Simplifying to the maximum:
You have to see the calculations with GPU of some different. The goal is to
parallelize calculations.
So 2 GTX 1080 Ti cards is (2x8) = 16GB VRAM "addressable space" if we make an optimized balancing of the threads between the 2 GPU (s).
In a perfect world it's better than a gtx1080ti with 11 GB VRAM. In reality not so simple.
In summary for GPU Computing: In the processing of an image: more kernel, more blocks so more threads that perform the calculations in parallel.
First of all, you have to see the GPU (s) as a multi-dimensional grid made up of independent calculation blocks.
A block represents a matrix of multi-dimensional threads. The developer chooses the dimensions he needs.
In the GPU (s):
A block contains 'n' threads. Each thread executes an instance of a kernel and has coordinates in that block to identify the thread.
On the other hand, at a given moment, before relaunching new calculations (for example for a new 'IN image'), it is necessary to recover the results of all threads and to consolidate them in a 'OUT image'. This re synchronization is then impacted by the slowest GPU or exchange of information from (s) GPU (s) to the software that is in RAM via the information bus. It is an intra bottleneck GPU (s) / BUS / CPU (s) / RAM /etc...
For the balancing between GPU (s): More GPU (s), more parallel calculations but provided that the GPU (s) are homogeneous. Ideally the same capabilities of calculations and addressability and speed of exchange in information bus.
That is why it is recommended to have the same GPUs / speed Bus in a multi-GPU configuration and ideally in PCI-E slots with the same bit rate (X16).
The smallest GPU in computing capacity is then used as the common denominator (and for the GPU GUI is reduced by the need for GUI)
It's not the developer who decides everything: it is the hardware via interrupts and manages the rendering sync to the software that retrieves the image in memory in what is nothing other than an array of floats.
@JIM
1. A single GTX 1080, 2560 CUDA cores, 256 bit bus, 8GB memory
2. Two 1060's, 1280 CUDA cores, 192 bit bus, 6GB memory
Without hesitation the option a GTX 1080:
- as many cores as 2 gtx1060
- important and we do not talk about it often but the bus rate will be bigger the more it will be possible for the hardware to transfer the bits in the bus (exchanges between RAM / CPU / GPU / BRIDGE / etc ...)
One GPU:
less power: less PSU
less heat
Hope this help.