We start with what our goals are
We have a vm that want to take as much cpu power as possible from the host.
But how shall we configure the VM to make the most of it. You have read about numa nodes and what that means? right? If not look, at the bottom of this post you will find find some nice links.
If you really want to deep dive regarding this read anything that Frank Denneman as written, great stuff!
Also take a look at https://notesfrommwhite.net/ from @mwVme great stuff and do great summary of VMware stuff.
Questions
Shall I use vCPU and make it equal to Logical Processors?
Shall I use vCPU and make it equal to Physical socket?
Whats is sockets on a VM?
How many cores per socket on the VM?
So many questions, Lets try the different configurations and see!
I have to answer a few questions during the tests
How many vCPU does the VM see?
How many numa nodes does the VM see?
What is the performance on the VM?
Lets begin
We start with the stuff that I can not change.
We have a server with 2 physical CPU:s
12 Cores per CPU in total 24 cores
Then we have hyper threading enabled so we get 48 Logical processors.
Host configuration:

So lets see what are the different configuration that we wanna try out for this VM.
The Windows server is 2019 with 8GB of memory. The picture says 4 but I change that after the screenshot.
vCPU | Cores | Sockets |
48 | 1 | 48 |
48 | 24 | 2 |
48 | 48 | 1 |
24 | 1 | 24 |
24 | 12 | 2 |
24 | 24 | 1 |
12 | 1 | 12 |
12 | 6 | 2 |
12 | 12 | 1 |
In the test vm we have 8 GB ram 40 GB. Newly installed Windows 2019 server. The pictures are related to the the test with 48 vCPU and 48 Cores per socket.
Server config

So first of whats does Windows see?

And how many numa nodes can Windows see?

Then we run CPU-Z just to verify:

Now to performance. We use passmarks program to test perfromance.

I don’t wanna bore you out with a lot of picture so you have to believe the excel sheet.
The winner in my case was 48 vCPU and 24 Cores. That will be 2 sockets on the VM and 2 NUMA nodes.
vCPU | Cores | Sockets | Virtual proc. in VM | Sockets in vm | Numa node in vm | CPU Mark | Integer Math | Floating Point | Prime Nr | SSE | Compression | Encryption | Physiscs | Sorting | Single Thread | Cross-plattform | Total |
48 | 1 | 48 | 48 | 48 | 2 | 21358 | 99308 | 44627 | 107 | 24308 | 404217 | 6522 | 1244 | 56132 | 1652 | 52738 | 712213 |
48 | 24 | 2 | 48 | 2 | 2 | 22922 | 92605 | 50481 | 107 | 28282 | 431481 | 8166 | 1098 | 57940 | 1652 | 53816 | 748550 |
48 | 48 | 1 | 48 | 1 | 2 | 17121 | 103058 | 53062 | 103 | 9509 | 250094 | 6835 | 1233 | 55052 | 1668 | 54289 | 552024 |
24 | 1 | 24 | 24 | 24 | 2 | 21734 | 79003 | 58336 | 186 | 28258 | 365648 | 5979 | 1823 | 51786 | 1678 | 61231 | 675662 |
24 | 12 | 2 | 24 | 2 | 2 | 21798 | 78712 | 57328 | 173 | 28341 | 364276 | 6147 | 1802 | 51361 | 1685 | 59729 | 671352 |
24 | 24 | 1 | 24 | 1 | 2 | 21788 | 79391 | 57852 | 179 | 29129 | 361213 | 6116 | 1831 | 51918 | 1645 | 60664 | 671726 |
12 | 1 | 12 | 12 | 12 | 1 | 12048 | 40685 | 27247 | 95 | 14651 | 182278 | 2987 | 1256 | 25113 | 1669 | 30541 | 338570 |
12 | 6 | 2 | 12 | 2 | 1 | 12720 | 41196 | 30045 | 104 | 15852 | 196008 | 3134 | 1301 | 27784 | 1673 | 32882 | 362699 |
12 | 12 | 1 | 12 | 1 | 1 | 12705 | 40564 | 29850 | 103 | 15553 | 193320 | 3214 | 1263 | 27248 | 1671 | 32222 | 357713 |
20 | 2 | 10 | 20 | 10 | 2 | 19596 | 68056 | 49869 | 182 | 25632 | 318011 | 5240 | 1875 | 45402 | 1674 | 54491 | 590028 |
Did not get the table nice in WordPress, you can not be good at all the things. So here is a screen shot.


Some sizing rules from vmware blog
Picture below is sizing av VM if the VM has less memory than one of the sockets.

Picture below is sizing av VM if the VM has more memory than one of the sockets.

Conclusions:
If you have a single VM on host, the best configuration is to match the physical setup. In my case was
48 vCPU and 24 Cores per socket
I did not look at the application side, that can be my next project if i get the feeling.
I also only used passmarks application to do the test. As you can see the best result in this test was 48 vCPU 24 Cores and results in 2 sockets. So it matches the physical hardware.
Some numbers are better on different configurations, but I have only looked at total numbers. If you want to calculate prime number the best configuration would be 24 vCPU 1 core for the VM.
And over allocation is also something to think about when you have multiple vm:s on same host.
If you enable Hotadd on a VM numa is disabled.
Regarding the sockets, it is purely based on how the CPUs are presented to the guest. Whether you choose dual sockets single core each, or single socket with dual core, vSphere treats those as the same, you get 2 cores of CPU time.
Links:
https://frankdenneman.nl/2016/12/12/decoupling-cores-per-socket-virtual-numa-topology-vsphere-6-5/
https://www.passmark.com/products/performancetest/download.php
CPU Benchmarks value:
https://www.cpubenchmark.net/cpu_test_info.html
Thank you for reading this far,
Keep hacking!
//Roger