Test vCPU configuration to get best performance out of a VM. What configuration shall I use. #vCPU @vExpert @VMware #virtualmachine @FrankDenneman #numa @mwVme

We start with what our goals are

We have a vm that want to take as much cpu power as possible from the host.

But how shall we configure the VM to make the most of it. You have read about numa nodes and what that means? right? If not look, at the bottom of this post you will find find some nice links.

If you really want to deep dive regarding this read anything that Frank Denneman as written, great stuff!

Also take a look at https://notesfrommwhite.net/ from @mwVme great stuff and do great summary of VMware stuff.

Questions

Shall I use vCPU and make it equal to Logical Processors?
Shall I use vCPU and make it equal to Physical socket?
Whats is sockets on a VM?
How many cores per socket on the VM?
So many questions, Lets try the different configurations and see!

I have to answer a few questions during the tests

How many vCPU does the VM see?
How many numa nodes does the VM see?
What is the performance on the VM?

Lets begin

We start with the stuff that I can not change.

We have a server with 2 physical CPU:s
12 Cores per CPU in total 24 cores
Then we have hyper threading enabled so we get 48 Logical processors.

Host configuration:

So lets see what are the different configuration that we wanna try out for this VM.

The Windows server is 2019 with 8GB of memory. The picture says 4 but I change that after the screenshot.

vCPUCoresSockets
48148
48242
48481
24124
24122
24241
12112
1262
12121
Different configuration to try.

In the test vm we have 8 GB ram 40 GB. Newly installed Windows 2019 server. The pictures are related to the the test with 48 vCPU and 48 Cores per socket.

Server config

VM config with 48 vCPU and 48 Cores per socket = 1 socket.

So first of whats does Windows see?

CPU in Windows

And how many numa nodes can Windows see?

Numa Nodes

Then we run CPU-Z just to verify:

CPU-Z

Now to performance. We use passmarks program to test perfromance.

I don’t wanna bore you out with a lot of picture so you have to believe the excel sheet.

The winner in my case was 48 vCPU and 24 Cores. That will be 2 sockets on the VM and 2 NUMA nodes.

vCPUCoresSocketsVirtual proc. in VMSockets in vmNuma node in vmCPU MarkInteger MathFloating PointPrime NrSSECompressionEncryptionPhysiscsSortingSingle ThreadCross-plattformTotal
4814848482213589930844627107243084042176522124456132165252738712213
482424822229229260550481107282824314818166109857940165253816748550
484814812171211030585306210395092500946835123355052166854289552024
2412424242217347900358336186282583656485979182351786167861231675662
241222422217987871257328173283413642766147180251361168559729671352
242412412217887939157852179291293612136116183151918164560664671726
121121212112048406852724795146511822782987125625113166930541338570
12621221127204119630045104158521960083134130127784167332882362699
121211211127054056429850103155531933203214126327248167132222357713
2021020102195966805649869182256323180115240187545402167454491590028
Testresults with passmark

Did not get the table nice in WordPress, you can not be good at all the things. So here is a screen shot.

Some sizing rules from vmware blog

Picture below is sizing av VM if the VM has less memory than one of the sockets.

Picture from: https://blogs.vmware.com/performance/2017/03/virtual-machine-vcpu-and-vnuma-rightsizing-rules-of-thumb.html

Picture below is sizing av VM if the VM has more memory than one of the sockets.

Picture from: https://blogs.vmware.com/performance/2017/03/virtual-machine-vcpu-and-vnuma-rightsizing-rules-of-thumb.html

Conclusions:

If you have a single VM on host, the best configuration is to match the physical setup. In my case was

48 vCPU and 24 Cores per socket

I did not look at the application side, that can be my next project if i get the feeling.

I also only used passmarks application to do the test. As you can see the best result in this test was 48 vCPU 24 Cores and results in 2 sockets. So it matches the physical hardware.

Some numbers are better on different configurations, but I have only looked at total numbers. If you want to calculate prime number the best configuration would be 24 vCPU 1 core for the VM.

And over allocation is also something to think about when you have multiple vm:s on same host.

If you enable Hotadd on a VM numa is disabled.

Regarding the sockets, it is purely based on how the CPUs are presented to the guest.  Whether you choose dual sockets single core each, or single socket with dual core, vSphere treats those as the same, you get 2 cores of CPU time.

Links:

https://frankdenneman.nl/2016/12/12/decoupling-cores-per-socket-virtual-numa-topology-vsphere-6-5/

https://www.passmark.com/products/performancetest/download.php

https://blogs.vmware.com/performance/2017/03/virtual-machine-vcpu-and-vnuma-rightsizing-rules-of-thumb.html

CPU Benchmarks value:

https://www.cpubenchmark.net/cpu_test_info.html

Thank you for reading this far,

Keep hacking!

//Roger

Advertisement

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

This site uses Akismet to reduce spam. Learn how your comment data is processed.

Blog at WordPress.com.

Up ↑

%d bloggers like this: