September 2018 Sprint

September sprint includes:

Budgetary Quote UI:

User doesn’t have to download excel to see the pricing on quote. The new interactive UI will allow one to see list prices and apply discounts to see the sales/net price. You can iterate the entire process (size, generate budgetary quote and apply discounts) and download the quote when fully satisfied.

Enhancements to the Financial assumptions:

– Support levels have been sorted

– Monthly support terms (14months, 26 months..etc.) have been added

– Brief description has been been added next to each support level

– Ability to apply discounts to Budgetary Quote has been moved to the new Budgetary Quote UI.

Regular/Robo Models

There was confusion on how robo models like the 1175S should be sized.  For Regular models which are used in data centers,  PM originally wanted to restrict the 1175S to just backup usage.  That in ROBO they could be used as application nodes that could be 1, 2, or 3+ nodes.

That brought some confusion when SE wanted to size a small cluster with say qty 5 of the 1175S.  Sure could do it in ROBO, but SE would say it is not a ROBO customer project.

So I worked with PM and we streamlined the rules and now models like the 1175S (lot of vendors have similar ones) are treated as fully functional models as follows:

Regular Model Rules

  • All models included
  • All use cases are allowed – main cluster application, remote cluster application and remote snapshots
  • 3+ nodes are recommended for any model. It is considered good data center practice to have a minimum of 3 nodes.

ROBO Model Rules

  • All models but only some models like the 1175S can size for 1 or 2 node, while others require 3 min nodes
  • All use cases – main cluster application, remote cluster application and remote snapshots
  • All models can go to 3+ nodes depending on sizing requirements

So what should you do

  • Just stay with Regular models for most of your sizing needs. It is default and models like 1175S can indeed run applications.  Any recommendation would be 3+ nodes
  • Only go to ROBO if you need 1 or 2 node for 1175S

For more info go to the Sizer wiki –  ROBO-Regular

Rack Awareness:

Nutanix has always had node awareness and for long time block awareness both of which are in Sizer.  With this release, Sizer will also have rack awareness where Data Availability is maintained even in an event of an entire rack or top-of-the-rack switch failure.

Supported Use Cases

  • Heterogeneous solution made up of several different homogeneous blocks

–    For example, could have several 3460 blocks and several 3360 blocks

–    Here the 3460 blocks are all identical, while all the 3360 blocks are identical.  Given that, the 3460 block is homogeneous and the 3360 block is homogeneous.

–    These homogeneous blocks are dispersed across sufficient racks to meet rack awareness

  • We will support a cluster wide setting for rack awareness and assume all workloads in that cluster must adhere to rack awareness

Other things

  • Share notice in email. Sharing has been around forever and put in your Shared Scenarios but now when a person shares a scenario an email goes to them.  Great to increase collaboration and we will do more
  • We have all Sizer and “non-Sizer” parts for all HP and Cisco models. Non-Sizer parts are things sizer does not analyze like say storage controller, boot drives , etc.  Where before we asked you to click link to HFCL we got in the BOM.
  • Lots of product updates
  • Inspur is a new vendor
  • 3060-G6 NVME is there
  • We allowed “no modification” as option for VDI when selecting desktop or office version.  Sometimes people have need to size for so much MHz or Cores per user and don’t want the modification.  In future Collector will use that option too when we upload a collector output

 

 

Roadmap

Attached is the current 12 month roadmap.  We group innovation by product initiatives, which are areas where Sizer needs to excel

Product –  Considerable development and product management resources are put into keeping up on latest product from all the various vendors.  Sizer needs to keep up on their innovation or it won’t be the trusted advisor on what to sell.

Sizing –  Equal in priority to keeping up on new product is adopting the latest sizing approaches.  We work with both Engineering for cluster sizing but also workload experts for workload sizing.

Usability –  The tool is getting more complex with more products and use cases as well as user requests.  So we do try to look at opportunities to make it a better user experience

New Use Cases – We do target new use cases where Sizer can help you enter new markets with Nutanix

Tool Integration –  We strongly feel to really improve sizing and define the truly optimal solution (not too big or too small) customer workload data acquired over reasonable time is needed.    Nutanix Collector is our premier offering to gather the right data

Technical Debt – In any product it is important to invest for the future in terms of performance, availability, scalability, etc.  Sizer is a critical tool for Nutanix.

 

Storage Calculator

Storage Calculator is both a standalone tool as well as a Sizer feature.  Either way it is used to determine the Extent Store and Effective Capacity of a configuration the user defines.  It is NOT tied to the workloads or the recommendation in the sizing scenario.

Access as a Standalone Tool

This is available on the Internet without login.  The same as DesignBrewz.

https://services.nutanix.com/#/storage-capacity-calculator

Access as a Sizer Feature

This is accessed by clicking on Storage Calculator in upper right corner of Sizer user interface

Storage Calculator

Here is Storage Calculator.

The purpose of Storage Calculator is to determine either the Extent Store or the Effective Capacity of a configuration.  As mentioned it is not tied to a sizing scenario.

  • Extent Store is the amount of storage remaining after discounting for CVM.  This is amount available for customer workloads.
  • Effective Capacity is then Extent Store * Storage Efficiency  + Erasure Coding savings you expect.   Storage Efficiency is either none, 1.5:1, or 2:1.  Examples of storage efficiency is compression and dedupe.

Defining the Configuration and Input Settings

Here are the inputs

  • SSD Size –  Pulldown with common SSDs currently available in various vendor models
  • SSD is downstroked –  If selected each drive loses 80GB for downstroking.  Sizer does that in its sizing for regular SSDs but assumes no downstroking is needed for encrypted drives
  • SSD quantity –  This is the number of SSDs you expect in model you are sizing.  Minimum is 1 as always need a SSD for parts of CVM
  • HDD Size –  Pulldown with common HDDs currently available in various vendor models
  • HDD quantity –  This is the number of HDDs you expect in model you are sizing.  Min is 0 in case of All Flash
  • Node Count –  Number of nodes you expect
  • Replication Factor –  Can be RF2 or RF3
  • ECX – If selected then see the % of Cold Data input
  • % of Cold Data – If select ECX then this input appears and is the percentage of cold data you are expecting
  • Storage Efficiency –  This is the factor you expect for storage efficiency and can be none, 1.5:1, or 2:1.
  • Calculate Button –  NOTE: must click on calculate when make any changes above

Storage Calculator Charts

Total Usage

  • The left donut chart shows the Extent Store and the CVM.  Extent Store is adjusted for either RF2 or RF3 depending on the input selection.  So here the extent store is adjusted for RF2 and is 7.26 TiB.  The total amount of Extent Store is 2x that amount or 14.52 TiB.  The adjustment was made so the customer sees amount of storage they have given the Replication Factor they prefer.
  • The right donut breaks out all the CVM pieces be it stored on HDD or SSD
  • Effective Capacity is above the charts.  It is Extent Store * Storage Efficiency Factor + ECX savings.  Again we adjust for RF level.  This capacity then represents the storage available to customers at their preferred RF level and including expected benefits from storage efficiency as as well as ECX.

SSD Usage

This is a supplemental graph from Total Usage.  It breaks out just the SSD portion of the Total Usage.

  • Top graph shows SSD CVM and SSD Extent Store adjusted for either RF2 or RF3
  • Lower graph shows all the SSD CVM elements.

HDD Usage

This is a supplemental graph from Total Usage.  It breaks out just the HDD portion of the Total Usage.

  • Top graph shows HDD CVM and HDD Extent Store adjusted for either RF2 or RF3
  • Lower graph shows all the HDD CVM elements.

 

What do the letters in the SSD drive indicate?

The letters indicate different levels of endurance in terms of Drive Writes per Day (DWPD).  For example, 3DWPD means you can rewrite all the data on the drive 3 times a day for its entire life that it is warranted for.

VDI Sizing (Frame/HorizonView/Citrix Desktops

VDI Profiles  used in Sizer

Sizer relies on Login VSI profiles and tests.  Here are descriptions about the profiles and applications run

Task Worker Workload

  • The Task Worker workload runs fewer applications than the other workloads (mainly Excel and Internet Explorer with some minimal Word activity, Outlook, Adobe, copy and zip actions) and starts/stops the applications less frequently. This results in lower CPU, memory and disk IO usage.

Below is the profile definition for a Task Worker:

Knowledge Worker Workload

  • The Knowledge Worker workload is designed for virtual machines with 2vCPUs. This workload contains the following applications and activities:
    •  Outlook, browse messages.
    •  Internet Explorer, browse different webpages and a YouTube style video (480p movie trailer) is opened three times in every loop.
    •  Word, one instance to measure response time, one instance to review and edit a document.
    •  Doro PDF Printer & Acrobat Reader, the Word document is printed and exported to PDF.
    •  Excel, a very large randomized sheet is opened.
    •  PowerPoint, a presentation is reviewed and edited.
    •  FreeMind, a Java based Mind Mapping application.
    •  Various copy and zip actions.

Below is the profile definition for a Knowledge Worker:

Power Worker Workload

  • The Power Worker workload is the most intensive of the standard workloads. The following activities are performed with this workload:
    •  Begins by opening four instances of Internet Explorer which remain open throughout the workload.
    •  Begins by opening two instances of Adobe Reader which remain open throughout the workload.
    •  There are more PDF printer actions in the workload as compared to the other workloads.
    •  Instead of 480p videos a 720p and a 1080p video are watched.
    •  The idle time is reduced to two minutes.
    •  Various copy and zip actions.

Below is the profile definition for a Power Worker:

Developer Worker Type

Sizer does offer Developer profile which is assumes 1 core per user (2 VCPU,  VCPU;pCore = 2).  Use that for super heavy user demands.

Below is the profile definition for a Developer:

What is strength and weaknesses of Profiles

Strengths

  • LoginVSI is the defacto  Industry standard VDI performance testing suite.  That offers ability to have common terms like “knowledge worker” .
  • Test suite was run on Nutanix-based cluster and number of users were found with reasonable performance.  From there we could build out the profile definitions in Sizer and this is based on lab results.
  • Things were setup optimally.  Hyperthreading is turned on and the cluster is set up using best practices.
  • It does a good job of not only having mix of applications but having different workload activity as add more users.  For example, how frequently applications are opened and so it does simulate having multiple users in real environment.
  • Essentially the “best game in town” to getting consistent sizing

Weaknesses

  • In the end VDI is a shared environment and sizing will depend on the activities of the users.  So if three companies have 1000 task workers, each company could have different sizing requirements as what the users do and when will vary.

What are other fctors Sizer considers for VDI sizing: 

Common VDI sizing parameters:  (Across all VDI Brokers)

Windows desktop OS and Office version:

Depending on the OS and Office version type, there are performance implications and cores are adjusted accordingly.

The below table has the adjustment factors for cores depending on the Windows OS:

Version Factor
No adjustment 1
Windows 11 – 22H2 1.3915
Windows 11 – 21H2 1.334
Windows 10 – 22H2 1.1845
Windows 10 – 21H2 1.219
Windows 10 – 20H2 1.219
Windows 10 – 2004 1.15
Windows 10 – 1903/1909 1.135
Windows 10 – 1803/1809 1.1
Windows 10 – 1709 1.05

The factors above include performance hits from Spectre and Meltdown updates.

Similarly, the below table has the adjustment factors for cores depending on the Windows Office version:

Office 2010 0.75
Office 2013 1
Office 2016/2019 1

Display Protocol:                   

Depending on the VDI broker, there are the following Display Protocols:

VMware Horizon View:

  • Blast(default)
  • PCoIP

Citrix Virtual Desktop:

  • ICA(default)

Frame:  

  • Frame Remote Protocl(FRP)

There are adjustment to cores depending on the selected protol for the respective VDI brokers as follows:

ICA 1
PCoIP 1.15
Blast 1.38
Frame 1.45

Sizing equations for Cores/RAM/Storage:

Cores: 

Cores users * VCPUs per user  * (1 / (Vcpu per CPU) *125% if V2V/P2V * 85% if 2400 MhZ DIMM
Note this change If provisioning type is V2V/P2V then need to increase cores by 25%, due to change this provisioning.  Now default is Thinwire video protocol and that causes 25% hit. If H264 then no hit. We will assume the default of Thinwire is used as Sizer user probably does not know.

RAM: 

RAM (users * RAM in GiB / user  * 1/1024 TiB/GiB) +

 (64MB * users * conversion from MB to TiB)

Note this change a. First part finds RAM for user data

b.  Second  part calculates reqt per VM which is user

Note: Hypervisor RAM will be added to CVM RAM as one Hypervisor per node

SSD:

For VDI workload, the rule to calculate SSD is as follows:

SSD  = hotTierDataPerNode * estNodes + goldImageCapacity * estNodes + numUsers * requiredSSD,

where  hotTierDataPerNode = 0.3 GB converted to GiB ,

estimatedNummerOfNodes = ( max (1, cores/20) ) where cores is calculated cores, 

goldImageCapacity as per selected profile numUsers as received from UI, 

requiredSSD – 2.5GiB for task worker, 5GiB for Power user/Developer user, 3.3GiB for Knowledge worker/Epic Hyperspace/ Hyperspace + Nuance Dragon,

(0.3 GB* 0.931323 GiB/GB * est nodes + goldimage in GiB *est nodes + users * reqdSSD in GiB) * 1/1024 TiB/GiB
reqdSSD = 2.5 GiB for task worker, 5 GiB for Power user/developer, 3.3 GiB for knowledge

HDD:

For VDI workload, the rule to calculate HDD is as follows: 

if VDI > SSD, HDD = VDI – SSD else   HDD = 0

where VDI  = numUsers * actPerUserCap    numUsers as received from UI, 

actPerUserCap : if provisionType is V2V/P2V or Full Clone, 

actPerUserCap =  goldImageCapacity + userDataCap where goldImageCapacity and userDataCap are received from UI  

                                       : if provisionType is  not V2V/P2V or Full Clone,

actPerUserCap =    userDataCap

VDI Sizing – July 2018 sprint

  • Dell completed extensive VDI testing using LoginVSI profiles and test suite on a Nutanix cluster using their skylake models.  So we now have the most extensive lab testing results to update Sizer profiles.  Given that we updated Sizer VDI workload sizing.  The key reasons:
    • This was run on skylake models and so includes any enhancements in that architecture
    • Latest AOS version was used
    • Best practices were used in setting up the cluster by VDI experts.  For example hyperthreading is turned ON
    • Latest login VSI suite was used
  • Here is summary of the results:
    • Big change  is Task workers.  In old days of Windows 7 and Office 2010 we were seeing 10 task workers per core as common ratio.  However, both Windows 10 and Office 2016 are very expensive resource-wise.  In the lab tests we only get about 6 users per core.  We are seeing a big bump in core counts for task workers as a result.  For example 18% increase in cores for Xenapp Task workers and 28% for Horizon task workers.  A customer’s actual usage will vary.
    • Windows 7 is estimated to be needing 60% of cores vs Windows 10.
    • Office 2010 is estimated to be needing 75% of cores vs Office 2016.
    • Knowledge workers for either View or Xen Desktop brokers did not change much
    • Power users on View did not change much
    • Power users for Xen Desktop did increase by 21% as the profile changed from 5 users per core to just 4 users per core.

Continue reading “VDI Sizing (Frame/HorizonView/Citrix Desktops”

Usable Capacity

Usable Remaining Capacity is the amount of storage that is available to the customer AFTER workloads, RF, storage savings are applied.  It represents what they should have remaining once deployed.

Sizer presents the values in both RF2 and RF3.

Usable Remaining Capacity (Assumming RF2)

  • HDD Usable Remaining  Capacity = (Raw + Compression Savings + Dedupe Savings + ECX Savings – Workload – RF Overhead – CVM overhead ) / 2
  • SSD Usable Remaining  Capacity =  (Raw + Compression Savings + Dedupe Savings + ECX Savings – Workload – RF Overhead – CVM overhead + Oplog ) / 2
  • Notes:
    • Usable capacity is basically RAW + storage savings with data reduction techniques like compression less workload, RF overhead and CVM overhead.
    • If All Flash,  Compression Savings, Dedupe Savings , ECX Savings, RF Overhead,  and CVM overhead that would be attributed to HDD’s is applied to SSDs
    • For SSD Capacity, Oplog is included as part of CVM overhead for SSDs but also added back as it is a Write log and so is available for user data.

Extent Store and Effective Capacity

Extent Store

This is a concept that is used in the Nutanix Bible.  This is RAW capacity less CVM.  It represents the capacity that is available to a customer

 

Effective Capacity

Used in Storage Calculator or DesignBrewz.  This is the Extent Store * Storage Efficiency setting in  Storage calculator.  So if the Extent Store is 10TiB and the Storage Efficiency factor is set to 1.5:1 then the Effective Capacity is 15 TiB.   Storage Efficiency factor is the expected benefit of storage reduction approaches like compression, dedupe, ECX.  Effective Capacity then is what is hoped to be available with these reduction techniques