Sizer Wiki Production – Page 12 – Just another WordPress site

March 18, 2019March 18, 2019

N+0, N+1, N+2 Failover Indicator

This is a BIG sizing improvement in Sizer where Sizer will always tell you if you are at N+0, N+1 or N+2 failover level for all resources (CPU, RAM, HDD, SSD) for each cluster.

Now as you make changes in automatic sizing or manual sizing you always know if you have adequate failover. Best practice is N+1 so you can take down any one node (e..g take one node offline for an upgrade) and customer workloads can still run.

This can be very hard to figure out on your own. ECX savings for example varies by node count. Heterogenous clusters mean you have to find the largest node for each resource. Multiple clusters mean you have to look at each separately. Sizer does this for you !!

Here is what you need to know.

Let’s take a two cluster scenario. One called Cluster-1 is Nutanix cluster running 900 users for VDI and the Files to support those users. The other is a standalone cluster for Files Pro with 100TB of user data

All clusters:

In a multi-cluster scenario All Clusters just provides a summary. Here it shows two clusters and the hardware for the clusters. In regards to N+1 indicator on the lower left it shows the worse cluster. Both are N+1 and so you see N+1. Had any cluster been N+0 then N+0 would be shown. Great indicator to show there is an issue with one of the clusters

File cluster

This is the Standalone cluster for Files. You see the hardware used in the cluster. You see the failover level for each resource (CPU, RAM, HDD, SSD). N+2 would indicate possibly could have less of that resource but often product options force more anyhow. This is cold storage intensive workload and so HDD is the worse case.

Cluster -1

This is the Nutanix cluster for the VDI users. You see the hardware used in the cluster. You see the failover level for each resource (CPU, RAM, HDD, SSD). This is core intensive workload and so that is the worse case.

February 5, 2019May 20, 2025

Sizing Recommendations for Objects

General Information on Objects

Understanding how Nutanix Objects works provides a useful context for any sizing. To read about the architecture check out the Objects Tech Note: https://portal.nutanix.com/page/documents/solutions/details?targetId=TN-2106-Nutanix-Objects:TN-2106-Nutanix-Objects

To understand the current maximums visit: https://portal.nutanix.com/page/documents/configuration-maximum/list?software=Nutanix%20Objects

Nutanix Objects falls under Nutanix Unified Storage (NUS) licensing. For an overview of NUS licensing visit: https://www.nutanix.com/products/cloud-platform/software-options#nus

Performance vs. Capacity Workloads

In the past object storage solutions were only concerned with capacity; performance was barely a consideration. However, modern workloads such as AI/ML and data analytics leverage S3 compatible storage, and these very often have significant performance demands. Nutanix Objects has been internally benchmarked with both hybrid and all flash systems (see https://portal.nutanix.com/page/documents/solutions/details?targetId=TN-2098-Nutanix-Objects-Performance-INTERNAL-ONLY:TN-2098-Nutanix-Objects-Performance-INTERNAL-ONLY) and as a result we have a good understanding into Objects’ performance capabilities with a variety of workload profiles. Extrapolations can reliably be taken from these results to model performance scaling, since Objects I/O performance scales linearly. Importantly, the empirical data gleaned from the benchmark testing is leveraged by Sizer to determine the minimum number of Objects workers – and therefore nodes (Objects enforces 1 worker per node per object store for HA reasons) – needed to deliver a certain level of performance.

It should also be noted that there are factors outside the object store, such as network speed and number of client connections, that play a significant role in achieving the best possible performance from Nutanix Objects. Regarding the number of client connections, it should be noted that each node/worker needs 60-75 concurrent client connections driving I/O for maximum performance potential to be realized.

More commonly, node count will be driven by capacity requirements. Even in these cases however, the minimum Objects worker count needed for the required performance should still be noted, especially in mixed deployments (discussed further below).

Configurations

While there is no difference in NUS licensing between dedicated deployments (the AOS cluster is dedicated to NUS) and mixed deployments (NUS resides on the same cluster as application VMs), sizing considerations in each scenario vary to a degree. These are discussed below.

Information about hardware models suitable for Objects (and Files) can be found at: https://www.nutanix.com/products/hardware-platforms/specsheet?platformProvider=Nutanix&useCase=Files%20and%20Objects. The link points to Nutanix NX models, but you can easily change the hardware vendor as required. At the time of writing, HPE provides the node with the highest storage density (DX4120-G11). Make sure ‘Files and Objects’ is selected as the use case.

Worker count and HDD spindle count (or SSD count if All Flash)

In scenarios where a certain level of performance must be met, Sizer will look at the number of workers needed and the number of HDDs (or SSDs in the case of all flash) needed to deliver the throughput entered*. A high performance worker on an all flash node can, in most scenarios, deliver substantially more throughput than a standard worker on a hybrid node.

Disk-wise, in RF2 configs Sizer assumes a single SATA HDD can deliver reads:100MB/s and writes:50MB/s. Sizer assumes a single SSD can deliver reads:500MB/s and writes:250MB/s. So for example, a hybrid node with 10*HDDs can deliver 1GB/s for a workload consisting entirely of reads, or 500MB/s for a workload consisting entirely of writes. If RF3 or FT1n/2d is selected the write throughput figure is increased by 50% to account for the additional disk write IO. Note that FT1n/2d is strongly recommended for storage dense configs.

*For performance sensitive sizings please do not just accept the default values in the performance profile section (“Profile Info” box in the top right of the Workload page). The default values are purely arbitrary. You must determine the actual average object size, R/W (get/put) split and throughput that the customer needs to achieve and enter that customer-specific data into the Profile Info section.

For an Objects dedicated configuration (hybrid)

Objects is supported on all models and platforms that can run AOS (NCI). However, if you’re sizing for a dedicated hybrid Objects cluster with 100TiB or above, we recommend the HPE DX4120-G11, NX-8155-G9 or equivalent for the best performance. Such models are ideal due to their high HDD spindle count (as discussed in the previous section), though any model will work as long as it matches the minimum configurations listed below.

CPU: dual-socket 12-core CPU (minimum) for hybrid configs with 4 or more HDDs
- Dual-socket 10-core CPU is acceptable for hybrid configs in use cases that do not require fast performance
Memory: 128GB per node (minimum)
Disk:
- Avoid hybrid configurations that have only 2 HDDs per node.
- For  hybrid configurations that need to deliver good throughput, systems with 10+ HDDs are highly recommended. On an NX8155 for example go for 2*SSD + 10*HDD rather than 4*SSD + 8*HDD. This is further explained in the below section Why 10+ HDDs in a dedicated hybrid config?
- If a system with 10 or more HDDs is not available, configure the system with the highest number of HDDs possible.
- Erasure Coding: inline enabled (set by default during deployment)
  - Note inline EC has a 10-15% impact on write performance (accounted for if you choose “inline EC” in Sizer)
- FT choice: choosing between RF3, FT1N/2D and RF3 has an impact on the system’s deliverable write throughput (see previous section “Worker count and HDD spindle count (or SSD count if All Flash)”)
Network: dual 25GbE generally recommended (but check calculation in “Network” section)

NOTE: In Sizer to force a hybrid cluster output make sure “Hybrid” is selected under “Worker Node”. Sizer does not automatically choose between hybrid or all-flash for you.

Licensing: NUS Starter covers any Objects deployment on a hybrid system (whether shared or dedicated).

Why 10+ HDDs in a dedicated hybrid config?

In the majority of today’s use cases objects tend to be large (>1.5MiB), meaning they manifest as sequential I/O on the Nutanix cluster. In response to this, Objects architecture is tuned to take full advantage of the HDD tier. If there are HDDs in a node, Objects will automatically write sequential data directly to them, while leveraging the SSDs purely for metadata (if there are any objects under 1.5MB these will land in the SSD tier). 

There are 3 reasons for this;

Solid sequential I/O performance can be achieved with HDDs, assuming there are enough of them
Objects deployments can be up to petabytes in size. At that sort of scale, cache or SSD hits are unlikely, so using SSDs in hopes of achieving accelerated performance through caching would provide little return on the additional costs. To keep the solution cost-effective, Objects minimizes SSD requirements by using SSDs for metadata, and only using for data if required. 
Since we recommend a dual-socket 12-core CPU configuration, fewer SSDs also helps to avoid system work that would otherwise be incurred by having to frequently move data between tiers – the result is less stress on the reduced CPU count.

If, however, the workload is made up of mostly small objects, all-flash systems are significantly better at catering for the resulting random I/O, particularly if the workload is performance intensive.

For an Objects dedicated configuration (all-flash)

If all-flash is the preference, any system with 3 or more SSD/NVMe devices is generally fine, although the calculation described earlier must be performed based on actual throughput requirements (Sizer does this). If the all-flash nodes must also be storage dense we recommend the NX-8150-G9. From a compute standpoint, all-flash Objects clusters should have a minimum of:

CPU: dual-socket 20-core CPU (minimum) for all-flash configs – importantly, this allows the “Performance Config” to be selected at deployment
Memory: 128GB per node (minimum)
Disk: For all flash configurations, systems with 3 SSDs/NVMes (or more) are recommended.
Erasure Coding: inline enabled (set by default during deployment)
- Note inline EC has a 10-15% impact on write performance (accounted for if you choose “inline EC” in Sizer)
FT choice: choosing between RF3, FT1N/2D and RF3 has an impact on the system’s deliverable write throughput (see previous section “Worker count and HDD spindle count (or SSD count if All Flash)”)
Network: quad 25GbE, dual 40GbE or higher generally recommended, and for very high performance requirements dual 100GbE will be needed (check calculation in “Network” section)

NOTE: In Sizer to force an all-flash cluster output make sure “All Flash” is selected under “Worker Node”. Sizer does not automatically choose between hybrid or all-flash for you.

Licensing: NUS Pro covers any Objects deployment on an all flash system (whether shared or dedicated).

For a mixed configuration (Objects coexisting with User VMs)

Objects is supported on any model and any platform as long as it matches the minimum configurations listed below.

CPU: at least 12 vCPUs are available per node
- All node types with dual-socket CPUs are supported and preferred, though single CPUs with at least 24 cores are also supported

Memory: at least 36GB available to Objects per node
Disk: avoid hybrid configurations with only 2 HDDs per node and bear in mind that more HDD spindles means better performance.
- Erasure Coding: Inline enabled (set by default during deployment)
  - Note inline EC has a 10-15% impact on write performance (accounted for if you choose “inline EC” in Sizer)
- FT choice: choosing between RF3, FT1N/2D and RF3 has an impact on the system’s deliverable write throughput (see previous section “Worker count and HDD spindle count (or SSD count if All Flash)”)
Network: dual 25GbE recommended (but check calculation in “Network” section)

Both the NUS Starter and Pro licenses allow one User VM (UVM) per node. If taking advantage of this, ensure that there are enough CPU cores and memory on each node to cater for both an Objects worker and the UVM – and potentially also a Prism Central (PC) VM, unless PC will be located on a different cluster. It’s important to understand that Nutanix Objects cannot be deployed without there being a Prism Central present somewhere in the environment.

Network

This section provides information on working out the network bandwidth (NIC speed and quantity) needed per node, given the customer’s throughput requirement and the number of load balancers in the deployment. Conversely, it can be used to work out how many load balancers are needed, particularly if the customer is limited to a particular speed of network. At the end of this section is a link to a spreadsheet that helps you perform these calculations.

Note that Sizer does not perform these calculations. Sizer will statically configure all-flash Object nodes with 4 x 25GbE ports (two dual port cards). However, that might not be enough so it’s important that you do the performance calculations below and, if necessary, manually increase the NIC speed and/or quantity in Sizer as needed.

1. Firstly it’s important to be aware that for each put (write) request received by Objects there is 4x network amplification. The write path is as follows:

Client > Load Balancer (1) > Worker (2) > CVM (3) > RF write to another CVM (4)

If RF3 or 1N2D is selected this increases to 5x network amplification

For each get (read) request received there is 3x amplification. The read path is as follows:

CVM > Worker (1) > Load Balancer (2) > Client (3)

If EC is selected (EC is default if there are enough nodes) read amplification could, for many gets, increase to 4x if parts of the EC strip need to be read from other CVMs.

So the total network bandwidth needed for the object store is determined by the customer’s requested throughput multiplied by these factors in the correct proportions (R/W). The resulting overall bandwidth requirement is then spread across the load balancers – a relatively even distribution is assumed.

2. Take whatever % of the customer’s throughput is write IO (puts) – this is typically expressed in MB/s or GB/s – and multiply by 4 (or 5 – see above) to account for the write amplification. Next, take whatever % of the customer’s throughput is read IO (gets) and multiply that by 3 (or 4 – see above) to account for the read amplification. Combine the results and you have the overall throughput requirement to/from the cluster.

Example:

Customer requirement:

Throughput = 5 GB/s

% puts = 20

Write throughput = 1 GB/s x 4 (write amplification) = 4 GB/s

Read throughput = 4 GB/s x 3 (read amplification) = 12 GB/s

Total bandwidth to/from object store = 4 GB/s + 12 GB/s = 16 GB/s

3. Divide the overall throughput figure by the number of load balancers you plan to deploy. The result is the amount of network bandwidth needed per physical node.

Example:

4 Load Balancers

16 GB/s / 4 = 4 GB/s per node

4. Map this figure to the real world limits of NICs of varying speeds. These are listed below for your convenience. Note that when 2 links are aggregated using LACP you do not get twice the bandwidth of a single link due to overheads. With 2 links in LACP you can assume ~20% bandwidth loss, with 4 you can assume ~40% loss. Further to that, and before LACP overhead is accounted for, a NIC’s advertised bandwidth is never fully achievable due to general networking overheads (protocol and other real world factors).

# links in LACP	1 (no aggregation)	2	4
Achievable GB/s	1.1	1.8	2.7

Usable bandwidth with 10GbE

# links in LACP	1 (no aggregation)	2	4
Achievable GB/s	2.8	4.4	6.6

Usable bandwidth with 25GbE

# links in LACP	1 (no aggregation)	2	4
Achievable GB/s	4.4	7.0	10.4

Usable bandwidth with 40GbE

# links in LACP	1 (no aggregation)	2	4
Achievable GB/s	10.5	*12.5 (not 16.8)	*12.5 (not 25.2)

Usable bandwidth with 100GbE

*At the time of writing OVS, the virtual switch architecture used by AHV/KVM, has a limit of 100Gbps – this means the maximum network throughput a single node can handle is 12.5 GB/s (100/8). The configurations affected by this are 2x and 4x 100GbE links in LACP. There are future plans to lift this limit (roadmap item).

Example:

4 GB/s per node is needed.

Each node needs 2 x 25GbE NICs (in LACP), which can do 4.4GB/s

This spreadsheet may help with the network bandwidth and load balancer calculations.

Sizing Use Cases

Use Case: Backup

Below is a Backup workload in Objects Sizer. In this scenario Nutanix Objects is used as a target to store backups sent from backup clients (i.e. the backup app).

Note that Nutanix Objects should not be located on the same physical cluster as the source data (i.e. the data being backed up).

Considerations when sizing a backup workload

Initial capacity – estimated initial capacity that will be consumed by backups stored on Nutanix Objects.
Capacity growth – % growth of the backup data per time unit (e.g. years) over an overall specified length of time.
Be cautious and do not attempt to cater for too long a growth period, otherwise the amount of capacity required due to growth could dwarf the amount of storage required on day one. Specifying a (for example) 10-year growth period undermines our fundamental pay-as-you-grow value. Plus of course growth predictions may not be entirely accurate in any case. 3 years is a typical growth period to size for.
Do not enable Nutanix deduplication on any Objects workloads.

Profile Info:
- All values can be customized as required.
- Write (PUT) traffic usually dominates these environments as backups occur more regularly than restores (GETs). Furthermore, when restores do occur they are usually just reading a small subset of the backup.
  - That said, more and more customers are becoming increasingly concerned with how fast all their data could be restored in the event of a ransomware attack – so do check this with the customer
- Backups usually result in sequential I/O so the requirement is expressed as MB/s throughput. Veeam is the one exception to this rule – discussed further below.
- Backups usually consist of large objects (with the exception of Veeam – discussed further below)
- “Sustained” only applies to small object (<1.5MB) puts. In a hybrid system, when the hot tier fills up the application I/O must wait while the data is drained from SSD/NVMe to HDD. This is why sustained small object put throughput is slower than burst small object put throughput.
Replication Factor
- When using nodes with large disks (12TB+) to achieve high storage density it’s recommended you use RF3 or, better still, 1N/2D if there are 100 or more disks in a single fault domain. This provides a higher level of resilience against disk failure. Disk failure is more likely in this scenario for two reasons:
  - The more disk hardware you have the greater the risk of a disk failure
  - Disks take longer to rebuild because they contain more data, thus the window of vulnerability is extended (to days rather than hours)
- The larger drive capacities also mean there is a greater chance of encountering a latent sector error (LSE) during rebuild
- This drives a real need for protection against dual disk failure – true regardless of whether the disks are HDD or SSD/NVMe.
- 1N/2D coupled with wider EC strip sizes is preferred to RF3 due to it being more storage efficient
- If you wish to stick with RF2, consider using multiple Objects clusters.
  - However each cluster will have its own N+1 overhead.

Special rules for Veeam

Veeam is different from other backup apps in that it does not write large objects. With Veeam the default object size is ~768KB, about a tenth (or less) of the size of objects generated by other backup apps. Therefore, for Veeam opportunities the specialized “Backup – Veeam” use case in Sizer should be selected. Note that small object performance requirements must be expressed in Sizer in requests/sec rather than MB/sec. Therefore some conversion may be required if the customer has provided a throughput number (the contrasting I/O gauges are discussed in the cloud-native apps section).

Because small objects will always hit the SSD/NVMe tier there is a danger the hot tier will fill up quickly causing Veeam to wait while the data is periodically drained to the HDDs. For this reason all-flash Objects is a better solution for Veeam, and is the default when the “Backup – Veeam” use case is selected.

Please see the Sizing Nutanix Object for Veeam guidance document.

Special rules for Commvault

If Commvault is the backup app, check whether the customer wishes to use both Commvault’s deduplication and WORM. If this is the case (and it often is), the storage requirement must be increased by 2.4x.

Please see the Sizing Nutanix Object for Commvault guidance document.

Use Case: Archive

Archive is very similar to Backup and so the same advice applies. The profile values aren’t quite the same however, as you can see below. As with Backup though, these can be customized to the customer’s specific workload needs.

Use Case: Cloud-Native Apps

Cloud-native is a broad category covering a wide range of workload profiles. The correct I/O profile here depends on whether (and how) a containerized application will leverage Objects, or whether Objects is being deployed to support K8s management functions (or both). Object storage is commonly used in a K8s supportive role as an image registry, a log target and/or a backup repository. However, the cloud-native category can also include live application data, including anything from containerized big data/ analytics application data to vector database indexes and logs used in AI inference, all of which have intensive I/O requirements. For this reason, the default profile in Sizer (shown below) reflects a workload that’s performance sensitive in nature. Object size can also vary greatly in this category, but with many cloud-native workloads the object size will be much smaller than with traditional backup and archive workloads, so the profile defaults to a small object size. Smaller objects result in random I/O rather than sequential, and when this is the case all flash nodes are an infinitely better choice than hybrid. Note that this random I/O value is expressed in Sizer in requests/sec, rather than the MB/sec throughput metric that’s used to represent large object sequential I/O. These metrics are consistent with how random and sequential I/O respectively are normally gauged within industry.

When sizing Objects for a cloud-native app it’s important to try and find out from the customer what the I/O profile for the app is, then you can edit the I/O profile settings accordingly. This is especially important given the wide variance of cloud-native workloads types out there. In the absence of such information, all flash is the safe choice.

There is also a “Number of Objects (in millions)” field for all workload types. This is often relevant to cloud-native workloads (though not exclusively so), which can result in billions of objects needing to be stored and addressed. This value is used to determine how many Objects workers are needed to address the number of objects that will be stored. Thus, it could be that an Objects cluster sizing is constrained not by performance nor by capacity, but by metadata requirements.

What’s Missing from Sizer Today?

There are some sizing scenarios that are not currently covered by Objects Sizer. These are listed below, together with advice about what to do.

Sizing for intensive list activity

Szier cannot account currently for list activity. However, if you have been given a list requirement that you need to factor into your sizing, note that we have done benchmarking against list activity – the results can be viewed here.

Work with your local NUS SA to extrapolate these benchmarks to your customer’s requirement.

Objects sizes not currently represented in Sizer

Sizer currently only represents 128KB objects (small) and 8MB+ objects (large) – another object size is included (768KB) but it’s specifically for Veeam.

Small and large object workloads have very different performance profiles.

Objects from 8MB and above in size have a consistent performance profile, so select 8MB+ when you need to represent objects greater in size than 8MB, the output will be accurate. In Sizer, object size doesn’t matter above 8MB because you simply enter the overall throughput required (rather than requests/sec), together with the % puts (writes).

However, object sizes from 1KB right up to just under 8MB have logarithmically different performance profiles, meaning it is not easy to predict the performance of (for example) a 768KB object workload given what we know about 128KB performance and 8MB performance. Fortunately engineering has benchmark data for various object sizes other than 128KB and 8MB and this data can be used to identify a configuration that’s a closer fit to your customer’s specific object size. Work with your local NUS SA if you have this requirement. More object sizes will be added to Sizer in the future.

It’s again worth pointing out that objects >1.5MiB in size are classed by AOS as sequential I/O and will go straight to the HDD tier. Objects of 1.5MB or less, on the other hand, are classed as random I/O and will go straight to the SSD/NVMe tier. Knowing your customer’s object size in light of this fact is a significant factor (though not the only one) in helping you understand whether hybrid or all-flash is likely to be the better option.

Veeam and Commvault

These backup apps have additional considerations that can significantly affect the Objects cluster specification. You should not expect a straightforward ‘vanilla’ Backup sizing to be appropriate for these. Veeam is less of a challenge to size given that Sizer has a specialist category for Veeam workloads (“Backup – Veeam”). We are hoping to add a specialist Commvault category to Sizer in the future. In any case, please refer to the below documents when sizing Veeam or Commvault.

Visit the Sizing Nutanix Object for Veeam guidance document for more details.

Visit the Sizing Nutanix Object for Commvault guidance document.

If you have any doubts or difficulties sizing Objects, don’t hesitate to contact your local NUS Solution Architect (SA) for assistance. The SAs are listed here – https://ntnx-intranet–simpplr.vf.force.com/apex/simpplr__app?u=/site/a0xf4000004zeZ7AAI/dashboard

February 5, 2019February 5, 2019

January 2019 Sprint

January Sprint 1

Key enhancements

We heard from field and partners that getting either budgetary quotes or getting real quotes created is well hard (I can’t repeat what we heard in interviews as I want to keep the channel G rated ). Partners were saying it can take a week to get either budgetary quote or real quote going. Distributors were saying we retype what is in the Sizer BOM to create a quote. Nutanix SE’s saying this is really hard with CBL.

We knew we can help and so we took it on big time in last three months and now in summary we can say a partner or NTNX field person can create either a budgetary or real quote for any business model be it appliance, disagregated Nutanix, XC Core, CBL SW sale . Attached is the matrix with details. I believe this is about improving sales velocity.

We now have the new File licenses going with Standalone cluster. So here for Files Pro you can create the Files cluster with IONLY the Files skus attached (no AOS licenses). More changes coming but big step
Data Center and ROBO Solutions. These are addons for your HCI recommendation where we add right amount of things like Prism Pro or Flow.

Product Updates

We always pull the latest from SFDC for Nutanix products
Implemented several Nutanix product rules like allowed CPUs for 3070 if GPU is desired.
We always pull the latest from HCL for SW only vendors

January Sprint 2

Key enhancements

Include ECX in auto sizing. We always took in the savings after the recommendation was determined and so the HDD utilization was accurate. We hadn’t taken that into account though in determining the recommendation. Now that we have really large workloads for Files and soon Buckets this became an issue. So now Sizer recommendation is accurate
Clone Workload Feature. This is cool. Define a workload and can clone it and then just modify what you want. For example, you want five different Server Virtualization workloads that are all similar but different. Define one and clone/edit the rest.
Thick VMs sizing logic improved when uploading Collector or RVTools outputs. Now no compression is taken. A subtle but important Sizing improvement

UX improvements

Budgetary Quote – Added Hardware Support quote line
Making List view as default dashboard view instead of grid.
Implement Open/Closed Opportunity Filter – UX

Product Updates

New model – Fujitsu XF8050 HY/AF
HPE are DL380/360 again instead of DX . There was a problem with HCL but addressed it
We always pull the latest from SFDC for Nutanix products
Implemented several Nutanix product rules like allowed CPUs for 3070 if GPU is desired.
We always pull the latest from HCL for SW only vendors

Coming soon

Buckets !! Should come out this week. I’ll announce it later but you can start thinking about Buckets by sizing different opportunities

January 24, 2019April 3, 2019

Introduction

Sizer Users can can size for different type of business requirements which may require sizing for:

NX Appliances
NX Core
Software Only Vendors

Sizer Users, both Nutanix Employees and Partners (resellers and distributors), can generate following documents with the help of Sizer for any of the above mentioned scenarios:

BOM
Budgetary Quote
Salesforce Quotes

Contents of these documents (BOM, Budgetary quote and Salesforce Quote) change significantly based on whether one is sizing for NX-Appliance, NX-Core or CBL (Software Only vendors, Dell XC, HX Certified). Most of these changes are related to license and support products. Sizer uses information present in Salesforce Account and Opportunity to determine the correct license and support options. Following flowchart diagram explains how we make such decisions:

January 24, 2019January 24, 2019

Partners: BOM and Salesforce Quotes

Requirements

In order to generate correct BOM, Budgetary quote and Salesforce Quote, user must enter the Opportunity ID or Deal Registration Approval ID.
In case of Nutanix models, Opportunity value helps Sizer determine if the sizing needs to be done for appliance or hardware disaggregated. Based on the opportunity value, user will be presented with support and license selection page which is applicable to software choice or appliance sizing.
Opportunity value plays a significant role when sizing with Non Nutanix Software only Vendors. If Opportunity is marked for Software only, then then user will be presented with support and license selection page which is applicable to CBL, otherwise user won’t see any support and license selection page.
Opportunity is a required field when creating a Salesforce Quote. It also ensure that Sizer isn’t sending quotes creating for velocity program to non-velocity opportunities.
If user doesn’t have the opportunity information but would still like to size for CBL/Software choice, one can do so my selecting “Software Choice Only” option from the create scenario page.

Steps to push BOM & Quote to Salesforce

Linking scenario to an opportunity or deal reg. Two features are enabled for the distributors
- Upload BoM to Salesforce from Sizer
- Generate Salesforce Quote from Sizer
In Salesforce, BoM and Quote are dependent on an opportunity, so to upload a BoM or to generate a Salesforce Quote, Sizer needs to know which opportunity to use. A user can provide opportunity information on the scenario creation page. Any of the following IDs/numbers is acceptable:
- 15- or 18-character Opportunity ID
- Deal registration approval ID
Opportunity & Deal registration approval ID can be found from DQT opportunity

Once the sizing is completed by adding the workloads, selecting & modifying financial assumptions, proceed with the actions to generate BoM, Generate Budgetary Quote, Push BoM to Sales force, Generate Salesforce Quote
On the left panel/sidebar of Scenario detail page, there are actions (generate BoM, Generate Budgetary Quote, Push BoM to Sales force, Generate Salesforce Quote) can be found

Clicking on “Push BoM to Salesforce” will launch a modal/pop-up. User can confirm the opportunity and push the BoM to Salesforce. Successful upload will close the modal and display a message on the page

Clicking on the “Generate Salesforce Quote” will launch a modal/pop-up. User can confirm the opportunity a license/support options before generating the Salesforce Quote. Successful upload will close the modal and display a message on the page

The opportunity information can be modified (removed, updated) only by clicking on the “Edit Scenario” action
Email notifications, containing link to opportunity in case of BoM upload and link to quote in case of Quote, will be sent to Users (Distributor, Opportunity Owner, Primary SE)

January 23, 2019February 4, 2019

Advice for Large File workloads

Please contact vikram.gupta@nutanix.com for assistance

Here is the Nutanix Files Sizing Guide

Nutanix Files Sizing Guide

Adjusting nodes manually

In case you are doing a manual sizing you want to make sure it meets N+1 resiliency.

This is easy to check and adjust if needed.

Go to manual sizing and decrement the node count.

Two things to keep in mind

The minimum number of nodes for Files is 4 as there is a FSVM on three nodes and 4th node is needed for N+1 (one node can be taken offline and still 3 nodes to run the 3 FSVMs). So 4 nodes are needed independent of capacity.
Second, like any Nutanix cluster you want to make sure you still are at N+1. Here is table that shows you max HDD utilization (Files is a HDD heavy workload) you want to assure N+1. For example, if you have 6 nodes and the HDD utilization is UNDER 75% you can be assured that you are at N+1. Here the N+0 target (utilization after lose a node) is 90%, meaning with a node offline the utilization is 90% or less.

Node	N+0 Utilization Target	Max Threshold for N+1
4	90%	67.50%
5	90%	72.00%
6	90%	75.00%
7	90%	77.14%
8	90%	78.75%
9	90%	80.00%
10	90%	81.00%
11	90%	81.82%
12	90%	82.50%
13	90%	83.08%
14	90%	83.57%
15	90%	84.00%
16	90%	84.38%
17	90%	84.71%
18	90%	85.00%
19	0.9	85.26%
20	0.9	85.50%
21	0.9	85.71%
22	0.9	85.91%
23	0.9	86.09%
24	0.9	86.25%
25	0.9	86.40%
26	0.9	86.54%
27	0.9	86.67%
28	0.9	86.79%
29	0.9	86.90%
30	0.9	87.00%

January 19, 2019February 4, 2019

Adjusting Nutanix File node count in Manual

Right now Sizer’s automatic sizing does not take into account Erasure Coding savings when it figures out the recommendation. Fortunately the savings is taken into account in terms of HDD utilization.

So the usable capacity and the utilization is accurate, but the recommended node count could be high depending on the settings for compression among other things

This is easy to check and adjust if needed.

Go to manual sizing and decrement the node count.

Two things to keep in mind

The minimum number of nodes for Files is 4 as there is a FSVM on three nodes and 4th node is needed for N+1 (one node can be taken offline and still 3 nodes to run the 3 FSVMs). So 4 nodes are needed independent of capacity.
Second, like any Nutanix cluster you want to make sure you still are at N+1. Here is table that shows you max HDD utilization (Files is a HDD heavy workload) you want to assure N+1. For example, if you have 6 nodes and the HDD utilization is UNDER 75% you can be assured that you are at N+1. Here the N+0 target (utilization after lose a node) is 90%, meaning with a node offline the utilization is 90% or less.

Node	N+0 Utilization Target	Max Threshold for N+1
4	90%	67.50%
5	90%	72.00%
6	90%	75.00%
7	90%	77.14%
8	90%	78.75%
9	90%	80.00%
10	90%	81.00%
11	90%	81.82%
12	90%	82.50%
13	90%	83.08%
14	90%	83.57%
15	90%	84.00%
16	90%	84.38%
17	90%	84.71%
18	90%	85.00%
19	0.9	85.26%
20	0.9	85.50%
21	0.9	85.71%
22	0.9	85.91%
23	0.9	86.09%
24	0.9	86.25%
25	0.9	86.40%
26	0.9	86.54%
27	0.9	86.67%
28	0.9	86.79%
29	0.9	86.90%
30	0.9	87.00%

January 7, 2019January 7, 2019

December 2018 Sprint

In month of December we made following enhancements

Partners can now generate Salesforce Quote for Nutanix Appliance. This is first time they can create the quote and then their distributor applies discounts and request approvals in the Distributor Quote Tool. We will enable quoting for Nutanix Disaggregated, XC Core, HX Core, CBL in January.
Sizer now integrate with Hardware Compatability List for all SW vendors including HP, Cisco and Dell PE. The benefit is we can keep Sizer in sync with the changes more consistently.
Nutanix Files enhancements

Increased maximum capacity to 10 PB (Petabytes)

Support for Nutanix File licenses. The budgetary quote, quote and BOM includes the new licenses

HX Core support
Intel server support (new vendor)
NX-5155-G6 in Sizer
ECX and Block Awareness together supported
Sizer Wiki Search – find information faster
Budgetary Quote changes for CBL
Oracle AWR tool support

December 14, 2018

Sizer BOT

SizerBot is here to assist the Sizer team is answering questions that are of repetitive nature. This guide breaks down our approach for Phase 1 and our plan to continuously build on feedback generated.

Types of questions on the Sizer Channel

We classify questions on Sizer into two specific buckets:

The first set of questions typically hold answers based on unique configurations and combinations of sizing
The second set of questions are more generic such that its answers can be found through the wiki

What questions does the bot address?

Extensive research within the fields of ‘conversational AI’ are underway to mine through the first set of questions. This requires larger sets of data as well as the ability to interpret the context of questions, which we are currently exploring for the next few phases of the bot.

For the first phase we are targeting a small subset of repetitive questions that we are building on using content from this channel as well as from the Sizer product managers.

How does the bot work?

Based on a list of questions stored, we use a matching algorithm to identify if a similar question, as asked on the slack channel, exists in the database. While we can get the bot to provide a response to every question asked on the slack channel, we are avoiding this by increasing the threshold to 90%. This means, unless the probability of the match is greater than 90%, no answer will be provided. This is often referred to as the trade-off between precision and recall which helps address the question of do we want the bot to provide more answers, or do we want the bot to provide more answers accurately?

Improving the bots performance

We currently have two ways of improving the bots performance:

When a question is answered by the bot – The feedback provided is used to upvote or downvote the bots response
When a question is not answered by the bot – We first identify if the question is repetitive in nature, if so the questions are automatically logged back to a database through a separate slack channel

Suggestions

We are always open to more suggestions and recommendations on how to improve this process. You can reach out to revathi.anilkumar@nutanix.com with your feedback.

December 7, 2018December 7, 2018

November 2018 Sprint

November Sprint

Collector 1.1 Support – Sizer to include median values from Collector. We mine the VCenter data and get actual core usage. THIS VERY COOL AS CAN SAVE CBL CORES WHICH COSTS LOTS OF $$.
Velocity models for partners and Nutanix users – Velocity models are the NX-1065-G6 and can be up to 8 nodes but offer better discounts.
XC Core support. Quotes and BOM show CBL and XC Core. Budgetary quote coming soon
Applied spectre adjustment for VDI and Xenapp – Windows 7 – 20% hit, Windows 10 version 1709 – 7% hit, Windows 10 version 1803 – 11% hit
Performance Improvements. Homogeneous sizing typically reduced by 50%, while heterogeneous sizing reduced by 30%
File Services updates like setting Erasure Coding to ON by default
Budgetary quote improvements – can now apply discounts for SW Only quotes or NX Disaggregated quotes
New product updates for various vendors