Files

Introduction – Please Read First

These questions are here to assist with ensuring that you’re gathering necessary information from a customer/prospect in order to put together an appropriate solution to meet their requirements in addition to capturing specific metrics from tools like Collector or RVTools. 

This list is not exhaustive, but should be used as a guide to make sure you’ve done proper and thorough discovery.  Also, it is imperative that you don’t just ask a question without understanding the reason why it is being asked.  We’ve structured these questions with not only the question that should be asked, but why we are asking the customer to provide an answer to that question and why it matters to provide an optimal solution. 

Questions marked with an asterisk (*) will likely require reaching out to a specialist/Solution Architect resource at Nutanix to go deeper with the customer on that topic/question.  Make sure you use the answers to these questions in the Scenario Objectives in Sizer when you create a new Scenario.  These questions should help guide you as to what the customer requirements, constraints, assumptions, and risks are for your opportunity. 

This is a live document, and questions will be expand and update over time.


Files

1.  Is this replacing a current solution, or is this a net new project?
     a.  What’s the current solution?

Why ask? This question helps us understand the use case, any current expectations and what the competitive landscape may look like.

2.  Using an existing Nutanix cluster (with existing workload) or net new Nutanix cluster?

Why ask?  If we’re sizing into an existing cluster we need to understand current hardware and current workload.  For licensing purposes adding Files to an existing cluster means the Files for AOS license. A common scenario has been to add storage only nodes to an existing cluster to support the new Files capacity.  If sizing into a new cluster we can potentially dedicate this cluster to Files and use Files Dedicated licensing.

3.  Is this for NFS, SMB or both?
     a.  Which protocol versions (SMB 3.0, NFSv4, etc)?

Why ask?  We need to understand protocol to first validate they are using supported clients.  Supported clients are documented in the release notes of each version of Files.  Concurrent SMB connections also impact sizing with respect to the compute resources we need for the FSVMs to handle those clients.  Max concurrent connections are also documented in the release notes of each version.  It also helps us validate supported authentication methods.  For SMB, we require Active Directory where we support 2008 domain functional level or higher (there is no local user or group support for Files).  For NFS v4 we support AD with Kerberos, LDAP and Unmanaged (no auth) shares.  For NFS v3 we support LDAP and Unmanaged.

 4.  Is there any explicit performance requirement for the customer?  Do they require IOPs or throughput or latency numbers?

Why ask?  Every FSVM has an expected performance envelope.  There is a sizing guide and performance tech note on the Nutanix Portal which give a relative expectation on the max read and write throughput per FSVM and max read or write IOPs per FSVM.  Throughput based on reads and writes are integrated into Nutanix Sizer and will impact the recommended number of FSVMs. https://portal.nutanix.com/page/documents/solutions/details?targetId=TN-2117-Nutanix-Files-Performance:TN-2117-Nutanix-Files-Performance

5.  Do they have any current performance collection from their existing environment?
      a.  Windows File Server = Perfmon
      b.  Netapp = perfstat
      c.  Dell DPACK, Live Optics

Why ask?  Seeing data from an existing solution can help validate the performance numbers so that we size accurately for performance. 

6.  What are the specific applications using the shares?
       a.  VDI (Home Shares)
       b.  PACS (Imaging)
       c.  Video (Streaming)
       d.  Backup (Streaming)

Why ask?  When sizing for storage space utilization the application performing the writes could impact storage efficiency.  Backup, Video and Image data are most commonly compressed by the application.  For those applications we should not include compression savings when sizing, only Erasure Coding.  For general purpose shares with various document types assume some level of compression savings.  

7.  Are they happy with performance or looking to improve performance?

Why ask?  If the customer has existing performance data, it’s good to understand if they are expecting equivalent or better performance from Files.  This could impact sizing, including going from a hybrid to an all flash cluster. 

 8.  How many expected concurrent user connections?

Why ask? Concurrent SMB connections are a required sizing parameter.  Each FSVM needs enough memory assigned to support a given number of users.  A Standard share is owned by one FSVM.  A distributed share is owned by all FSVMs and is load balanced based on top level directories.  We need to ensure any one FSVM can support all concurrent clients to the standard share or top level directory with the highest expected connections. We should also be ensuring that the sizing for concurrent connections is taking into account N-1 redundancy for node maintenance/failure/etc.

9.  Will the underlying hardware config support larger or more FSVMs if additional throughput or performance is required?

Why ask? Files is a scale-out and scale-up workload so you need to know what growth in the environment can look like.

 10.  Current share configuration including number of shares?

Why ask?  Files has a soft (recommended) limit of 100 shares per FSVM.

11.  Directory structure:
       a.  Large number of folders in share root?

Why ask?  This indicates a large number of top level directories making a distributed share a good choice for load balancing and data distribution.

       b.  Files in share root?

Why ask?  Distributed shares cannot store files in the share root.  If an application must store files in the root then you should plan for sizing using standard shares.  Alternatively, a nested share can be used. 

       c.  Total size of largest single directories?

Why ask?  Nutanix supports standard shares up to 140TB.  And top level directories in a distributed share up to 140TB.  These limits are based on the volume group supporting the standard share or top level directory.  We need to ensure no single folder or share (if using a standard share) surpasses 140TB. Files Compression can yield more usable storage per share as well. Nutanix Files – Deployment and Upgrade FAQ https://portal.nutanix.com/page/documents/kbs/details?targetId=kA00e000000LMXpCAO

       d.  Largest number of files/folders in a single folder?

Why ask?  Nutanix Files is designed to store millions of files within a single share and billions of files across a multi-node cluster with multiple shares.  To achieve speedy response time for high file and directory count environments it’s necessary to give some thought to directory design. Placing millions of files or directories into a single directory is going to be very slow in file enumeration that must occur before file access.  The optimal approach is to branch out from the root share with leaf directories up to a width (directory or file count in a single directory) no greater than 100,000.  Subdirectories should have similar directory width.  If file or directory counts get very wide within a single directory, this can cause slow data response time to client and application.  Increasing FSVM memory up to 96 GB to cache metadata can help improve performance for these environments especially if designs for directory and files listed above are followed.

12.  Total storage and compute requirements including future growth?

Why ask?  Core sizing question to ensure adequate storage space is available with the initial purchase and over the expected timeframe. 

13.  Does your sizing include the resources to run File Analytics?

Why ask?  FA is a key differentiator for Files, and drives a lot of customer delight and insights into their data. Every SE and sizing should assume that any Files customer will want to run FA as well – don’t present as an optional component.

14.  Percent of data considered to be active/hot?

 Why ask?  Understanding the expected active dataset can help with sizing the SSD tier for a hybrid solution.  Performance and statistical collection from an existing environment may help with this determination.

 15.  Storage change rate?

Why ask?  Change rate influences snapshot overheads based on retention schedules.  Nutanix Sizer will ask what the change rate is for the dataset to help with determining the storage space impact of snapshot retention.

 16.  Any storage efficiency details from the current environment (dedup, compression, etc.)?

Why ask?  Helps to determine if data reduction techniques like dedup and compression are effective against the customers data.  Files does not support the use of deduplication today, so any dedup savings should not be taken into account when sizing for Files.  If the data is compressible in the existing environment it should also be compressible with Nutanix compression.

 17.  Block size of current solution (if known)?

Why ask?  Block size can impact storage efficiency.  A solution which has many small files with a fixed block size may show different space consumption when migrated to Files, which uses variable block lengths based on file size.  For files over 64KB in size, Files uses a 64KB block size.  In some cases a large number of large files have been slightly less efficient when moved to Nutanix Files.  Understanding this up front can help explain differences following migrations.

18.  Self Service Restore (SSR) requirements (share level snapshots)?

Why ask?  Nutanix Files uses two levels of snapshots, SSR snapshots occur at the file share level via ZFS.  These snapshots have their own schedule and Sizer asks for their frequency and change rate under “Nutanix Files Snapshots.”  The schedule associated with SSR and retention periods will impact overall storage consumption. Nutanix Files Snapshots increase both the amount of licensing required and total storage required, so it’s important to get it right during the sizing process.

 19.  Data Protection/Disaster Recovery requirements (File Server Instance snapshots):
         a.  Expected snapshot frequency and retention schedule (hourly, daily, weekly, etc.)?

Why ask? Data Protection snapshots occur at the AOS (protection domain) level via the NDSF.  The schedule and retention policy are managed against the protection domain for the file server instance and will impact overall storage consumption.  Sizer asks for the local and remote snapshot retention under “Data Protection.”
Files supports 1hr RPO today and will support near-sync in the AOS 5.11.1 release in conjunction with Files 3.6.  Keep in mind node density (raw storage) when determining RPO.  Both 1hr and near-sync RPO require hybrid nodes with 40TB or less raw or all flash nodes with 48TB or less raw.  Denser configurations can only support 6hr RPO.  These requirements will likely change so double check the latest guidance when sizing dense storage nodes. Confirm that underlying nodes and configs support NearSync per latest AOS requirements if NearSync will be used.

        b.  Active/Active requirements (Peer Software)?

Why ask?  If the customer needs active/active file shares in different sites which represent the same data, we need to position a third party called Peer Software.  Peer performs near real time replication of data between heterogenous file servers.  Peer utilizes Windows VMs which consume some CPU and memory you may want to size into the Nutanix clusters intended for Files.

 20.  Feature Requirements:
         a.  Auditing? Which vendors?

Why ask?  Nutanix is working to integrate with three main third-party auditing vendors today, Netwrix (supported and integrated with Files), Varonis (working on integration) and Stealthbits (not yet integrated).  Nutanix Files also has a native auditing solution in File Analytics.
Along with ensuring audit vendor support, a given solution may require a certain amount of CPU, Memory and Storage (to hold auditing events).  Ensure to include any vendor specific sizing in the configuration.  File Analytics for example could require 8vcpu 48GB of memory and 3TB of storage.

         b.  Antivirus? Which vendors?

Why ask? Files supports five main Antivirus vendors today with respect to ICAP integration, McAfee, Symantec, Kaspersky, Sophos and Bitdefender.  If centralized virus scan servers are to be used you will want to include their compute requirements into sizing the overall solution.

         c.  Backup? Which vendors?

Why ask?  Files has full change file tracking (CFT) support with HYCU and Commvault.  Veritas, Rubrik and Veeam are or will soon be working on integration.  Other vendors can also be supported outside of CFT support.  If including a backup vendor on the same platform, you may need to size for any virtual appliance which may also run on Nutanix.

         d.  Multiprotocol? SMB + NFS? * (Engage with a Files Specialist/Solutions Architect if this is a customer requirement)

Why ask?  Multiprotocol is challenging, and often behaves differently than a customer imagines it will. One protocol is defined as authoritative and the other protocol maps onto it. If the customer does not already use multiprotocol shares and have a strong command of the technology, engage your SA to assist on the design to ensure success.

21.  Using DFS (Distributed File Server) Namespace (DFS-N)?

Why ask?  Less about sizing and more about implementation.  Prior to Files 3.5.1 Files could only support distributed shares with DFS-N.  Starting with 3.5.1 both distributed and standard shares are fully supported as folder targets with DFS-N.

 22.  Tiering requirements?

Why ask?  Files is targeting support for tiering in the 1H of CY21.  Tiering in this context means automatically moving data off Nutanix Files and to an S3 compliant object service either on-premises or in the cloud.  In scoping future requirements, customers may size for a given amount of on-premises storage and a larger amount of tiered storage for longer term archive.

23.  Access-Based Enumeration (ABE) Requirements?

Why ask?  Nutanix Files supports Access-based Enumeration (ABE). Is it a requirement to hide objects (files and folders) from users who don’t have NTFS permissions (Read or List) on a network shared folder in order to access them?  If so, we fully support it. 

24.  Reality Check: Files Mixed vs Dedicated clusters

Why ask?  Always double check the cost of Dedicated vs Mixed clusters. Dedicated can often be more cost effective, and accommodates larger FSVM sizes since the FSVMs  are capable of using the full amount of compute resources available to the cluster.

25.  Reality Check: Dedicated Cluster hardware minimums

Why ask?  Remember that Files is still a virtualized workload, so don’t assume the minimum possible hardware spec. Use 12 core 4214 CPUs as a reasonable minimum, or 14 cores if NearSync requirements dictate. 128GB memory per node will not allow for AHV + CVM + maximum FSVM size deployments, so consider 192GB, or nodes that can expand to 192GB after deployment.

26.  Reality Check: Implementation

Why ask?  Have a high level design of how you’ve designed/sized Files in your solution and communicate the design to the installer. Poor implementation, and implementation that doesn’t match the planned design, is one of the leading causes of customer satisfaction issues for Files.

27.  Reality Check: Files Prerequisites

Why ask?  Ensure that you’ve reviewed the relevant prerequisites and shared with the customer before deploying (Active Directory if using SMB, AHV/ESXi only – no Hyper-V, have a second VLAN if customer wants iSCSI isolation, Backup clients like Rubrik deployed in wrong subnet if using two networks)

28.  Reality Check: Clients

Why ask?  Review the list of supported Files clients and share with the customer. Laptops and desktops are rarely a problem, but document senders/multifunction printers that are used to scan paper and convert to PDFs on a file share can often be capped at only SMBv1 support, which Files does not and will never support.

Resources:

Xpert Storage Team Page:  https://sites.google.com/nutanix.com/americas-xpert-program/storage?authuser=1

Files Sales Enablement Page: https://sites.google.com/nutanix.com/files/home?authuser=1

Calls to action/next steps:

For a peer review of a sizing or to request meeting support after the Files first call is completed: create a SFDC opportunity and request a Storage solutions architect on the opportunity

Test Drive – Storage: https://www.nutanix.com/one-platform?type=tddata

Files Bootcamps: https://confluence.eng.nutanix.com:8443/display/SEW/Bootcamps (Internal Only)