Scaling guidance
F5 NGINXaaS for Azure (NGINXaaS) supports manual and automatic scaling of your deployment, allowing you to adapt to application traffic demands while controlling cost.
This feature is only available for Standard plan(s).
An NGINXaaS deployment can be scaled out to increase the capacity (increasing the cost) or scaled in to decrease the capacity (reducing the cost). Capacity is measured in NGINX Capacity Units (NCU).
In this document you will learn:
- What an NGINX Capacity Unit (NCU) is
- How to manually scale your deployment
- How to enable autoscaling on your deployment
- What capacity restrictions apply for your Marketplace plan
- How to monitor capacity usage
- How to estimate the amount of capacity to provision
An NGINX Capacity Unit (NCU) quantifies the capacity of an NGINX instance based on the underlying compute resources. This abstraction allows you to specify the desired capacity in NCUs without having to consider the regional hardware differences.
An NGINX Capacity Unit consists of the following parameters:
- CPU: an NCU provides 20 Azure Compute Units (ACUs)
- Bandwidth: an NCU provides 60 Mbps of network throughput
- Concurrent connections: an NCU provides 400 concurrent connections. This performance is not guaranteed when F5 WAF for NGINX is used with NGINXaaS
To update the capacity of your deploymentv using the Azure Portal,
- Select NGINXaaS scaling in the left menu.
- Select Manual.
- Set the desired number of NCUs. Scale increases in 10 NCU intervals (10, 20, 30, and so on).
- Select Submit to update your deployment.
There’s no downtime while an NGINXaaS deployment changes capacity.
With autoscaling enabled, the size of your NGINXaaS deployment will automatically adjust based on traffic requirements without the need to guess how many NCUs to provision. You must specify a minimum and maximum NCU count. NGINXaaS will maintain the size of the deployment ensuring the number of provisioned NCUs does not fall below the set minimum NCUs and does not grow beyond the maximum NCUs. Refer to the Capacity Restrictions when setting the minimum and maximum capacity.
When creating a new NGINXaaS deployment with autoscaling enabled, the initial size of the deployment will match the minimum NCU count.
To enable autoscaling using the Azure Portal,
- Select NGINXaaS scaling in the left menu.
- Select Autoscale.
- Specify the minimum and maximum NCU count.
- Select Submit to enable NGINXaaS deployment autoscaling.
NGINXaaS automatically adjusts the number of NCUs based on “scaling rules.” A scaling rule defines when to scale, what direction to scale, and how much to scale. NGINXaaS will evaluate the following scaling rules, in order, based on the percentage capacity consumed metric and the provisioned NCU metric.
- Moderate Increase Rule: Over the last 5 minutes, if the average capacity consumed is greater than or equal to 70% of the average provisioned NCUs, increase capacity by 20%.
- Urgent Increase Rule: Over the last minute, if the capacity consumed is greater than or equal to 85% of the number of provisioned NCUs, increase capacity by 20%.
- Decrease Rule: Over the last 10 minutes, if the average capacity consumed is less than or equal to 60% of the average provisioned NCUs, decrease capacity by 10%.
To avoid creating a loop between scaling rules, NGINXaaS will not apply a scaling rule if it predicts that doing so would immediately trigger an opposing rule. For example, if the the “Urgent Increase Rule” is triggered due to a sudden spike in traffic, but the new capacity will cause the “Decrease Rule” to trigger immediately after, the autoscaler will not increase capacity. This prevents the deployment’s capacity from increasing and decreasing erratically.
The following table outlines constraints on the specified capacity based on the chosen Marketplace plan, including the minimum capacity required for a deployment to be highly available, the maximum capacity, and what value the capacity must be a multiple of. By default, an NGINXaaS for Azure deployment will be created with the corresponding minimum capacity.
| Marketplace Plan | Minimum Capacity (NCUs) | Maximum Capacity (NCUs) | Multiple of | 
|---|---|---|---|
| Standard plan(s) | 10 | 500 | 10 | 
If you need a higher maximum capacity, please open a request and specify the Resource ID of your NGINXaaS deployment, the region, and the desired maximum capacity you wish to scale to.
- NGINXaaS only supports the epollconnection processing method when using theusedirective, as NGINXaaS is based on Linux.
NGINXaaS provides metrics for visibility of the current and historical capacity values. These metrics, in the NGINXaaS Statistics namespace, include:
- NCUs Requested: ncu.requested– how many NCUs have been requested using the API. This is the goal state of the system at that point in time.
- NCUs Provisioned: ncu.provisioned– how many NCUs have been successfully provisioned by the service.- This is the basis for billing.
- This may differ from ncu.requestedtemporarily during scale-out/scale-in events or during automatic remediation for a hardware failure.
 
- Capacity Percentage: nginxaas.capacity.percentage– the percentage of the current workload’s total capacity that is being used.- If this is over 70%, consider scaling out; otherwise, requests may fail or take longer than expected. Alternatively, enable autoscaling, so your deployment can automatically scale based on the amount of capacity consumed.
 
See the Metrics Catalog for a reference of all metrics.
These metrics aren’t visible unless enabled, see how to Enable Monitoring for details.
Thencu.consumedmetric is now deprecated and is on the path to retirement. Please change any alerting on this metric to use the new Capacity Percentage metric.
To calculate how many NCUs to provision, take the highest value across the parameters that make up an NCU:
- CPU
- Bandwidth
- Concurrent connections
Example 1: “I need to support 2,000 concurrent connections but only 4 Mbps of traffic. I need 52 ACUs.” You would need Max(52/20, 4/60, 2000/400) = Max(2.6, 0.07, 5) = At least 5 NCUs.
Example 2: “I don’t know any of these yet!” Either start with the minimum and adjust capacity with the iterative approach described below, or enable autoscaling.
In addition to the maximum capacity needed, we recommend adding a 10% to 20% buffer of additional capacity to account for unexpected spikes in traffic. Monitor the Percentage Capacity Metric over time to determine your peak usage levels and adjust your requested capacity accordingly.
- Make an estimate by either:
- using the Usage and Cost Estimator
- compare to a reference workload
 
- Observe the nginxaas.capacity.percentagemetric in Azure Monitor of your workload
- Decide what headroom factor you wish to have
- Multiply the headroom factor by the provisioned NCUs to get the target NCUs.
- Adjust capacity to the target NCUs
- repeat from step 2 – it is always good to check back after making a change
Example:
- I am really unsure what size I needed so I just specified the default capacity,  20NCUs.
- I observe that my nginxaas.capacity.percentageis currently at90%.
- This is early morning, traffic. I think midday traffic could be 3x what it is now.
- 90% * 3 = 270%. 2.7 * 20 NCUs = 54 NCUs54 NCUs is my target capacity.
- I can see that I need to scale by multiples of 10 so I’m going to scale out to 60NCUs.
- At midday I can see that I overestimated the traffic I would be getting and it was still a busy day. We peaked at 68%of capacity, let me scale in to50NCUsto match the workload.
These reference workloads were all measured with a simplistic NGINX config proxying requests to an upstream. Keepalive between NGINX and upstream is enabled. Minimal request matching or manipulation is done.
| TLS? | Conn/s | Req/s | Response Size | Throughput | NCU | 
|---|---|---|---|---|---|
| no | 12830 | 13430 | 0KB | 23Mbps | 18.8 | 
| no | 12080 | 13046 | 1KB | 125Mbps | 19 | 
| no | 12215 | 12215 | 10KB | 953Mbps | 21 | 
| no | 1960 | 1690 | 100KB | 1295Mbps | 23.6 |