Troubleshooting guide
Many users encounter difficulties getting started with NGINXaaS for Azure, or have difficulty debugging performance or run-time problems. Here is some guidance about how to resolve common problems.
Certificates are a common source of confusion and errors. It is important to understand that certificates are integral to your deployment’s health. Once you have added a certificate as a resource to your deployment, our service will always attempt to fetch that certificate when applying your NGINX configuration. It doesn’t matter whether the certificate is referenced by the NGINX configuration or not: we will always try to fetch it as long as it is added to the deployment.
- Ensure that all certificate resources are needed by the deployment and that they are not about to expire.
- If a certificate on the deployment expires in your Azure Key Vault, we will not be able to fetch it and your deployment will enter a degraded state. Our service will then be unable to apply the latest software updates to your dataplane.
It is easier to renew a certificate than it is to begin again with a brand new certificate which needs to be added to the NGINXaaS deployment. Our service always attempts to pull the latest version of your certificates, so automatically renewing certificates is a convenient way of avoiding problems with expiration.
Many users struggle to get their first certificates added to the deployment. Please ensure the following:
- Check that you have a managed identity added to your NGINXaaS deployment.
- Ensure that the managed identity has the correct role assignments to your Azure Key Vault.
Our service uses the managed identity delegated to the deployment when fetching your certificates from your Azure Key Vault, so it is important that the managed identity has the role assignments that it needs over the key vault.
If you wish to reference certificates in a private Azure Key Vault, you will need to configure a Network Security Perimeter so that our service has the necessary permissions to fetch the certificates.
When performing an end-to-end test on your NGINXaaS deployment, if you are unable to access one of the locations defined in your NGINX configuration, it may appear that traffic is not flowing through the system properly.
In this situation:
- Check that your deployment’s Network Security Group settings are not causing inbound traffic on certain ports to be blocked.
Please see the Azure Network Security Group documentation for guidance.
Another reason why traffic may not flow through the system as expected is that the upstreams referenced in your NGINX configuration are either not accessible to our service or are not in a healthy state.
Check that the upstream is accessible to our service.
- Create an Azure Virtual Machine within the same subnet as your deployment.
- SSH into the VM and then
ping
orcurl
the upstream. - You should be successful. If not, this indicates that the upstream is not accessible to our service.
If there are other problems routing traffic to your upstreams, check the logs of your upstream servers to gain further insights into the problem.
Once your upstream servers are accessible to our service, NGINX gives you various tools to manage the health of your upstreams.
We recommend that users enable both passive and active health checks on their deployments.
- Active health checks proactively monitor the responsiveness of your upstreams and allows your NGINXaaS deployment to stop routing traffic to unhealthy backend servers.
See this document for steps to configure active and passive health checks through your NGINX configuration.
NGINXaaS gives users two main tools to monitor the performance of their deployments.
NGINX access logs give you information about the requests received by your NGINXaaS deployment and the error logs tell you about errors encountered by NGINX when handling incoming requests. This information is extremely important when debugging problems with your NGINX configuration. Many problems that arise when handling traffic can be fixed by modifying your NGINX configuration.
The metrics published by NGINXaaS to Azure Monitor offer a powerful tool for evaluating system performance. Our metrics catalog provides a description of each metric. Taken together these metrics provide insights into every aspect of your deployment’s performance. If you add the memory zone directive to your upstreams in your NGINX configuration, we will be able to provide metrics on upstream health.
The nginxaas.capacity.percentage
metric indicates how much capacity your deployment is consuming (expressed as a percentage of the total). We recommend configuring an alert on this metric, so that if it exceeds a certain threshold, you manually scale out your deployment. If you have autoscaling enabled, our service will scale your deployment automatically.
There are a variety of ways to boost the performance of your NGINXaaS deployment. This blog post offers a roundup of the top ten tips to get the most out of NGINX.
Enabling http keepalives is a tip that deserves added emphasis. Keepalive settings allow for efficient reuse of connections between client and server and can significantly increase the speed of your system.