Last time we took the new architecture to it’s theoretical limit and pushed more of the load toward the database. This time …
What we changed (Using 2 Load Test Suites)
- Turn on Output Caching on the Proxy Server. Which defaults to caching js, css, and images. Which works really great with really old sites.
- We also lowered the number of users as the Backend Services ramped up to 100%.
- Forced Test Agents to Run in 64-bit mode. This resolved the an Out Of Memory exception that we we’re getting when the Test Agents were running into the 2 GB memory caps of their 32-bit processes.
- Found a problem with the Test Suite that was allowing all tests to complete without hitting the backend service. (This really effected the number of calls that made it to the Impacted Backend Services.)
- Added a second Test Suite which also used the same database. The load on this suite wasn’t very high; it just added more real world requests.
Test Setup
- Constant Load Pattern
- 1000 users
- 7 Test Agents (64-bit mode)
- Main Proxy
- 4 vCPU / 8 vCore
- 24 GB RAM
- AppPool Queue Length: 50,000
- WebFarm Request Timeout: 120 seconds
- Output Caching (js, css, images)
- Impacted Web App Server
- 3 VMs
- AppPool Queue Length: 50,000
- Impacted Backend Service Server
- 8 VMs
- Classic ASP App
- CDNs used for 4 JS files and 1 CSS file
- Custom JS and CSS coming from Impacted Web App
- Images still coming from Impacted Web App
- JS is minified
- VS 2017 Test Suite
- WebTest Caching Enabled
- A 2nd Test Suite which Impacts other applications in the environment is also run. (This is done off a different VS 2017 Test Controller)
Test Results
- Main Proxy
- CPU: 28% (down 37)
- Max Concurrent Connections: – (Didn’t Record)
- Impacted Web App
- CPU: 56% (down 10)
- Impacted Backend Service
- CPU: 100% (up 50)
- DB
- CPU: 30% (down 20)
- VS 2017 Test Suite
- Total Tests: 95,000 (down 30,000)
- Tests/Sec: 869 (down 278)
This more “real world” test really highlighted that the impacted systems weren’t going to have a huge impact on the database shared by the other systems which will using it at the same time.
We had successfully moved the load from the Main Proxy onto the the backend services, but not all the way to the database. With some further testing we found that adding CPUs and new VMs to the Impacted Backend Servers had a direct 1:1 relationship with handling more requests. The unfortunate side of that is that we weren’t comfortable with the cost of the CPUs compared to the increased performance.
The real big surprise was the significant CPU utilization decrease that came from turning On Output Caching on the
And, with that good news, we called it a day.
So, the final architecture looks like this …
What we learned …
- SSL Encryption/Decryption can put a significant load on your main proxy/load balancer server. The number of requests processed by that server will directly scale into CPU utilization. You can reduce this load by moving static content to CDNs.
- Even if your main proxy/load balancer does SSL offloading and requests to the backend services aren’t SSL encrypted, the extra socket connections still have an impact on the servers CPU utilization. You can lower this impact on both the main proxy and the Impacted Web App servers by using Output Caching for static content (js, css, images).
- We didn’t have the need to use bundling and we didn’t have the ability to do spriting; but we would strongly encourage anyone to use those if they are an option.
- Moving backend service requests to an internal proxy doesn’t significantly lower the number of requests through the main proxy. It’s really images that create the most number of requests to render a web page (especially with an older Classic ASP site).
- In Visual Studio, double check that your suite of web tests are doing exactly what you think they are doing. Also, go the extra step and check that the HTTP Status Code returned on each request is the code that you expect. If you expect a 302, check that it’s a 302 instead of considering a 200 to be satisfactory.
