IIS Proxy & App Web Performance Optimizations Pt. 1

on Monday, March 5, 2018

We’re ramping up towards a day where our web farm fields around 40 times the normal load. It’s not much load compared to truly popular websites, but it’s a lot more than what we normally deal with. It’s somewhere around the order of ~50,000 people trying to use the system in an hour. And, the majority of the users hit the system in the first 15 minutes of the hour.

So, of course, we tried to simulate more than the expected load in our test environment and see what sort of changes we can make to ensure stability and responsiveness.

A quick note: This won’t be very applicable to Azure/Cloud based infrastructure. A lot of this will be done for you on the Cloud.

Web Farm Architecture

These systems run in a private Data Center. So, the servers and software don’t have a lot of the very cool features that the cloud offers.

The servers are all Win 2012 R2, IIS 8.5 with ARR 3.0, URL Rewrite 7.2, and Web Farm Framework 1.1.

Normally, the layout of the systems is similar to this diagram. This gives a general idea that there is a front-end proxy, a number of applications, backend services, and a database which are all involved in this yearly event. And, that a single Web App is significantly hit and it’s main supporting Backend Service is also significantly hit. The Backend Service is also shared by the other Web Apps involved in the event; but they are not the main clients during that hour.

image

Testing Setup

For testing we are using Visual Studio 2017 with a Test Controller and several Agents. It’s a very simple web test suite with a single scenario. This is the main use case during that hour. A user logs in to check their status, and then may take a few actions on other web applications.

Starting Test Load

  • Step Pattern
  • 100 users, 10 user step every 10 seconds, max 400 users
  • 1 Agent

We eventually get to this Test Load

  • Step Pattern
  • 1000 users, 200 user step every 10 seconds, max 2500 users
  • 7 agents

We found that over 2500 concurrent users would result in a SocketException on the Agent machines. Our belief is that each agent attempts to run the max user load defined by the test. And, that the Agent Process will run out (Sockets?) to spawn new users to make calls. This results in SocketExceptions. To alleviate the issue, we added more Agents to the Controller and lowered the maximum number of concurrent users.

SocketExceptions on VS 2017 Test Agents can be prevented by lowering the maximum number of concurrent users. (You can then add in more Agents to the Test Controller in order to get the numbers back up.)

Initial Architecture Change

We’ve been through this load for many years so we already have some standard approaches that we take every year to help with the load:

  • Add more Impacted Backend Service servers
  • Add more CPU/Memory to the Impacted Web App

This year we went further by

  • Adding another proxy server to ensure Backend Service Calls from the Impacted Web App don’t route through the Main Proxy to the Impacted Backend Services. This helps reduce the number of connections through the Main Proxy.
  • Adding 6 more Impacted Backend Service servers. These are the servers that always take the worst hit. These servers don’t need sticky sessions, so they can easily spread the load between them.
  • Adding a second Impacted Web App server. This server usually doesn’t have the same level of high CPU load that the Proxy and Impacted Backend Services do. These servers do require sticky sessions, so there are potential issues with the load not being balanced.

If you don’t have to worry about sticky session, adding more processing servers can always help distribute a load. That’s why Cloud based services with “Sliders” are fantastic!

image

Next Time …

In the next section we’ll look at the initial testing results and the lessons learned on each testing iteration.

0 comments:

Post a Comment


Creative Commons License
This site uses Alex Gorbatchev's SyntaxHighlighter, and hosted by herdingcode.com's Jon Galloway.