Self-Signed Certificates for Win10

on Friday, November 24, 2017

Browsers have implemented all sorts of great new security measures to ensure that certificates are pretty valid. So, using a self-signed certificate today is more difficult than it used to be. Also, IIS for Win8/10 gained access for using a Central Certificate Store. So, here’s some scripts that:

  • Create a Self-Signed Cert
    • Creates a self-signed cert with a DNS Name (browsers don’t like it when the Subject Alternative Name doesn’t list the DNS Name).
    • Creates a Shared SSL folder on disk and adds permissions for IIS’s Central Certificate Store account will read the certs with.
    • Exports the cert to the Shared SSL folder as a .pfx.
    • Reimports the certs to the machines Trusted Root Authority (needed for browsers to verify the cert is trusted)
    • Adds the 443/SSL binding to the site (if it exists) in IIS
  • Re-Add Cert to Trusted Root Authority
    • Before Win10, Microsoft implemented a background task which will periodically check the certs installed in your Machine Trusted Root Authority which are self-signed and removes them. So, this script re-installs them.
    • It will look through the shared SSL folder created in the previous script and add any certs back to the local Machine Trusted Root Authority that are missing.
  • Re-Add Cert to Trusted Root Authority Scheduled Task
    • Schedules the script to run hourly
### Create-SelfSignedCert.ps1

$name = "site.name.com" # only need to edit this


# get the shared ssl password for dev - this will be applied to the cert
$pfxPassword = "your pfx password"

# you can only create a self-signed cert in the \My store
$certLoc = "Cert:\LocalMachine\My"
$cert = New-SelfSignedCertificate `
            -FriendlyName $name `
            -KeyAlgorithm RSA `
            -KeyLength 4096 `
            -CertStoreLocation $certLoc `
            -DnsName $name

# ensure the path the directory for the central certificate store is setup with permissions
# NOTE: This assumes that IIS is already setup with Central Cert Store, where
#       1) The user account is "Domain\AccountName"
#       2) The $pfxPassword Certificate Private Key Password
$sharedPath = "D:\AllContent\SharedSSL\Local"
if((Test-Path $sharedPath) -eq $false) {
    mkdir $sharedPath

    $acl = Get-Acl $sharedPath
    $objUser = New-Object System.Security.Principal.NTAccount("Domain\AccountName") 
	$rule = New-Object System.Security.AccessControl.FileSystemAccessRule($objUser, "ReadAndExecute,ListDirectory", "ContainerInherit, ObjectInherit", "None", "Allow")
	$acl.AddAccessRule($rule)
	Set-Acl $sharedPath $acl
}


# export from the \My store to the Central Cert Store on disk
$thumbprint = $cert.Thumbprint
$certPath = "$certLoc\$thumbprint"
$pfxPath = "$sharedPath\$name.pfx"
if(Test-Path $pfxPath) { del $pfxPath }
Export-PfxCertificate `
    -Cert $certPath `
    -FilePath $pfxPath `
    -Password $pfxPassword


# reimport the cert into the Trusted Root Authorities
$authRootLoc = "Cert:\LocalMachine\AuthRoot"
Import-PfxCertificate `
    -FilePath $pfxPath `
    -CertStoreLocation $authRootLoc `
    -Password $pfxPassword `
    -Exportable


# delete it from the \My store
del $certPath # removes from cert:\localmachine\my


# if the website doesn't have the https binding, add it
Import-Module WebAdministration

if(Test-Path "IIS:\Sites\$name") {
    $httpsBindings = Get-WebBinding -Name $name -Protocol "https"
    $found = $httpsBindings |? { $_.bindingInformation -eq "*:443:$name" -and $_.sslFlags -eq 3 }
    if($found -eq $null) {
        New-WebBinding -Name $name -Protocol "https" -Port 443 -IPAddress "*" -HostHeader $name -SslFlags 3
    }
}
### Add-SslCertsToAuthRoot.ps1

$Error.Clear()

Import-Module PowerShellLogging
$name = "Add-SslCertsToAuthRoot"
$start = [DateTime]::Now
$startFormatted = $start.ToString("yyyyMMddHHmmss")
$logdir = "E:\Logs\Scripts\IIS\$name"
$logpath = "$logdir\$name-log-$startFormatted.txt"
$log = Enable-LogFile $logpath

try {

    #### FUNCTIONS - START ####
    Function Get-X509Certificate {
	Param (
        [Parameter(Mandatory=$True)]
		[ValidateScript({Test-Path $_})]
		[String]$PfxFile,
		[Parameter(Mandatory=$True)]
		[string]$PfxPassword=$null
	)

	    # Create new, empty X509 Certificate (v2) object
	    $X509Certificate = New-Object System.Security.Cryptography.X509Certificates.X509Certificate2

	    # Call class import method using password
        try {
			$X509Certificate.Import($PfxFile,$PfxPassword,"PersistKeySet")
			Write-Verbose "Successfully accessed Pfx certificate $PfxFile."
		} catch {
			Write-Warning "Error processing $PfxFile. Please check the Pfx certificate password."
			Return $false
		}
	
        Return $X509Certificate
    }

    # http://www.orcsweb.com/blog/james/powershell-ing-on-windows-server-how-to-import-certificates-using-powershell/
    Function Import-PfxCertificate {
    Param(
	    [Parameter(Mandatory = $true)]
	    [String]$CertPath,
	    [ValidateSet("CurrentUser","LocalMachine")]
	    [String]$CertRootStore = "LocalMachine",
	    [String]$CertStore = "My",
	    $PfxPass = $null
    )
        Process {
	        $pfx = new-object System.Security.Cryptography.X509Certificates.X509Certificate2
	        if ($pfxPass -eq $null) {$pfxPass = read-host "Enter the pfx password" -assecurestring}
	        $pfx.import($certPath,$pfxPass,"Exportable,PersistKeySet")
 
	        $store = new-object System.Security.Cryptography.X509Certificates.X509Store($certStore,$certRootStore)

	        $serverName = [System.Net.Dns]::GetHostName();
	        Write-Warning ("Adding certificate " + $pfx.FriendlyName + " to $CertRootStore/$CertStore on $serverName. Thumbprint = " + $pfx.Thumbprint)
	        $store.open("MaxAllowed")
	        $store.add($pfx)
	        $store.close()
	        Write-Host ("Added certificate " + $pfx.FriendlyName + " to $CertRootStore/$CertStore on $serverName. Thumbprint = " + $pfx.Thumbprint)
        }
    }
    #### FUNCTIONS - END ####


    #### SCRIPT - START ####
    $sharedPath = "D:\AllContent\SharedSSL\Local"
    $authRootLoc = "Cert:\LocalMachine\AuthRoot"
    
    $pfxPassword = "your password" # need to set this

    $pfxs = dir $sharedPath -file -Filter *.pfx
    foreach($pfx in $pfxs) {    
        $cert = Get-X509Certificate -PfxFile $pfx.FullName -PfxPassword $pfxSecret.Password
        $certPath = "$authRootLoc\$($cert.Thumbprint)"
        if((Test-Path $certPath) -eq $false) {
            $null = Import-PfxCertificate -FilePath $pfx.FullName -CertStoreLocation $authRootLoc -Password $pfxPassword -Exportable
            Write-Host "$($cert.Subject) ($($cert.Thumbprint)) Added"
        } else {
            Write-Host "$($cert.Subject) ($($cert.Thumbprint)) Already Exists"
        }
    }
    #### SCRIPT - END ####

} finally {
    foreach($er in $Error) { $er }

    Disable-LogFile $log
}
### Install-Add-SslCertsToAuthRoot.ps1

$yourUsername = "your username" # needs local admin rights on your machine (you probably have it)
$yourPassword = "your password"

$name = "Add-SslCertsToAuthRoot"
$filename = "$name.ps1"
$fp = "D:\AllContent\Scripts\IIS\$filename"
$taskName = $name
$fp = "powershell $fp"

$found = . schtasks.exe /query /tn "$taskName" 2>null
if($found -ne $null) {
    . schtasks.exe /delete /tn "$taskName" /f
    $found = $null
}
if($found -eq $null) {
    . schtasks.exe /create /ru $yourUsername /rp $yourPassword /tn "$taskName" /sc daily /st "01:00" /tr "$fp"
    . schtasks.exe /run /tn "$taskName"
}

Wnf Kernel Memory Leak

on Friday, November 17, 2017

Back in 2015, we started using Win2012 R2 servers and within a day of Production usage we started seeing Out of Memory errors on the servers. Looking at the Task Manager, we could easily see that a massive amount of Kernel Memory was being used. But why?

Using some forums posts, SysInternals, and I think a Scott Hanselman blog entry we were able to use PoolMon.exe to see that the system using all the Kernel Memory was Wnf. We had no idea what it was and went down some rabbit holes before finding this forum post.

Microsoft Support would later tell us the problem had something to with a design change to Remote Registry and how it deals with going idle, and another design change in Windows Server 2012 R2 about how it choose which services to make idle. Anyways, the fix was easy to implement (just a real pain to find):

If you want the service to not stop when Idle, you can set this registry key:
key : HKEY_LOCAL_MACHINE\SOFTWARE\Microsoft\Windows NT\CurrentVersion\RemoteRegistry
name : DisableIdleStop
REG_DWORD, data : 1

Here’s what it looks like when the leak is happening:

image

How to Crash Exchange Using IIS Healthchecks

on Saturday, September 23, 2017

So, I had a bad week. I crashed a multiple server, redundant, highly available Exchange Server setup using the IIS Healthchecks of a single website in Dev and Test (not even Prod).

How did I do this? Well …

  • Start with a website that is only in Dev & Test; and hasn’t moved to Prod.
    • All of the database objects are only in Dev & Test.
  • Do a database refresh from Prod and overlay Dev & Test.
    • The database refresh takes 2 hours; but the next 17 hours is a period where the Dev & Test environments don’t have the database objects available to them, because those objects weren’t a part of the refresh.
  • So, now you have 19 hours of a single website being unable to properly make a database call.
  • Why wasn’t anyone notified? Well, that’s all on me. It was the Dev & Test version of the website, and I was ignoring those error messages (those many, many error messages).
  • Those error messages were from ELMAH. If you use ASP.NET and don’t know ELMAH; then please learn about it, it’s amazing!
    • In this case, I was using ELMAH with WebAPI, so I was using the Elmah.Contrib.WebAPI package. I’m not singling them out as a problem, I just want to spread the word that WebAPI applications need to use this package to get error reporting.
  • Finally, you have the IIS WebFarm Healthcheck system.
    • The IIS WebFarm healthcheck system is meant to help a WebFarm route requests to healthy application servers behind a proxy. If a single server is having a problem, then requests are no longer routed to it and only the healthy servers are sent requests to process. It’s a really good idea.
    • Unfortunately, … (You know what? … I’ll get back to this below)
    • Our proxy servers have around 215 web app pools.
    • The way IIS healthchecks are implemented, every one of those web app pools will run the healthchecks on every web farm. So, this one single application gets 215 healthchecks every 30 seconds (the default healthcheck interval).
    • That’s 2 healthchecks per minute, by 215 application pools …
    • Or 430 healthchecks per minute … per server
    • Times 3 servers (1 Dev & 2 Test Application Servers) … 1290 healthchecks per minute
    • Times 60 per hour, times 19 hours … 1,470,600 healthchecks in 19 hours.
  • Every one of the 1,470,600 healthchecks produced an error, and ELMAH diligently reported every one of those errors. (First email type)
  • Now for Exchange
    • Even if we didn’t have a multi-server, redundant, highly available Exchange server, 1.5 million emails would have probably crashed it.
    • But, things got crazier because we have a multiple server, redundant, highly available setup.
    • So, the error emails went to a single recipient, me.
    • And, eventually my Inbox filled up (6 GBs limit on my Inbox), which started to produce response emails saying “This Inbox is Full”. (Second email type)
    • Well … those response emails went back to the sender … which was a fake email address I used for the website (it’s never supposed to be responded to).
    • Unfortunately, that fake email address has an the domain as my account (@place.com); which sent all the responses back to the same Exchange server.
    • Those “Inbox is Full” error messages then triggered Exchange to send back messages that said “This email address doesn’t exist”. (Third email type)
    • I’m not exactly sure about how this happened, but there was a number of retry attempts on the [First Email Type] which again re-triggered the Second and Third email type. I call the retrys the (Fourth email type).
    • Once all of the error messages get factored into the equation, the 1.5 million healthcheck emails generated out 4.5 million healthcheck and smtp error emails.
    • Way before we hit the 4.5 million mark, our Exchange server filled up …
      • It’s database
      • The disk on the actual Exchange servers

So, I don’t really understand Exchange too well. I’m trying to understand this diagram a little better. One thing that continues to puzzle me is the why the Exchange server sent out error emails to “itself”. (My email address is my.name@place.com and the ELMAH emails were from some.website@place.com … so the error emails were sent to @place.com, which that Exchange server owns). Or does it …

  • So, from the diagram, consultation, and my limited understanding … our configuration is this:
    • We have a front end email firewall that owns the MX record (DNS routing address) for @place.com.
      • The front end email firewall is supposed to handle external email DDOS attacks and ridiculous spam emails.
    • We have an internal Client Access Server / Hub Transport Server which takes in the ELMAH emails from our applications and routes them into the Exchange Servers.
    • We have 2 Exchange servers with 2 Databases behind them, which our email inboxes are split across.
    • So, the flow might be (again, I don’t have this pinned down)
      • The application sent the error email to the Client Access Server
      • The Client Access Server queued the error email and determined which Exchange server to process it with (let’s say Exchange1)
      • Exchange1 found that the mailbox was full and using SMTP protocols it needed to send an “Inbox is full error message”. Exchange1 looked up the MX record of where to send and found that it needed to send it to the Email Firewall. It sent it ..
      • The Email Firewall then found that some.website@place.com wasn’t an actual address and maybe sent it to Exchange2 for processing?
      • Exchange2 found it was a fake address and sent back a “This address doesn’t exist email”, which went back to the Email Firewall.
      • The Email Firewall forwarded the email or dropped it?
      • And, somewhere in all this mess, the emails that couldn’t be delivered to my real address my.name@place.com because my “Inbox was full” got put into a retry queue … in case my inbox cleared up. And, this helped generate more “Inbox is full” and “This address doesn’t exist” emails.
  • Sidenote: I said above “One thing that continues to puzzle me is the why the Exchange server sent out error emails to “itself”. ”
    • I kinda get it. Exchange does an MX lookup for @place.com and finds the Email Firewall as the IP address, which isn’t itself. But …
    • Shouldn’t Exchange know that it owns @place.com? Why does it need to send the error email?

So … this biggest problem in this whole equation is me. I knew that IIS had this healthcheck problem before hand. And, I had even created a support ticket with Microsoft to get it fixed (which they say has been escalated to the Product Group … but nothing has happened for months).

I knew of the problem, I implemented ELMAH, and I completely forgot that the database refresh would wipe out the db objects which the applications would need.

Of course, we/I’ve now gone about implementing fixes, but I want to dig into this IIS Healthcheck issue a little more. Here’s how it works.

  • IIS has a feature called ARR (Application Request Routing)
    • It’s used all the time in Azure. You may have setup a Web App, which requires an “App Service”. The App Service is actually a proxy server that sits in front of your Web App. The proxy server uses ARR to route the requests to your Web App. But, in Azure they literally create a single proxy server for your single web application server. If you want to scale up and “move the slider”, more application servers are created behind the proxy. BUT, in Azure, the number of Web Apps that can sit behind a App Service/Proxy Service is very limited (less than 5). <rant>No where in the IIS documentation do they tell you to limit yourself to 5 applications; and the “/Build conference” videos from the IIS team make you believe that IIS is meant to handle hundreds of websites. </rant>
  • We use ARR to route requests for all our custom made websites (~215) to the application servers behind our proxy.
  • ARR uses webfarms to determine where to route requests. The purpose of the webfarms is have multiple backend Application Servers; which handle load balancing.
  • The webfarms have a Healthcheck feature, which allows the web farms to check if the application servers behind the proxy are Healthy. If one of the application servers isn’t healthy then it’s taken out of the pool until it’s healthy again.
    • I really like this feature and it makes a lot of sense.
  • The BIG PROBLEM with this setup is that the WEBFARMS AREN’T DIRECTLY LINKED TO APPLICATION POOLS.
    • So, every application pool that runs on the frontend proxy server, loads the entire list of webfarms into memory.
    • If any of those webfarms happens to have a healthcheck url, then that application pool will consider itself the responsible party to check that healthcheck url.
    • So, if a healthcheck url has a healthcheck interval of 30 seconds …
    • And a proxy server has 215 application pools on it; then that is 215 healthchecks every 30 seconds.

I think the design of the Healthcheck feature is great. But, the IMPLEMENTATION is flawed. HEALTHCHECKS ARE NOT DESIGNED THE WAY THEY ARE IMPLEMENTED.

Of course I’ve worked on other ways to prevent this problem in the future. But, IIS NEEDS TO FIX THE WAY HEALTHCHECKS ARE IMPLEMENTED.

I get bothered when people complain without a solution, so here’s the solution I propose:

  • Create a new xmlnode in the <webfarm> section of applicationHost.config which directly links webfarms to application pools.
  • Example (sorry, I’m having a lot of problem getting code snippets to work in this version of my LiveWriter)
<webfarm enabled="true" name="wf_johndoe.place.com_lab">
  <applicationpool name="johndoe.place.com_lab" />
  <server enabled="true" address="wa100.place.com" />
  <applicationrequestrouting>
    <protocol reverserewritehostinresponseheaders="false" timeout="00:00:30">
      <cache enabled="false" querystringhandling="Accept" />
    </protocol>
    <affinity cookiename="ARRAffinity_johndoe.place.com_lab" usecookie="true"/>
    <loadbalancing algorithm="WeightedRoundRobin" />
  </applicationrequestrouting>
</webfarm>

Healthchecks Should Not Be Pings

on Saturday, December 17, 2016

I had a long held belief that health checks should just be pings. “Is the websites up?” And for years, that was right. Not anymore.

Recently, a developer asked me if he should use health checks to ensure that the Entity Framework Cache stays in memory? It took me a while disassociate health checks from pings, but he was right. YES, you should use health checks to ensure the health of your site.

You should use health checks to do this:

  • Ensure your site is up and running (ping)
  • Ensure all cached values are available and, if possible, at the latest value.
  • Ensure Entity Framework’s cache is hit before your first user
    • EF is a total hog of resources and complete slowdown on first hit
  • Same thing for WCF
  • Cache any application specific values needed before first hit

Health checks should not be pings. They should check the entire health of the site and its responsiveness. It should check the cache, it’s database connectivity, and everything that makes a website work. It’s a “health check” not a ping.

Tyk in Docker on Windows 10

on Sunday, October 16, 2016

I’m very new to all this technology so, please take this with a grain of salt. The reason I’m writing it is because I couldn’t find another guide that had end-to-end setup on Tyk in Docker on Windows 10.

Tyk is an API Gateway product that can be used to help manage a centralized location of many services/micro services. It is a product which is built on top of the nginx web server. And, nginx is really only supported as a “server” product on *nix based systems. Their Windows build is considered a beta.

So, there are already some good guides for each of the next steps, I’m just gonna pull them all together, and add one extra piece at the end.

Install Docker

There are a couple ways to get around the limitation of nginx only being “production ready on *nix”, but I choose to try out Tyk on Docker. Docker is the multiplatform container host that has created a lot of buzz within the cloud space. But, it also seems pretty awesome at setting up small containers on your local machine too.

Note: At this time, 2016-10-16, if you download a Docker for Windows installer, use the Beta Channel. The stable channel has a bug when trying to mount volumes into containers.

The docker installation wizard is pretty straight forward, so no worries there. Once, installed right-click on the Docker systray icon and select Open Kitematic …

image

A pop-up window should come up containing instructions on how to download and install Kitematic. It was amazingly simple and gave a nice GUI interface over the command line.

Follow Tyk’s Installer Instructions

Tyk provides instructions to setup the API gateway & dashboard with Docker on their website. I would suggest getting an account at Docker Hub. I don’t remember when in the process I created one, but I needed it to access … something.

In Step 2. Get the quick start compose files you’ll need to git clone the files to an folder under you C:\Users\XXXX folder. For me, Docker had a permissions restriction that only allowed containers to mount volumes from folders under my user folder. (So, that could be interesting if you run a container on a server under a service account.)

The silver lining about this set of containers is that they only need to use config files from your local drive. So, it’s not like your C:\Users folder is going to store a database.

In Step 4. Bootstrap your dashboard and portal, if you have bash available to you I would suggest trying it when you run ./setup.sh. I haven’t installed Win10 Anniversary Update, Git Bash, or Cygwin so I didn’t have bash available to run setup.sh.

However, I do feel somewhat comfortable in powershell, and the setup.sh script didn’t look too long. Below is the powershell conversion, which you should be saved in the same directory as setup.sh, and you should run .\setup.ps1 from the PowerShell ISE with the arguments that you want.

After that, I had a running Tyk API Gateway.

image

Other Thoughts

Since this was all new technology I ran into a lot of errors and read through a lot of issue/forum posts. Which makes me think this might not be the best idea for a production setup. If you’re able to make linux servers within your production environment, I would strongly suggest that.

Because I made so many mistakes I got used to these three commands which really helped recreate the environment whenever I messed things up. I hope this helps.

Get-FullDomainAccount

on Friday, April 1, 2016

In a previous post I forgot to include the PowerShell code for Get-FullDomainAccount. Sorry about that.

Here it is:

$env:USERDOMAIN = "<your domain>"
<#
.SYNOPSIS
	Ensures that the given domain account also has the domain prefix. For example,
	if the -DomainAccount is "IUSR_AbcXyz" the "<your domain>\IUSR_AbcXyz" would most likely
	be returned. The domain is pulled from the current users domain, $env:USERDOMAIN.

	If -Environment is provided, this will also run the -DomainAccount through
	Get-EnvironmentDomainAccount to replace any environment specific information.

.LINK
	Get-EnvironmentDomainAccount
		Used to apply environment specific value to the domain account

.EXAMPLE
    $result = Get-FullDomainAccount -DomainAccount "IUSR_AbcXyz"
    $result -eq "<your domain>\IUSR_AbcXyz"
#>
Function Get-FullDomainAccount {
[CmdletBinding()]
Param (
	[Parameter(Mandatory=$true)]
	[string] $DomainAccount,
	[string ]$Environment = ""
)
	$accountName = $DomainAccount;

	if($Environment -ne "") {
        $accountName = Get-EnvironmentDomainAccount -Environment $Environment -DomainAccount $DomainAccount;
	}

    if($accountName -match "ApplicationPoolIdentity") {
        $accountName = "IIS AppPool\$accountName"
    }

    if($accountName -match "LocalSystem") {
        $accountName = "$($env:COMPUTERNAME)\$accountName"
    }

	if($accountName -notmatch "\\") {
		$accountName = $env:USERDOMAIN + "\" + $accountName;
	}
	return $accountName;
}

WebAdministration Not Loaded Correctly on Remote

on Friday, November 14, 2014

When making remote calls that use the WebAdministration module you can sometimes get this error, inconsistently:

ERROR: Get-WebSite : Could not load file or assembly ‘Microsoft.IIS.PowerShell.Framework' or one of its dependencies. The system cannot find the file specified.

It’s a really tricky error because its inconsistent. But, there is a workaround that will prevent the error from giving you too much trouble. From the community that has done troubleshooting on this, the problem seems to occur on the first call that uses the WebAdministration module. If you can wrap that call in a try/catch, then subsequent calls will work correctly.

$scriptBlock = {
    Import-Module WebAdministration

    try {
        $sites = Get-WebSite
    } catch {
        # http://help.octopusdeploy.com/discussions/problems/5172-error-using-get-website-in-predeploy-because-of-filenotfoundexception
        $sites = Get-WebSite
    }
}

Invoke-Command -ScriptBlock $scriptBlock -ComputerName Remote01


Creative Commons License
This site uses Alex Gorbatchev's SyntaxHighlighter, and hosted by herdingcode.com's Jon Galloway.