The first thing is, Don’t Do This.
I needed to put our main proxy servers IIS logs into a database in order to calculate out the total bytes sent and total bytes received over time. The reason for the analytics was to estimate the expected cost of running a similar setup in AWS with an Application Load Balancer.
To load the data, I copied a powershell script from this blog (Not so many…), which was a modification of the original script from the same blog. The script is below. It is meant to be run as scheduled task and to log the details of each run using PowerShellLogging. The script is currently setup to only import a single day of data, but it can be altered to load many days without much effort.
But I want to focus on the size of the database and the time it takes to load.
IIS Log Information
90 Sites
38 days of logs
58.8 GB of IIS Logs
Database Configuration Information
My Personal Work Machine (not an isolated server)
MS SQL Server 2016 SP 1
1TB SSD Drive
Limited to 5GB Memory
Core i7-6700
Windows 10 1803
First Attempt – An Unstructured Database
This was “not thinking ahead” in a nutshell. I completely ignored the fact that I needed to query the data afterwards and simply loaded all of into a table which contained no Primary Key or Indexes.
The good news was it loaded “relatively” quickly.
Stats
- 151 Million Records
- 161 GB of Disk Space (that’s a 273% increase)
- 7h 30m Running Time
The data was useless as I couldn’t look up anything without a full table scan. I realized this problem before running my first query, so I have no data on how long that would have taken; but I figure it would have been a long time.
First Attempt Part 2 – Adding Indexes (Bad Idea)
Foolishly, I thought I could add the indexes to the table. So, I turned on Simple Logging and tried to add a Primary Key.
Within 1h 30m the database had grown to over 700 GB and a lot of error messages started popping up. I had to forcefully stop MSSQL Server and delete the .mdf/.ldf files by hand.
So, that was a really bad idea.
Second Attempt – Table with Indexes
This time I created a table with 9 indexes (1 PK, 8 IDX) before loading the data. Script below.
With the additional indexes and a primary key having a different sort order than the way the data was being loaded, it took significantly longer to load.
Stats
- 151 Million Records
- 362 GB of Disk Space (that’s a 615% increase)
- 77 GB Data
- 288 GB Indexes
- 25h Running Time
I was really surprised to see the indexes taking up that much space. It was a lot of indexes, but I wanted to be covered for a lot of different querying scenarios.
Daily Imports
Stats
- 3.8 Million Records
- 6 GB of Disk Space
- 30m Running Time
Initial Thoughts …
Don’t Do This.
There are a lot of great log organizing companies out there: Splunk, New Relic, DataDog, etc. I have no idea how much they cost, but the amount of space and the amount of time it takes to organize this data for querying absolutely justifies the need for their existence.
############## https://notsomany.wordpress.com/2017/12/08/iis-logs-to-sql-database-using-powershell-and-log-parser/ | |
$dbserver = "$env:COMPUTERNAME,1433" | |
$database = "iislogs" | |
$proxyserver = "<your-web-server>" | |
$ErrorActionPreference = "Stop" | |
$Error.Clear() | |
$name = "Import-IisLogFiles" | |
$start = [DateTime]::Now | |
$startFormatted = $start.ToString("yyyyMMddHHmmss") | |
$logpath = "E:\Logs\Scripts\IIS\$name\$name-log-$startFormatted.txt" | |
$logoutput = Enable-LogFile $logpath | |
$logParser = "${env:ProgramFiles(x86)}\Log Parser 2.2\LogParser.exe" | |
try { | |
#Import-Module Pscx -EA 0 | |
Write-Host "Start: $start" -ForegroundColor Yellow | |
function Import-IisLogFilesToDb( | |
[string] $LogPath, | |
[string] $IisServer, | |
[string] $IisSiteName, | |
[string] $DbServer, | |
[string] $DbName | |
) | |
{ | |
if ([string]::IsNullOrEmpty($LogPath) -eq $true) | |
{ | |
throw "The log path must be specified." | |
} | |
if ((Test-Path $LogPath) -eq $false) | |
{ | |
throw "The log file doesn't exist, $LogPath." | |
} | |
$query = ` | |
"SELECT" ` | |
+ " TO_TIMESTAMP(date, time) AS EntryTime" ` | |
+ ", LogFilename" ` | |
+ ", RowNumber" ` | |
+ ", '$IisSiteName' as SiteName" ` | |
+ ", '$IisServer' as ServerName" ` | |
+ ", s-ip AS sIp" ` | |
+ ", cs-method AS csMethod" ` | |
+ ", cs-uri-stem AS csUriStem" ` | |
+ ", cs-uri-query AS csUriQuery" ` | |
+ ", s-port AS sPort" ` | |
+ ", TO_STRING(cs-username) AS csUsername" ` | |
+ ", c-ip AS cIp" ` | |
+ ", cs(User-Agent) AS csUserAgent" ` | |
+ ", cs(Referer) AS csReferer" ` | |
+ ", sc-status AS scStatus" ` | |
+ ", sc-substatus AS scSubstatus" ` | |
+ ", sc-win32-status AS scWin32Status" ` | |
+ ", sc-bytes AS scBytes" ` | |
+ ", cs-bytes AS csBytes" ` | |
+ ", time-taken AS timeTaken" ` | |
+ " INTO IisLog" ` | |
+ " FROM $LogPath" | |
$connectionString = "Driver={SQL Server Native Client 11.0};Server=$DbServer;Database=$DbName;Trusted_Connection=yes;" | |
[string[]] $parameters = @() | |
$parameters += $query | |
$parameters += "-i:W3C" | |
$parameters += "-e:-1" | |
$parameters += "-o:SQL" | |
$parameters += "-createTable:ON" | |
$parameters += "-oConnString:`"$connectionString`"" | |
Write-Debug "Parameters: $parameters" | |
Write-Host ("[{0}] Importing $LogPath to $database.$dbserver ..." -f [DateTime]::Now) | |
$elapsed = [System.Diagnostics.Stopwatch]::StartNew() | |
## this is the important line | |
$result = . $logParser $parameters | |
$elapsed.Stop() | |
$recordCount = $result |% { if($_ -match "Elements output") { $_.Split(" ", [StringSplitOptions]::RemoveEmptyEntries)[2] } } | |
Write-Host ("[{0}] Finished importing $LogPath to $DbName.$DbServer (Running Time: {1}, Records: {2})" -f [DateTime]::Now, $elapsed.Elapsed, $recordCount) | |
} | |
## run the program | |
$sites = dir -Path "\\$proxyserver\Logs\IIS\" -Directory | |
$todaysLog = ("u_ex{0}.log" -f [DateTime]::UtcNow.ToString("yyMMdd")) | |
$yesterdaysLog = ("u_ex{0}.log" -f [DateTime]::UtcNow.AddDays(-1).ToString("yyMMdd")) | |
foreach($site in $sites) { | |
if(Test-Path "$($site.FullName)\LogFiles") { | |
$w3Name = (dir -Path "$($site.FullName)\LogFiles" -Directory).Name | |
$logs = dir -Path "$($site.FullName)\LogFiles\$w3Name" -File -Filter *.log; | |
$logs = $logs |? { $_.Name -match $yesterdaysLog } | |
foreach($log in $logs) { | |
#Write-Host $log.FullName | |
Import-IisLogFilesToDb ` | |
-LogPath $log.FullName ` | |
-IisServer $server ` | |
-IisSiteName $site.Name ` | |
-DbServer $dbserver ` | |
-DbName $database | |
} | |
} | |
} | |
} finally { | |
foreach($er in $Error) { | |
$er | |
} | |
$end = [DateTime]::Now | |
Write-Host "End: $end" -ForegroundColor Yellow | |
Write-Host "Total: $($end - $start)" | |
$logoutput | Disable-LogFile | |
if(@($Error).Count -gt 0) { | |
# send an email to steven | |
$priority = [System.Net.Mail.MailPriority]::Normal; | |
if($environment -eq "prod") { $priority = [System.Net.Mail.MailPriority]::High } | |
$messageParams = @{ | |
Subject = "[Error] $name Error ($environment, $($env:COMPUTERNAME))"; | |
Priority = $priority; | |
BodyAsHtml = $true; | |
Body = "<pre>`n" + ((Get-Content $logoutput.Path) -join "`n") + "`n</pre>"; | |
From = "noreply@sa.ucsb.edu"; | |
To = @("Your Name <your.name@your.domain.com>") | |
SmtpServer = "smtp.your.domain.com"; | |
}; | |
Send-MailMessage @messageParams; | |
} | |
} |
SET NUMERIC_ROUNDABORT OFF | |
GO | |
SET ANSI_PADDING, ANSI_WARNINGS, CONCAT_NULL_YIELDS_NULL, ARITHABORT, QUOTED_IDENTIFIER, ANSI_NULLS ON | |
GO | |
SET XACT_ABORT ON | |
GO | |
SET TRANSACTION ISOLATION LEVEL SERIALIZABLE | |
GO | |
BEGIN TRANSACTION | |
GO | |
IF @@ERROR <> 0 SET NOEXEC ON | |
GO | |
PRINT N'Creating [dbo].[IisLog]' | |
GO | |
CREATE TABLE [dbo].[IisLog] | |
( | |
[EntryTime] [datetime] NOT NULL, | |
[LogFilename] [varchar] (255) COLLATE SQL_Latin1_General_CP1_CI_AS NOT NULL, | |
[RowNumber] [int] NOT NULL, | |
[SiteName] [varchar] (255) COLLATE SQL_Latin1_General_CP1_CI_AS NULL, | |
[ServerName] [varchar] (255) COLLATE SQL_Latin1_General_CP1_CI_AS NULL, | |
[sIp] [varchar] (255) COLLATE SQL_Latin1_General_CP1_CI_AS NULL, | |
[csMethod] [varchar] (255) COLLATE SQL_Latin1_General_CP1_CI_AS NULL, | |
[csUriStem] [varchar] (255) COLLATE SQL_Latin1_General_CP1_CI_AS NULL, | |
[csUriQuery] [varchar] (255) COLLATE SQL_Latin1_General_CP1_CI_AS NULL, | |
[sPort] [int] NULL, | |
[csUsername] [varchar] (255) COLLATE SQL_Latin1_General_CP1_CI_AS NULL, | |
[cIp] [varchar] (255) COLLATE SQL_Latin1_General_CP1_CI_AS NULL, | |
[csUserAgent] [varchar] (255) COLLATE SQL_Latin1_General_CP1_CI_AS NULL, | |
[csReferer] [varchar] (255) COLLATE SQL_Latin1_General_CP1_CI_AS NULL, | |
[scStatus] [int] NULL, | |
[scSubstatus] [int] NULL, | |
[scWin32Status] [int] NULL, | |
[scBytes] [int] NULL, | |
[csBytes] [int] NULL, | |
[timeTaken] [int] NULL | |
) | |
GO | |
IF @@ERROR <> 0 SET NOEXEC ON | |
GO | |
PRINT N'Creating primary key [PK_IisLog] on [dbo].[IisLog]' | |
GO | |
ALTER TABLE [dbo].[IisLog] ADD CONSTRAINT [PK_IisLog] PRIMARY KEY CLUSTERED ([EntryTime], [LogFilename], [RowNumber]) | |
GO | |
IF @@ERROR <> 0 SET NOEXEC ON | |
GO | |
PRINT N'Creating index [IX_EntryTime_LogFilename_RowNumber_csUriStem] on [dbo].[IisLog]' | |
GO | |
CREATE NONCLUSTERED INDEX [IX_EntryTime_LogFilename_RowNumber_csUriStem] ON [dbo].[IisLog] ([EntryTime], [LogFilename], [RowNumber], [csUriStem]) | |
GO | |
IF @@ERROR <> 0 SET NOEXEC ON | |
GO | |
PRINT N'Creating index [IX_EntryTime_LogFilename_RowNumber_csUriStem_timeTaken] on [dbo].[IisLog]' | |
GO | |
CREATE NONCLUSTERED INDEX [IX_EntryTime_LogFilename_RowNumber_csUriStem_timeTaken] ON [dbo].[IisLog] ([EntryTime], [LogFilename], [RowNumber], [csUriStem], [timeTaken]) | |
GO | |
IF @@ERROR <> 0 SET NOEXEC ON | |
GO | |
PRINT N'Creating index [IX_EntryTime_LogFilename_RowNumber_siteName_cIp_csUriStem] on [dbo].[IisLog]' | |
GO | |
CREATE NONCLUSTERED INDEX [IX_EntryTime_LogFilename_RowNumber_siteName_cIp_csUriStem] ON [dbo].[IisLog] ([EntryTime], [LogFilename], [RowNumber], [SiteName], [cIp], [csUriStem]) | |
GO | |
IF @@ERROR <> 0 SET NOEXEC ON | |
GO | |
PRINT N'Creating index [IX_EntryTime_LogFilename_RowNumber_siteName_csUriStem] on [dbo].[IisLog]' | |
GO | |
CREATE NONCLUSTERED INDEX [IX_EntryTime_LogFilename_RowNumber_siteName_csUriStem] ON [dbo].[IisLog] ([EntryTime], [LogFilename], [RowNumber], [SiteName], [csUriStem]) | |
GO | |
IF @@ERROR <> 0 SET NOEXEC ON | |
GO | |
PRINT N'Creating index [IX_EntryTime_LogFilename_RowNumber_siteName_csUriStem_scStatus] on [dbo].[IisLog]' | |
GO | |
CREATE NONCLUSTERED INDEX [IX_EntryTime_LogFilename_RowNumber_siteName_csUriStem_scStatus] ON [dbo].[IisLog] ([EntryTime], [LogFilename], [RowNumber], [SiteName], [csUriStem], [scStatus]) | |
GO | |
IF @@ERROR <> 0 SET NOEXEC ON | |
GO | |
PRINT N'Creating index [IX_EntryTime_LogFilename_RowNumber_siteName_csUriStem_scStatus_scSubstatus_timeTaken] on [dbo].[IisLog]' | |
GO | |
CREATE NONCLUSTERED INDEX [IX_EntryTime_LogFilename_RowNumber_siteName_csUriStem_scStatus_scSubstatus_timeTaken] ON [dbo].[IisLog] ([EntryTime], [LogFilename], [RowNumber], [SiteName], [csUriStem], [scStatus], [scSubstatus], [timeTaken]) | |
GO | |
IF @@ERROR <> 0 SET NOEXEC ON | |
GO | |
PRINT N'Creating index [IX_EntryTime_LogFilename_RowNumber_siteName_csUriStem_scStatus_timeTaken] on [dbo].[IisLog]' | |
GO | |
CREATE NONCLUSTERED INDEX [IX_EntryTime_LogFilename_RowNumber_siteName_csUriStem_scStatus_timeTaken] ON [dbo].[IisLog] ([EntryTime], [LogFilename], [RowNumber], [SiteName], [csUriStem], [scStatus], [timeTaken]) | |
GO | |
IF @@ERROR <> 0 SET NOEXEC ON | |
GO | |
PRINT N'Creating index [IX_EntryTime_LogFilename_RowNumber_siteName_ServerName_csUriStem] on [dbo].[IisLog]' | |
GO | |
CREATE NONCLUSTERED INDEX [IX_EntryTime_LogFilename_RowNumber_siteName_ServerName_csUriStem] ON [dbo].[IisLog] ([EntryTime], [LogFilename], [RowNumber], [SiteName], [ServerName], [csUriStem]) | |
GO | |
IF @@ERROR <> 0 SET NOEXEC ON | |
GO | |
COMMIT TRANSACTION | |
GO | |
IF @@ERROR <> 0 SET NOEXEC ON | |
GO | |
DECLARE @Success AS BIT | |
SET @Success = 1 | |
SET NOEXEC OFF | |
IF (@Success = 1) PRINT 'The database update succeeded' | |
ELSE BEGIN | |
IF @@TRANCOUNT > 0 ROLLBACK TRANSACTION | |
PRINT 'The database update failed' | |
END | |
GO |
0 comments:
Post a Comment