TheGeekery

The Usual Tech Ramblings

Cleaning up remote directories with PowerShell

Part of our application at work sends email notifications to various uses when certain criteria are met. This is great, except testing is difficult as you really don’t want to send out mails to valid customers, so we disable relaying outbound, and the local mail server just chomps on it a bit, then decides to not deliver it and drops it to the local file system. This is great, except in a the matter of days, we have 100k message files. This becomes entirely unmanageable with Windows, so PowerShell is here to help…

There are plenty of articles on file management with PowerShell, pretty much all of them revolving around the Get-ChildItem function. This usually works pretty well, but is terrible on memory when it comes to handling this many files, and working with filters (only selecting files that are X days old).

This is when my brain clicked to a PowerShell.com tip a few weeks back about Windows Management Instrumentation (WMI) based filters. I won’t go into the dark recesses of my brain, and try an explain why I have a weird ability to remember some weird function that I used 5 years ago, but I’ll save you some work and introduce you to the WMI class CIM_DataFile. The cool thing about using WMI to do this, you can use the server side filters to do most of the dirty work for you. So lets see the example:

Get-WmiObject CIM_DataFile -Computer yourremoteserver `
    -Filter "Drive='C:' and Path='\\InetPub\\MailRoot\\BadMail\\\'"

Okay, that was pretty easy… right? That just gets all the files in the folder C:\inetpub\mailroot\badmail on the server yourremoteserver. Now we want to limit it to just keeping the mail from the last 3 days. Before I did into the code that does this, the date format on CIM_DataFile through PowerShell is a little ‘odd’. Don’t believe me? Try this:

Get-WmiObject CIM_DataFile `
    -Filter "Drive='C:' and Path='\\Temp\\\'" | Select FileName,CreationDate

You will probably end up with something that looks like this:

FileName                                 CreationDate
--------                                 ------------
process_usage                            20100407083839.775225-300
proxy1                                   20100517150112.509957-300
proxy2                                   20100517150105.506804-300
Registration                             20100131221914.239533-360
sql_job_status                           20100401160936.442674-300

Okay, not too odd, it’s really just the date in the format YYYYmmddhhmmss.nnnnnn-tzo. n is milliseconds, and tzo is the timezone offset in minutes. In the above example, 300 would be 5 hours, 360 is 6 hours. This means, for the query, we have to modify slightly.

$date = Get-Date
$date = $date.AddDays(-3)
$sDate = "{0:00}{1:00}{2:00}000000.000000-000" -f $date.Year, $date.Month, $date.Day
Get-WmiObject CIM_DataFile -Computer yourremoteserver `
    -Filter "Drive='C:' and Path='\\InetPub\\MailRoot\\BadMail\\\' and CreationDate <= '$sDate'"

You might be wondering what that weird stuff on line 3 is, it’s the PowerShell equivalent to C#’s string.Format. This takes the date we had, and reformats it nicely for WMI to use. This is much nicer, gets WMI to do all the grunt work, and is much more friendly on the server (memory/CPU). Next, the delete…

$date = Get-Date
$date = $date.AddDays(-3)
$sDate = "{0:00}{1:00}{2:00}000000.000000-000" -f $date.Year, $date.Month, $date.Day
Get-WmiObject CIM_DataFile -Computer yourremoteserver `
    -Filter "Drive='C:' and Path='\\InetPub\\MailRoot\\BadMail\\\' and CreationDate <= '$sDate'" | ForEach-Object{ $_.Delete() }

As I mentioned at the beginning, it’s possible to use the Get-ChildItem function to do the same thing (except it’s not remote), so I figured I’d show how too…

$date = Get-Date
$date = $date.AddDays(-3)
Get-ChildItem -Path 'C:\inetpub\mailroot\badmail' | `
    ? {$_.CreationDate -le $date} | `
    ForEach-Object{ $_.Delete() }

The problem you can see here, the filtering is done after all the items are fetched, rather than restricting the fetched list to begin with. This cannot be executed remotely either (unless using network shares), so running this from a central job server is a little more tricky.

I’m sticking with the WMI method for now, it’s fast, memory and CPU friendly, and does the job just as nicely. How do you guys manage remote files with PowerShell? Any hints and tips?

Edit: Thanks to Steven Klassen, I edited some of the code. Apparently the CodeColorer plugin was escaping my escape and only showing a single backslash on the end of some of the commands.

Comments