I just stumbled upon a utility today that is going to save me a LOT of time — Microsoft Log Parser 2.2.
The past week has presented me with two different problems that I knew a SQL query would solve but my data was in text files and I wanted to keep it that way. I’m still a SQL newbie; I can write some queries to squeeze out the information I need but thats about it.
Let’s try MS Query
The first problem involved one text file with directory names and sizes. The second text file was a list of users with both username and userid as fields. I needed to find out which directories matched up with userid’s.
I found the MS Query* tool included with Office. Supposedly you can use it to treat both text files and Excel files as databases using SQL queries. I could never get it to work right for either file type. I shelved the problem for later.
That Didn’t Work, Now What?
On to the next puzzle. We have a field called employeeID added to the schema of Active Directory. I needed to find all accounts that have duplicate employeeID’s. I knew I could do this with a SQL query. I already had a CSV file containing a dump of several AD attributes for all users. I planned on importing the CSV file into a SQL table and working with it. In searching to figure out how to build the query, I found Log Parser.
Log Parser is a utility written by Microsoft originally to parse IIS logs. It does that and much, much more. It will parse just about any structured text file, the event log, the file system, and the registry. It’s not a supported utility but this site has an active user community.
I was able to use this command to pull up a nice GUI table format:
logparser file:myquery.sql -i:csv -o:datagrid
This command exported the same information to a CSV file:
logparser file:myquery.sql -i:csv -o:CSV -headers:ON > duplicate-employeeid.csv
My myquery.sql file looked like this:
SELECT employeeID,sAMAccountName,sn,givenName FROM 'users.csv' WHERE employeeID IN (SELECT employeeID FROM 'users.csv' GROUP BY employeeID HAVING (COUNT(employeeID) > 1)) ORDER BY employeeID
Then I discovered that Log Parser would query Active Directory! I knew how to use a simple SELECT statement in VBScript but it didn’t support all the SQL keywords I needed to use. This utility does that beautifully.
I changed my query file to this:
SELECT employeeID,sAMAccountName,sn,givenName FROM LDAP://domain.org WHERE employeeID IN (SELECT employeeID FROM LDAP://domain.org GROUP BY employeeID HAVING (COUNT(employeeID) > 1)) ORDER BY employeeID
And ran this command and voila! A complex SQL query of Active Directory!
logparser file:myquery.sql -i:ADS -objectClass:User -o:CSV -headers:ON > duplicate-employeeid.csv
The documentation that comes with Log Parser is very useful and comprehensive. I also found an article (part1, part2) describing how to use Log Parser to view all the different types of Exchange logs and pull information into charts and graphs. I’ll be checking that out for sure.
Download Log Parser here: http://www.microsoft.com/download/en/details.aspx?id=24659
Apparently there’s a book, too: Microsoft Log Parser Toolkit: A complete toolkit for Microsoft’s undocumented log analysis tool
* MS Query located here for Office 2010: c:\Program Files\Microsoft Office\Office14\MSQRY32.EXE.
I found today that you can’t use JOIN in the queries for Log Parser. You CAN use the IN keyword to work around this. For a more complicated JOIN workaround, I suppose you could do one query, export it to a file and then do another query based on that.
Sample IN query:
SELECT DIR As UserID, SIZE_MB As Dir_Size_MB FROM 'directory-size.txt' WHERE UserID IN (SELECT USERID FROM 'users.txt') ORDER BY Dir_Size_MB DESC