In a after-site-deployment scenario, I needed to compare site directories and files on front-ends in a load-balanced web server farm to make sure these sites contained identical files and directory structure. When a hotfix was applied by an administrator, once too often files were not deployed properly and behavior of the site was not consistent because of the differences in file or configuration on front-ends. To diagnose these kind of problems more quickly, I developed PowerShell command Compare-Directory to compare a reference directory with one or more difference directories.
Learning from Compare-Object
When designing Compare-Directory, I took a good look at PowerShells own Compare-Object, which does a hell of a job comparing two objects:
Compare-Object –ReferenceObject $ref –DifferenceObject $diff –ExcludeDifferent –IncludeEqual ` –PassThru –CaseSensitive –SyncWindow $sw
After playing with Compare-Object, requirements arose for Compare-Directory:
- I wanted to support ExludeDifferent, IncludeEqual and PassThru in Compare-Directory as well.
- I wanted to be able to exclude files in the comparison.
- I wanted to be able to exclude directories in the comparison.
- I wanted to recurse the comparison to subdirectories.
- I wanted to compare a reference directory with one or more difference directories.
The output of Compare-Directory should be similar to Compare-Object.
But how to compare files and directories from the reference directory to the difference directory? If I use Get-ChildItem to gather the files and (sub)folders from the reference directory to compare, the fullpaths are bound to differ from files and (sub)folders in the difference directory. To illustrate, consider these example reference directory “FrontEnd1-Site” and difference directory “FrontEnd2-Site” as a subdirectory in “CompareObjectTest”:
Reference directory Difference directory C:\TestFrontEnd1-Site\bin\site.dll C:\TestFrontEnd2-Site\bin\site.dll C:\TestFrontEnd1-Site\help.txt C:\TestFrontEnd1-Site\index.htm C:\TestFrontEnd2-Site\index.htm C:\TestFrontEnd1-Site\web.config C:\TestFrontEnd2-Site\web.config
Comparing the results of Get-ChildItem “C:\TestFrontEnd1-Site” –recurse with the results of Get-ChildItem “C:\TestFrontEnd2-Site\”–recurse with Compare-Object would yield all as different. So comparing on fullname, name or basename would not produce the desired result.
Comparing files and directories
Comparing two directories should be executed on two aspects, namely directory structure and file content. Beginning with the latter: comparing files is simple: compare file content (without taking file dates in account). The easiest way is calculating an hash for a file and comparing hashes. Enter Get-MD5.
function Get-MD5 { [CmdletBinding(SupportsShouldProcess=$false)] param ( [Parameter(Mandatory=$true, ValueFromPipeline=$true, ValueFromPipelineByPropertyName=$true, HelpMessage="file(s) to create hash for")] [Alias("File", "Path", "PSPath", "String")] [ValidateNotNull()] $InputObject ) begin { $cryptoServiceProvider = [System.Security.Cryptography.MD5CryptoServiceProvider] $hashAlgorithm = new-object $cryptoServiceProvider } process { $hashByteArray = "" $item = Get-Item $InputObject -ErrorAction SilentlyContinue if ($item -is [System.IO.DirectoryInfo]) { throw "Cannot create hash for directory" } if ($item) { $InputObject = $item } if ($InputObject -is [System.IO.FileInfo]) { $stream = $null; $hashByteArray = $null try { $stream = $InputObject.OpenRead(); $hashByteArray = $hashAlgorithm.ComputeHash($stream); } finally { if ($stream -ne $null) { $stream.Close(); } } } else { $utf8 = new-object -TypeName "System.Text.UTF8Encoding" $hashByteArray = $hashAlgorithm.ComputeHash($utf8.GetBytes($InputObject.ToString())); } Write-Output ([BitConverter]::ToString($hashByteArray)).Replace("-","") } }
Get-MD5 “C:\TestFrontEnd1-Site\bin\site.dll” would return the MD5 hash for the file:
PS> Get-MD5 "C:\TestFrontEnd1-Site\bin\site.dll" 8FDBD9537FC1D6BA4FD2F1B8944A2686
The hash of a file in the reference directory can now be easily compared with the hash of the file in the difference directory. The first step in the compare process is collecting a files and directories with Get-ChildItem and calculating the hash. And this is why I love PowerShell: I add the hash as a NoteProperty member to the FileInfo object as returned by Get-ChildItem with name “MD5Hash”; note the “Add-Member” that adds NoteProperty “MD5Hash” to the object.
Get-ChildItem -Path $DirectoryPath -Exclude $ExcludeFile -Recurse:$Recurse | foreach { $hash = "" if (!$_.PSIsContainer) { $hash = Get-MD5 $_ } # Added two new properties to the DirectoryInfo/FileInfo objects $item = $_ | Add-Member -Name "MD5Hash" -MemberType NoteProperty -Value $hash –PassThru }
Now how to compare the file and directory structure? We to compare file and directory structures relatively to the base of the reference and difference directory. Illustrative:
Reference directory Difference directory \bin\site.dll \bin\site.dll \help.txt \index.htm \index.htm \web.config \web.config
For this purpose, I need to add another member “RelativeBaseName” to the FileInfo and DirectoryInfo objects returned by Get-ChildItem. A function function called Get-Files retrieves files and folders and adds the two new members:
PS> Get-Files "C:\Test\FrontEnd1-Site" -Recurse | select fullname, RelativeBaseName, MD5Hash ` | ft -AutoSize FullName RelativeBaseName MD5Hash -------- ---------------- ------- C:\Test\FrontEnd1-Site\bin \bin\ C:\Test\FrontEnd1-Site\help.txt \help.txt 9E267E67EDC0AAA2D4DFDEDA885635BB C:\Test\FrontEnd1-Site\index.htm \index.htm B0C46E58F3FDC897F20697FFA7726A0A C:\Test\FrontEnd1-Site\web.config \web.config CC45796B7296B5A2D3EC6DED46626144 C:\Test\FrontEnd1-Site\bin\site.dll \bin\site.dll 4F750B7EFCB6520AE01E01D082D7D476
Now get a similar file/directory set for the difference directory and Compare-Object can be used to compare the two sets on properties RelativeBaseName and MD5Hash!
PS> $referencefiles = Get-Files "C:\Test\FrontEnd1-Site" –Recurse $differencefiles = Get-Files "C:\Test\FrontEnd2-Site" –Recurse Compare-Object –referenceObject $referencefiles –differenceObject $differencefiles ` -Property RelativeBaseName, MD5Hash RelativeBaseName MD5Hash SideIndicator Item ---------------- ------- ------------- ---- \index.htm 3B47F479A6075169531A4B34DF3154A6 => index.htm \help.txt 9E267E67EDC0AAA2D4DFDEDA885635BB <= help.txt \index.htm B0C46E58F3FDC897F20697FFA7726A0A <= index.htm
Hey! It looks like file index.htm is different and help.txt can only be found in the reference directory (missing in the difference directory). I’ve included the Compare-Directory.ps1 and sample front-end site directories a ZIP, bundled with this article.
Without further ado, here’s the syntax of Compare-Directory script commandlet:
Compare-Directory [-ReferenceDirectory] [-DifferenceDirectory] <DirectoryInfo[]> [-Recurse] [-ExcludeFile <String[]>] [-ExcludeDirectory <String[]>] [-ExcludeDifferent] [-IncludeEqual] [-PassThru] []
Parameters Excludefile and ExcludeDirectory lists the names of files and directories to exclude from the comparison. Specifying ExcludeDifferent displays only the characteristics of compared objects that are equal; IncludeEqual displays characteristics of files that are equal. By default, only characteristics that differ between the reference and difference files are displayed. PassTru passes the files in comparison, instead of the sideIndicator view (as above).
# Compare-Directory.ps1 # Compare files in one or more directories and return file difference results # Victor Vogelpoel <victor.vogelpoel@macaw.nl> # Sept 2013 # Compare-Directory -ReferenceDirectory "C:\Compare-Directory\FrontEnd1-Site" -DifferenceDirectory "C:\Compare-Directory\FrontEnd2-Site" function global:Compare-Directory { [CmdletBinding()] param ( [Parameter(Mandatory=$true, position=0, ValueFromPipelineByPropertyName=$true, HelpMessage="The reference directory to compare one or more difference directories to.")] [System.IO.DirectoryInfo]$ReferenceDirectory, [Parameter(Mandatory=$true, position=1, ValueFromPipeline=$true, ValueFromPipelineByPropertyName=$true, HelpMessage="One or more directories to compare to the reference directory.")] [System.IO.DirectoryInfo[]]$DifferenceDirectory, [Parameter(Mandatory=$false, ValueFromPipelineByPropertyName=$true, HelpMessage="Recurse the directories")] [switch]$Recurse, [Parameter(Mandatory=$false, ValueFromPipelineByPropertyName=$true, HelpMessage="Files to exclude from the comparison")] [String[]]$ExcludeFile, [Parameter(Mandatory=$false, ValueFromPipelineByPropertyName=$true, HelpMessage="Directories to exclude from the comparison")] [String[]]$ExcludeDirectory, [Parameter(Mandatory=$false, ValueFromPipelineByPropertyName=$true, HelpMessage="Displays only the characteristics of compared objects that are equal.")] [switch]$ExcludeDifferent, [Parameter(Mandatory=$false, ValueFromPipelineByPropertyName=$true, HelpMessage="Displays characteristics of files that are equal. By default, only characteristics that differ between the reference and difference files are displayed.")] [switch]$IncludeEqual, [Parameter(Mandatory=$false, ValueFromPipelineByPropertyName=$true, HelpMessage="Passes the objects that differed to the pipeline.")] [switch]$PassThru ) begin { function Get-MD5 { [CmdletBinding(SupportsShouldProcess=$false)] param ( [Parameter(Mandatory=$true, ValueFromPipeline=$true, ValueFromPipelineByPropertyName=$true, HelpMessage="file(s) to create hash for")] [Alias("File", "Path", "PSPath", "String")] [ValidateNotNull()] $InputObject ) begin { $cryptoServiceProvider = [System.Security.Cryptography.MD5CryptoServiceProvider] $hashAlgorithm = new-object $cryptoServiceProvider } process { $hashByteArray = "" $item = Get-Item $InputObject -ErrorAction SilentlyContinue if ($item -is [System.IO.DirectoryInfo]) { throw "Cannot create hash for directory" } if ($item) { $InputObject = $item } if ($InputObject -is [System.IO.FileInfo]) { $stream = $null; $hashByteArray = $null try { $stream = $InputObject.OpenRead(); $hashByteArray = $hashAlgorithm.ComputeHash($stream); } finally { if ($stream -ne $null) { $stream.Close(); } } } else { $utf8 = new-object -TypeName "System.Text.UTF8Encoding" $hashByteArray = $hashAlgorithm.ComputeHash($utf8.GetBytes($InputObject.ToString())); } Write-Output ([BitConverter]::ToString($hashByteArray)).Replace("-","") } } function Get-Files { [CmdletBinding(SupportsShouldProcess=$false)] param ( [string]$DirectoryPath, [String[]]$ExcludeFile, [String[]]$ExcludeDirectory, [switch]$Recurse ) $relativeBasenameIndex = $DirectoryPath.ToString().Length # Get the files from the first deploypath # and ADD the MD5 hash for the file as a property # and ADD a filepath relative to the deploypath as a property Get-ChildItem -Path $DirectoryPath -Exclude $ExcludeFile -Recurse:$Recurse | foreach { $hash = "" if (!$_.PSIsContainer) { $hash = Get-MD5 $_ } # Added two new properties to the DirectoryInfo/FileInfo objects $item = $_ | Add-Member -Name "MD5Hash" -MemberType NoteProperty -Value $hash -PassThru | Add-Member -Name "RelativeBaseName" -MemberType NoteProperty -Value ($_.FullName.Substring($relativeBasenameIndex)) -PassThru # Test for directories and files that need to be excluded because of ExcludeDirectory if ($item.PSIsContainer) { $item.RelativeBaseName += "\" } if ($ExcludeDirectory | where { $item.RelativeBaseName -like "\$_\*" }) { Write-Verbose "Ignore item `"$($item.Fullname)`"" } else { Write-Verbose "Adding `"$($item.Fullname)`" to result set" Write-Output $item } } } $referenceDirectoryFiles = Get-Files -DirectoryPath $referenceDirectory -ExcludeFile $ExcludeFile -ExcludeDirectory $ExcludeDirectory -Recurse:$Recurse } process { if ($DifferenceDirectory -and $referenceDirectoryFiles) { foreach($nextPath in $DifferenceDirectory) { $nextDifferenceFiles = Get-Files -DirectoryPath $nextpath -ExcludeFile $ExcludeFile -ExcludeDirectory $ExcludeDirectory -Recurse:$Recurse ################################################### # Compare the contents of the two file/directory arrays and return the results $results = @(Compare-Object -ReferenceObject $referenceDirectoryFiles -DifferenceObject $nextDifferenceFiles -ExcludeDifferent:$ExcludeDifferent -IncludeEqual:$IncludeEqual -PassThru:$PassThru -Property RelativeBaseName, MD5Hash) if (!$PassThru) { foreach ($result in $results) { $path = $ReferenceDirectory $pathFiles = $referenceDirectoryFiles if ($result.SideIndicator -eq "=>") { $path = $nextPath $pathFiles = $nextDifferenceFiles } # Find the original item in the files array $itemPath = (Join-Path $path $result.RelativeBaseName).ToString().TrimEnd('\') $item = $pathFiles | where { $_.fullName -eq $itemPath } $result | Add-Member -Name "Item" -MemberType NoteProperty -Value $item } } Write-Output $results } } } <# .SYNOPSIS Compares a reference directory with one or more difference directories. .DESCRIPTION Compare-Directory compares a reference directory with one ore more difference directories. Files and directories are compared both on filename and contents using a MD5hash. Internally, Compare-Object is used to compare the directories. The behavior and results of Compare-Directory is similar to Compare-Object. .PARAMETER ReferenceDirectory The reference directory to compare one or more difference directories to. .PARAMETER DifferenceDirectory One or more directories to compare to the reference directory. .PARAMETER Recurse Include subdirectories in the comparison. .PARAMETER ExcludeFile File names to exclude from the comparison. .PARAMETER ExcludeDirectory Directory names to exclude from the comparison. Directory names are relative to the Reference of Difference Directory path .PARAMETER ExcludeDifferent Displays only the characteristics of compared files that are equal. .PARAMETER IncludeEqual Displays characteristics of files that are equal. By default, only characteristics that differ between the reference and difference files are displayed. .PARAMETER PassThru Passes the objects that differed to the pipeline. By default, this cmdlet does not generate any output. .EXAMPLE Compare-Directory -reference "D:\TEMP\CompareTest\path1" -difference "D:\TEMP\CompareTest\path2" -ExcludeFile "web.config" -recurse Compares directories "D:\TEMP\CompareTest\path1" and "D:\TEMP\CompareTest\path2" recursively, excluding "web.config" Only differences are shown. Results: RelativeBaseName MD5Hash SideIndicator Item ---------------- ------- ------------- ---- bin\site.dll 87A1E6006C2655252042F16CBD7FB41B => D:\TEMP\CompareTest\path2\bin\site.dll index.html 02BB8A33E1094E547CA41B9E171A267B => D:\TEMP\CompareTest\path2\index.html index.html 20EE266D1B23BCA649FEC8385E5DA09D <= D:\TEMP\CompareTest\path1\index.html web_2.config 5E6B13B107ED7A921AEBF17F4F8FE7AF <= D:\TEMP\CompareTest\path1\web_2.config bin\site.dll 87A1E6006C2655252042F16CBD7FB41B => D:\TEMP\CompareTest\path2\bin\site.dll index.html 02BB8A33E1094E547CA41B9E171A267B => D:\TEMP\CompareTest\path2\index.html index.html 20EE266D1B23BCA649FEC8385E5DA09D <= D:\TEMP\CompareTest\path1\index.html web_2.config 5E6B13B107ED7A921AEBF17F4F8FE7AF <= D:\TEMP\CompareTest\path1\web_2.config .EXAMPLE Compare-Directory -reference "D:\TEMP\CompareTest\path1" -difference "D:\TEMP\CompareTest\path2" -ExcludeFile "web.config" -recurse -IncludeEqual Compares directories "D:\TEMP\CompareTest\path1" and "D:\TEMP\CompareTest\path2" recursively, excluding "web.config". Results include the items that are equal: RelativeBaseName MD5Hash SideIndicator Item ---------------- ------- ------------- ---- bin == D:\TEMP\CompareTest\path1\bin bin\site2.dll 98B68D681A8D40FA943D90588E94D1A9 == D:\TEMP\CompareTest\path1\bin\site2.dll bin\site3.dll 9408C4B29F82260CBBA528342CBAA80F == D:\TEMP\CompareTest\path1\bin\site3.dll bin\site4.dll 0616E1FBE12D468F611F07768D70C2EE == D:\TEMP\CompareTest\path1\bin\site4.dll ... bin\site8.dll 87A1E6006C2655252042F16CBD7FB41B => D:\TEMP\CompareTest\path2\bin\site8.dll index.html 02BB8A33E1094E547CA41B9E171A267B => D:\TEMP\CompareTest\path2\index.html index.html 20EE266D1B23BCA649FEC8385E5DA09D <= D:\TEMP\CompareTest\path1\index.html web_2.config 5E6B13B107ED7A921AEBF17F4F8FE7AF <= D:\TEMP\CompareTest\path1\web_2.config .EXAMPLE Compare-Directory -reference "D:\TEMP\CompareTest\path1" -difference "D:\TEMP\CompareTest\path2" -ExcludeFile "web.config" -recurse -ExcludeDifference Compares directories "D:\TEMP\CompareTest\path1" and "D:\TEMP\CompareTest\path2" recursively, excluding "web.config". Results only include the files that are equal; different files are excluded from the results. .EXAMPLE Compare-Directory -reference "D:\TEMP\CompareTest\path1" -difference "D:\TEMP\CompareTest\path2" -ExcludeFile "web.config" -recurse -Passthru Compares directories "D:\TEMP\CompareTest\path1" and "D:\TEMP\CompareTest\path2" recursively, excluding "web.config" and returns NO comparison results, but the different files themselves! FullName -------- D:\TEMP\CompareTest\path2\bin\site3.dll D:\TEMP\CompareTest\path2\index.html D:\TEMP\CompareTest\path1\index.html D:\TEMP\CompareTest\path1\web_2.config .LINK Compare-Object #> }
You can also download the PowerShell script Compare-Directory.ps1 at this gist, or download the script and sample bogus files in this archive: Compare-Directory.ZIP.
Well, have fun with Compare-Directory and make sure to let me know how you’re using it.
December 2, 2013 at 21:03
Hey Victor, thanks for this. It’s nice piece of work. I just had one question. I’m not seeing any way to specify the exclusion of a file in a directory. In my exclude list I’d like to be explicit about which file with a certain name is excluded in case there are multiple files with the same name in different directories.
Thanks.
December 3, 2013 at 08:45
Sean, thank you for your comment.
File-exclusion is processed by the Get-ChildItem, so if you specify an file name to exclude, Get-ChildItem will exclude every same name file in every subdirectory. Compare-Directory is currently not designed to handle your requirement.
What you can do is modify Compare-Directory to remove the Get-Files functionality and feed the reference and difference files arrays to the function to work out the differences. You’ll have to figure out how to execude specific files in specific directories. You could use Get-ChildItem to get all files from a directory and remove specific files from the resulting FileInfo array before feeding it to the modified Compare-Directory function… (I would call the function “Compare-Files” by now 😉
December 4, 2013 at 06:37
Yep. That makes sense. Thanks for the response.
May 14, 2014 at 21:25
Reblogged this on rsr72 and commented:
PowerShell compare directories and MD5 hash
July 30, 2020 at 10:05
Hi Victor,
thanks a lot for your work. Even some years later the script helped me. 🙂
But when entering relative paths, the script does not react as expected.
One suggestion for improvement: Before using the two path variables, I added this code
# problem, with different working directories (Windows / Powershell):
[Environment]::CurrentDirectory = $ExecutionContext.SessionState.Path.CurrentFileSystemLocation
# dissolve relative paths
$ReferenceDirectory = [System.IO.Path]::GetFullPath($ReferenceDirectory)
for ($i=0; $i -lt $DifferenceDirectory.Length; $i++) {
$DifferenceDirectory[$i] = [System.IO.Path]::GetFullPath($DifferenceDirectory[$i])
}
August 7, 2020 at 09:53
Thanks for the update, Erich
November 16, 2020 at 18:42
Hi,
Wanted to thank you for the script and include a couple of things that helped me implement it.
To call the script from an other worker script:
# load PS functions (Note the space between the period and the path)
. C:\Powershell\Compare-Directory.ps1
How to exclude a list of specific files:
# files that are known to change(Note the Back quote (`) is just to have the list on multiple lines and not a long row).
$ExcludeFile = “cert.pfx”,`
“mediregs.html”,`
“RadUploadTestFile”,`
“RadUploadTemp”,`
“App_Data”
November 16, 2020 at 18:59
hello Chris.
You’re welcome! Comparing directories is still a very popular subject after 7 years and I see much traffic going to this page.
As for the $ExcludeFile list: favor the array subexpression @() and not the backtick, which is easily forgotten when you extend the list. Your list would become:
$ExcludeFile = @(“cert.pfx”,
“mediregs.html”,
“RadUploadTestFile”,
“RadUploadTemp”,
“App_Data”)