Diff XML, via sorting XML elements and attributes

Have you ever had to diff XML files? Because ordering isn’t important in XML structure, it’s often hard to diff changes in XML, especially if changes have been generated by tooling. A colleague and I came up with a quick script in PowerShell to order all elements and attributes in an XML files for an easier diff.

Note, this will overwrite the file without warning. We chose this option because we’re using source control for the XML files. This easily allows us to diff files with their previous versions they have been cleaned up and modified by other tooling. You may want to change this behaviour to allow you to specify a different file name to save to.

Edit: I’ve updated the script to include ValueFromPipeline for easier piping

<#
.SYNOPSIS Sorts an xml file by element and attribute names. Useful for diffing XML files.
#>

param (
    [Parameter(Mandatory=$true,ValueFromPipeline=$true)]
    # The path to the XML file to be sorted
    [string]$XmlPath
)

process {
    if (-not (Test-Path $XmlPath)) {
        Write-Warning "Skipping $XmlPath, as it was not found."
        continue;
    }

    $fullXmlPath = (Resolve-Path $XmlPath)
    [xml]$xml = Get-Content $fullXmlPath
    Write-Output "Sorting $fullXmlPath"

    function SortChildNodes($node, $depth = 0, $maxDepth = 20) {
        if ($node.HasChildNodes -and $depth -lt $maxDepth) {
            foreach ($child in $node.ChildNodes) {
                SortChildNodes $child ($depth + 1) $maxDepth
            }
        }

        $sortedAttributes = $node.Attributes | Sort-Object { $_.Name }
        $sortedChildren = $node.ChildNodes | Sort-Object { $_.OuterXml }

        $node.RemoveAll()

        foreach ($sortedAttribute in $sortedAttributes) {
            [void]$node.Attributes.Append($sortedAttribute)
        }

        foreach ($sortedChild in $sortedChildren) {
            [void]$node.AppendChild($sortedChild)
        }
    }

    SortChildNodes $xml.DocumentElement

    $xml.Save($fullXmlPath)
}
Advertisement

5 thoughts on “Diff XML, via sorting XML elements and attributes

  1. William Hug January 24, 2018 / 05:48

    What would the command be to use this in Visual Studio Team Foundation server (Tools>Options>Source Control>Visual Studio Team Foundation Server>Configure User Tools)?

  2. Zvi Zemel April 21, 2018 / 04:46

    Is this code under any sort of license? We’d like to use the code as-is in a utility script but there’s no information about licensing.

    • Daniel Šmon April 21, 2018 / 10:20

      Hi Zvi,

      There is no license, so happy to mark it as “public domain”, with no warranty etc. Feel free to use this snippet. I hope you get some value from it :).

      Please be cautious, as it’s entirely possible that this code will break your XML. I only tested it on a small set of data, so your mileage may vary.

      Regards,
      Daniel

  3. Tomislav May 18, 2018 / 19:55

    Thanks. The script helped me a lot in comparing unsorted xml files.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s