Remove Duplicates From XML with XSLThttp://forums.asp.net/t/1781907.aspx/1?Remove+Duplicates+From+XML+with+XSLTTue, 20 Mar 2012 01:54:14 -040017819074885813http://forums.asp.net/p/1781907/4885813.aspx/1?Remove+Duplicates+From+XML+with+XSLTRemove Duplicates From XML with XSLT <p>I have a sitemap.xml. I am trying to figure out how to remove duplicate &lt;loc&gt;. If &lt;loc&gt; is a duplicate then remove the containing &lt;url&gt; and all contents with the duplicate &lt;loc&gt;. I have my xslt somewhat working. It removes the &lt;loc&gt; if it is a duplicate, but leaves the leftover &lt;lastmod&gt;, &lt;priority&gt;, and &lt;changefreq&gt; for each duplicate. Any help would be greatly appreciated. Jake</p> <p>Here is my XSLT Code:</p> <p>&lt;?startSampleFile ?&gt;<br> &lt;!-- xq495.xsl: converts xq494.xml into xq496.xml --&gt;<br> &lt;xsl:stylesheet xmlns:xsl=&quot;http://www.w3.org/1999/XSL/Transform&quot;<br> version=&quot;1.0&quot; <br> xmlns:s=&quot;http://www.sitemaps.org/schemas/sitemap/0.9&quot;<br> exclude-result-prefixes=&quot;s&quot;&gt;<br> <br> &lt;xsl:output method=&quot;xml&quot; omit-xml-declaration=&quot;yes&quot;/&gt;<br> <br> &lt;xsl:key name=&quot;kloc&quot; match=&quot;s:loc&quot;<br> use=&quot;.&quot; /&gt;<br> &lt;xsl:key name=&quot;klastmod&quot; match=&quot;s:lastmod&quot;<br> use=&quot;../s:loc&quot;/&gt;<br> &lt;xsl:key name=&quot;kpriority&quot; match=&quot;s:priority&quot;<br> use=&quot;../s:loc&quot;/&gt;<br> &lt;xsl:key name=&quot;kchangefreq&quot; match=&quot;s:changefreq&quot;<br> use=&quot;../s:loc&quot;/&gt;<br> <br> &lt;xsl:template match=&quot;/*&quot;&gt;</p> <p>&lt;xsl:for-each select=<br> &quot;*/s:loc[generate-id()<br> =<br> generate-id(key('kloc',.)[1])]&quot;&gt; <br> &lt;xsl:value-of select=&quot;.&quot;/&gt;<br> &lt;xsl:for-each select=&quot;key('klastmod', .)&quot;&gt;<br> &lt;xsl:value-of select=&quot;.&quot;/&gt;<br> &lt;/xsl:for-each&gt;<br> &lt;xsl:for-each select=&quot;key('kpriority', .)&quot;&gt;<br> &lt;xsl:value-of select=&quot;.&quot;/&gt;<br> &lt;/xsl:for-each&gt;<br> &lt;xsl:for-each select=&quot;key('kchangefreq', .)&quot;&gt;<br> &lt;xsl:value-of select=&quot;.&quot;/&gt;<br> &lt;/xsl:for-each&gt;<br> &lt;/xsl:for-each&gt;<br> &lt;/xsl:template&gt;</p> <p>&lt;/xsl:stylesheet&gt;</p> <p></p> <p>Existing XML:</p> <p>&lt;?xml version=&quot;1.0&quot; encoding=&quot;UTF-8&quot;?&gt;<br> &lt;urlset xmlns=&quot;http://www.sitemaps.org/schemas/sitemap/0.9&quot;&gt;<br> &lt;url&gt;<br> &lt;loc&gt;http://www.example.com/first&lt;/loc&gt;<br> &lt;lastmod&gt;2012-01-27T10:18:26&#43;00:00&lt;/lastmod&gt;<br> &lt;priority&gt;0.90&lt;/priority&gt;<br> &lt;changefreq&gt;daily&lt;/changefreq&gt;<br> &lt;/url&gt;<br> &lt;url&gt;<br> &lt;loc&gt;http://www.example.com/second&lt;/loc&gt;<br> &lt;lastmod&gt;2012-03-16T07:38:42&#43;00:00&lt;/lastmod&gt;<br> &lt;priority&gt;0.60&lt;/priority&gt;<br> &lt;changefreq&gt;weekly&lt;/changefreq&gt;<br> &lt;/url&gt;<br> &lt;url&gt;<br> &lt;loc&gt;http://www.example.com/second&lt;/loc&gt;<br> &lt;lastmod&gt;2012-03-16T08:36:44&#43;00:00&lt;/lastmod&gt;<br> &lt;priority&gt;0.60&lt;/priority&gt;<br> &lt;changefreq&gt;weekly&lt;/changefreq&gt;<br> &lt;/url&gt;<br> &lt;url&gt;<br> &lt;loc&gt;http://www.example.com/third&lt;/loc&gt;<br> &lt;lastmod&gt;2012-03-16T08:37:09&#43;00:00&lt;/lastmod&gt;<br> &lt;priority&gt;0.60&lt;/priority&gt;<br> &lt;changefreq&gt;weekly&lt;/changefreq&gt;<br> &lt;/url&gt;<br> &lt;/urlset&gt;</p> <p></p> <p>Desired Output:</p> <p>&lt;?xml version=&quot;1.0&quot; encoding=&quot;UTF-8&quot;?&gt;<br> &lt;urlset xmlns=&quot;http://www.sitemaps.org/schemas/sitemap/0.9&quot;&gt;<br> &lt;url&gt;<br> &lt;loc&gt;http://www.example.com/first&lt;/loc&gt;<br> &lt;lastmod&gt;2012-01-27T10:18:26&#43;00:00&lt;/lastmod&gt;<br> &lt;priority&gt;0.90&lt;/priority&gt;<br> &lt;changefreq&gt;daily&lt;/changefreq&gt;<br> &lt;/url&gt;<br> &lt;url&gt;<br> &lt;loc&gt;http://www.example.com/second&lt;/loc&gt;<br> &lt;lastmod&gt;2012-03-16T07:38:42&#43;00:00&lt;/lastmod&gt;<br> &lt;priority&gt;0.60&lt;/priority&gt;<br> &lt;changefreq&gt;weekly&lt;/changefreq&gt;<br> &lt;/url&gt;<br> &lt;loc&gt;http://www.example.com/third&lt;/loc&gt;<br> &lt;lastmod&gt;2012-03-16T08:37:09&#43;00:00&lt;/lastmod&gt;<br> &lt;priority&gt;0.60&lt;/priority&gt;<br> &lt;changefreq&gt;weekly&lt;/changefreq&gt;<br> &lt;/url&gt;<br> &lt;/urlset&gt;</p> 2012-03-18T02:08:57-04:004888460http://forums.asp.net/p/1781907/4888460.aspx/1?Re+Remove+Duplicates+From+XML+with+XSLTRe: Remove Duplicates From XML with XSLT <p>Hello</p> <p>In fact I think you can just use LINQ-TO-XML to deal with the problem by checking whether the loc's Value is duplicated or notIf yesplease remove it and do outputtingSample codes look like this following</p> <pre class="prettyprint">namespace MyTest { class Program { static void Main(string[] args) { XDocument doc = XDocument.Load(&quot;abc.xml&quot;); XNamespace xname = XNamespace.Get(&quot;http://www.sitemaps.org/schemas/sitemap/0.9&quot;); var result = doc.Descendants(xname&#43;&quot;url&quot;); List&lt;XElement&gt; elements = new List&lt;XElement&gt;(result); for (int i = 1; i &lt; elements.Count; i&#43;&#43;) { if (elements[i - 1].Element(xname&#43;&quot;loc&quot;).Value == elements[i].Element(xname&#43;&quot;loc&quot;).Value) { elements.RemoveAt(i); i--; } } doc.Root.Elements().Remove(); foreach (var item in elements) { doc.Root.Add(item); } doc.Save(&quot;c:\\try.xml&quot;); } } }</pre> 2012-03-20T01:54:14-04:00