On occasion I need to track down duplicate entries in a file. Without going through a bunch of mechanics, I found this approach useful, and, most importantly, easy. First, we will create a dummy array and store the contents in a temp file:
# Create temp file with dummy data including duplicate lines
Next, we get the data into an array. Interestingly, Get-Content does this for you without any extra work:
# Get file contents into an array
$filecontents=Get-Content-Path$tempfile
Once we have an array, which is verifiable by using this command:
$filecontents.GetType()
IsPublic IsSerial Name BaseType
-------- -------- ---- --------
True True Object[] System.Array
we can use the Group-Object (or group alias) with a Where-Object (or where alias) cmdlet pattern to find collections (or groupings) with more than 1 entry. In essence, this is a set of lines (or array entries) where more than 1 entry exists per group:
# Find duplicates
$filecontents |
Group |
Where {$_.count -gt 1}
When this gets run, it shows results:
Count Name Group
----- ---- -----
4 1 {1, 1, 1, 1}
3 2 {2, 2, 2}
2 3 {3, 3}
To finalize this sample, remove the temp file:
# Clean up
Remove-Item-Path$tempfile
While such a simple example may seem artificial, I am working on a way to reference the actual lines where duplicates appear this may "break" the simple Group cmdlet usage shown above, but, if you are in a hurry, these steps can save you very easily with minimal effort.
About lamsim
Author Description here.. Nulla sagittis convallis. Curabitur consequat. Quisque metus enim, venenatis fermentum, mollis in, porta et, nibh. Duis vulputate elit in elit. Mauris dictum libero id justo.