Posted in : Microsoft Av Stina Perbo Utas Översätt med Google ⟶
5 years ago
Recently I walked passed a collegue with an error on his Powershell console which peaked my interest. He quickly noted that he had chosen the wrong delimiter for the Csv he imported which resulted in errors in the code, I then jokingly said ”Why don’t you just use the ’Get-CsvDelimiter’ cmdlet?” and we had a quick laugh.
30 minutes later the ”Get-CsvDelimiter” function was born, but first,
Let’s dive into how Import-Csv, ConvertFrom-Csv & delimiters really work
When used correctly Import-Csv & ConvertFrom-Csv creates an array object with each row as a PSCustomObject with named content based on headers.
Import-Csv has two constructors, one with Path as a required and the other with Path & Delimiter as required.
If you omit the delimiter parameter the Path constructor will be used, now this constructor has a required parameter (according to the documentation) that is auto filled. Instead of using Delimiter we can use ”UseCulture” which takes the current culture delimiter as input,
”To find the list separator for a culture, use the following command: (Get-Culture).TextInfo.ListSeparator
.” –docs.microsoft
If we interpret the documentation this is a required parameter, but it is not really required, as we are able to pass only the -Path parameter and delimiter will autofill
Assume we’re using the following CSV as input ($File)
Name,Title,Department Vikingur Saemundsson,Technical Specialist,Specialists Isabella Tinnberg,Office Administrator,Business Management
the command ”Import-Csv $File” will output a valid object. Now in my opinion it should fail. As my culture specifies semicolon (;) as the default listseparator, while forcing ”-UseCulture” can’t produce a correct csv object. This basically means that UseCulture is not a required parameter and the default unless specified is always a comma (,)
Import-Csv $File Name Title Department ---- ----- ---------- Vikingur Saemundsson Technical Specialist Specialists Isabella Tinnberg Office Administrator Business Management (Get-Culture).TextInfo.ListSeparator ; Import-Csv $File -UseCulture Name,Title,Department --------------------- Vikingur Saemundsson,Technical Specialist,Specialists Isabella Tinnberg,Office Administrator,Business Management
The exact same goes for ConvertFrom-Csv.
Presenting Get-CsvDelimiter
The below function will search and find the most probable delimiter used in a Csv file.
Let’s see how it works.
$File is a csv file with the below data.
Name;Title;Department Vikingur Saemundsson;Technical Specialist;Specialists Isabella Tinnberg;Office Administrator;Business Management
If we run Import-Csv $File we’ll get a incorrect table object, every row will have a single property containing the row data.
If we instead specify a delimiter as shown below, we’ll calculate the most probable delimiter and use that to produce a correct CSV table object without having to inspect the csv culture or assume anything.
Import-Csv $File -Delimiter (Get-CsvDelimiter -file $File) Name Title Department ---- ----- ---------- Vikingur Saemundsson Technical Specialist Specialists Isabella Tinnberg Office Administrator Business Management
And to prove it works, the below code outputs only the Name column in the Csv
(Import-Csv $File -Delimiter (Get-CsvDelimiter -file $File)).Name Vikingur Saemundsson Isabella Tinnberg
If we have a stringobject and need to convert it that is also possible.
$Raw = @' Name;Title;Department Vikingur Saemundsson;Technical Specialist;Specialists Isabella Tinnberg;Office Administrator;Business Management '@ $Raw | ConvertFrom-Csv -Delimiter (Get-CsvDelimiter -file $File) Name Title Department ---- ----- ---------- Vikingur Saemundsson Technical Specialist Specialists Isabella Tinnberg Office Administrator Business Management
With this method we can use virtually any delimiter we want, assuming that the Csv is formatted correctly. Again we’re using the same csv input but we change the delimiter to a ”greater than” (>) symbol and see how the function performs.
Name>Title>Department Vikingur Saemundsson>Technical Specialist>Specialists Isabella Tinnberg>Office Administrator>Business Management
Now when we run the function standalone we’ll get a Greater than symbol as the return
Get-CsvDelimiter -file $File >
And finally, when running the previous code we get the same output as when we were using a semicolon as the delimiter.
Import-Csv $File -Delimiter (Get-CsvDelimiter -file $File) Name Title Department ---- ----- ---------- Vikingur Saemundsson Technical Specialist Specialists Isabella Tinnberg Office Administrator Business Management
Get-CsvDelimiter
Function Get-CsvDelimiter{ Param( $file, $precision = 10 ) $content = Get-Content $File -First $precision $potentialDelimiters = @() $i = 0 #Select the top $precison rows and read the chars, group them and input them into a variable for further checking $content | ForEach-Object{ $row = $_ $potentialDelimiters += [pscustomobject] @{ Row=$i Chars=[char[]]$row | Group-Object | sort count -Descending } $i++ } #We're looking for consistency here, so we take the header row (0) and assign a scoretable based on the unique chars here and use it as an golden standard #Scores are assigned by counting the each unique char appears per row, and if the number matches the first row we add one point to the char score $charScoreArray = @() $i = 0 $potentialDelimiters | ForEach-Object{ $thisDelimObject = $_ If($thisDelimObject.Row -eq 0){ $thisDelimObject.Chars | ForEach-Object{ $charScoreArray += [pscustomobject]@{ Char=$_.Name Score=1 } } } Else{ $oldDelimObject = $potentialDelimiters[$i-1] $thisDelimObject.Chars | ForEach-Object{ $thisCharObject = $_ $OldDelimCharGroup = $oldDelimObject.Chars | Where-Object{$_.Name -clike $thisCharObject.Name} If($_.Name -eq $OldDelimCharGroup.Name -and $_.Count -eq $OldDelimCharGroup.Count -and $charScoreArray.Char -ccontains $_.Name -and $charScoreArray.Char -ccontains $OldDelimCharGroup.Name){ ($charScoreArray | Where-Object{$_.Char -clike $thisCharObject.Name}).Score++ } } } $i++ } #If multiple values have the same score, then we select the the top ones and sort based on char (ascii number, lowest first and return the top 1) $topPoints = $charScoreArray | Sort Score -Descending | Select -ExpandProperty Score -First 1 Return $charScoreArray | Where-Object{$_.Score -eq $topPoints} | sort Char | select -ExpandProperty Char -First 1 }
Tags : ConvertFrom-Csv, CSV, Get-CsvDelimiter, import-csv, PowerShell
Personlig rådgivning
Vi erbjuder personlig rådgivning med författaren för 1400 SEK per timme. Anmäl ditt intresse i här så återkommer vi så snart vi kan.
Add comment