Posted in : Microsoft

2 years ago

Recently I walked passed a collegue with an error on his Powershell console which peaked my interest. He quickly noted that he had chosen the wrong delimiter for the Csv he imported which resulted in errors in the code, I then jokingly said ”Why don’t you just use the ’Get-CsvDelimiter’ cmdlet?” and we had a quick laugh.
30 minutes later the ”Get-CsvDelimiter” function was born, but first,

Let’s dive into how Import-Csv, ConvertFrom-Csv & delimiters really work

When used correctly Import-Csv & ConvertFrom-Csv creates an array object with each row as a PSCustomObject with named content based on headers.
Import-Csv has two constructors, one with Path as a required and the other with Path & Delimiter as required.
If you omit the delimiter parameter the Path constructor will be used, now this constructor has a required parameter (according to the documentation) that is auto filled. Instead of using Delimiter we can use ”UseCulture” which takes the current culture delimiter as input,
”To find the list separator for a culture, use the following command: (Get-Culture).TextInfo.ListSeparator.” –docs.microsoft
If we interpret the documentation this is a required parameter, but it is not really required, as we are able to pass only the -Path parameter and delimiter will autofill
Assume we’re using the following CSV as input ($File)

Name,Title,Department
Vikingur Saemundsson,Technical Specialist,Specialists
Isabella Tinnberg,Office Administrator,Business Management

the command ”Import-Csv $File” will output a valid object. Now in my opinion it should fail. As my culture specifies semicolon (;) as the default listseparator, while forcing ”-UseCulture” can’t produce a correct csv object. This basically means that UseCulture is not a required parameter and the default unless specified is always a comma (,)

Import-Csv $File
Name                 Title                Department
----                 -----                ----------
Vikingur Saemundsson Technical Specialist Specialists
Isabella Tinnberg    Office Administrator Business Management
(Get-Culture).TextInfo.ListSeparator
;
Import-Csv $File -UseCulture
Name,Title,Department
---------------------
Vikingur Saemundsson,Technical Specialist,Specialists
Isabella Tinnberg,Office Administrator,Business Management

The exact same goes for ConvertFrom-Csv.

Presenting Get-CsvDelimiter

The below function will search and find the most probable delimiter used in a Csv file.
Let’s see how it works.
$File is a csv file with the below data.

Name;Title;Department
Vikingur Saemundsson;Technical Specialist;Specialists
Isabella Tinnberg;Office Administrator;Business Management

If we run Import-Csv $File we’ll get a incorrect table object, every row will have a single property containing the row data.
If we instead specify a delimiter as shown below, we’ll calculate the most probable delimiter and use that to produce a correct CSV table object without having to inspect the csv culture or assume anything.

Import-Csv $File -Delimiter (Get-CsvDelimiter -file $File)
Name                 Title                Department
----                 -----                ----------
Vikingur Saemundsson Technical Specialist Specialists
Isabella Tinnberg    Office Administrator Business Management

And to prove it works, the below code outputs only the Name column in the Csv

(Import-Csv $File -Delimiter (Get-CsvDelimiter -file $File)).Name
Vikingur Saemundsson
Isabella Tinnberg

If we have a stringobject and need to convert it that is also possible.

$Raw = @'
Name;Title;Department
Vikingur Saemundsson;Technical Specialist;Specialists
Isabella Tinnberg;Office Administrator;Business Management
'@
$Raw | ConvertFrom-Csv -Delimiter (Get-CsvDelimiter -file $File)
Name                 Title                Department
----                 -----                ----------
Vikingur Saemundsson Technical Specialist Specialists
Isabella Tinnberg    Office Administrator Business Management

With this method we can use virtually any delimiter we want, assuming that the Csv is formatted correctly. Again we’re using the same csv input but we change the delimiter to a ”greater than” (>) symbol and see how the function performs.

Name>Title>Department
Vikingur Saemundsson>Technical Specialist>Specialists
Isabella Tinnberg>Office Administrator>Business Management

Now when we run the function standalone we’ll get a Greater than symbol as the return

Get-CsvDelimiter -file $File
>

And finally, when running the previous code we get the same output as when we were using a semicolon as the delimiter.

Import-Csv $File -Delimiter (Get-CsvDelimiter -file $File)
Name                 Title                Department
----                 -----                ----------
Vikingur Saemundsson Technical Specialist Specialists
Isabella Tinnberg    Office Administrator Business Management

 

Get-CsvDelimiter

Function Get-CsvDelimiter{
Param(
    $file,
    $precision = 10
)
    $content = Get-Content $File -First $precision
    $potentialDelimiters = @()
    $i = 0
    #Select the top $precison rows and read the chars, group them and input them into a variable for further checking
    $content | ForEach-Object{
        $row = $_
        $potentialDelimiters += [pscustomobject] @{
            Row=$i
            Chars=[char[]]$row | Group-Object | sort count -Descending
        }
        $i++
    }
    #We're looking for consistency here, so we take the header row (0) and assign a scoretable based on the unique chars here and use it as an golden standard
    #Scores are assigned by counting the each unique char appears per row, and if the number matches the first row we add one point to the char score
    $charScoreArray = @()
    $i = 0
    $potentialDelimiters | ForEach-Object{
        $thisDelimObject = $_
        If($thisDelimObject.Row -eq 0){
            $thisDelimObject.Chars | ForEach-Object{
                $charScoreArray += [pscustomobject]@{
                    Char=$_.Name
                    Score=1
                }
            }
        }
        Else{
            $oldDelimObject = $potentialDelimiters[$i-1]
            $thisDelimObject.Chars | ForEach-Object{
                $thisCharObject = $_
                $OldDelimCharGroup = $oldDelimObject.Chars | Where-Object{$_.Name -clike $thisCharObject.Name}
                If($_.Name -eq $OldDelimCharGroup.Name -and $_.Count -eq $OldDelimCharGroup.Count -and $charScoreArray.Char -ccontains $_.Name -and $charScoreArray.Char -ccontains $OldDelimCharGroup.Name){
                    ($charScoreArray | Where-Object{$_.Char -clike $thisCharObject.Name}).Score++
                }
            }
        }
        $i++
    }
    #If multiple values have the same score, then we select the the top ones and sort based on char (ascii number, lowest first and return the top 1)
    $topPoints = $charScoreArray | Sort Score -Descending | Select -ExpandProperty Score -First 1
    Return $charScoreArray | Where-Object{$_.Score -eq $topPoints} | sort Char | select -ExpandProperty Char -First 1
}

Tags : ConvertFrom-Csv, CSV, Get-CsvDelimiter, import-csv, PowerShell

Add comment

Your comment will be revised by the site if needed.