Methods

NAME

Text::CSV::Hash - hash based CSV file handing, including comparisons

SYNOPSIS

Text::CSV::Hash provides simple hash based methods for loading, saving, and comparing CSV files. Will handle standard quoting and entries with newlines.

DESCRIPTION

Methods

$csv->config(): Allows various configuration options to be set. Should be passed a hash with any of the following:
$csv->dump($row, $row2): If called with one csv row reference displays each field label and its values. If called with two, displays the second's value where it differs from the first.
$csv->labels(): Return the current set of standard labels for current csv as an array.
$csv->labels_add(@labels): Add @labels to the set of standard labels for all csv rows. The standard labels are used by dump(), save() etc.
$csv->labels_set(@labels): Set the standard labels for all csv rows to @labels. The standard labels are used by dump(), save() etc.
$csv->linenolabel: The name of the field containing the file linenumber for each row. This is set by $csv->read automatically. It defaults to 'lineno', but if that conflicts with a label in the file it will prepend '_' until a unique name is found.
$csv->load($file, $conf);: Load csv file '$file'. If $conf is present it can specify any of the options valid for $csv->config.; The first line of the file is expected to contain column labels, except where overridden by name_row_offset or skip_blank_lines.; Returns undef on error and error value can be displayed with $csv->error().
$csv->lookup($row, $minscore, $lookupscores);: Lookup row $row in csv. Other parameters are optional, but if specified:
$csv->match($row, @fields);: Return a list of references to rows in $csv which match $row for all fields in @fields. Typically $row would be from a different csv object. match() builds a hash cache internally to speedup repeated lookups on the same csv object.; If the returned list has one entry it is an exact match, more than one indicates an ambiguous match, and zero no matches.; For more sophisticated matching see lookup().
new Text::CSV::Hash;: Create a new $csv object. If passed a reference to an array of hashes, set the labels to the keys in the first entry, and the rows to the contents of the array.
$csv->rows(): Returns the reference to an array of all rows in the csv. Each row is a hash of label => value. If passed a reference to an array of hashes will set the rows to the contents of the array.
$csv->save($file, $conf);: Save csv to file. If $map was specified on load, saved column names are the values after mapping. If $file is not specified it will default to the file used in load().; Returns undef on error and error value can be displayed with $csv->error().; If present, $conf can specify characteristics of the file. Note these will default to those set when the file was loaded.

EXAMPLE1

    # Sample script to load file.csv, double the 'cost' columns, and save
    my($csv);

    $csv = new Text::CSV::Hash;

    $csv->load('file.csv') || die $csv->error();
    foreach my $row (@{$csv->rows()})
        { $row->{cost} *= 2; }
    $csv->save() || die $csv->error();

EXAMPLE2

    # Sample script to produce merged versions of file1.csv and file2.csv
    #   - matching on 'first name', 'last name', and 'country' columns
    #   - add two columns 'results', and 'mismatches'
    #   - allowing for any one of the three fields to mismatch
    #   - trimming spaces from each field.

    my($csv1, $csv2, %merge_match);

    $csv1 = new Text::CSV::Hash;
    $csv2 = new Text::CSV::Hash;

    %merge_match = ('first name' => 1, 'last name' => 1, 'country' => 1);

    $csv1->load('file1.csv', {trim => 1}, \%lookups) || die $csv1->error();
    $csv2->load('file2.csv', {trim => 1}, \%lookups) || die $csv2->error();

    $csv2->labels_add('results', 'mismatches');

    foreach my $row2 (@{$csv2->rows()})
        {
        my($match1, $ambig, $mismatches, $resfield, $misfield);

        ($match1, $ambig, $mismatches) = $csv1->lookup($row2, 2);
        if ($ambig)
            { $resfield = 'Ambiguous'; }
        elsif (!$match1)
            { $resfield = 'Not Found'; }
        elsif ($mismatches)
            {
            $resfield = "Mismatch ".join(' + ', @${mismatches});
            foreach my $mis (@${mismatches})
                { $misfield .= qq#("$row2->{$mis}" - "$match1->{$mis}") #; }
            chop $misfield;
            }
        else
            { $resfield = 'Match all'; }

        $row2->{results} = $resfield;
        $row2->{mismatches} = $misfield;

        if ($match1)
            {
            # Merge fields both ways

            foreach($csv1->labels())
                { $row2->{$_} ||= $match1->{$_}; }

            foreach($csv2->labels())
                { $match1->{$_} ||= $csv2->{$_}; }
            }
        }

    $csv1->labels_add($csv2->labels());
    $csv1->save('merged_'.$csv1->{filename}) || die $csv1->error();

    $csv2->labels_add($csv1->labels());
    $csv2->save('merged_'.$csv2->{filename}) || die $csv2->error();