如何告诉DBD :: CSV使用逗号作为小数分隔符？

Question

I'm trying to use a German-style CSV file with DBI and DBD::CSV . 我正在尝试使用带有DBI和DBD :: CSV的德式CSV文件。 This, in turn, uses Text::CSV to parse the file. 反过来，这使用Text :: CSV来解析文件。 I want to query the data in that file using SQL. 我想使用SQL查询该文件中的数据。

Let's look at the file first. 我们先来看一下这个文件。 It is separated by semicolons ( ; ), and the numbers in it look like this: 5,23 , which is equivalent to the English 5.23 . 它用分号（ ; ）分隔，其中的数字如下： 5,23 ，相当于英文5.23 。

Here's what I've got so far: 这是我到目前为止所得到的：

use strict; use warnings;
use DBI;

# create the database handle
my $dbh = DBI->connect(
  'dbi:CSV:',
  undef, undef,
  {
    f_dir => '.',
    f_schema => undef,
    f_ext => '.csv',
    f_encoding => 'latin-1',
    csv_eol => "\n",
    csv_sep_char => ';',
    csv_tables => {
      foo => {
        file => 'foo.csv',
        #skip_first_row => 0,
        col_names => [ map { "col$_" } (1..3)  ], # see annotation below
      },
    },
  },
) or croak $DBI::errstr;

my $sth = $dbh->prepare(
  'SELECT col3 FROM foo WHERE col3 > 80.50 ORDER BY col3 ASC'
);
$sth->execute;

while (my $res = $sth->fetchrow_hashref) {
  say $res->{col3};
}

Now, this looks quite nice. 现在，这看起来很不错。 The problem is that the SQL (meaning SQL::Statement, which is somewhere down the line from DBI and DBD::CSV) does not regard the data in col3 , which is a floating-point value with a comma in the middle, as a float. 问题是SQL（意思是SQL :: Statement，它位于DBI和DBD :: CSV的某个位置）不考虑col3的数据， col3是一个浮点值，中间有一个逗号，如一个浮子。 Instead, it treats the column as an integer, because it doesn't understand the comma. 相反，它将列视为整数，因为它不理解逗号。

Here's some example data: 这是一些示例数据：

foo;foo;81,90
bar;bar;80,50
baz;baz;80,70

So the above code with this data will result in one line of output: 81,90 . 所以带有这些数据的上述代码将产生一行输出： 81,90 。 Of course, that is wrong. 当然，这是错误的。 It used the int() part of col3 with the comparison, which is right, but not what I want. 它使用col3的int()部分进行比较，这是对的，但不是我想要的。

Question: How can I tell it to treat the numbers with the comma as float? 问题： 如何告诉它用逗号处理数字为浮点数？

Things I've thought about: 我想过的事情：

I've not found any built-in way in Text::CSV to do this. 我没有在Text :: CSV中找到任何内置方式来执行此操作。 I'm not sure where in Text::CSV I could hook this in, or if there is a mechanism in Text::CSV to put such things in at all. 我不确定在Text :: CSV中我可以将其挂钩，或者如果Text :: CSV中有一个机制可以将这些内容放入其中。
I don't know if it poses a problem that DBD::CSV wants to use Text::CSV_XS if possible. 我不知道它是否会造成DBD :: CSV想要使用Text :: CSV_XS的问题。
Maybe I can do it later, after the data has been read and is already stored away somewhere, but I'm not yet sure where the right access point is. 也许我可以在数据被读取并且已经存储在某个地方之后再进行，但我还不确定正确的接入点在哪里。
I understand that the stuff is stored in SQL::Statement. 我知道这些东西存储在SQL :: Statement中。 I don't yet know where. 我还不知道在哪里。 This could be handy somehow. 这在某种程度上可能很方便。

Changing the source CSV file to have dots instead of commas is not an option. 不能选择将源CSV文件更改为带点而不是逗号。

I'm open for all kinds of suggestions. 我愿意接受各种建议。 Other approaches to the whole CSV via SQL thing are welcome, too. 通过SQL的其他方法也很受欢迎。 Thanks a lot. 非常感谢。

Answer 1

You need to write a user-defined function using SQL::Statement::Functions (already loaded as part of DBD::CSV ). 您需要使用SQL::Statement::Functions （已作为DBD::CSV一部分加载）编写用户定义的函数。

This program does what you want. 这个程序做你想要的。 Adding 0.0 to the transformed string is strictly unnecessary, but it makes the point about the purpose of the subroutine. 在变换后的字符串中添加0.0是完全没必要的，但它说明了子程序的用途。 (Note also your typo in the f_encoding parameter to the connect call.) （另请注意connect调用的f_encoding参数中的拼写错误。）

use strict;
use warnings;

use DBI;

my $dbh = DBI->connect(
  'dbi:CSV:',
  undef, undef,
  {
    f_dir => '.',
    f_schema => undef,
    f_ext => '.csv',
    f_encoding => 'latin-1',
    csv_eol => "\n",
    csv_sep_char => ';',
    csv_tables => {
      foo => {
        file => 'test.csv',
        #skip_first_row => 0,
        col_names => [ map { "col$_" } (1..3)  ], # see annotation below
      },
    },
  },
) or croak $DBI::errstr;

$dbh->do('CREATE FUNCTION comma_float EXTERNAL');

sub comma_float {
  my ($self, $sth, $n) = @_;
  $n =~ tr/,/./;
  return $n + 0.0;
}

my $sth = $dbh->prepare(
  'SELECT col3 FROM foo WHERE comma_float(col3) > 80.50 ORDER BY col3 ASC'
);
$sth->execute;

while (my $res = $sth->fetchrow_hashref) {
  say $res->{col3};
}

output 产量

80,70
81,90

如何告诉DBD :: CSV使用逗号作为小数分隔符？

问题描述

1 个解决方案

解决方案1
13 已采纳 2012-12-06 12:38:57

如何告诉DBD :: CSV使用逗号作为小数分隔符？

问题描述

1 个解决方案

解决方案1 13 已采纳 2012-12-06 12:38:57

解决方案1
13 已采纳 2012-12-06 12:38:57