[英]How can I extract field names from SQL with Perl?
I have a series of select statements in a text file and I need to extract the field names from each select query. 我在文本文件中有一系列选择语句,我需要从每个选择查询中提取字段名称。 This would be easy if some of the fields didn't use nested functions like
to_char()
etc. 如果某些字段不使用诸如
to_char()
等嵌套函数,这将很容易。
Given select statement fields that could have several nested parenthese like: 给定的select语句字段可能具有多个嵌套括号,例如:
ltrim(rtrim(to_char(base_field_name, format))) renamed_field_name,
Or the simple case of just base_field_name
as a field, what would the regex look like in Perl? 还是仅以
base_field_name
作为字段的简单情况,正则表达式在Perl中会是什么样?
不要尝试编写正则表达式解析器(尽管perl正则表达式可以处理类似的嵌套模式),请使用SQL :: Statement :: Structure 。
Why not ask the target database itself how it would interpret the queries? 为什么不问目标数据库本身如何解释查询呢?
In perl, one can use the DBI to query the prepared representation of a SQL query. 在perl中,可以使用DBI查询准备好的SQL查询表示形式。 Sometimes this is database-specific: some drivers (under the perl
DBD::
namespace) support their RDBMS' idea of describing statements in ways analogous to the RDBMS' native C or C++ API. 有时这是特定于数据库的:某些驱动程序(在perl
DBD::
名称空间下)支持RDBMS的描述语句的思想,类似于RDBMS的本机C或C ++ API。
It can be done generically, however, as the DBI will put the names of result columns in the statement handle attribute NAME
. 但是,由于DBI会将结果列的名称放在语句句柄属性
NAME
,因此可以一般地完成。 The following, for example, has a good chance of working on any DBI-supported RDBMS: 例如,以下代码很有可能在任何DBI支持的RDBMS上工作:
use strict;
use warnings;
use DBI;
use constant DSN => 'dbi:YouHaveNotToldUs:dbname=we_do_not_know';
my $dbh = DBI->connect(DSN, ..., { RaiseError => 1 });
my $sth;
while (<>) {
next unless /^SELECT/i; # SELECTs only, assume whole query on one line
chomp;
my $sql = /\bWHERE\b/i ? "$_ AND 1=0" : "$_ WHERE 1=0"; # XXX ugly!
eval {
$sth = $dbh->prepare($sql); # some drivers don't know column names
$sth->execute(); # until after a successful execute()
};
print $@, next if $@; # oops, problem with that one
print join(', ', @{$sth->{NAME}}), "\n";
}
The XXX ugly! XXX丑陋! bit there tries to append an always-false condition on the SELECT, so that the SQL engine doesn't have to do any real work when you
execute()
. 有点尝试将始终为false的条件附加到SELECT上,以便当您
execute()
时,SQL引擎不必执行任何实际工作。 It's a terribly naive approach -- that /\\bWHERE\\b/i
test is no more correctly identifying a SQL WHERE clause than simple regexes correctly parse out SELECT field names -- but it is likely to work. 这是一种非常幼稚的方法-
/\\bWHERE\\b/i
测试无法正确地识别出SQL WHERE子句,而不是简单的正则表达式可以正确地解析出SELECT字段名称-但它可能会起作用。
In a somewhat related problem at the office I used: 在办公室中一个有点相关的问题中,我使用了:
my @SqlKeyWordList = qw/select from where .../; # (1)
my @Candidates =split(/\s/,$SqlSelectQuery); # (2)
my %FieldHash; # (3)
for my $Word (@Candidates) {
next if grep($word,@SqlKeyWordList);
$FieldHash($Word)++;
}
Comments: 评论:
my @Candidates=split(/[\s \( \) \+ \, \* \/ \- \n \ \= \r ]+/,$SqlSelectQuery );
How about splitting each line into terms (replace every parenthesis, comma and space with a newline), then sorting: 如何将每行分割成多个字词(用换行符替换每个括号,逗号和空格),然后进行排序:
perl -ne's/[(), ]/\n/g; print' < textfile | sort -u
You'll end up with a lot of content like: 您最终将获得很多内容,例如:
fieldname1 fieldname1 formatstring ltrim rtrim t_char
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.