[英]How to update a column value in multiple rows, in a single command, using Parameters in npgsql?
[英]npgsql: How to select multiple rows (with multiple column values) with npgsql in one command with a collection as a parameter?
我在下面定义了两个表, supplier_balances
和supplier_balance_items
(顺便说一句,两者之间存在1[supplier_balance]:N[supplier_balance_items]
关系):
CREATE TABLE IF NOT EXISTS sch_brand_payment_data_lake_proxy.supplier_balances (
/* id is here for joining purposes with items table, instead of joining with the 4 columns used for sake
of making sure a record is deemed as unique */
id bigserial NOT NULL,
accounting_document text NOT NULL,
accounting_document_type text NOT NULL,
company_code text NOT NULL,
document_date_year int4 NOT NULL,
accounting_doc_created_by_user text,
accounting_clerk text,
assignment_reference text,
document_reference_id text,
original_reference_document text,
payment_terms text,
supplier text,
supplier_name text,
document_date timestamp,
posting_date timestamp,
net_due_date timestamp,
created_on timestamp default NULL,
modified_on timestamp default NULL,
pushed_on timestamp default NULL,
is_modified bool GENERATED ALWAYS AS (modified_on IS NOT NULL AND modified_on > created_on) STORED,
is_pushed bool GENERATED ALWAYS AS (pushed_on IS NOT NULL AND pushed_on > modified_on) STORED,
CONSTRAINT supplier_balances_pkey PRIMARY KEY (id),
/* accounting_document being the field of the composite unique index -> faster querying */
CONSTRAINT supplier_balances_unique UNIQUE (
accounting_document,
accounting_document_type,
company_code,
document_date_year)
);
/* Creating other indexes for querying of those as well */
CREATE INDEX IF NOT EXISTS supplier_balances_accounting_document_type_idx
ON sch_brand_payment_data_lake_proxy.supplier_balances (accounting_document_type);
CREATE INDEX IF NOT EXISTS supplier_balances_company_code_idx
ON sch_brand_payment_data_lake_proxy.supplier_balances (company_code);
CREATE INDEX IF NOT EXISTS supplier_balances_document_date_year_idx
ON sch_brand_payment_data_lake_proxy.supplier_balances (document_date_year);
CREATE TABLE IF NOT EXISTS sch_brand_payment_data_lake_proxy.supplier_balance_items
(
supplier_balance_id bigserial NOT NULL,
posting_view_item text NOT NULL,
posting_key text,
amount_in_company_code_currency numeric,
amount_in_transaction_currency numeric,
cash_discount_1_percent numeric,
cash_discount_amount numeric,
clearing_accounting_document text,
document_item_text text,
gl_account text,
is_cleared bool,
clearing_date timestamp,
due_calculation_base_date timestamp,
/* uniqueness is basically the posting_view_item for a given supplier balance */
CONSTRAINT supplier_balance_items_pkey PRIMARY KEY (supplier_balance_id, posting_view_item),
/* 1(supplier balance):N(supplier balance items) */
CONSTRAINT supplier_balance_items_fkey FOREIGN KEY (supplier_balance_id)
REFERENCES sch_brand_payment_data_lake_proxy.supplier_balances (id)
ON DELETE CASCADE
ON UPDATE CASCADE
);
注意:为了简单起见,我只是填写不能为NULL
的列。
INSERT INTO
sch_brand_payment_data_lake_proxy.supplier_balances
(accounting_document, accounting_document_type, company_code, document_date_year)
VALUES
('A', 'B', 'C', 0),
('A', 'B', 'C', 1),
('A', 'B', 'C', 2),
('A', 'B', 'C', 3),
('A', 'B', 'C', 4),
('A', 'B', 'C', 5)
RETURNING id;
输出:
ID |
---|
1 |
2 |
3 |
4 |
5 |
6 |
INSERT INTO
sch_brand_payment_data_lake_proxy.supplier_balance_items
(supplier_balance_id, posting_view_item)
VALUES
(1, 'A'),
(1, 'B'),
(3, 'A'),
(3, 'B'),
(2, 'A'),
(1, 'C');
SELECT
accounting_document,
accounting_document_type,
company_code,
document_date_year
FROM sch_brand_payment_data_lake_proxy.supplier_balances;
输出:
ID | 会计文件 | 会计文件类型 | 公司代码 | 文档日期年份 |
---|---|---|---|---|
1 | 一个 | 乙 | C | 0 |
2 | 一个 | 乙 | C | 1 |
3 | 一个 | 乙 | C | 2 |
4 | 一个 | 乙 | C | 3 |
5 | 一个 | 乙 | C | 4 |
6 | 一个 | 乙 | C | 5 |
SELECT
supplier_balance_id,
posting_view_item
FROM sch_brand_payment_data_lake_proxy.supplier_balance_items;
输出:
供应商余额ID | posting_view_item |
---|---|
1 | 一个 |
1 | 乙 |
3 | 一个 |
3 | 乙 |
2 | 一个 |
1 | C |
现在,如果我们想在 JOIN 中选择多个值,我们可以在原始 SQL 中执行:
SELECT
id,
accounting_document,
accounting_document_type,
company_code,
document_date_year,
posting_view_item
FROM sch_brand_payment_data_lake_proxy.supplier_balances
LEFT OUTER JOIN sch_brand_payment_data_lake_proxy.supplier_balance_items
ON supplier_balances.id = supplier_balance_items.supplier_balance_id
WHERE (accounting_document, accounting_document_type, company_code, document_date_year)
IN (('A', 'B', 'C', 1), ('A', 'B', 'C', 2))
输出:
ID | 会计文件 | 会计文件类型 | 公司代码 | 文档日期年份 | posting_view_item |
---|---|---|---|---|---|
2 | 一个 | 乙 | C | 1 | 一个 |
3 | 一个 | 乙 | C | 2 | 一个 |
https://github.com/npgsql/npgsql/issues/1199
现在,当在 C# 中使用npgsql时,重现上面的查询是一件容易的事:
using System.Data;
using Npgsql;
var connectionStringBuilder = new NpgsqlConnectionStringBuilder
{
Host = "localhost",
Port = 5432,
Username = "brand_payment_migration",
Password = "secret",
Database = "brand_payment"
};
using var connection = new NpgsqlConnection(connectionStringBuilder.ToString());
connection.Open();
using var command = connection.CreateCommand();
command.CommandText =
"SELECT id, accounting_document, accounting_document_type, company_code, document_date_year, posting_view_item " +
"FROM sch_brand_payment_data_lake_proxy.supplier_balances " +
"LEFT OUTER JOIN sch_brand_payment_data_lake_proxy.supplier_balance_items " +
"ON supplier_balances.id = supplier_balance_items.supplier_balance_id " +
"WHERE (accounting_document, accounting_document_type, company_code, document_date_year) " +
"IN (('A', 'B', 'C', 1), ('A', 'B', 'C', 2));";
using var reader = command.ExecuteReader();
using var dataTable = new DataTable();
dataTable.Load(reader);
var cols = dataTable.Columns.Cast<DataColumn>().ToArray();
Console.WriteLine(string.Join(Environment.NewLine, cols.Select((x, i) => $"Col{i} = {x}")));
Console.WriteLine(string.Join("\t", cols.Select((_, i) => $"Col{i}")));
foreach (var dataRow in dataTable.Rows.Cast<DataRow>())
{
Console.WriteLine(string.Join("\t", dataRow.ItemArray));
}
其中,正如预期的输出:
Col0 = id
Col1 = accounting_document
Col2 = accounting_document_type
Col3 = company_code
Col4 = document_date_year
Col5 = posting_view_item
Col0 Col1 Col2 Col3 Col4 Col5
2 A B C 1 A
3 A B C 2 A
3 A B C 2 B
现在,我想要实现的是,而不是为(('A', 'B', 'C', 1), ('A', 'B', 'C', 2));
,我很想使用NpgSqlParameter
与一组值集(即每列))。
所以我改变了上面的 C# 片段并添加了参数
// ...
"WHERE (accounting_document, accounting_document_type, company_code, document_date_year) " +
"IN @values;";
var parameter = command.CreateParameter();
parameter.ParameterName = "@values";
parameter.NpgsqlDbType = NpgsqlDbType.Array;
parameter.NpgsqlValue = new object[,]
{
{ "A", "B", "C", 1 },
{ "A", "B", "C", 2 }
};
// Note: the same kind of issue arises when using tuples, i.e.
// ( "A", "B", "C", 1 )
// ( "A", "B", "C", 2 )
command.Parameters.Add(parameter);
using var reader = command.ExecuteReader();
// ...
然后我得到了这个例外:
Unhandled exception. System.ArgumentOutOfRangeException: Cannot set NpgsqlDbType to just Array, Binary-Or with the element type (e.g. Array of Box is NpgsqlDbType.Array | Npg
sqlDbType.Box). (Parameter 'value')
at Npgsql.NpgsqlParameter.set_NpgsqlDbType(NpgsqlDbType value)
at Program.<Main>$(String[] args) in C:\Users\natalie-perret\Desktop\Personal\playground\csharp\CSharpPlayground\Program.cs:line 25
然后我尝试使用以下方法解决该错误:
parameter.NpgsqlDbType = NpgsqlDbType.Array | NpgsqlDbType.Unknown;
但随后又得到另一个例外:
Unhandled exception. System.ArgumentException: No array type could be found in the database for element .<unknown>
at Npgsql.TypeMapping.ConnectorTypeMapper.ResolveByNpgsqlDbType(NpgsqlDbType npgsqlDbType)
at Npgsql.NpgsqlParameter.ResolveHandler(ConnectorTypeMapper typeMapper)
at Npgsql.NpgsqlParameterCollection.ValidateAndBind(ConnectorTypeMapper typeMapper)
at Npgsql.NpgsqlCommand.ExecuteReader(CommandBehavior behavior, Boolean async, CancellationToken cancellationToken)
at Npgsql.NpgsqlCommand.ExecuteReader(CommandBehavior behavior, Boolean async, CancellationToken cancellationToken)
at Npgsql.NpgsqlCommand.ExecuteReader(CommandBehavior behavior)
at Program.<Main>$(String[] args) in C:\Users\natalie-perret\Desktop\Personal\playground\csharp\CSharpPlayground\Program.cs:line 32
似乎由于某种原因需要注册类型,实际上如果我不指定类型:
Unhandled exception. System.NotSupportedException: The CLR type System.Object isn't natively supported by Npgsql or your PostgreSQL. To use it with a PostgreSQL composite
you need to specify DataTypeName or to map it, please refer to the documentation.
at Npgsql.TypeMapping.ConnectorTypeMapper.ResolveByClrType(Type type)
at Npgsql.TypeMapping.ConnectorTypeMapper.ResolveByClrType(Type type)
at Npgsql.NpgsqlParameter.ResolveHandler(ConnectorTypeMapper typeMapper)
at Npgsql.NpgsqlParameter.Bind(ConnectorTypeMapper typeMapper)
at Npgsql.NpgsqlParameterCollection.ValidateAndBind(ConnectorTypeMapper typeMapper)
at Npgsql.NpgsqlCommand.ExecuteReader(CommandBehavior behavior, Boolean async, CancellationToken cancellationToken)
at Npgsql.NpgsqlCommand.ExecuteReader(CommandBehavior behavior, Boolean async, CancellationToken cancellationToken)
at Npgsql.NpgsqlCommand.ExecuteReader(CommandBehavior behavior)
at Program.<Main>$(String[] args) in C:\Users\natalie-perret\Desktop\Personal\playground\csharp\CSharpPlayground\Program.cs:line 31
[编辑]
我最终得到的临时解决方案是依赖 jsonb 支持,尤其是jsonb_to_recordset
函数(请参阅PostgreSQL 文档部分关于 json 函数):
using System.Data;
using System.Text.Json;
using Npgsql;
using NpgsqlTypes;
var connectionStringBuilder = new NpgsqlConnectionStringBuilder
{
Host = "localhost",
Port = 5432,
Username = "brand_payment_migration",
Password = "secret",
Database = "brand_payment"
};
using var connection = new NpgsqlConnection(connectionStringBuilder.ToString());
connection.Open();
using var command = connection.CreateCommand();
command.CommandText =
"SELECT id, accounting_document, accounting_document_type, company_code, document_date_year, posting_view_item " +
"FROM sch_brand_payment_data_lake_proxy.supplier_balances " +
"LEFT OUTER JOIN sch_brand_payment_data_lake_proxy.supplier_balance_items " +
"ON supplier_balances.id = supplier_balance_items.supplier_balance_id " +
"WHERE (accounting_document, accounting_document_type, company_code, document_date_year) " +
"IN (SELECT * FROM jsonb_to_recordset(@values) " +
"AS params (accounting_document text, accounting_document_type text, company_code text, document_date_year integer));";
var parameter = command.CreateParameter();
parameter.ParameterName = "@values";
parameter.NpgsqlDbType = NpgsqlDbType.Jsonb;
parameter.NpgsqlValue = JsonSerializer.Serialize(new []
{
new Params("A", "B", "C", 1),
new Params("A", "B", "C", 2)
});
command.Parameters.Add(parameter);
using var reader = command.ExecuteReader();
using var dataTable = new DataTable();
dataTable.Load(reader);
var cols = dataTable.Columns.Cast<DataColumn>().ToArray();
Console.WriteLine(string.Join(Environment.NewLine, cols.Select((x, i) => $"Col{i} = {x}")));
Console.WriteLine(string.Join("\t", cols.Select((_, i) => $"Col{i}")));
foreach (var dataRow in dataTable.Rows.Cast<DataRow>())
{
Console.WriteLine(string.Join("\t", dataRow.ItemArray));
}
public Params(
string accounting_document,
string accounting_document_type,
string company_code,
int document_date_year);
输出:
Col0 = id
Col1 = accounting_document
Col2 = accounting_document_type
Col3 = company_code
Col4 = document_date_year
Col5 = posting_view_item
Col0 Col1 Col2 Col3 Col4 Col5
2 A B C 1 A
3 A B C 2 A
3 A B C 2 B
但这是以在传递参数时添加额外的 json 序列化步骤为代价的。 因此,除了构建一个非常长的字符串之外,我有点困惑的是,没有办法直接将实际值传递给NpgsqlParameter.NpgsqlValue
属性,而无需额外的步骤。
[编辑 2]
添加一个DbFiddle
[编辑 3]
可以使用相同的 jsonb“技巧”来提供数据(尽管我已经在上面提到了同样的问题):
INSERT INTO sch_brand_payment_data_lake_proxy.supplier_balances
(accounting_document, accounting_document_type, company_code, document_date_year)
SELECT * FROM jsonb_to_recordset(
'[{"accounting_document":"E","accounting_document_type":"B","company_code":"C","document_date_year":1},
{"accounting_document":"E","accounting_document_type":"B","company_code":"C","document_date_year":2}]'::jsonb)
AS params (accounting_document text, accounting_document_type text, company_code text, document_date_year integer)
RETURNING id;
[编辑 4] 另一种方法是使用jsonb_populate_recordset
并将相关的NULL::table-full-name
作为第一个参数(将定义列)和相关的jsonb
作为第二个参数(类似于第一个参数的jsonb_to_recordset
)。
基本上实现我想要的 3 种主要方法(相应地更新了DbFiddle ):
注意:使用带有json_table
功能的 PostgreSQL 15 可能会变得更容易。
[编辑 3] 这篇文章很好地总结了一些事情: https ://dev.to/forbeslindesay/postgres-unnest-cheat-sheet-for-bulk-operations-1obg
[编辑 2]
跟进我今天早些时候提出的问题https://github.com/npgsql/npgsql/issues/4437#issuecomment-1113999994
我已经解决了@dhedey在另一个以某种方式相关的问题中提到的解决方案/解决方法:
如果它对其他人有帮助,我发现使用
UNNEST
命令对这些类型的查询有一个很好的解决方法,该命令可以采用多个数组参数并将它们一起压缩到列中,这些列可以与表连接以过滤到相关列。在某些情况下,连接的使用也比 ANY/IN 模式更高效。
SELECT * FROM table WHERE (itemauthor, itemtitle) = ANY (('bob', 'hello'), ('frank', 'hi')...)
可以表示为:
var authorsParameter = new NpgsqlParameter("@authors", NpgsqlDbType.Array | NpgsqlDbType.Varchar) { Value = authors.ToList() }; var titlesParameter = new NpgsqlParameter("@titles", NpgsqlDbType.Array | NpgsqlDbType.Varchar) { Value = titles.ToList() }; var results = dbContext.Set<MyRow>() .FromSqlInterpolated($@" SELECT t.* FROM UNNEST({authorsParameter}, {titlesParameter}) params (author, title) INNER JOIN table t ON t.author = params.author AND t.title = params.title ");
注意 - Varchar 可以被其他类型的参数替换为其他类型的数组(例如 Bigint) - 查看
NpgsqlDbType
枚举以获取更多详细信息。
然后我重写了我最初发布的一些代码,似乎unnest
PostgreSQL 函数解决方案就像一个魅力。 这是我暂时接受的答案,它看起来比 Json / JsonB 更整洁,后者需要进一步的 postgresql-json 特定映射恶作剧或提取。
不过,我还不太确定性能影响:
unnest
涉及您映射差异jsonb_to_recordset
需要额外的 .NET Json 序列化步骤,并且在某些情况下,需要将jsonb_to_recordset
的输出显式映射到相关列。 两者都不是免费的。 但我喜欢unnest
明确地为每一列(即每个集合/更大的 .NET 类型(元组、记录、类、结构等)的值的集合)传递给 DB 的NpgsqlParameter.NpgsqlValue
属性类型将通过NpgsqlDbType
枚举使用
using System.Data;
using Npgsql;
using NpgsqlTypes;
var connectionStringBuilder = new NpgsqlConnectionStringBuilder
{
Host = "localhost",
Port = 5432,
Username = "brand_payment_migration",
Password = "secret",
Database = "brand_payment"
};
using var connection = new NpgsqlConnection(connectionStringBuilder.ToString());
connection.Open();
var selectStatement =
"SELECT * FROM sch_brand_payment_data_lake_proxy.supplier_balances " +
"WHERE (accounting_document, accounting_document_type, company_code, document_date_year) " +
"IN (SELECT * FROM unnest(" +
"@accounting_document_texts, " +
"@accounting_document_types, " +
"@company_codes, " +
"@document_date_years" +
"))";
var insertStatement =
"INSERT INTO sch_brand_payment_data_lake_proxy.supplier_balances " +
"(accounting_document, accounting_document_type, company_code, document_date_year) " +
"SELECT * FROM unnest(" +
"@accounting_document_texts, " +
"@accounting_document_types, " +
"@company_codes, " +
"@document_date_years" +
") RETURNING id;";
var parameters = new (string Name, NpgsqlDbType DbType, object Value)[]
{
("@accounting_document_texts", NpgsqlDbType.Array | NpgsqlDbType.Text, new[] {"G", "G", "G"}),
("@accounting_document_types", NpgsqlDbType.Array | NpgsqlDbType.Text, new[] {"Y", "Y", "Y"}),
("@company_codes", NpgsqlDbType.Array | NpgsqlDbType.Text, new[] {"Z", "Z", "Z"}),
("@document_date_years", NpgsqlDbType.Array | NpgsqlDbType.Integer, new[] {1, 2, 3})
};
connection.ExecuteNewCommandAndWriteResultToConsole(insertStatement, parameters);
connection.ExecuteNewCommandAndWriteResultToConsole(selectStatement, parameters);
public static class Extensions
{
public static void AddParameter(this NpgsqlCommand command, string name, NpgsqlDbType dbType, object value)
{
var parameter = command.CreateParameter();
parameter.ParameterName = name;
parameter.NpgsqlDbType = dbType;
parameter.NpgsqlValue = value;
command.Parameters.Add(parameter);
}
public static NpgsqlCommand CreateCommand(this NpgsqlConnection connection,
string text,
IEnumerable<(string Name, NpgsqlDbType DbType, object Value)> parameters)
{
var command = connection.CreateCommand();
command.CommandText = text;
foreach (var (name, dbType, value) in parameters)
{
command.AddParameter(name, dbType, value);
}
return command;
}
public static void ExecuteAndWriteResultToConsole(this NpgsqlCommand command)
{
Console.WriteLine($"Executing command... {command.CommandText}");
using var reader = command.ExecuteReader();
using var dataTable = new DataTable();
dataTable.Load(reader);
var cols = dataTable.Columns.Cast<DataColumn>().ToArray();
Console.WriteLine(string.Join(Environment.NewLine, cols.Select((x, i) => $"Col{i} = {x}")));
Console.WriteLine(string.Join("\t", cols.Select((_, i) => $"Col{i}")));
foreach (var dataRow in dataTable.Rows.Cast<DataRow>())
{
Console.WriteLine(string.Join("\t", dataRow.ItemArray));
}
}
public static void ExecuteNewCommandAndWriteResultToConsole(this NpgsqlConnection connection,
string text,
IEnumerable<(string Name, NpgsqlDbType DbType, object Value)> parameters)
{
using var command = connection.CreateCommand(text, parameters);
command.ExecuteAndWriteResultToConsole();
}
}
输出:
Executing command... INSERT INTO sch_brand_payment_data_lake_proxy.supplier_balances (accounting_document, accounting_document_type, company_code, document_date_year) SEL
ECT * FROM unnest(@accounting_document_texts, @accounting_document_types, @company_codes, @document_date_years) RETURNING id;
Col0 = id
Col0
28
29
30
Executing command... SELECT * FROM sch_brand_payment_data_lake_proxy.supplier_balances WHERE (accounting_document, accounting_document_type, company_code, document_date_y
ear) IN (SELECT * FROM unnest(@accounting_document_texts, @accounting_document_types, @company_codes, @document_date_years))
Col0 = id
Col1 = accounting_document
Col2 = accounting_document_type
Col3 = company_code
Col4 = document_date_year
Col5 = accounting_doc_created_by_user
Col6 = accounting_clerk
Col7 = assignment_reference
Col8 = document_reference_id
Col9 = original_reference_document
Col10 = payment_terms
Col11 = supplier
Col12 = supplier_name
Col13 = document_date
Col14 = posting_date
Col15 = net_due_date
Col16 = created_on
Col17 = modified_on
Col18 = pushed_on
Col19 = is_modified
Col20 = is_pushed
Col0 Col1 Col2 Col3 Col4 Col5 Col6 Col7 Col8 Col9 Col10 Col11 Col12 Col13 Col14 Col15 Col16 Col17 Col18 Col19 Col20
28 G Y Z 1 False False
29 G Y Z 2 False False
30 G Y Z 3 False False
[编辑 1]
由于@Charlieface 指出这不是合适的答案,我认为最好从 npgsql 维护者/贡献者那里获得答案/信息。
因此在他们的 GitHub 存储库中提交了一个问题: https ://github.com/npgsql/npgsql/issues/4437
原答案:
到今天为止,除了其他东西之外,没有办法将元组或集合作为复合“类型”或通过位置斜杠隐式“定义”(然后可以在已经传递给参数值的集合中使用)属性),npgslq 需要先前的 PostgreSQL 类型定义(但元组和嵌套集合仍然无法解决,因为维护者或至少其中一个认为不够安全)。 https://github.com/npgsql/npgsql/issues/2154
正如例外所说,数据库中需要相应的组合。 这是因为匿名类型没有映射到记录。
因此,您应该创建一个类型和一个必须映射到该类型的结构。
仅供参考,有一个类似的问题#2097来跟踪映射组合到值元组。
但这需要 npgsql 的其他一些相关开发人员,例如#2097 ,作者/主要贡献在https://github.com/dotnet/efcore/issues/14661#issuecomment-462440199中被认为过于脆弱
请注意,在npgsql/npgsql#2097中讨论之后,我们决定放弃这个想法。 C# 值元组没有名称,因此任何到 PostgreSQL 组合的映射都将依赖于字段定义排序,这似乎非常危险/脆弱。
我终于决定接受 jsonb 替代方案,不是一个超级粉丝,但至少它允许以相对安全的方式传递集合(只要传递 jsonb 的序列化在控制之下)。
但是我最初设想的做法不是今天可以做到的。
在写这篇文章的过程中,我还学到了一件事:
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.