简体   繁体   中英

SQL Server bulk insert special characters / Unicode from C# generated CSV

Scenario: CSV text file generated from C# to be imported into SQL Server database using BULK INSERT. Some fields contain special characters (aka Unicode).

Problem: The special characters display correctly in the text file, but are not being saved correctly in the database.

Edit: Example of correct text is "Khālid Muḥammad ʻAlī al-Ḥājj" and incorrect text is "Kha¯lid Muh?ammad ?Ali¯ al-H?a¯jj".

I pieced together the answer to this from several sources, so here is the whole thing in one place.

(1) The text file must be marked as a Unicode file.

StreamWriter writer = new StreamWriter(fileName, append: false, encoding: Encoding.Unicode);

(2) The database columns must use Unicode data types. For example, NVARCHAR instead of VARCHAR. Use one of the data types that starts with "N".

Side note: you can mark string literals in T-SQL as Unicode strings by prefixing them with an N.

INSERT INTO myTable (Name) VALUES (N'special characters');

(3) Specify DATAFILETYPE='widechar' in the BULK INSERT command. (If the file is not marked Unicode, this will throw an error.)

BULK INSERT dbo.myTable FROM 'C:\path\fileName.csv' 
WITH (DATAFILETYPE ='widechar', FIRSTROW=2, FIELDTERMINATOR='|', ROWTERMINATOR='\n');

(4) Use database collation SQL_Latin1_General_CP1_CI_AS.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM