We receive a .tar.gz
file from a client every day and I am rewriting our import process using SSIS. One of the first steps in my process is to unzip the .tar.gz
file which I achieve via a Python script.
After unzipping we are left with a number of CSV files which I then import into SQL Server. As an aside, I am loading using the CozyRoc DataFlow Task Plus.
Most of my CSV files load without issue but I have five files which fail. By reading the log I can see that the process is reading the Header and First line as though there is no HeaderRow Delimiter (ie it is trying to import the column header as ColumnHeader1ColumnValue1
I took one of these CSVs, copied the top 5 rows into Excel, used Text-To-Columns to delimit the data then saved that as a new CSV file. This version imported successfully .
That makes me think that somehow the original CSV isn't using {CR}{LF}
as the row delimiter but I don't know how to check. Any suggestions?
I ended up using the suggestion commented by @ vahdet because I already had notepad++ installed. I can't find the same option in EmEditor but it may exist
For those who are curious, the files are using {LF}
which is consistent with the other files. My investigation continues...
Seeing that you have EmEditor, you can use EmEditor to find the eol character in two ways:
Some other things you could try checking for are: file encoding, wrong type of data for a field and an inconsistent number of columns.
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.