I have a lot of logfiles that I want to load into a SQL Server database via a Talend job so that I can preserve the logs without keeping the files in the file system (we generate over 3k log files every day). I do not want to change the jobs to write to the database directly as I don't want the jobs to be dependent on the DB server being available.
I have created a Talend job that can read a log file and convert it to an XML document which I can then send to a SQL Server (2016) stored procedure. The stored procedure is executed, however the contents of the log file are NOT being stored in the database.
I have ruled out the following possible causes:
The stored procedure is:
ALTER procedure [myschema].[myProc]
@logXML as xml,
@logFile as varchar(max),
@result as varchar(max) OUT
as
begin
declare @myAction varchar(max);
begin try
begin transaction
set @myAction = 'inserting log file ' + @logFile;
insert into [myDB].[mySchema].[myTable]
(
JobName,
LogName,
RunDate,
LogLineNum,
LogLine
)
(select logs.value('JobName[1]', 'varchar(500)') as JobName,
logs.value('LogName[1]', 'varchar(500)') as LogName,
logs.value('runDate[1]', 'varchar(20)') as runDate,
lines.value('Number[1]', 'integer') as LogLineNum,
lines.value('Content[1]', 'varchar(max)') as LogLine
from @logXML.nodes('/LogFileContents') as l1(logs),
@logXML.nodes('/LogFileContents/LogLines/Line') as l2(lines)
);
commit transaction;
set @result = 'SUCCESS';
end try
begin catch
if @@TRANCOUNT > 0
begin
rollback
set @result = 'Error ' + @myAction + '. Error code: ' + @@ERROR;
end
end catch
end
The XML document generated by the Talend job looks like the following:
<?xml version="1.0" encoding="UTF-8"?> // I do not have control of this - I get this "free" from Talend
<LogFileContents>
<JobName>myTask</JobName>
<LogName>myLogFile</LogName>
<runDate>YYYYMMDDHHmiss</runDate>
<LogLines>
<Line>
<Number>1</Number>
<Content>1st Log Message</Content>
</Line>
<Line>
<Number>2</Number>
<Content>2nd Log Message</Content>
</Line>
...
<Line>
<Number>Last</Number>
<Content>Last Log Message</Content>
</Line>
</LogLines>
</LogFileContents>
In Talend I have tried passing the XML document to the tDBSP as a document (which results in a Talend Error). It is currently configured to pass the XML document as a String - I had to add the "sendStringParametersAsUnicode=false" to my connection string (additional parameters) to eliminate the "unable to switch encoding" error. When I check the result from the call to the stored proc I get "SUCCESS". I have even tried having the stored proc pass back a non-sense value for the result instead of "SUCCESS".
In the stored proc I have tried changing the input parameter to a varchar(max) and then converting that to an XML variable via
@logXML as varchar(max),
...
declare @myXML xml = convert(xml, @logXML)
...
from @myXML.nodes(...) as l1(logs)
@myXML.nodes(...) as l2(lines)
which also did not load the document into the database.
However if I execute the stored procedure via SSMS using the XML Document (only difference is the missing <?xml...> document header):
<LogFileContents>
<JobName>myTask</JobName>
<LogName>myLogFile</LogName>
<runDate>YYYYMMDDHHmiss</runDate>
<LogLines>
<Line>
<Number>1</Number>
<Content>1st Log Message</Content>
</Line>
<Line>
<Number>2</Number>
<Content>2nd Log Message</Content>
</Line>
...
<Line>
<Number>Last</Number>
<Content>Last Log Message</Content>
</Line>
</LogLines>
</LogFileContents>
The entire contents loads as expected (The log file I am testing with has 40 lines and all 40 show up in the database, exactly as I expect).
I have also reviewed the following links - but they are in regards to processing XML files, not XML strings.
Parsing XML Data Into SQL Server
Importing XML file into SQL Server 2000 using OPENROWSET
I feel like there's something not quite configured right in Talend but I'm not sure where that would be. Any input/guidance would be greatly appreciated.
Your XML shredding has no issues outside of performance.
Please see below.
So, it seems that the issue is some kind of environment related (Talend?.).
SQL
DECLARE @logXML XML =
'<?xml version="1.0" encoding="UTF-8"?>
<LogFileContents>
<JobName>myTask</JobName>
<LogName>myLogFile</LogName>
<runDate>YYYYMMDDHHmiss</runDate>
<LogLines>
<Line>
<Number>1</Number>
<Content>1st Log Message</Content>
</Line>
<Line>
<Number>2</Number>
<Content>2nd Log Message</Content>
</Line>...
<Line>
<Number>10000</Number>
<Content>Last Log Message</Content>
</Line>
</LogLines>
</LogFileContents>';
SELECT logs.value('(JobName/text())[1]', 'varchar(500)') as JobName
, logs.value('(LogName/text())[1]', 'varchar(500)') as LogName
, logs.value('(runDate/text())[1]', 'varchar(20)') as runDate
, lines.value('(Number/text())[1]', 'integer') as LogLineNum
, lines.value('(Content/text())[1]', 'varchar(max)') as LogLine
FROM @logXML.nodes('/LogFileContents') as l1(logs)
, @logXML.nodes('/LogFileContents/LogLines/Line') as l2(lines);
Output
JobName | LogName | runDate | LogLineNum | LogLine |
---|---|---|---|---|
myTask | myLogFile | YYYYMMDDHHmiss | 1 | 1st Log Message |
myTask | myLogFile | YYYYMMDDHHmiss | 2 | 2nd Log Message |
myTask | myLogFile | YYYYMMDDHHmiss | 10000 | Last Log Message |
So it seems that, even though I have a commit in the stored procedure, I still need an external (in Talend) commit (tDBCommit). From what I've been able to gather it has to do with the fact that I use a tDBConnection to establish the DB connection instead of configuring the tDBSP to make a connection to the DB.
Although I haven't tried it, I suspect that I could also set the "Auto Commit" property of the tDBConnection and that would work as well.
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.