简体   繁体   中英

Parsing Java Server Exception logs

I have a requirement wherein I have to read all "history" exception logs from a WebSphere server and load them in Hive. Below is what a typical log looks like but message rows are sometimes extended for 4-5 lines as well. I do not really care about the stack trace but definitely need the Timestamp, ThereadId, Short name, Event Type and full error message in their individual columns.

[5/20/16 22:35:39:841 CDT] 00233723 SystemOut     O 22:35:39,840 ERROR [com.xxx.app.yyy.hms.jms.receivers.impl.B2bTonnn278InReceiverImpl] 
xxxRuntimeException{errorVO=com.xxx.app.yyy.nnn.mmm.data.mmmCompleteIntakeErrorVO(diagnosesMessagesExist:false, mmmMessagesExist:false, incrementedKey:null, numPagesWithMessages:1, primaryKeyFields:[], providersMessagesExist:false, requiredFields:[], servicesMessagesExist:true, changeDateTime:05-20-2016 10:35:39:840 PM CDT, changeUserID:SYSTEM, createDateTime:null, createUserID:null, dataSecured:false, dataSecurityTypeList:null, globalMessages:[], historyID:0, messages:{procedureUnitCount=[Field For Label: procedureUnitCount Message ID: 'ERR0010', Message Arguments: '[]']}, trackChanges:false, updateVersion:-1, messages={procedureUnitCount=[Field For Label: procedureUnitCount Message ID: 'ERR0010', Message Arguments: '[]']})}
    at com.xxx.app.yyy.nnn.mmm.businesslogic.impl.mmmImpl.completemmm(mmmImpl.groovy:612)
    at sun.reflect.GeneratedMethodAccessor4988.invoke(Unknown Source)

I tried doing this by reading one line at a time and parsing using Regex - which failed miserably (only 20% of data met the Regex) and that quality is also poor. I really do not know to proceed here and what delimiter to choose to break that exception string to columns (\\t already tried - not working too.)

Any help or pointer to right direction here ?

Use Logstash to read and parse the WebSphere logs and post them into Elasticsearch for further processing (ie use ELK Stack ).

Read related discussion here .

With Logstash, you can use Grok to parse any crappy unstructured log data into something structured and queryable.

grep -A 1  SystemOut LogFile |  awk 'NR%3{printf $0" ";next;}2' | awk '{print $2" "$4" "$8" "$10}'

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM