简体   繁体   中英

Read and extract XML parser?

I have a folder which contains around 15,103 xml files.

An example of an xml file within the folder is the following.

000010000.img.xml

A snippet of the xml part I want to focus in on.

<imgdir name="000010000.img">
   <imgdir name="info">
      <int name="version" value="10" />
      <int name="cloud" value="0" />
      <int name="town" value="0" />
      <float name="mobRate" value="1.0" />
      <string name="bgm" value="Bgm34/MapleLeaf" />
      <int name="returnMap" value="10000" />
      <string name="mapDesc" value="" />
      <int name="hideMinimap" value="0" />
      <int name="forcedReturn" value="999999999" />
      <int name="moveLimit" value="0" />
      <string name="mapMark" value="MushroomVillage" />
      <int name="swim" value="0" />
      <int name="fieldLimit" value="8260" />
      <int name="VRTop" value="-892" />
      <int name="VRLeft" value="-1064" />
      <int name="VRBottom" value="915" />
      <int name="VRRight" value="1334" />
      <int name="fly" value="0" />
      <int name="noMapCmd" value="0" />
      <string name="onFirstUserEnter" value="" />
      <string name="onUserEnter" value="go10000" />
      <int name="standAlone" value="0" />
      <int name="partyStandAlone" value="0" />
      <string name="fieldScript" value="" />
   </imgdir>

   </imgdir>
   <imgdir name="portal">
      <imgdir name="0">
         <string name="pn" value="sp" />
         <int name="pt" value="0" />
         <int name="x" value="-389" />
         <int name="y" value="183" />
         <int name="tm" value="999999999" />
         <string name="tn" value="" />
      </imgdir>
      <imgdir name="1">
         <string name="pn" value="sp" />
         <int name="pt" value="0" />
         <int name="x" value="-416" />
         <int name="y" value="185" />
         <int name="tm" value="999999999" />
         <string name="tn" value="" />
      </imgdir>
      <imgdir name="2">
         <string name="pn" value="sp" />
         <int name="pt" value="0" />
         <int name="x" value="-450" />
         <int name="y" value="183" />
         <int name="tm" value="999999999" />
         <string name="tn" value="" />
      </imgdir>
      <imgdir name="3">
         <string name="pn" value="out00" />
         <int name="pt" value="2" />
         <int name="x" value="1080" />
         <int name="y" value="541" />
         <int name="tm" value="20000" />
         <string name="tn" value="in00" />
         <string name="script" value="" />
         <int name="hideTooltip" value="0" />
         <int name="onlyOnce" value="0" />
         <int name="delay" value="0" />
      </imgdir>
   </imgdir>

I do not know how to code this (never done XML parsing before) and I think it may be possible to do in a .bat.

I need to automatically go into each XML file, extract all the portal information & map ID, and put it all into one text file.

Here is an example of how I need the output of the text to be (uses the snippet of XML above as a reference)

[10000] // <int name="returnMap" value="10000" />
total=4 // total amount of portals (4 below)
sp 0 -389 183 999999999 // <imgdir name="0">
sp 0 -416 185 999999999 // <imgdir name="1">
sp 0 -450 183 999999999 // <imgdir name="2">
out00 2 1080 541 20000 // <imgdir name="3">

I need a program to go into each xml, extract the information above and put it consistently into a single text file.

All XML files have the same structure and are all pretty much follow same style and imgdir name's, but all contain different amounts of portals.

您应该提高搜索能力,无论如何,我进行了快速搜索,发现了这一点: http : //msdn.microsoft.com/en-us/library/87274khy(v=vs.110).aspx ,它将帮助您进行解析XML以及以下内容: http : //msdn.microsoft.com/zh-cn/library/2kzb96fk.aspx ,它将帮助您遍历目录和文件。

This type of problems are unpleasant to solve because you did not described the steps required to solve it, you just said "this is the data, this is the wanted result, solve it!". This means that you pass to us the task of correctly analyze the data and produce the right procedure to get the result...

The Batch file below is a way to solve this problem; I assumed that the parts after the // in the output example are not required.

@echo off
setlocal EnableDelayedExpansion

(for %%a in (*.xml) do call :processFile "%%a") > output.txt
goto :EOF


:processFile
set "returnMap="
for /F "tokens=3,5 delims==> " %%a in ('findstr /C:"<int name=" /C:"<imgdir name=" /C:"<string name=" %1') do (
   if not defined returnMap (
      if %%a equ "returnMap" (
         echo [%%~b]
         set returnMap=true
         set "portal="
      )
   ) else (
      if not defined portal (
         if %%a equ "portal" set portal=true & set /A i=0, skip=1
      ) else (
         if !skip! equ 1 (
            set /A skip-=1
            set "line="
         ) else if %%a neq "tn" (
            set "line=!line! %%~b"
         ) else (
            set /A i+=1, skip=1
            set "line[!i!]=!line:~1!"
            if %%b neq "" goto endPortals
         )
      )
   )
)
:endPortals
echo total=%i%
for /L %%i in (1,1,%i%) do echo !line[%%i]!

Output:

[10000]
total=4
sp 0 -389 183 999999999
sp 0 -416 185 999999999
sp 0 -450 183 999999999
out00 2 1080 541 20000

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM