简体   繁体   中英

processing xml files with bash scripting

I have an xml file which has the following structure that contains numerous <Episodes></Episodes> to which the structure looks like this:

<Episode>
  <id>4195462</id>
  <Combined_episodenumber>8</Combined_episodenumber>
  <Combined_season>2</Combined_season>
  <DVD_chapter></DVD_chapter>
  <DVD_discid></DVD_discid>
  <DVD_episodenumber></DVD_episodenumber>
  <DVD_season></DVD_season>
  <Director>Jay Karas</Director>
  <EpImgFlag>2</EpImgFlag>
  <EpisodeName>Karl's Wedding</EpisodeName>
  <EpisodeNumber>8</EpisodeNumber>
  <FirstAired>2011-11-08</FirstAired>
  <GuestStars>Katee Sackhoff|Carla Gallo</GuestStars>
  <IMDB_ID></IMDB_ID>
  <Language>en</Language>
  <Overview>Karl Hevacheck, aka the Human Genius, gets married.</Overview>
  <ProductionCode>209</ProductionCode>
  <Rating>7.6</Rating>
  <RatingCount>20</RatingCount>
  <SeasonNumber>2</SeasonNumber>
  <Writer>Kevin Etten</Writer>
  <absolute_number></absolute_number>
  <filename>episodes/211751/4195462.jpg</filename>
  <lastupdated>1362547148</lastupdated>
  <seasonid>471254</seasonid>
  <seriesid>211751</seriesid>
</Episode>

I've figured out how to pull the information between a single tag like so

  value=$(grep -m 1 "<Rating>" path_to_file | sed 's/<.*>\(.*\)<\/.*>/\1/')

but I can't find a way to verify that I am looking at the correct episode ie. to check If this is the correct branch which is for <Combined_season>2</Combined_season> <EpisodeNumber>8</EpisodeNumber> before saving the values for specific attributes. I know this can somehow be done using a combination of sed and awk but can't seem to figure it out anyhelp on how I can do this would be greatly appreciated.

Use a proper XML parser not sed or awk . You can still call your XML parser from your bash script just like you would with sed or awk . It's a bad idea to use sed or awk because XML is a structured file, sed and awk typical work with line oriented files. You will just give yourself a headache by using the wrong tool for the job. I suggest using a dedicated tools or a language such a php , python or perl (or any other language not starting with p ) that has libraries for parsing XML.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM