简体   繁体   中英

parse this kind of file in a proper format

I want to parse this file in [{},{}]

Each hash is the section in the STREAM tag

Is there any library can do this ? thanks

[STREAM]
index=0
codec_name=h264
codec_long_name=H.264 / AVC / MPEG-4 AVC / MPEG-4 part 10
has_b_frames=2
sample_aspect_ratio=0:1
display_aspect_ratio=0:1
pix_fmt=yuv420p
level=11
timecode=N/A
id=N/A
r_frame_rate=15000/1001
avg_frame_rate=15000/1001
time_base=1/15000
start_pts=0
start_time=0.000000
duration_ts=76076
duration=5.071733
[/STREAM]
[STREAM]
index=1
codec_name=aac
codec_long_name=AAC (Advanced Audio Coding)
channels=1
[/STREAM]

Unless your file is a standard format (ex: csv) then there probably isn't a library that will already know how to parse it. However a format this simple isn't hard to write a parse loop for. I saved your example to streams.txt and ran this

import pprint

streams = []
stream = {}
with open("streams.txt", "r") as streams_file:
    for line in streams_file:
        # Remove whitespace and newline character
        line = line.strip()
        if line == "[STREAM]":
            # Start of new stream
            stream = {}
        elif line == "[/STREAM]":
            # End of stream, add it to streams list
            streams.append(stream)
        else:
            # Item of stream
            key_value_pair = line.split("=")
            if key_value_pair:
                stream[key_value_pair[0]] = key_value_pair[1]

pprint.pprint(streams)

Which provided this output

[{'avg_frame_rate': '15000/1001',
  'codec_long_name': 'H.264 / AVC / MPEG-4 AVC / MPEG-4 part 10',
  'codec_name': 'h264',
  'display_aspect_ratio': '0:1',
  'duration': '5.071733',
  'duration_ts': '76076',
  'has_b_frames': '2',
  'id': 'N/A',
  'index': '0',
  'level': '11',
  'pix_fmt': 'yuv420p',
  'r_frame_rate': '15000/1001',
  'sample_aspect_ratio': '0:1',
  'start_pts': '0',
  'start_time': '0.000000',
  'time_base': '1/15000',
  'timecode': 'N/A'},
 {'channels': '1',
  'codec_long_name': 'AAC (Advanced Audio Coding)',
  'codec_name': 'aac',
  'index': '1'}]

You could run in to potential problem if there are malformation in the file, if an equals sign is present more than once a line, or if key_value_pair somehow ends up with only 1 entry. But with your example this ran fine.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM