简体   繁体   中英

Extracting tab data with Regex

I'm writing a web script to scrape data from guitar tabs and convert it into MIDI notes.

This is what the extracted tab data looks like. in string form:

[tab]e|------------------------------------------------------------------------|\r
\nB|----------3------5-------3----------------------------------------------|\r
\nG|---2-4---------------4----------4-2-0-----------------------------------|\r
\nD|---------------------------------------0---------------0-0--------------|\r
\nA|-----------------------------------------------------2------------------|\r
\nE|------------------------------------------------------------------------

I need to be able to convert this visual representation of a guitar neck into a dataframe with 6 rows (e,B,G,D,A,E)

I'm looking for away to use regex to extract the relevant tab information and convert it into a dataframe.

For example,

  1. Select all characters between "e|"and "|"
  2. Store these characters in a 1xn dataframe

I cannot figure out how to do this for the life of me and its very frustrating.

Please and thank you!

I would go for the following:

import re
import pandas

txt="""
[tab]e|------------------------------------------------------------------------|\r
\nB|----------3------5-------3----------------------------------------------|\r
\nG|---2-4---------------4----------4-2-0-----------------------------------|\r
\nD|---------------------------------------0---------------0-0--------------|\r
\nA|-----------------------------------------------------2------------------|\r
\nE|------------------------------------------------------------------------"""
pattern=r"(?<=e\|)[^|]*(?=\|%s)"%"\r"

matches=re.findall(pattern,txt)
dataframe=pandas.DataFrame([matches])

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM