簡體   English   中英

和弦字典(Python)中特殊字符(例如#、/)的正則表達式問題

[英]Regex problems with special characters (e.g. #, /) in a chord dictionary (Python)

我正在編寫和弦字典,為此我需要將不同類型的和弦分組為更小的組。

但是,我在處理一些包含#(例如 C#、C#m)和 D7/F# 和 A/B 等變體的變體時遇到了問題,我想將它們插入到其他變體中。

我相信這是一些正則表達式參數,我承認我不太熟悉。

這是開發的代碼:

triadeMaior = pd.DataFrame({'triadeMaior': ['C','C#','Db','D','D#','Eb','E','F','F#','Gb','G','G#','Ab','A','A#','Bb','B']
})

triadeMenor = pd.DataFrame({'triadeMenor': ['Cm','C#m','Dbm','Dm','D#m','Ebm','Em','Fm','F#m','Gbm','Gm','G#m','Abm','Am','A#m','Bbm','Bm']
})

triadeDiminuta = pd.DataFrame({'triadeDiminuta':['Cdim','C#dim','Dbdim', 'Ddim', 'D#dim', 'Ebdim', 'Edim', 'Fdim', 'F#dim', 'Gbdim','Gdim', 
'G#dim', 'Abdim', 'Adim', 'A#dim', 'Bbdim', 'Bdim']
})

triadeAumentada = pd.DataFrame({'triadeAumentada':['Caug','C#aug','Dbaug','Daug','D#aug','Ebaug','Eaug','Faug','F#aug','Gbaug','Gaug','G#aug','Abaug','Aaug','A#aug','Bbaug','Baug' ]
})

setima = pd.DataFrame({'setima':['C7','C#7','Db7','D7','D#7','Eb7','E7','F7','F#7','Gb7','G7','G#7','Ab7','A7','A#7','Bb7','B7']
})

setimaMenor = pd.DataFrame({'setimaMenor':['Cm7','C#m7','Dbm7','Dm7','D#m7','Ebm7','Em7','Fm7','F#m7','Gbm7','Gm7','G#m7','Abm7','Am7','A#m7','Bbm7','Bm7']
})

setimaMaior = pd.DataFrame({'setimaMaior':['Cmaj7', 'C#maj7', 'Dbmaj7', 'Dmaj7', 'D#maj7', 'Ebmaj7', 'Emaj7', 'Fmaj7', 'F#maj7','Gbmaj7','Gmaj7', 'G#maj7','Abmaj7','Amaj7','A#maj7','Bbmaj7','Bmaj7']
})

setimaMenorQuinta = ({'setimaMenorQuinta':['Cm7b5','C#m7b5', 'Dbm7b5', 'Dm7b5', 'D#m7b5', 'Ebm7b5','Em7b5', 'Fm7b5', 'F#m7b5', 'Gbm7b5', 'Gm7b5', 'G#m7b5', 'Abm7b5', 'Am7b5', 'A#m7b5', 'Bbm7b5', 'Bm7b5']
})

sexta= pd.DataFrame({'sexta':['C6','C#6','Db6','D6','D#6','Eb6','E6','F6','F#6','Gb6','G6','G#6','Ab6','A6','A#6','Bb6','B6']
})

sextaMenor = pd.DataFrame({'sextaMenor': ['Cm6','C#m6','Dbm6','Dm6','D#m6','Ebm6','Em6','Fm6','F#m6','Gbm6','Gm6','G#m6','Abm6','Am6','A#m6' 
'Bbm6','Bm6']
})
triadeMaior_pat = fr"\b({'|'.join(triadeMaior['triadeMaior'])})\b"
triadeMenor_pat = fr"\b({'|'.join(triadeMenor['triadeMenor'])})\b"
triadeDiminuta_pat = fr"\b({'|'.join(triadeDiminuta['triadeDiminuta'])})\b"
triadeAumentada_pat = fr"\b({'|'.join(triadeAumentada['triadeAumentada'])})\b"
setima_pat = fr"\b({'|'.join(setima['setima'])})\b"
setimaMenor_pat = fr"\b({'|'.join(setimaMenor['setimaMenor'])})\b"
setimaMaior_pat = fr"\b({'|'.join(setimaMaior['setimaMaior'])})\b"
setimaMenorQuinta_pat = fr"\b({'|'.join(setimaMenorQuinta['setimaMenorQuinta'])})\b"
sexta_pat = fr"\b({'|'.join(sexta['sexta'])})\b"
sextaMenor_pat = fr"\b({'|'.join(sextaMenor['sextaMenor'])})\b"
df['chordType'] = df['chords'].replace({triadeMaior_pat: 'triadeMaj',
                                   triadeMenor_pat: 'triadeMen',
                                   triadeDiminuta_pat: 'triadeDim',
                                   triadeAumentada_pat: 'triadeAug',
                                   setima_pat: 'setima',
                                   setimaMenor_pat: 'setimaMen', 
                                   setimaMaior_pat: 'setimaMaj',
                                   setimaMenorQuinta_pat : 'setimaMenQui',
                                   sexta_pat:'sexta',
                                   sextaMenor_pat: 'sextaMen',                               
                                   r'\b(?!triadeMaj|triadeMen|triadeDim|triadeAug|setima|setimaMen|setimaMen|setimaMaj|sexta|sextaMen\b)\w+': 'outros'}, 
                                  regex=True)

以下是一些結果的示例:

和弦 和弦類型
C# , E7, Abm, Amaj7, E, Abm, C#m , E triadeMaj# , setima, triadeMen, setimaMaj, triadeMaj, triadeMen, triadeMaj#outros , triadeMaj
E、A7、G6、 D/F# 、F6、E、Em、 D7/F# 、Fmaj7、E、A7、G6、 D7/F# 、F6、Em、D、Dm7、E triadeMaj,setima,sexta, triadeMaj/triadeMaj# ,sexta,triadeMaj,triadeMen, setima/triadeMaj# ,setimaMaj,triadeMaj,setima,sexta, setima/triadeMaj# ,sexta,triadeMen,triadeMaj,setimaMen,triadeMaj

如您所見,對於帶有# 和/ 的和弦,當前代碼將其理解為兩部分而不是一個部分。

有誰知道如何解決? 另外,正如我所提到的,我沒有很多正則表達式技能,所以我不知道是否可以縮短代碼並使代碼更加健壯和干凈。

實際上,如果沒有正則表達式,這可能會更干凈。

此示例僅使用您數據的一小部分,但您可以使用所有映射填寫chord_types字典。

import pandas as pd

chord_types = {'C': 'triadeMaj', 'C#': 'triadeMaj', 'C7': 'setima'} # Add as required
df = pd.DataFrame(['C, C7', 'C, C#'], columns=('chords',)) # Toy example

map_fn = lambda cs: ', '.join((chord_types.get(c, 'outros') for c in cs))
df['chordType'] = df['chords'].str.replace(' ', '').str.split(',').apply(map_fn)

print(df)

給予:

  chords             chordType
0  C, C7     triadeMaj, setima
1  C, C#  triadeMaj, triadeMaj

暫無
暫無

聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.

 
粵ICP備18138465號  © 2020-2024 STACKOOM.COM