[英]Text with /n matching in regex and Openrefine
我正在嘗試在開放式優化中過濾包含new lines
的文本。
輸入為:
Them Spanish girls love me like I'm Aventura
I'm the man, y'all don't get it, do ya?
Type of money, everybody acting like they knew ya
Go Uptown, New York City, bitch
Them Spanish girls love me like I'm Aventura
Tell Uncle Luke I'm out in Miami, too
Them Spanish girls love me like I'm Aventura
預期結果將是:
Type of money, everybody acting like they knew ya
Go Uptown, New York City, bitch
Them Spanish girls love me like I'm Aventura
我試圖獲得與關鍵字和之前和之后的行。
我的標准正則表達式代碼如下所示:
/((.*\\n){2})^.*\\b(New York)\\b.*((.*\\n){3})/m
但這在公開優化中不起作用。 我嘗試了以下操作,但僅返回“ null”
value.match(/.*(\\New York)/.*)
有人知道我該怎么做嗎? 我真的需要保持警惕,所以我不能在比賽前做replace(/\\n/,'')
。
全新的OpenRefine 3具有find()
函數,比match()
更友好。
我認為這個正則表達式可以解決問題:
value.find(/(.*\n){1}.+New York.+(\n.*){1}/).join('\n')
結果:
如果出於某種原因您更喜歡使用OpenRefine 2.8,則Python / Jython提供了另一種選擇:
import re
matches = re.findall(r".+?\n.+New York.+\n.+", value)
return "\n".join(matches)
結果:
如果您想完全避免使用RegEx,而只是閱讀文本並在此行之前和之后寫這行,則可以在Excel中的單元格A1
中編寫該文本:
Public Sub TestMe()
Dim inputString As String
inputString = Range("A1")
Dim lookForWord As String
lookForWord = "New York"
Dim inputArr As Variant
inputArr = Split(inputString, vbLf)
Dim line As Variant
Dim previousLine As String
Dim foundWord As Boolean
Dim linesAfter As Long: linesAfter = 1
For Each line In inputArr
If InStr(1, line, lookForWord) Then
previousLine = previousLine & vbCrLf & line
foundWord = True
Else
If foundWord And linesAfter Then
previousLine = previousLine & vbCrLf & line
linesAfter = linesAfter - 1
ElseIf linesAfter Then
previousLine = line
End If
End If
Next line
If Not linesAfter Then Debug.Print previousLine
End Sub
Split()
將文本解析為如下數組:
linesAfter
變量可以告訴您單詞后應顯示多少行。
聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.