簡體   English   中英

從字符串數組或列表 c# 中查找第一個匹配字符串的索引

[英]Find the index of first matching string from string array or list c#

我在表格中有字符串

var dummyString = $@"SIGNED APPLICATION AND AFFIDAVIT REQUIRED  LOCATION:  BLK 99, LOT 9 AND BLK 100 LOT 9, 10, 11, 12 & 13 RT 38 EAST HAINESPORT, NJ  BASED ON:  VACANT LAND";

我想做的是從這個字符串中提取位置/地址。 我可以很容易地找到LOCATION 的索引:但想不出有效的解決方案來解決我應該終止字符串的索引。 最簡單的選擇是遍歷列表並找到 state 代碼的索引,但這不是一種非常有效的處理方式。

What i thought would be the solution to this problem is to use a list of US state codes and then find the index of the first match of any state code after the index of LOCATION: substring with a whitespace so I can find the complete state code及其索引。

public const List<string> USStateCodes = new List<string> { "AL", "AK", "AS", "AZ", "AR", "CA", "CO", "CT", "DE", "DC", "FM", "FL", "GA", "GU", "HI", "ID", "IL", "IN", "IA", "KS", "KY", "LA", "ME", "MH", "MD", "MA", "MI", "MN", "MS", "MO", "MT", "NE", "NV", "NH", "NJ", "NM", "NY", "NC", "ND", "MP", "OH", "OK", "OR", "PW", "PA", "PR", "RI", "SC", "SD", "TN", "TX", "UT", "VT", "VI", "VA", "WA", "WV", "WI", "WY" };

關於如何從這里開始的任何想法?

我想要的 output 是:

BLK 99、LOT 9 和 BLK 100 LOT 9、10、11、12 和 13 RT 38 東海恩斯波特,新澤西州

這里陳述的問題是更大邏輯的一部分,我使用正則表達式查找 zip 代碼(5 位)的索引作為終止符,但在某些情況下,地址中可能不存在 zip 代碼(用戶錯誤)。 我仍然必須能夠提取地址。

您可以使用

var dummyString = @"SIGNED APPLICATION AND AFFIDAVIT REQUIRED  LOCATION:  BLK 99, LOT 9 AND BLK 100 LOT 9, 10, 11, 12 & 13 RT 38 EAST HAINESPORT, NJ  BASED ON:  VACANT LAND";
var USStateCodes = new List<string> { "AL", "AK", "AS", "AZ", "AR", "CA", "CO", "CT", "DE", "DC", "FM", "FL", "GA", "GU", "HI", "ID", "IL", "IN", "IA", "KS", "KY", "LA", "ME", "MH", "MD", "MA", "MI", "MN", "MS", "MO", "MT", "NE", "NV", "NH", "NJ", "NM", "NY", "NC", "ND", "MP", "OH", "OK", "OR", "PW", "PA", "PR", "RI", "SC", "SD", "TN", "TX", "UT", "VT", "VI", "VA", "WA", "WV", "WI", "WY" };
var result = Regex.Match(dummyString, $@"LOCATION:\s*(.*?\b(?:{string.Join("|", USStateCodes)}))\b")?.Groups[1].Value;

請參閱C# 演示result output: BLK 99, LOT 9 AND BLK 100 LOT 9, 10, 11, 12 & 13 RT 38 EAST HAINESPORT, NJ

結果模式是

LOCATION:\s*(.*?\b(?:AL|AK|AS|AZ|AR|CA|CO|CT|DE|DC|FM|FL|GA|GU|HI|ID|IL|IN|IA|KS|KY|LA|ME|MH|MD|MA|MI|MN|MS|MO|MT|NE|NV|NH|NJ|NM|NY|NC|ND|MP|OH|OK|OR|PW|PA|PR|RI|SC|SD|TN|TX|UT|VT|VI|VA|WA|WV|WI|WY))\b

請參閱正則表達式演示

細節

  • LOCATION: - 一個固定的起始字符串
  • \s* - 0+ 個空格
  • (.*?\b(?:{string.Join("|", USStateCodes)})) - 第 1 組(結果將在組中捕獲):
    • .*? - 除換行符之外的任何 0 個或多個字符(也使用RegexOptions.Singleline匹配換行符),盡可能少
    • \b - 單詞邊界
    • (?:{string.Join("|", USStateCodes)}) - 使用 state 代碼(如(?:AL|AK|AS|...|WY) )創建一個交替組並匹配任何一個備選方案
  • \b - 單詞邊界。

暫無
暫無

聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.

 
粵ICP備18138465號  © 2020-2024 STACKOOM.COM