[英]Identifying strings and manipulating the correctly
為了序言,我從數據庫中提取記錄。 CaseNumber
列將具有唯一標識符。 但是,與ONE事件相關的多個案例將具有非常相似的案例編號,其中最后兩位數字將是下一個數字。 例:
TR42X2330789
TR42X2330790
TR42X2330791
TR51C0613938
TR51C0613939
TR51C0613940
TR51C0613941
TR51C0613942
TR52X4224749
如您所見,我們必須將這些記錄分為三組。 目前我的功能非常混亂,而且我沒有考慮一組案例編號后跟另一組案例編號的情況。 我想知道是否有人對如何解決這個問題有任何建議。 我正在考慮將所有案例編號放在一個數組中。
int i = 1;
string firstCaseNumber = string.Empty;
string previousCaseNumber = string.Empty;
if (i == 1)
{
firstCaseNumber = texasHarrisPublicRecordInfo.CaseNumber;
i++;
}
else if (i == 2)
{
string previousCaseNumberCode = firstCaseNumber.Remove(firstCaseNumber.Length - 3);
int previousCaseNumberTwoCharacters = Int32.Parse(firstCaseNumber.Substring(Math.Max(0, firstCaseNumber.Length - 2)));
string currentCaseNumberCode = texasHarrisPublicRecordInfo.CaseNumber.Remove(texasHarrisPublicRecordInfo.CaseNumber.Length - 3);
int currentCaselastTwoCharacters = Int32.Parse(texasHarrisPublicRecordInfo.CaseNumber.Substring(Math.Max(0, texasHarrisPublicRecordInfo.CaseNumber.Length - 2)));
int numberPlusOne = previousCaseNumberTwoCharacters + 1;
if (previousCaseNumberCode == currentCaseNumberCode && numberPlusOne == currentCaselastTwoCharacters)
{
//Group offense here
i++;
needNewCriminalRecord = false;
}
else
{
//NewGRoup here
}
previousCaseNumber = texasHarrisPublicRecordInfo.CaseNumber;
i++;
}
else
{
string beforeCaseNumberCode = previousCaseNumber.Remove(previousCaseNumber.Length - 3);
int beforeCaselastTwoCharacters = Int32.Parse(previousCaseNumber.Substring(Math.Max(0, previousCaseNumber.Length - 2)));
string currentCaseNumberCode = texasHarrisPublicRecordInfo.CaseNumber.Remove(texasHarrisPublicRecordInfo.CaseNumber.Length - 3);
int currentCaselastTwoCharacters = Int32.Parse(texasHarrisPublicRecordInfo.CaseNumber.Substring(Math.Max(0, texasHarrisPublicRecordInfo.CaseNumber.Length - 2)));
int numberPlusOne = beforeCaselastTwoCharacters + 1;
if (beforeCaseNumberCode == currentCaseNumberCode && numberPlusOne == currentCaselastTwoCharacters)
{
i++;
needNewCriminalRecord = false;
}
else
{
needNewCriminalRecord = true;
}
}
如果您不關心性能,可以使用LINQ .GroupBy()
和.ToDictionary()
方法並使用列表創建字典。 以下內容:
string[] values =
{
"TR42X2330789",
"TR42X2330790",
"TR42X2330791",
"TR51C0613938",
"TR51C0613939",
"TR51C0613940",
"TR51C0613941",
"TR51C0613942",
"TR52X4224749"
};
Dictionary<string, List<string>> grouppedValues = values.GroupBy(v =>
new string(v.Take(9).ToArray()), // key - first 9 chars
v => v) // value
.ToDictionary(g => g.Key, g => g.ToList());
foreach (var item in grouppedValues)
{
Console.WriteLine(item.Key + " " + item.Value.Count);
}
輸出:
TR42X2330 3
TR51C0613 5
TR52X4224 1
我會創建一個普通的puropose擴展方法:
static IEnumerable<IEnumerable<T>> GroupByConsecutiveKey<T, TKey>(this IEnumerable<T> list, Func<T, TKey> keySelector, Func<TKey, TKey, bool> areConsecutive)
{
using (var enumerator = list.GetEnumerator())
{
TKey previousKey = default(TKey);
var currentGroup = new List<T>();
while (enumerator.MoveNext())
{
if (!areConsecutive(previousKey, keySelector(enumerator.Current)))
{
if (currentGroup.Count > 0)
{
yield return currentGroup;
currentGroup = new List<T>();
}
}
currentGroup.Add(enumerator.Current);
previousKey = keySelector(enumerator.Current);
}
if (currentGroup.Count != 0)
{
yield return currentGroup;
}
}
}
現在你會像以下一樣使用它:
var grouped = data.GroupByConsecutiveKey(item => item, (k1, k2) => areConsecutive(k1, k2));
對於areConsecutive
的快速破解可能是:
public static bool Consecutive(string s1, string s2)
{
if (s1 == null || s2 == null)
return false;
if (s1.Substring(0, s1.Length - 2) != s2.Substring(0, s2.Length - 2))
return false;
var end1 = s1.Substring(s1.Length - 2, 2);
var end2 = s2.Substring(s2.Length - 2, 2);
if (end1[1]!='0' && end2[1]!='0')
return Math.Abs((int)end1[1] - (int)end2[1]) == 1;
return Math.Abs(int.Parse(end1) - int.Parse(end2)) == 1;
}
請注意,我正在考慮Key
可以采取任何形式。 如果字母數字代碼始終具有相同的模式,那么您可以使這個方法更漂亮或只使用正則表達式。
聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.