[英]Formatting the string by rewriting the delimiters
我正在處理一些舊數據,它們將每個記錄存儲在一個大/大字符串中(一個字符串=一條記錄)
在每個字符串中,它們使用某種分隔符來分割數據,但是每個分隔符實際上都定義了一個含義,例如:\\ vToyota \\ cBlue \\ cRed \\ cWhite \\ s200mph \\ oAndrew \\ oJohn
\\ v表示車輛,\\ c是顏色,\\ s是速度\\ o是所有者...類似的東西
我的任務要求我重新格式化數據,以便如果存在一個特征的多個字段,則必須將其重寫為:(例如)\\ vToyota \\ cBlue \\ c2Red \\ c3White \\ s200mph \\ oAndrew \\ o2John
編輯:好的。 @DarrenYoung的建議有效! 現在我有一組vToyota cBlue cRed cWhite s200mph oAndrew oJohn。 我使用相同的方法對其他數據進行了測試,它也可以正常工作。 現在,我只需要幫助就可以找到一種方法,以在每次重復時重寫每個字符串的第一個字母。
謝謝!
我發現這是一個有趣的小難題,以了解我可以使用LINQ做什么。 以下似乎有效:
private string FixIt(string foo)
{
var newFoo = "\\" + string.Join("\\",
foo.Split(new[] {'\\'}, StringSplitOptions.RemoveEmptyEntries)
.GroupBy(s => s[0],
(c, g) =>
{
var cnt = 0;
return g.Select(x => cnt++ == 0
? x
: x[0] + cnt.ToString() + x.Substring(1));
})
.SelectMany(g => g));
return newFoo;
}
Input: \vToyota\cBlue\cRed\cWhite\s200mph\oAndrew\oJohn
Output: \vToyota\cBlue\c2Red\c3White\s200mph\oAndrew\o2John
SelectMany
是一件方便記住的事情。
因為我認為這個問題很有趣,所以我寫了一個程序來做我認為是合理的解決方案。 我從幾個基本假設開始:
在我的解決方案中,我還創建了單獨的類以提高可讀性,因此可以快速調查代碼。 盡管我不建議這樣做已經准備好進行生產,但是以下方法在解決該問題上應該走很長的路。 祝你好運!
// Just paste the rest of this into a new console application to see it work!
public class Program
{
private static readonly List<string> TOKENS = new List<string> {@"\v", @"\c", @"\o", @"\s"};
private static readonly List<string> DISPLAY = new List<string> {"Vehicle", "Color", "Owner", "Speed"};
private static readonly List<bool> ALLOW_MULTIPLE = new List<bool> {false, true, true, false};
private class RecordEntry
{
public string Value { get; set; }
public int Index { get; set; }
public string DataType { get; set; }
public override string ToString() { return DataType + ": " + Value; }
}
private class ParsedRecord
{
private List<RecordEntry> entries = new List<RecordEntry>();
public List<RecordEntry> Entries { get { return entries; } }
}
public static void Main(string[] args)
{
// sample records (second has a \m which is ignored since it isn't a recognized token)
var records = new[] {@"\vToyota\cBlue\c2Red\c3White\s200mph\oAndrew\o2John",
@"\vChevy\c2Orange\cGreen\s50mph\o2Bob\mWhite"};
var parsedData = new List<ParsedRecord>();
foreach (var record in records)
{
// character by character parsing
var currentParseRecord = new ParsedRecord();
parsedData.Add(currentParseRecord);
var currentRecord = new StringBuilder(record);
var currentToken = new StringBuilder();
for (var parseIdx = 0; parseIdx < currentRecord.Length; parseIdx++)
{
currentToken.Append(currentRecord[parseIdx]);
var recordIdx = 0;
var index = TOKENS.IndexOf(currentToken.ToString());
if (index < 0) continue;
// current char is used up now (was part of the token)
parseIdx++;
if (ALLOW_MULTIPLE[index] && currentRecord.Length > parseIdx + 1)
{
// assuming less than 10 records max - if more, would need to pull multiple numeric values here
if (!Int32.TryParse(currentRecord[parseIdx] + "", out recordIdx)) recordIdx = 0;
else parseIdx++;
}
// find the next token or end of string
int valueLength = FindNextToken(currentRecord, parseIdx) - parseIdx;
if (valueLength <= 0) valueLength = currentRecord.Length - parseIdx;
currentParseRecord.Entries.Add(new RecordEntry
{
DataType = DISPLAY[index],
Index = recordIdx,
Value = currentRecord.ToString(parseIdx, valueLength)
});
parseIdx += valueLength - 1;
currentToken.Clear();
}
}
}
private static int FindNextToken(StringBuilder value, int currentIndex)
{
for (var searchIdx = currentIndex; searchIdx < value.Length; searchIdx++) {
if (TOKENS.Any(checkToken => value.Length > searchIdx + checkToken.Length &&
value.ToString(searchIdx, checkToken.Length) == checkToken)) {
return searchIdx;
}
}
return -1;
}
}
聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.