簡體   English   中英

從文本文件中獲取字符串中的特定值

[英]Get certain value in the string from text file

我的文本文件中有這個:


000000000:Carrots:$1.99:214:03/11/2015:03/11/2016:$0.99
000000001:Bananas:$1.99:872:03/11/2015:03/11/2016:$0.99
000000002:Chocolate:$2.99:083:03/11/2015:03/11/2016:$1.99
000000003:Spaghetti:$3.99:376:03/11/2015:03/11/2016:$2.99
000000004:Tomato Sauce:$1.99:437:03/11/2015:03/11/2016:$0.99
000000005:Lettuce:$0.99:279:03/11/2015:03/11/2016:$0.99
000000006:Orange Juice:$2.99:398:03/11/2015:03/11/2016:$1.99
000000007:Potatoes:$2.99:792:03/11/2015:03/11/2016:$1.99
000000008:Celery:$0.99:973:03/11/2015:03/11/2016:$0.99
000000009:Onions:$1.99:763:03/11/2015:03/11/2016:$0.99
000000010:Chicken:$8.99:345:03/11/2015:03/11/2016:$7.99

000000010:雞肉:$ 8.99: 345 :03/11/2015:03/11/2016:$ 7.99

我需要從粗體位置獲取每個“數量”值的值。

編輯:我還想比較我得到的值,如果數量少,則給出錯誤。

在輸入數據較大的情況下,以最少的內存消耗解決方案。 另外:“數量”列中沒有處理不正確的數據。 為此,只需替換int.Parse塊;

這是使用LINQ表達式處理文件數據的幾種方法

    internal static class MyExtensions
{
    /// <exception cref="OutOfMemoryException">There is insufficient memory to allocate a buffer for the returned string. </exception>
    /// <exception cref="IOException">An I/O error occurs. </exception>
    /// <exception cref="ArgumentException"><paramref name="stream" /> does not support reading. </exception>
    /// <exception cref="ArgumentNullException"><paramref name="stream" /> is null. </exception>
    public static IEnumerable<string> EnumerateLines(this Stream stream)
    {
        using (var reader = new StreamReader(stream))
        {
            do
            {
                var line = reader.ReadLine();
                if (line == null) break;
                yield return line;
            } while (true);
        }
    }

    /// <exception cref="ArgumentNullException"><paramref name="line"/> is <see langword="null" />.</exception>
    public static IEnumerable<string> ChunkLine(this string line)
    {
        if (line == null) throw new ArgumentNullException("line");
        return line.Split(':');
    }

    /// <exception cref="ArgumentNullException"><paramref name="chuckedData"/> is <see langword="null" />.</exception>
    /// <exception cref="ArgumentException">Index should be not negative value</exception>
    public static string GetColumnData(this IEnumerable<string> chuckedData, int columnIndex)
    {
        if (chuckedData == null) throw new ArgumentNullException("chuckedData");
        if (columnIndex < 0) throw new ArgumentException("Column index should be >= 0", "columnIndex");
        return chuckedData.Skip(columnIndex).FirstOrDefault();
    }
}

這是用法示例:

    private void button1_Click(object sender, EventArgs e)
    {
        var values = EnumerateQuantityValues("largefile.txt");
        // do whatever you need
    }

    private IEnumerable<int> EnumerateQuantityValues(string fileName)
    {
        const int columnIndex = 3;
        using (var stream = File.OpenRead(fileName))
        {
            IEnumerable<int> enumerable = stream
                .EnumerateLines()
                .Select(x => x.ChunkLine().GetColumnData(columnIndex))
                .Select(int.Parse);

            foreach (var value in enumerable)
            {
                yield return value;
            }                
        }
    }

只需考慮是否可以將所有這些行存儲在字符串數組或列表中即可。

您可以應用以下代碼以IEnumerable<string>獲取數量的集合。

var quantity = arr.Select(c =>
{
    var temp = c.Split('$');
    if (temp.Length > 1)
    {
        temp = temp[1].Split(':');
        if (temp.Length > 1)
        {
            return temp[1];
        }
    }
    return null;
}).Where(c => c != null);

更新

檢查小提琴。 https://dotnetfiddle.net/HqKdeI

您只需要拆分字符串

string data = @"000000000:Carrots:$1.99:214:03/11/2015:03/11/2016:$0.99
000000001:Bananas:$1.99:872:03/11/2015:03/11/2016:$0.99
000000002:Chocolate:$2.99:083:03/11/2015:03/11/2016:$1.99
000000003:Spaghetti:$3.99:376:03/11/2015:03/11/2016:$2.99
000000004:Tomato Sauce:$1.99:437:03/11/2015:03/11/2016:$0.99
000000005:Lettuce:$0.99:279:03/11/2015:03/11/2016:$0.99
000000006:Orange Juice:$2.99:398:03/11/2015:03/11/2016:$1.99
000000007:Potatoes:$2.99:792:03/11/2015:03/11/2016:$1.99
000000008:Celery:$0.99:973:03/11/2015:03/11/2016:$0.99
000000009:Onions:$1.99:763:03/11/2015:03/11/2016:$0.99
000000010:Chicken:$8.99:345:03/11/2015:03/11/2016:$7.99";

string[] rows = data.split(Environment.Newline.ToCharArray());

foreach(var row in rows)
{
    string[] cols = row.Split(':');
    var quantity = cols[3];
}

您可以使用String.Split來執行此操作。

// Read all lines into an array
string[] lines = File.ReadAllLines(@"C:\path\to\your\file.txt");

// Loop through each one
foreach (string line in lines)
{
    // Split into an array based on the : symbol
    string[] split = line.Split(':');

    // Get the column based on index
    Console.WriteLine(split[3]);
}

查看下面的示例代碼。 您關心的字符串稱為TheValueYouWantInTheString。

char[] delimiterChar = { ':' };
string input = @"000000010:Chicken:$8.99:345:03/11/2015:03/11/2016:$7.99";
string[] values = input.Split(delimiterChar);
string theValueYouWantInTheString = values[3];

如果遇到問題,請使用正則表達式。 現在您有兩個問題。

這是一個使用您的輸入作為txt文件的程序。 函數GetQuantity返回包含數量的int列表。 使用這種方法,您可以定義更多的組以從每行提取信息。

namespace RegExptester
{
    class Program
    {

        private static List<int> GetQuantity(string txtFile)
        {
            string tempLineValue;
            Regex regex = new Regex(@"[0-9]*:[a-zA-Z]*:\$[0-9]*\.[0-9]*:([0-9]*).*", RegexOptions.Compiled);
            List<int> retValue = new List<int>();
            using (StreamReader inputReader = new StreamReader(txtFile))
            {
                while (null != (tempLineValue = inputReader.ReadLine()))
                {
                    Match match = regex.Match(tempLineValue);
                    if (match.Success)
                    {
                       if(match.Groups.Count == 2)
                       {
                           int numberValue;
                           if (int.TryParse(match.Groups[1].Value, out numberValue))
                               retValue.Add(numberValue);
                       }
                    }
                }
            }
            return retValue;
        } 

        static void Main(string[] args)
        {
            var tmp = GetQuantity("c:\\tmp\\junk.txt");
        }
    }
}

顯然,您希望從每行中的第3個冒號和第4個冒號之間分出一部分。 Linq可以為您做到這一點:

using (var textReader = new StreamReader(fileName))
{
    // read all text and divide into lines:
    var allText = textReader.ReadToEnd();
    var allLines = textReader.Split(new char[] {'\r','\n'}, StringSplitIoptions.RemoveEmptyEntries);

     // split each line based on ':', and take the fourth element
     var myValues = allLines.Select(line => line.Split(new char[] {':'})
         .Skip(3)
         .FirstOrDefault();
}

如果您希望降低可讀性,當然可以將這些語句合並為一行。

暫無
暫無

聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.

 
粵ICP備18138465號  © 2020-2024 STACKOOM.COM