简体   繁体   中英

how to increase the size of array or free the memory after each iteration. Error: Index was outside the bounds of the array c#

I read data from a text file which is 27 MB file and contains 10001 rows, I need to handle large data. I perform some kind of processing in each row of data and then write it back to a text file. This is the code I have am using

StreamReader streamReader = System.IO.File.OpenText("D:\\input.txt");
        string lineContent = streamReader.ReadLine();
        int count = 0;
        using (StreamWriter writer = new StreamWriter("D:\\ft1.txt"))
        {

            do
            {
                if (lineContent != null)
                {
                    string a = JsonConvert.DeserializeObject(lineContent).ToString();
                    string b = "[" + a + "]";
                    List<TweetModel> deserializedUsers = JsonConvert.DeserializeObject<List<TweetModel>>(b);
                    var CreatedAt = deserializedUsers.Select(user => user.created_at).ToArray();

                    var Text = deserializedUsers.Where(m => m.text != null).Select(user => new
                    {
                        a = Regex.Replace(user.text, @"[^\u0000-\u007F]", string.Empty)
                        .Replace(@"\/", "/")
                        .Replace("\\", @"\")
                        .Replace("\'", "'")
                        .Replace("\''", "''")
                        .Replace("\n", " ")
                        .Replace("\t", " ")
                    }).ToArray();
                    var TextWithTimeStamp = Text[0].a + " (timestamp:" + CreatedAt[0] + ")";
                    writer.WriteLine(TextWithTimeStamp);
                }
                lineContent = streamReader.ReadLine();

            }
            while (streamReader.Peek() != -1);
            streamReader.Close();

This code helps does well up to 54 iterations as I get 54 lines in the output file. After that it gives error "Index was outside the bounds of the array." at line

var TextWithTimeStamp = Text[0].a + " (timestamp:" + CreatedAt[0] + ")";

I am not very clear about the issue if the maximum capacity of array has been violated, if so how can I increase it or If I can write the individual line encountered in loop through

writer.WriteLine(TextWithTimeStamp);

And clean the storage or something that can solve this issue. I tried using list insead of array , still issue is the same.Please help.

Change this line

var TextWithTimeStamp = Text[0].a + " (timestamp:" + CreatedAt[0] + ")";

to

var TextWithTimeStamp = (Text.Any() ? Text.First().a : string.Empty) + 
            " (timestamp:" + (CreatedAt.Any() ? CreatedAt.First() : string.Empty) + ")";

As you are creating Text and CreatedAt collection objects, they might be empty (0 total item) based on some scenarios and conditions.

Those cases, Text[0] and CreatedAt[0] will fail. So, before using the first element, check if there are any items in the collection. Linq method Any() is used for that purpose.

Update

If you want to skip the lines that do not contain text, change this lines

var TextWithTimeStamp = Text[0].a + " (timestamp:" + CreatedAt[0] + ")";
writer.WriteLine(TextWithTimeStamp);

to

if (Text.Any())
{
    var TextWithTimeStamp = Text.First().a + " (timestamp:" + CreatedAt.First() + ")";
    writer.WriteLine(TextWithTimeStamp);
}

Update 2

To include all the strings s from CreatedAt rather than only the first one, you can add all the values in comma separated strings. A general example

var strings = new List<string> { "a", "b", "c" };
var allStrings = string.Join(",", strings); //"a,b,c"

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM