简体   繁体   中英

C# OutofMemoryException with Regex

I'm getting a OutOfMemoryException at

if (Regex.IsMatch(output, @"^\\d"))

But I'm unsure of what's causing it, my program had been running for like 4 minute. Reading text files (a lot of them). Bulk inserting them into SQL. The output string at the time contained nothing special, a small text read from a .txt file.

I'm assuming this is happening because of the amount of times it needs to regex check, after 4 minute it was in the million times. Is there a way to prevent the Memory problem? dispose or clear before I start looping? If so how do you that?

EDIT: I'm not reading a big file, I'm reading a lot of files. At the time it failed it was around 6666~ files it already read (5 folders) but it needs to read 60 folders in total -> 80.361 .txt files

EDIT: Added the source code. Hoping to clarify

UPDATE:

added: static void DisposeAll(IEnumerable set)

static void DisposeAll(IEnumerable set)
{
    foreach (Object obj in set)
    {
        IDisposable disp = obj as IDisposable;
        if (disp != null) { disp.Dispose(); }
    }
}

And I'm executing this at the end of each loop of a folder.

DisposeAll(ListExtraInfo);
DisposeAll(ListFouten);
ListFouten.Clear();
ListExtraInfo.Clear();

Error placement changed, no longer the Regex but ListFouten is causing it now. Still happening at around 6666 .txt files read.

Exception of type 'System.OutOfMemoryException' was thrown.

static void Main(string[] args)
        {
            string pathMMAP = @"G:\HLE13\Resultaten\MMAP";
            string[] entriesMMAP = Directory.GetDirectories(pathMMAP);
            List<string> treinNamen = new List<string>();

            foreach (string path in entriesMMAP)
            {
                string TreinNaam = new DirectoryInfo(path).Name;
                treinNamen.Add(TreinNaam);
                int IdTrein = 0;
                ListExtraInfo = new List<extraInfo>();
                ListFouten = new List<fouten>();
                readData(TreinNaam, IdTrein, path);
             }
        }


        static void readData(string TreinNaam, int IdTrein, string path)
        {
            using (SqlConnection sourceConnection = new SqlConnection(GetConnectionString()))
            {
                sourceConnection.Open();


                try
                {
                    SqlCommand commandRowCount = new SqlCommand(
                 "SELECT TreinId FROM TestDatabase.dbo.Treinen where Name = " + TreinNaam,
                 sourceConnection);
                    IdTrein = Convert.ToInt16(commandRowCount.ExecuteScalar());

                }
                catch (Exception ex)
                {


                }

            }

            string[] entriesTreinen = Directory.GetDirectories(path);
            foreach (string rapport in entriesTreinen)
            {

                string RapportNaam = new DirectoryInfo(rapport).Name;
                FileInfo fileData = new System.IO.FileInfo(rapport);

                leesTxt(rapport, TreinNaam, GetConnectionString(), IdTrein);

            }
        }
        public static string datum;
        public static string tijd;
        public static string foutcode;
        public static string absentOfPresent;
        public static string teller;
        public static string omschrijving;
        public static List<fouten> ListFouten;
        public static List<extraInfo> ListExtraInfo;
        public static string textname;
        public static int referentieGetal = 0;


        static void leesTxt(string rapport, string TreinNaam, string myConnection, int TreinId)
        {
            foreach (string textFilePath in Directory.EnumerateFiles(rapport, "*.txt"))
            {

                textname = Path.GetFileName(textFilePath);
                textname = textname.Substring(0, textname.Length - 4);

                using (StreamReader r = new StreamReader(textFilePath))
                {
                    for (int x = 0; x <= 10; x++)
                        r.ReadLine();

                    string output;

                    Regex compiledRegex = new Regex(@"^\d", RegexOptions.Compiled);
                    string[] info = new string[] { };
                    string[] datumTijdelijk = new string[] { };

                    while (true)
                    {

                        output = r.ReadLine();
                        if (output == null)
                            break;


                        if (compiledRegex.IsMatch(output))
                        {
                            info = output.Split(' ');
                            int kolom = 6;
                            datum = info[0];
                            datumTijdelijk = datum.Split(new[] { '/' });


                            try
                            {
                                datum = string.Format("{2}/{1}/{0}", datumTijdelijk);
                                tijd = info[1];
                                foutcode = info[2];
                                absentOfPresent = info[4];
                                teller = info[5];
                                omschrijving = info[6];
                            }
                            catch (Exception ex)
                            {

                            }


                            while (kolom < info.Count() - 1)
                            {
                                kolom++;
                                omschrijving = omschrijving + " " + info[kolom];
                            }
                            referentieGetal++;


                            ListFouten.Add(new fouten { Date = datum, Time = tijd, Description = omschrijving, ErrorCode = foutcode, Module = textname, Name = TreinNaam, TreinId = TreinId, FoutId = referentieGetal });

                        }


                        if (output == string.Empty)
                        {
                            output = " ";
                        }
                        if (Char.IsLetter(output[0]))
                        {
                            ListExtraInfo.Add(new extraInfo { Values = output, FoutId = referentieGetal });
                        }

                    }

                }

            }

        }

It could be because your code is re-compiling the regular expression every time it is used? Try using a compiled Regex transform instead. Outside your foreach loop, store a compiled Regex variable:

Regex compiledRegex = new Regex(@"^\d", RegexOptions.Compiled);

Then, when checking for the match, use:

if (compiledRegex.IsMatch(output))

Edit: this answer is not valid. Though the Regex documentation here states that Regex expressions encountered in instance methods would be recompiled, this is not the case: they are cached.

This issue is not for the fault of the regex operations, for the true fault lies in the data which is ultimately being stored around the regex processing .

The analogy is driving a car and saying "It ran out of gas while I had the radio on". It is not the radio's fault...

I recommend that you identify why such copious amounts of data are being stored and resolve that.


There are better ways of processing and analyzing information than throwing everything in memory. I believe that you will need to rewrite the logic to achieve the end goal.

Why are you collecting, and more importantly saving information about every line of 6000+ files? That might be the real issue here....


Otherwise be proactive with these steps

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM