简体   繁体   中英

C# better way to do this?

Hi I have this code below and am looking for a prettier/faster way to do this.

Thanks!

string value = "HelloGoodByeSeeYouLater";
string[] y = new string[]{"Hello", "You"};

foreach(string x in y)
{
    value = value.Replace(x, "");
}

You could do:

y.ToList().ForEach(x => value = value.Replace(x, ""));

Although I think your variant is more readable.

Forgive me, but someone's gotta say it,

value = Regex.Replace( value, string.Join("|", y.Select(Regex.Escape)), "" );

Possibly faster, since it creates fewer strings.

EDIT : Credit to Gabe and lasseespeholt for Escape and Select.

While not any prettier, there are other ways to express the same thing.

In LINQ:

value = y.Aggregate(value, (acc, x) => acc.Replace(x, ""));

With String methods:

value = String.Join("", value.Split(y, StringSplitOptions.None));

I don't think anything is going to be faster in managed code than a simple Replace in a foreach though.

It depends on the size of the string you are searching. The foreach example is perfectly fine for small operations but creates a new instance of the string each time it operates because the string is immutable. It also requires searching the whole string over and over again in a linear fashion.

The basic solutions have all been proposed. The Linq examples provided are good if you are comfortable with that syntax; I also liked the suggestion of an extension method, although that is probably the slowest of the proposed solutions. I would avoid a Regex unless you have an extremely specific need.

So let's explore more elaborate solutions and assume you needed to handle a string that was thousands of characters in length and had many possible words to be replaced. If this doesn't apply to the OP's need, maybe it will help someone else.

Method #1 is geared towards large strings with few possible matches.

Method #2 is geared towards short strings with numerous matches.

Method #1

I have handled large-scale parsing in c# using char arrays and pointer math with intelligent seek operations that are optimized for the length and potential frequency of the term being searched for. It follows the methodology of:

  • Extremely cheap Peeks one character at a time
  • Only investigate potential matches
  • Modify output when match is found

For example, you might read through the whole source array and only add words to the output when they are NOT found. This would remove the need to keep redimensioning strings.

A simple example of this technique is looking for a closing HTML tag in a DOM parser. For example, I may read an opening STYLE tag and want to skip through (or buffer) thousands of characters until I find a closing STYLE tag.

This approach provides incredibly high performance, but it's also incredibly complicated if you don't need it (plus you need to be well-versed in memory manipulation/management or you will create all sorts of bugs and instability).

I should note that the .Net string libraries are already incredibly efficient but you can optimize this approach for your own specific needs and achieve better performance (and I have validated this firsthand).

Method #2

Another alternative involves storing search terms in a Dictionary containing Lists of strings. Basically, you decide how long your search prefix needs to be, and read characters from the source string into a buffer until you meet that length. Then, you search your dictionary for all terms that match that string. If a match is found, you explore further by iterating through that List, if not, you know that you can discard the buffer and continue.

Because the Dictionary matches strings based on hash, the search is non-linear and ideal for handling a large number of possible matches.

I'm using this methodology to allow instantaneous (<1ms) searching of every airfield in the US by name, state, city, FAA code, etc. There are 13K airfields in the US, and I've created a map of about 300K permutations (again, a Dictionary with prefixes of varying lengths, each corresponding to a list of matches).

For example, Phoenix, Arizona's main airfield is called Sky Harbor with the short ID of KPHX. I store:

KP KPH KPHX

Ph Pho Phoe

Ar Ari Ariz

Sk Sky

Ha Har Harb

There is a cost in terms of memory usage, but string interning probably reduces this somewhat and the resulting speed justifies the memory usage on data sets of this size. Searching happens as the user types and is so fast that I have actually introduced an artificial delay to smooth out the experience.

Send me a message if you have the need to dig into these methodologies.

Extension method for elegance

(arguably "prettier" at the call level)

I'll implement an extension method that allows you to call your implementation directly on the original string as seen here.

value = value.Remove(y);
// or
value = value.Remove("Hello", "You");
// effectively
string value = "HelloGoodByeSeeYouLater".Remove("Hello", "You");

The extension method is callable on any string value in fact, and therefore easily reusable.

Implementation of Extension method:
I'm going to wrap your own implementation (shown in your question) in an extension method for pretty or elegant points and also employ the params keyword to provide some flexbility passing the arguments. You can substitute somebody else's faster implementation body into this method.

static class EXTENSIONS {
    static public string Remove(this string thisString, params string[] arrItems) {
       // Whatever implementation you like:
       if (thisString == null)
           return null;
       var temp = thisString;
       foreach(string x in arrItems)
            temp = temp.Replace(x, "");
       return temp;
    }
}

That's the brightest idea I can come up with right now that nobody else has touched on.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM