I have the following code:
public IList<Tweet> Match(IEnumerable<Tweet> tweetStream, IList<string> match, IList<string> exclude)
{
var tweets = from f in tweetStream
from m in match
where f.Text.ToLowerInvariant().Contains(m)
select f;
var final = from f in tweets
from e in exclude
where !f.Text.ToLowerInvariant().Contains(e.ToLowerInvariant())
select f;
return final.Distinct().ToList<Tweet>();
}
I've been building the tests up which haven't included the final
resultset and been matching happily now I've added the exclude if the IList<string>exclude
is empty all items are removed.
So this test passes as it should:
[TestMethod]
public void Should_exclude_items_from_exclude_list()
{
IEnumerable<Tweet> twitterStream = new List<Tweet>
{
new Tweet("I have a Mazda car"),
new Tweet("I have a ford"),
new Tweet("Mazda Rules"),
new Tweet("My Ford car is great"),
new Tweet("My renault is brill"),
new Tweet("Mazda cars are great")
};
IList<string> matches = new List<string>{"mazda","car"};
IList<string> exclude = new List<string>{"ford"};
Matcher target = new Matcher();
IList<Tweet> actual = target.Match(twitterStream, matches, exclude);
Assert.AreEqual(3, actual.Count);
}
but this test now fails:
[TestMethod]
public void Should_match_items_either_mazda_or_car_but_no_duplicates()
{
IEnumerable<Tweet> twitterStream = new List<Tweet>
{
new Tweet("I have a Mazda car"),
new Tweet("I have a ford"),
new Tweet("Mazda Rules"),
new Tweet("My Ford car is great"),
new Tweet("My renault is brill"),
new Tweet("Mazda cars are great")
};
IList<string> matches = new List<string>{"mazda","car"};
IList<string> exclude = new List<string>();
Matcher target = new Matcher();
IList<Tweet> actual = target.Match(twitterStream, matches, exclude);
Assert.AreEqual(4, actual.Count);
}
I know I'm missing something really simple but after staring at the code for an hour its not coming to me.
Well, I know why it's failing: it's this clause:
from e in exclude
That's going to be an empty collection, so there are no entries to even hit the where clause.
Here's an alternative approach:
var final = from f in tweets
let lower = f.Text.ToLowerInvariant()
where !exclude.Any(e => lower.Contains(e.ToLowerInvariant())
select f;
Although I considered msarchet's approach as well, the nice thing about this one is that it only ends up evaluating tweetStream
once - so even if that reads from the network or does something else painful, you don't need to worry. Where possible (and convenient) I try to avoid evaluating LINQ streams more than once.
Of course, you can make the whole thing one query very easily:
var tweets = from f in tweetStream
let lower = f.Text.ToLowerInvariant()
where match.Any(m => lower.Contains(m.ToLowerInvariant())
where !exclude.Any(e => lower.Contains(e.ToLowerInvariant())
select f;
I'd consider that even cleaner, to be honest :)
So what is happening is this:
var final = from f in tweets
from e in exclude
where !f.Text.ToLowerInvariant().Contains(e.ToLowerInvariant())
select f;
Since the second from is empty, If I am correct the rest of the the statement is not evaluated, so your select is never happening.
Try doing this like this instead
var excludeTheseTweet = from f in tweets
from e in exclude
where f.Text.ToLowerInvariant().Contains(e.ToLowerInvariant())
select f;
return tweets.Except(excludeTheseTweets).Distinct().ToList<Tweet>();
So that will get a list of tweets to exculde (so if there is nothing to exclude it won't get anything) and then it will remove those items form the original list.
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.