简体   繁体   English

iPad上的MonoTouch:如何使文本搜索更快?

[英]MonoTouch on iPad: How to make text search faster?

I need to do text search based on user input in a relative large list (about 37K lines with 50 to 100 chars each line). 我需要根据相对较大列表中的用户输入进行文本搜索(大约37,000行,每行50至100个字符)。 The search is done after entering each character and the result is shown in a UITableView . 输入每个字符后完成搜索,结果显示在UITableView This is my current code: 这是我当前的代码:

if (input.Any(x => Char.IsUpper(x)))
    return _list.Where(x => x.Desc.Contains(input));
else
    return _list.Where(x => x.Desc.ToLower().Contains(input));

It performs okay on a MacBook running simulator, but too slow on iPad. 它可以在运行MacBook的模拟器上正常运行,但在iPad上运行太慢。

On interesting thing I observed is that it takes longer and longer as input grows. 我观察到的一件有趣的事情是,随着输入的增加,它花费的时间越来越长。 For example, say "examin" as input. 例如,说“ examin”作为输入。 It takes about 1 second after entering e, 2 seconds after x, 5 seconds after a, but 28 seconds after m and so on. 输入e之后大约需要1秒,x之后需要2秒,a之后需要5秒,而m之后需要28秒,依此类推。 Why that? 为什么?

I hope there is a simple way to improve it. 我希望有一种简单的方法可以改善它。

Always take care to avoid memory allocations in time sensitive code. 始终要注意避免在对时间敏感的代码中分配内存。

For example we often produce code often allocates string without realizing it, eg 例如,我们经常产生的代码经常分配string而没有意识到它,例如

x => x.Desc.ToLower().Contains(input)

That will allocate a string to return from ToLower . 这将分配一个字符串以从ToLower返回。 From your description this will occurs many time. 根据您的描述,这将发生很多次。 You can easily avoid this by using: 您可以使用以下方法轻松避免这种情况:

x = x.Desc.IndexOf ("s", StringComparison.OrdinalIgnoreCase) != -1

note: just select the StringComparison.*IgnoreCase that match your need. 注意:只需选择符合您需求的StringComparison.*IgnoreCase

Also LINQ is nice but it hides allocations in many cases - maybe not in your case but measuring is key to get things faster. LINQ也很不错,但是在很多情况下它都隐藏了分配-也许在您的情况下不行,但是测量是使事情变得更快的关键。 In that case using another algorithm (like suggested in another answer) could give you much better results (but keep in mind the allocations ;-) 在那种情况下,使用其他算法(如另一个答案中所建议的)可以为您提供更好的结果(但请记住分配;-)

UPDATE: 更新:

Mono's Contains(string) will call, after a few checks, the following: 经过几次检查,Mono的Contains(string)将调用以下内容:

CultureInfo.CurrentCulture.CompareInfo.IndexOf (this, value, 0, length, CompareOptions.Ordinal);

which, with your ToLower requirement that using StringComparison.OrdinalIgnoreCase is the perfect (ie identical) match for your existing code (it did not do any culture specific comparison). 根据您对ToLower要求,使用StringComparison.OrdinalIgnoreCase是您现有代码的完美 (即相同)匹配项(它没有进行任何区域性特定的比较)。

Generally I've found that contains operations are not preferable for search, so I'd recommend you take a look at the Mastering Core Data Session (login required ) video on the WWDC 2010 page (around the 10 min mark). 通常,我发现包含操作不适合进行搜索,因此建议您看一下WWDC 2010页面上的Mastering Core Data Session (需要登录)视频(大约10分钟)。 Apple knows that 'contains' is terrible w/ SQLite on mobile devices, you can essentially do what Apple does to sort of "hack" FTS on the version of SQLite they ship. Apple知道在移动设备上使用SQLite时“包含”是很糟糕的,您基本上可以做Apple采取的措施来在他们发布的SQLite版本上“破解” FTS。

Essentially they do prefix matching by creating a table like: 本质上,它们通过创建如下表来进行前缀匹配:

[[ pk_id || input || normalized_input ]]

Where input and normalized_input are both indexed explicitly. 其中input和normalized_input 都被显式索引 Then they prefix match against the normalized value. 然后,它们对归一化值进行前缀匹配。 So for instance if a user is searching for 'snuggles' and so far they've typed in 'snu' the prefix matching query would look like: 因此,例如,如果用户正在搜索“ snuggles”,并且到目前为止,他们已经输入“ snu”,则前缀匹配查询将类似于:

normalized_input >= 'snu' and normalized_input < 'snt'

Not sure if this translates given your use case, but I thought it was worth mentioning. 不知道给定您的用例是否可以翻译,但我认为值得一提。 Hope it's helpful! 希望对您有所帮助!

You need to use a trie. 您需要使用特里。 See http://en.wikipedia.org/wiki/Trie 参见http://en.wikipedia.org/wiki/Trie

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM