[英]What is the fastest way to get lists of files and search through file lists repeatedly?
What is the fastest way to get lists of files and search through file lists repeatedly? 获取文件列表并反复搜索文件列表的最快方法是什么?
Situation: 情况:
Environment: 环境:
C# and .NET 4.0 on Windows PC. Windows PC上的C#和.NET 4.0。
Is this the fastest way?: 这是最快的方法吗?:
Is the fastest way to use a dictionary, with FileName as a key (lowercase) and Path as a value (original case)? 以FileName为键(小写)和Path为值(大写)使用字典的最快方法吗? In this way I can get the index/Path at the same pass when I search for the filename?
这样,当我搜索文件名时,可以同时获得索引/路径吗? The FileName and Path are split up front when populating the list.
填充列表时,文件名和路径在最前面。
if (d.TryGetValue("key", out value))
{
// Log "key" and value to table // only does one lookup
}
Note: I am a bit concerned that I probably will have duplicate key values per FileType. 注意:我有点担心每个FileType可能会有重复的键值。 When/If I run across this scenario what type of list and access method should I use?
当/如果我遇到这种情况,应该使用哪种类型的列表和访问方法?
Maybe on these rare cases, I should populate another list of the duplicate keys. 也许在这些罕见的情况下,我应该填充另一个重复键列表。 Because I will need to do at least one of: log/copy/delete of the files in any path.
因为我将需要执行以下至少一项操作:在任何路径下记录/复制/删除文件。
I would use a Dictionary<string,string>
with the FullName (path+file+ext) changed to lower case as key and the FullName unchanged as value. 我将使用
Dictionary<string,string>
,将FullName(path + file + ext)更改为小写字母作为键,并将FullName更改为值。 Then split the required parts using the static methods GetDirectoryName
and GetFileName
of the System.IO.Path
class before inserting them into the table. 然后,在将它们插入表之前,使用
System.IO.Path
类的静态方法GetDirectoryName
和GetFileName
拆分所需的部分。
EDIT : The GetFiles
method of the DirectoryInfo
class returns an array of FileInfo
. 编辑 :
DirectoryInfo
类的GetFiles
方法返回一个FileInfo
数组。 FileInfo
has a FullName
property returning path+file+ext. FileInfo
具有FullName
属性,该属性返回path + file + ext。 You could as well store this FileInfo
as value in your dictionary if memory consumption is not an issue. 如果不消耗内存,那么也可以将此
FileInfo
作为值存储在字典中。 FileInfo
has a DirectoryName
and a Name
property returning the two parts you need. FileInfo
有一个DirectoryName
和Name
属性,返回您需要的两个部分。
EDIT : Here is my implementation of a multimap which does the Directory<TKey,List<TValue>>
stuff: 编辑 :这是我执行
Directory<TKey,List<TValue>>
的multimap的实现:
/// <summary>
/// Represents a collection of keys and values. Multiple values can have the same key.
/// </summary>
/// <typeparam name="TKey">Type of the keys.</typeparam>
/// <typeparam name="TValue">Type of the values.</typeparam>
public class MultiMap<TKey, TValue> : Dictionary<TKey, List<TValue>>
{
public MultiMap()
: base()
{
}
public MultiMap(int capacity)
: base(capacity)
{
}
/// <summary>
/// Adds an element with the specified key and value into the MultiMap.
/// </summary>
/// <param name="key">The key of the element to add.</param>
/// <param name="value">The value of the element to add.</param>
public void Add(TKey key, TValue value)
{
List<TValue> valueList;
if (TryGetValue(key, out valueList)) {
valueList.Add(value);
} else {
valueList = new List<TValue>();
valueList.Add(value);
Add(key, valueList);
}
}
/// <summary>
/// Removes first occurence of a element with a specified key and value.
/// </summary>
/// <param name="key">The key of the element to remove.</param>
/// <param name="value">The value of the element to remove.</param>
/// <returns>true if the a element is removed; false if the key or the value were not found.</returns>
public bool Remove(TKey key, TValue value)
{
List<TValue> valueList;
if (TryGetValue(key, out valueList)) {
if (valueList.Remove(value)) {
if (valueList.Count == 0) {
Remove(key);
}
return true;
}
}
return false;
}
/// <summary>
/// Removes all occurences of elements with a specified key and value.
/// </summary>
/// <param name="key">The key of the elements to remove.</param>
/// <param name="value">The value of the elements to remove.</param>
/// <returns>Number of elements removed.</returns>
public int RemoveAll(TKey key, TValue value)
{
List<TValue> valueList;
int n = 0;
if (TryGetValue(key, out valueList)) {
while (valueList.Remove(value)) {
n++;
}
if (valueList.Count == 0) {
Remove(key);
}
}
return n;
}
/// <summary>
/// Gets the total number of values contained in the MultiMap.
/// </summary>
public int CountAll
{
get
{
int n = 0;
foreach (List<TValue> valueList in Values) {
n += valueList.Count;
}
return n;
}
}
/// <summary>
/// Determines whether the MultiMap contains a element with a specific key / value pair.
/// </summary>
/// <param name="key">Key of the element to search for.</param>
/// <param name="value">Value of the element to search for.</param>
/// <returns>true if the element was found; otherwise false.</returns>
public bool Contains(TKey key, TValue value)
{
List<TValue> valueList;
if (TryGetValue(key, out valueList)) {
return valueList.Contains(value);
}
return false;
}
/// <summary>
/// Determines whether the MultiMap contains a element with a specific value.
/// </summary>
/// <param name="value">Value of the element to search for.</param>
/// <returns>true if the element was found; otherwise false.</returns>
public bool Contains(TValue value)
{
foreach (List<TValue> valueList in Values) {
if (valueList.Contains(value)) {
return true;
}
}
return false;
}
}
I would probably use a dictionary with filename lowercased as key. 我可能会使用以小写文件名作为键的字典。 Value would be a class with the needed extra information.
价值将是具有所需额外信息的一类。 I would also search it like your example.
我也将像您的示例一样进行搜索。 If this was slow I would probably also try searching with linq just to see if it was faster.
如果速度很慢,我可能还会尝试使用linq进行搜索,以查看速度是否更快。 This is however one problem here;
但是,这是一个问题。 this requires that all files through all folders are uniquely named.
这要求所有文件夹中的所有文件都具有唯一的名称。 That might be the case for you, but it could also be a problem if you haven't already considered it ;)
对于您来说可能是这种情况,但是如果您还没有考虑过,这也可能是一个问题;)
Remember that you can also use a FileWatcher object to keep the memory dictionary/list synchronized with the disk contents if it is subject to change. 请记住,如果可以更改存储字典/列表,则还可以使用FileWatcher对象将其与磁盘内容同步。 If it's static I would probably store it all in a database table and search that instead, startup of your program would then be instatanious.
如果它是静态的,我可能会将其全部存储在数据库表中并进行搜索,那么程序的启动就不会发生变化。
Edit: Just now noticed your conscern for duplicates. 编辑:刚才注意到您担心重复。 If that's a problem I would create a List where fileclass is a class containing needed information on the files.
如果存在问题,我将创建一个列表,其中fileclass是一个包含有关文件所需信息的类。 Then search the list using linq as that could give you zero, one or more hits.
然后使用linq搜索列表,因为这样可能会给您带来零,一或多个匹配。 I think that would be more efficient than a dictionary with a list as value, where the list would contain one or more items (duplicates).
我认为这比以列表为值的字典更有效,在字典中列表包含一个或多个项目(重复项)。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.