简体   繁体   English

C - 读取多个文件

[英]C - Reading multiple files

just had a general question about how to approach a certain problem I'm facing. 我对如何解决我面临的某个问题提出了一般性问题。 I'm fairly new to C so bear with me here. 我对C很新,所以请耐心等待。 Say I have a folder with 1000+ text files, the files are not named in any kind of numbered order, but they are alphabetical. 假设我有一个包含1000多个文本文件的文件夹,这些文件没有以任何编号顺序命名,但它们是按字母顺序排列的。 For my problem I have files of stock data, each file is named after the company's respective ticker. 对于我的问题,我有库存数据文件,每个文件都以公司各自的股票代码命名。 I want to write a program that will open each file, read the data find the historical low and compare it to the current price and calculate the percent change, and then print it. 我想写一个程序打开每个文件,读取数据找到历史低点并将其与当前价格进行比较并计算百分比变化,然后打印出来。 Searching and calculating are not a problem, the problem is getting the program to go through and open each file. 搜索和计算不是问题,问题是让程序通过并打开每个文件。 The only way I can see to attack this is to create a text file containing all of the ticker symbols, having the program read that into an array and then run a loop that first opens the first filename in the array, perform the calculations, print the output, close the file, then loop back around moving to the second element (the next ticker symbol) in the array. 我可以看到攻击这个的唯一方法是创建一个包含所有股票代码的文本文件,让程序将其读入数组,然后运行一个循环,首先打开数组中的第一个文件名,执行计算,打印输出,关闭文件,然后循环回到数组中的第二个元素(下一个自动收录器符号)。 This would be fairly simple to set up (I think) but I'd really like to avoid typing out over a thousand file names into a text file. 设置起来相当简单(我认为),但我真的想避免在文本文件中输入超过一千个文件名。 Is there a better way to approach this? 有没有更好的方法来解决这个问题? Not really asking for code ( unless there is some amazing function in c that will do this for me ;) ), just some advice from more experienced C programmers. 不是真的要求代码(除非在c中有一些令人惊讶的功能可以为我做这个;)),只是来自更有经验的C程序员的一些建议。

Thanks :) 谢谢 :)

Edit: This is on Linux, sorry I forgot to metion that! 编辑:这是在Linux上,抱歉,我忘记了!

Under Linux/Unix (BSD, OS X, POSIX, etc.) you can use opendir / readdir to go through the directory structure. 在Linux / Unix(BSD,OS X,POSIX等)下,您可以使用opendir / readdir来完成目录结构。 No need to generate static files that need to be updated, when the file system has the information you want. 当文件系统具有您想要的信息时,无需生成需要更新的静态文件。 If you only want a sub-set of stocks at a given time, then using glob would be quicker, there is also scandir . 如果您只想在给定时间内获得一组股票,那么使用glob会更快,还有scandir

I don't know what Win32 (Windows / Platform SDK) functions are called, if you are developing using Visual C++ as your C compiler. 如果您使用Visual C ++作为C编译器进行开发,我不知道调用了什么Win32(Windows / Platform SDK)函数。 Searching MSDN Library should help you. 搜索MSDN Library应该可以帮到您。

Assuming you're running on linux... 假设你在linux上运行......

ls /path/to/text/files > names.txt

is exactly what you want. 正是你想要的。

There are no functions in standard C that have any notion of a "directory". 标准C中没有任何具有“目录”概念的功能。 You will need to use some kind of platform-specific function to do this. 您需要使用某种特定于平台的功能来执行此操作。 For some examples, take a look at this post from Cprogrammnig.com . 有关示例,请查看Cprogrammnig.com上的这篇文章

Personally, I prefer using the opendir() / readdir() approach as shown in the second example. 就个人而言,我更喜欢使用opendir() / readdir()方法,如第二个例子所示。 It works natively under Linux and also on Windows if you are using Cygwin. 如果您使用的是Cygwin,它可以在Linux下本地运行,也可以在Windows上运行。

Approach 1) I would just have a specific directory in which I have ONLY these files containing the ticker data and nothing else. 方法1)我只有一个特定的目录,我只有这些文件包含自动收报机数据,没有别的。 I would then use the C readdir API to list all files in the directory and iterate over each one performing the data processing that you require. 然后,我将使用C readdir API列出目录中的所有文件,并迭代执行所需数据处理的每个文件。 Which ticker the file applies to is determined only by the filename. 文件适用的哪个滚动条仅由文件名确定。

Pros: Easy to code 优点:易于编码

Cons: It really depends where the files are stored and where they come from. 缺点:这取决于文件存储的位置和来源。

Approach 2) Change the file format so the ticker files start with a magic code identifying that this is a ticker file, and a string containing the name. 方法2)更改文件格式,以便自动收报机文件以识别这是一个自动收报机文件的魔术代码和包含该名称的字符串开头。 As before use readdir to iterate through all files in the folder and open each file, ensure that the magic number is set and read the ticker name from the file, and process the data as before 和之前一样,使用readdir遍历文件夹中的所有文件并打开每个文件,确保设置了幻数并从文件中读取了自动收录器名称,并像以前一样处理数据

Pros: More flexible than before. 优点:比以前更灵活。 Filename needn't reflect name of ticker Cons: Harder to code, file format may be fixed. 文件名无需反映股票代码的名称缺点:代码更难,文件格式可能是固定的。

but I'd really like to avoid typing out over a thousand file names into a text file. 但我真的想避免在文本文件中输入超过一千个文件名。 Is there a better way to approach this? 有没有更好的方法来解决这个问题?

I have solved the exact same problem a while back, albeit for personal uses :) 我已经解决了相同的问题,虽然个人用途:)

What I did was to use the OS shell commands to generate a list of those files and redirected the output to a text file and had my program run through them. 我所做的是使用OS shell命令生成这些文件的列表,并将输出重定向到文本文件并让我的程序运行它们。

In pseudo code it would look like this, I cannot define the code as I'm not 100% sure if this is the correct approach... 在伪代码中它看起来像这样,我无法定义代码,因为我不是100%确定这是否是正确的方法...

for each directory entry
    scan the filename
         extract the ticker name from the filename
         open the file
              read the data
              create a record consisting of the filename, data.....
         close the file
         add the record to a list/array...
> sort the list/array into alphabetical order based on 
  the ticker name in the filename...

You could vary it slightly if you wish, scan the filenames in the directory entries and sort them first by building a record with the filenames first, then go back to the start of the list/array and open each one individually reading the data and putting it into the record then.... 如果您愿意,可以稍微改变它,扫描目录条目中的文件名并首先通过首先构建带有文件名的记录对它们进行排序,然后返回列表/数组的开头并单独打开每个文件读取数据并放入它进入记录然后....

Hope this helps, best regards, Tom. 希望这有助于,最好的问候,汤姆。

On UNIX, there's the handy glob function: 在UNIX上,有方便的glob函数:

glob_t results;
memset(&results, 0, sizeof(results));
glob("*.txt", 0, NULL, &results);
for (i = 0; i < results.gl_pathc; i++)
    printf("%s\n", results.gl_pathv[i]);
globfree(&results);

On Linux or a related system, you could use the fts library. 在Linux或相关系统上,您可以使用fts库。 It's designed for traversing file hierarchies: man fts , 它专为遍历文件层次结构而设计: man fts

or even something as simple as readdir 甚至像readdir这样简单的东西

If on Windows, you can use their Directory Management API's. 如果在Windows上,您可以使用其目录管理 API。 More specifically, the FindFirstFile function, used with wildcards, in conjunction with FindNextFile 更具体地说, FindFirstFile函数与通配符一起使用,与FindNextFile结合使用

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM