[英]Reading a specific number of lines from a file in C (scanf, fseek,fgets)
I have a process master that spawns N child processes that communicate with the parent through unnamed pipes. 我有一个流程主控器,它生成N个子进程 ,这些子进程通过未命名的管道与父进程进行通信。 I must be able to:
我必须能够:
My problem does not concern the OS concepts, only the file operations :S 我的问题与OS概念无关,仅涉及文件操作:S
Perhaps fseek? 也许fseek? I can't mmap the log file (some have more than 1GB).
我无法映射日志文件(有些文件大小超过1GB)。
I would appreciate some ideas. 我将不胜感激。 Thank you in advance
先感谢您
EDIT: I'm trying to make the children read the respective lines without using fseek and the value of chunks, so, could someone please tell me if this is valid? 编辑:我试图让孩子们在不使用fseek和块的值的情况下读取相应的行,所以,有人可以告诉我这是否有效吗? :
:
//somewhere in the parent process:
FILE* logFile = fopen(filename, "r");
while (fgets(line, 1024, logFile) != NULL) {
num_lines++;
}
rewind(logFile);
int prev = 0;
for (i = 0; i < maps_nr; i++) {
struct send_to_Map request;
request.fp = logFile;
request.lower = lowLimit;
request.upper = highLimit;
if (i == 0)
request.minLine = 0;
else
request.minLine = 1 + prev;
if(i!=maps_nr-1)
request.maxLine = (request.minLine + num_lines / maps_nr) - 1;
else
request.maxLine = (request.minLine + num_lines / maps_nr)+(num_lines%maps_nr);
prev = request.maxLine;
}
//write this structure to respective pipe
//child process:
while(1) {
...
//reads the structure to pipe (and knows which lines to read)
int n=0, counter=0;
while (fgets(line, 1024, logFile) != NULL){
if (n>=minLine and n<=maxLine)
counter+= process(Line);//returns 1 if IP was found, in that line, between the low and high limit
n++;
}
//(...)
}
I don't know if it's going to work, I just to make it work! 我不知道它是否会起作用,我只是想使其起作用! Even like this, is it possible to outperform a single process reading the whole file and printing the total number of ips found in the log file?
即使这样,是否有可能胜过读取整个文件并打印日志文件中找到的ip总数的单个过程?
If you don't care about dividing the file exactly evenly, and the distribution of line lengths is somewhat even over the entire file, you can avoid reading the entire file in the parent once. 如果您不希望完全均匀地分割文件,并且行长在整个文件中的分布有些均匀,则可以避免一次在父级中读取整个文件。
start
and reads chunk_size
bytes. start
并读取chunk_size
字节。 That's a rough sketch of the strategy. 这是该策略的概图。
Edited to simplify things a bit. 编辑简化一些事情。
Edit : here's some untested code for step 3, and step 4 below. 编辑 :这是下面第3步和第4步的一些未经测试的代码。 This is all untested, and I haven't been careful about off-by-one errors, but it gives you an idea of the usage of
fseek
and ftell
, which sounds like what you are looking for. 这都是未经测试的,并且我还没有仔细研究过一次的错误,但是它使您了解
fseek
和ftell
的用法,听起来像您要找的东西。
// Assume FILE* f is open to the file, chunk_size is the average expected size,
// child_num is the id of the current child, spawn_child() is a function that
// handles the logic of spawning a child and telling it where to start reading,
// and how much to read. child_chunks[] is an array of structs to keep track of
// where the chunks start and how big they are.
if(fseek(f, child_num * chunk_size, SEEK_SET) < 0) { handle_error(); }
int ch;
while((ch = fgetc(f)) != FEOF && ch != '\n')
{/*empty*/}
// FIXME: needs to handle EOF properly.
child_chunks[child_num].end = ftell(f); // FIXME: needs error check.
child_chunks[child_num+1].start = child_chunks[child_num].end + 1;
spawn_child(child_num);
Then in your child (step 4), assume the child has access to child_chunks[]
and knows its child_num
: 然后在您的孩子中(第4步),假设该孩子可以访问
child_chunks[]
并知道其child_num
:
void this_is_the_child(int child_num)
{
/* ... */
fseek(f, child_chunks[child_num].start, SEEK_SET); // FIXME: handle error
while(fgets(...) && ftell(f) < child_chunks[child_num].end)
{
}
}
/* get an array with line-startpositions (file-offsets) */
fpos_t readLineBegins(FILE *f,fpos_t **begins)
{
fpos_t ch=0, mark=0, num=0;
*begins = 0;
do {
if( ch=='\n' )
{
*begins = realloc( *begins, ++num * sizeof(fpos_t) );
(*begins)[num-1] = mark;
mark = ftell(f);
}
} while( (ch=fgetc(f))!=EOF );
if( mark<ftell(f) )
{
*begins = realloc( *begins, ++num * sizeof(fpos_t) );
(*begins)[num-1]=mark;
}
return num;
}
/* output linenumber beg...end */
void workLineBlocks(FILE *f,fpos_t *begins,fpos_t beg,fpos_t end)
{
while( beg<=end )
{
int ch;
fsetpos( f, &begins[beg] ); /* set linestart-position */
printf("%ld:", ++beg );
while( (ch=fgetc(f))!=EOF && ch!='\n' && ch!='\r' )
putchar(ch);
puts("");
}
}
main()
{
FILE *f=fopen("file.txt","rb");
fpos_t *lineBegins, /* Array with line-startpositions */
lb = readLineBegins(f,&lineBegins); /* get number of lines */
workLineBlocks(f,lineBegins,lb-2,lb-1); /* out last two lines */
workLineBlocks(f,lineBegins,0,1); /* out first two lines */
fclose(f);
free(lineBegins);
}
我认为它可以帮助您: 从文本文件中读取特定范围的行
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.