简体   繁体   中英

c# FileStream Read having problems with StreamReader EndOfStream

As the title says I found a problem. Little back story first: We have file.txt looking like this:

aaaabb
ccccddd
eeeefffffff

There are many ways to read this text line-by-line, one of which is this:

StreamReader sr = new StreamReader("file.txt");
while(!sr.EndOfStream)
{
    string s = sr.ReadLine();
}
sr.Close();

Works. s gets each line. Now I need the first 4 letters as bytes and the rest as string. After looking up things and experimenting a little, I found that the easiest way is this:

FileStream fs = new FileStream("file.txt", FileMode.Open);
StreamReader sr = new StreamReader(fs);
byte[] arr = new byte[4];
fs.Read(arr, 0, 4);
string s = sr.ReadLine();
sr.Close();
fs.Close();

Works. arr contains the first 4 letters as bytes and the rest of the line is saved in s . This is only a single line. If we add the while :

FileStream fs = new FileStream("file.txt", FileMode.Open);
StreamReader sr = new StreamReader(fs);
while(!sr.EndOfStream)
{
    byte[] arr = new byte[4];
    fs.Read(arr, 0, 4);
    string s = sr.ReadLine();
} 
sr.Close();
fs.Close();

Now there's a problem. Now arr doesn't get anything and s reads the whole line including the first 4 letters. Even more strange that if I use while(true) (and I assume anything else that is not the example) than it works as intended, 4 characters as bytes and rest is string, and this is the same for every line.

Question is that what am I missing? Why is this happening? How do I solve this? Or is it possible that this is a bug?

The problem here is simple buffering. When you attach your StreamReader to the FileStream , it ends up consuming a block from the file, thus advancing the current Position of FileStream . With your example file and the default buffer size, once the StreamReader attaches itself, it basically consumes the entire file into a buffer, leaving the FileStream at its EOF. When you then attempt to read 4 bytes from the FileStream directly via your fs reference, there's nothing left to consume. The following ReadLine works on your sr reference as that's reading from the buffered file content.

Here's a step-by-step breakdown of what's happening:

  1. fs opens up the file and sits at Position 0.
  2. sr wraps up fs and the call to EndOfStream ends up consuming (in this case) 27 bytes into its internal buffer. At this point, the fs Position now sits at EOF.
  3. You attempt to read from fs directly, but its at EOF with no more bytes.
  4. sr.ReadLine reads from the buffer it built up in step #2 and all works well.

To fix your specific error case, you could change your byte array to a char array and use sr.Read instead. ie

char[] arr = new char[4];
sr.Read(arr, 0, 4);

Now there's a problem. Now arr doesn't get anything and s reads the whole line including the first 4 letters.

Yes, that seems very plausible. StreamReader maintains a buffer - when you ask it to read a line of text, it may well read more from the stream than that single line, using that buffered data when it's next asked for information.

Fundamentally, I would strongly advise from directly reading from the stream that the StreamReader is reading from. It's going to be very fiddly to get right even where it's possible, and in some cases the API may just not let you do what you want.

If you want to remove the first four characters from each line, it would be much simpler to read the whole line, and then use Substring .

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM