[英]streamreader's basestream position using mono
I am trying to write some simple code to index some wikipedia xml pages. 我正在尝试编写一些简单的代码来索引一些Wikipedia xml页面。 The idea was to get the byte offset of each character by reading in a character using streamreader, then saving the position from the byte stream so I could get back to that position later.
这个想法是通过使用streamreader读取字符来获取每个字符的字节偏移量,然后保存字节流中的位置,以便稍后再返回该位置。
using a short test file that just contains "感\\na\\nb" (8 bytes) with new line after each character. 使用一个简短的测试文件,该文件仅包含“感\\ na \\ nb”(8个字节),每个字符后都有新行。 Then I tried using this code in the main function :
然后我尝试在主要功能中使用以下代码:
using System;
using System.IO;
namespace indexer
{
class MainClass
{
public static void Main(string[] args)
{
StreamReader sr = new StreamReader (@"/home/chris/Documents/len.txt");
Console.Out.WriteLine(" length of file is " + sr.BaseStream.Length + " bytes ");
sr.Read (); // read first byte.
Console.Out.WriteLine(" current position is " + sr.BaseStream.Position);
sr.Close ();
}
}
}
this gives the output : 这给出了输出:
length of file is 8 bytes
current position is 8
The position should be 3, as it should only read the first character. 该位置应为3,因为它只能读取第一个字符。 If I use sr.Read() again, I do get the next character correctly, but the position remains 8.
如果再次使用sr.Read(),则可以正确获取下一个字符,但位置仍为8。
Am I misunderstanding how this should work, or have I discovered a bug of some sort? 我是在误解它应该如何工作,还是发现了某种错误?
Thank you. 谢谢。
No, it is not a bug. 不,这不是错误。
StreamReader
uses a 1 KB buffer inside which is filled up when you call StremReader.Read()
. StreamReader
使用1 KB的缓冲区,当您调用StremReader.Read()
时,缓冲区将被填充。
You should call Encoding.GetByteCount()
method to get a number of bytes in a character or a string is being read. 您应该调用
Encoding.GetByteCount()
方法来获取字符中的多个字节或正在读取的字符串。 Current encoding can be found in StreamReader.CurrentEncoding
. 当前编码可以在
StreamReader.CurrentEncoding
找到。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.