简体   繁体   English

使用 WPF C# 显示大文本文件

[英]Displaying large text files with WPF C#

I'm trying to write a WPF application to display (possibly) large log files (50MB-2GB) such that they are easier to read.我正在尝试编写一个 WPF 应用程序来显示(可能)大型日志文件(50MB-2GB),以便它们更易于阅读。 I tried loading a 5 MB file with ~75k lines into a GridView with TextBlocks but it was really slow.我尝试将一个约 75k 行的 5 MB 文件加载到带有 TextBlocks 的 GridView 中,但它真的很慢。 I don't need any editing capabilities.我不需要任何编辑功能。

I came across GlyphRun but I couldn't figure out how to use them.我遇到了 GlyphRun,但我不知道如何使用它们。 I imagine I would have to fill a canvas or image with a GlyphRun of each line of my log file.我想我必须用我的日志文件每一行的 GlyphRun 填充画布或图像。 Could anyone tell me how to do this?谁能告诉我如何做到这一点? The documentation on GlyphRun is not very helpful unfortunately.不幸的是,关于 GlyphRun 的文档并不是很有帮助。

I have this file reading algorithm from a proof of concept application (which was also a log file viewer/diff viewer).我从概念验证应用程序(它也是一个日志文件查看器/差异查看器)中获得了这个文件读取算法。 The implementation requires C# 8.0 (.NET Core 3.x or .NET 5).该实现需要 C# 8.0(.NET Core 3.x 或 .NET 5)。 I removed some indexing, cancellation etc. to remove noise and to show the core business of the algorithm.我删除了一些索引、取消等,以消除噪音并展示算法的核心业务。
It performs quite fast and compares very well with editors like Visual Code.它的执行速度非常快,并且与 Visual Code 等编辑器相比非常好。 It can't get much faster.它不能变得更快。 To keep the UI responsive I highly recommend to use UI virtualization.为了保持 UI 响应,我强烈建议使用 UI 虚拟化。 If you implement UI virtualization, then the bottleneck will be the file reading operation.如果实现 UI 虚拟化,那么瓶颈将是文件读取操作。 You can tweak the algorithm's performance by using different partition sizes (you can implement some smart partitioning to calculate them dynamically).您可以通过使用不同的分区大小来调整算法的性能(您可以实现一些智能分区来动态计算它们)。
The key parts of the algorithm are算法的关键部分是

  • asynchronous implementation of Producer-Consumer pattern using Channel使用Channel异步实现生产者-消费者模式
  • partitioning of the source file into blocks of n bytes将源文件分区为n个字节的块
  • parallel processing of file partitions (concurrent file reading)文件分区的并行处理(并发文件读取)
  • merging the result document blocks and overlapping lines合并结果文档块和重叠线

DocumentBlock.cs文档块.cs
The result struct that holds the lines of a processed file partition.保存已处理文件分区行的结果结构。

public readonly struct DocumentBlock
{
  public DocumentBlock(long rank, IList<string> content, bool hasOverflow)
  {
    this.Rank = rank;
    this.Content = content;
    this.HasOverflow = hasOverflow;
  }

  public long Rank { get; }
  public IList<string> Content { get; }
  public bool HasOverflow { get; }
}

ViewModel.cs视图模型.cs
The entry point is the public ViewModel.ReadFileAsync member.入口点是公共ViewModel.ReadFileAsync成员。

class ViewModel : INotifyPropertyChanged
{
  public ViewModel() => this.DocumentBlocks = new ConcurrentBag<DocumentBlock>();

  // TODO::Make reentrant 
  // (for example cancel running operations and 
  // lock/synchronize the method using a SemaphoreSlim)
  public async Task ReadFileAsync(string filePath)
  {
    using var cancellationTokenSource = new CancellationTokenSource();

    this.DocumentBlocks.Clear();    
    this.EndOfFileReached = false;

    // Create the channel (Producer-Consumer implementation)
    BoundedChannelOptions channeloptions = new BoundedChannelOptions(Environment.ProcessorCount)
    {
      FullMode = BoundedChannelFullMode.Wait,
      AllowSynchronousContinuations = false,
      SingleWriter = true
    };

    var channel = Channel.CreateBounded<(long PartitionLowerBound, long PartitionUpperBound)>(channeloptions);

    // Create consumer threads
    var tasks = new List<Task>();
    for (int threadIndex = 0; threadIndex < Environment.ProcessorCount; threadIndex++)
    {
      Task task = Task.Run(async () => await ConsumeFilePartitionsAsync(channel.Reader, filePath, cancellationTokenSource));
      tasks.Add(task);
    }

    // Produce document byte blocks
    await ProduceFilePartitionsAsync(channel.Writer, cancellationTokenSource.Token);    
    await Task.WhenAll(tasks);    
    CreateFileContent();
    this.DocumentBlocks.Clear();
  }

  private void CreateFileContent()
  {
    var document = new List<string>();
    string overflowingLineContent = string.Empty;
    bool isOverflowMergePending = false;

    var orderedDocumentBlocks = this.DocumentBlocks.OrderBy(documentBlock => documentBlock.Rank);
    foreach (var documentBlock in orderedDocumentBlocks)
    {
      if (isOverflowMergePending)
      {
        documentBlock.Content[0] += overflowingLineContent;
        isOverflowMergePending = false;
      }

      if (documentBlock.HasOverflow)
      {
        overflowingLineContent = documentBlock.Content.Last();
        documentBlock.Content.RemoveAt(documentBlock.Content.Count - 1);
        isOverflowMergePending = true;
      }

      document.AddRange(documentBlock.Content);
    }

    this.FileContent = new ObservableCollection<string>(document);
  }

  private async Task ProduceFilePartitionsAsync(
    ChannelWriter<(long PartitionLowerBound, long PartitionUpperBound)> channelWriter, 
    CancellationToken cancellationToken)
  {
    var iterationCount = 0;
    while (!this.EndOfFileReached)
    {
      try
      {
        var partition = (iterationCount++ * ViewModel.PartitionSizeInBytes,
          iterationCount * ViewModel.PartitionSizeInBytes);
        await channelWriter.WriteAsync(partition, cancellationToken);
      }
      catch (OperationCanceledException)
      {}
    }
    channelWriter.Complete();
  }

  private async Task ConsumeFilePartitionsAsync(
    ChannelReader<(long PartitionLowerBound, long PartitionUpperBound)> channelReader, 
    string filePath, 
    CancellationTokenSource waitingChannelWritertCancellationTokenSource)
  {
    await using var file = File.OpenRead(filePath);
    using var reader = new StreamReader(file);

    await foreach ((long PartitionLowerBound, long PartitionUpperBound) filePartitionInfo
      in channelReader.ReadAllAsync())
    {
      if (filePartitionInfo.PartitionLowerBound >= file.Length)
      {
        this.EndOfFileReached = true;
        waitingChannelWritertCancellationTokenSource.Cancel();
        return;
      }

      var documentBlockLines = new List<string>();
      file.Seek(filePartitionInfo.PartitionLowerBound, SeekOrigin.Begin);
      var filePartition = new byte[filePartitionInfo.PartitionUpperBound - partition.PartitionLowerBound];
      await file.ReadAsync(filePartition, 0, filePartition.Length);

      // Extract lines
      bool isLastLineComplete = ExtractLinesFromFilePartition(documentBlockLines, filePartition); 

      bool documentBlockHasOverflow = !isLastLineComplete && file.Position != file.Length;
      var documentBlock = new DocumentBlock(partition.PartitionLowerBound, documentBlockLines, documentBlockHasOverflow);
      this.DocumentBlocks.Add(documentBlock);
    }
  }  

  private bool ExtractLinesFromFilePartition(byte[] filePartition, List<string> resultDocumentBlockLines)
  {
    bool isLineFound = false;
    for (int bufferIndex = 0; bufferIndex < filePartition.Length; bufferIndex++)
    {
      isLineFound = false;
      int lineBeginIndex = bufferIndex;
      while (bufferIndex < filePartition.Length
        && !(isLineFound = ((char)filePartition[bufferIndex]).Equals('\n')))
      {
        bufferIndex++;
      }

      int lineByteCount = bufferIndex - lineBeginIndex;
      if (lineByteCount.Equals(0))
      {
        documentBlockLines.Add(string.Empty);
      }
      else
      {
        var lineBytes = new byte[lineByteCount];
        Array.Copy(filePartition, lineBeginIndex, lineBytes, 0, lineBytes.Length);
        string lineContent = Encoding.UTF8.GetString(lineBytes).Trim('\r');
        resultDocumentBlockLines.Add(lineContent);
      }
    }      

    return isLineFound;
  }

  protected virtual void OnPropertyChanged([CallerMemberName] string propertyName = "") 
    => this.PropertyChanged?.Invoke(this, new PropertyChangedEventArgs(propertyName));

  public event PropertyChangedEventHandler PropertyChanged;
  private const long PartitionSizeInBytes = 100000;
  private bool EndOfFileReached { get; set; }
  private ConcurrentBag<DocumentBlock> DocumentBlocks { get; }

  private ObservableCollection<string> fileContent;
  public ObservableCollection<string> FileContent
  {
    get => this.fileContent;
    set
    {
      this.fileContent = value;
      OnPropertyChanged();
    }
  }
}

To implement a very simple UI virtualization, this example uses a plain ListBox , where all mouse effects are removed from the ListBoxItem elements in order to get rid of the ListBox look and feel (a indetermintae progress indicator is highly recommended).为了实现一个非常简单的 UI 虚拟化,这个例子使用了一个普通的ListBox ,其中所有的鼠标效果都从ListBoxItem元素中移除,以摆脱ListBox外观和感觉(强烈推荐不确定进度指示器)。 You can enhance the example to allow multi-line text selection (eg, to allow to copy text to the clipboard).您可以增强示例以允许多行文本选择(例如,允许将文本复制到剪贴板)。

MainWindow.xaml主窗口.xaml

<Window>
  <Window.DataContext>
    <ViewModel />
  </Window.DataContext>

  <ListBox ScrollViewer.VerticalScrollBarVisibility="Visible" 
           ItemsSource="{Binding FileContent}" 
           Height="400" >
    <ListBox.ItemContainerStyle>
      <Style TargetType="ListBoxItem">
        <Setter Property="Template">
          <Setter.Value>
            <ControlTemplate TargetType="ListBoxItem">
              <ContentPresenter />
            </ControlTemplate>
          </Setter.Value>
        </Setter>
      </Style>
    </ListBox.ItemContainerStyle>
  </ListBox>
</Window>

If you are more advanced, you can simply implement your own powerful document viewer eg, by extending the VirtualizingPanel and using low-level text rendering.如果您更高级,您可以简单地实现您自己强大的文档查看器,例如,通过扩展VirtualizingPanel和使用低级文本渲染。 This allows you to increase performance in case you are interested in text search and highlighting (in this context stay far away from RichTextBox (or FlowDocument ) as it is too slow).如果您对文本搜索和突出显示感兴趣(在这种情况下远离RichTextBox (或FlowDocument ),因为它太慢),这允许您提高性能。

At least you have a good performing text file reading algorithm you can use to generate the data source for your UI implementation.至少你有一个性能良好的文本文件读取算法,可以用来为你的 UI 实现生成数据源。

If this viewer is not your main product, but a simple development tool to aid you in processing log files, I don't recommend to implement your own log file viewer.如果这个查看器不是你的主要产品,而是一个简单的开发工具来帮助你处理日志文件,我不建议你实现自己的日志文件查看器。 There are plenty of free and paid applications out there.那里有很多免费和付费的应用程序。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM