简体   繁体   中英

C# asynchronous LCD write

So I'm working on a project that involves a LCD screen that can update 60 times per second. It uses a BitmapFrame and I need to copy those pixels to a library that updates the screen. Currently I'm getting about 30-35 FPS which is too low. So I'm trying to use multi-threading but this creates a lot of problems.

The DisplayController already creates a thead to do all the work on like so:

public void Start()
{
    _looper = new Thread(Loop);
    _looper.IsBackground = true;
    _looper.Start();
}

private void Loop()
{
    while (_IsRunning)
    {
        renderScreen();
    }
}

Which calls the renderScreen method that draws all the elements and copies the pixels to the BitmapFrame . But this proces takes too long so my FPS drops. My attempt to solve this problem was by creating a Task that draws, copies and writes the pixels. But this solution uses a lot of CPU and causes glitchtes on the screen.

public void renderScreen()
{
    Task.Run(() =>
    {
        Monitor.Enter(_object);

        // Push screen to LCD
        BitmapFrame bf = BitmapFrame.Create(screen);
        RenderOptions.SetBitmapScalingMode(bf, BitmapScalingMode.LowQuality);
        bf.CopyPixels(new Int32Rect(0, 0, width, height), pixels, width * 4, 0);

        DisplayWrapper.USBD480_DrawFullScreenBGRA32(ref disp, pixels);

        Monitor.Exit(_object);
    });
}

I've been reading a lot about concurrent queues for C# but that's not what I need. And using two threads causes the issue that the compiler says that the variable is owned by another thread.

How can I concurrently render a new bitmap and write that bitmap 60 times per second to the LCD?

I assume that USBD480_DrawFullScreenBGRA32 is what actually writes to the LCD, and the rest of the code just prepares the image. I think your key to better performance is preparing the next image while the previous image is being written.

I think your best solution is to use two threads and use a ConcurrentQueue as a buffer for what needs to be written. One thread prepares the images and puts them into the ConcurrentQueue , and the other thread pulls them off the queue and writes them to the LCD. This way you don't have the overhead of calling Task.Run each time around.

It might also be wise to limit how many frames are written to the queue, so it doesn't get too far ahead and take up unnecessary memory.

I think you should have two threads (and two only):

  1. one that continuously creates the bitmap; and
  2. one that continuously takes the most recent bitmap and pushes it to LCD.

Here's my naive implementation.

I used a shared array that contains the latest produced image because it keeps number of allocations low. A shared array we can get away with 3 array objects (shared + 2 thread locals).

public class Program
{
    public class A
    {
        private readonly object pixelsLock = new object();

        Array shared = ...;

        public void Method2()
        {
            Array myPixels = (...);
            while (true)
            {
                // Prepare image
                BitmapFrame bf = BitmapFrame.Create(screen);
                RenderOptions.SetBitmapScalingMode(bf, BitmapScalingMode.LowQuality);
                bf.CopyPixels(new Int32Rect(0, 0, width, height), myPixels, width * 4, 0);

                lock (pixelsLock)
                {
                    // Copy the hard work to shared storage
                    Array.Copy(sourceArray: myPixels, destinationArray: shared, length: myPixels.GetUpperBound(0) - 1);
                }
            }
        }

        public void Method1()
        {
            Array myPixels = (...);
            while (true)
            {
                lock (pixelsLock)
                {
                    //Max a local copy
                    Array.Copy(sourceArray: shared, destinationArray: myPixels, length: myPixels.GetUpperBound(0) - 1);
                }
                DisplayWrapper.USBD480_DrawFullScreenBGRA32(ref disp, myPixels);
            }
        }
    }


    public static async Task Main(string[] args)
    {
        var a = new A();
        new Thread(new ThreadStart(a.Method1)).Start();
        new Thread(new ThreadStart(a.Method2)).Start();
        Console.ReadLine();
    }
}

You could consider using the robust, performant, and highly configurable TPL Dataflow library , that will allow you to construct a pipeline of data. You will be posting raw data into the first block of the pipeline, and the data will be transformed while flowing from one block to the next, before being finally rendered at the last block. All blocks will be working in parallel. In the example bellow there are three blocks, all configured with the default MaxDegreeOfParallelism = 1 , so 3 threads at maximum will be concurrently busy doing work. I have configured the blocks with an intentionally small BoundedCapacity , so that if the incoming raw data is more than what the pipeline can process, the excessive input will be dropped.

var block1 = new TransformBlock<Stream, BitmapFrame>(stream =>
{
    BitmapFrame bf = BitmapFrame.Create(stream);
    RenderOptions.SetBitmapScalingMode(bf, BitmapScalingMode.LowQuality);
    return bf;
}, new ExecutionDataflowBlockOptions()
{
    BoundedCapacity = 5
});

var block2 = new TransformBlock<BitmapFrame, int[]>(bf =>
{
    var pixels = new int[width * height * 4];
    bf.CopyPixels(new Int32Rect(0, 0, width, height), pixels, width * 4, 0);
    return pixels;
}, new ExecutionDataflowBlockOptions()
{
    BoundedCapacity = 5
});

var block3 = new ActionBlock<int[]>(pixels =>
{
    DisplayWrapper.USBD480_DrawFullScreenBGRA32(ref disp, pixels);
}, new ExecutionDataflowBlockOptions()
{
    BoundedCapacity = 5
});

The pipeline is created by linking the blocks together:

block1.LinkTo(block2, new DataflowLinkOptions() { PropagateCompletion = true });
block2.LinkTo(block3, new DataflowLinkOptions() { PropagateCompletion = true });

And finally the loop takes the form bellow:

void Loop()
{
    while (_IsRunning)
    {
        block1.Post(GetRawStreamData());
    }
    block1.Complete();
    block3.Completion.Wait(); // Optional, to wait for the last data to be processed
}

In this example 2 types of blocks are used, two TransformBlock s and one ActionBlock at the end. The ActionBlock s do not produce any output, so they are frequently found at the end of TPL Dataflow pipelines.

An alternative to TPL Dataflow is a recently introduced library named Channels , a small library that is easy to learn. This one includes the interesting option BoundedChannelFullMode , for selecting what items are dropped when the queue is full:

DropNewest: Removes and ignores the newest item in the channel in order to make room for the item being written.
DropOldest: Removes and ignores the oldest item in the channel in order to make room for the item being written.
DropWrite: Drops the item being written.
Wait: Waits for space to be available in order to complete the write operation.

In contrast TPL Dataflow has only two options. It can ether drop the item being written by using the demonstrated block1.Post(...) , or wait for space to be available by using the alternative block1.SendAsync(...).Wait() .

Channels are not a complete replacement of TPL Dataflow though, since they deal only with the queuing of the workitems, and not with their actual processing.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM