简体   繁体   中英

Best way to improve performance in WPF application

I'm currently working on a WPF application which was build using entity framework to access data ( SQL Server database) (database first).

In the past, the database was on an internal server and I did not notice any problem regarding the performance of the application even though the database is very badly implemented (only tables, no views, no indexes or stored procedure). I'm the one who created it but it was my first job and I was not very good with databases so I felt like entity framework was the best approach to focus mainly on code.

However, the database is now on another server which is waaay slower. As you guessed it, the application has now big performance issues (more than 10 seconds to load a dozen rows, same to insert new rows,...).

  • Should I stay with entity framework but try to improve performance by altering the database adding views and stored procedure ?
  • Should I get rid off entity framework and use only "basic" code (and improve the database at the same time) ?
  • Is there a simple ORM I could use instead of EF ?

Time is not an issue here , I can use all the time I want to improve the application but I can't seem to make a decision about the best way to make my application evolved.

The database is quite simple (around 10 tables), the only thing that could complicates thing is that I store files in there. So I'm not sure I can really use whatever I want. And I don't know if it's important but I need to display quite a few calculated fields. Any advice ?

Feel free to ask any relevant questions.

First you need to run a performance profiler and find put what is the bottle neck here, it can be database, entity framework configuration, entity framework queries and so on

In my experience, entity framework is a good option to this kind of applications, but you need understand how it works.

Also, What entity framework are you using? the lastest version is 6.2 and has some performance improvements that olders does not have, so if you are using a old one i suggest that update it

Based on the comments I am going to hazard a guess that it is mostly a bandwidth issue.

You had an application that was working fine when it was co-located, perhaps a single switch, gigabit ethernet and 200m of cabling.

Now that application is trying to send or retrieve data to/from a remote server, probably over the public internet through an unknown number of internal proxies in contention with who knows what other traffic, and it doesn't perform as well.

You also mention that you store files in the database, and your schema has fields like Attachment.data and Doc.file_content . This suggests that you could be trying to transmit large quantities (perhaps megabytes) of data for a simple query and that is where you are falling down.

Some general pointers:

  • Add indexes for anywhere you are joining tables or values you commonly query on.
  • Be aware of the difference between Lazy & Eager loading in Entity Framework. There is no right or wrong answer, but you should be know what you approach you are using and why .
  • Split any file content into its own table, with the same primary key as the main table or play with different EF classes to make sure you only retrieve files when you need to use them.

For performance profiling, the first place I recommend looking is an SQL profiler. This can capture the exact SQL statements that EF is running, and help identify possible performance culprits. I cover a few of these here . The Schema issues are probably the most relevant place to start. The title targets MVC, but most of the items relate to WPF and any application.

A good, simple profiler that I use for SQL Server is ExpressProfiler. ( https://github.com/OleksiiKovalov/expressprofiler )

With the move to a new server, and it now sending the data over the wire rather than pulling from a local database, the performance issues you're noticing will most likely be falling under the category of "loading too much, too often". Now you won't only be waiting for the database to load the data, but also for it to package it up and send it over the wire. Also, does the new database represent the same data volume and serve only a single client, or now serving multiple clients? Other catches for developers is "works on my machine" where local testing databases are smaller and not dealing with concurrent queries from the server. (where locks and such can impact performance)

From here, run a copy of the application with an isolated database server (no other clients hitting it to reduce "noise") with the profiler running against it. The things to look out for:

Lazy Loading - This is cases where you have queries to load data, but then see lots (dozens to hundreds) of additional queries being spun off. Your code may say "run this query and populate this data" which you expect should be 1 SQL query, but by touching lazy-loaded properties, this can spin off a great many other queries. The solution to lazy loading: If you need the extra data, eager load it with .Include() . If you only need some of the data, look into using .Select() to select view models / DTO of the data you need rather than relying on complete entities. This will eliminate lazy load scenarios, but may require some significant changes to your code to work with view models/dtos. Tools like Automapper can help greatly here. Read up on .ProjectTo() to see how Automapper can work with IQueryable to eliminate lazy load hits.

Reading too much - Loading entities can be expensive, especially if you don't need all of that data. Culprits for performance include excessive use of .ToList() which will materialize entire entity sets where a subset of data is needed, or a simple exists check or count would suffice. For example, I've seen code that does stuff like this:

var data = context.MyObjects.SingleOrDefault(x => x.IsActive && x.Id = someId);
return (data != null);

This should be:

var isData = context.MyObjects.Where(x => x.IsActive && x.Id = someId).Any();
return isData;

The difference between the two is that in the first example, EF will effectively do a SELECT * operation, so in the case where data is present it will return back all columns into an entity, only to later check if the entity was present. The second statement will run a faster query to simply return back whether a row exists or not.

var myDtos = context.MoyObjects.Where(x => x.IsActive && x.ParentId == parentId)
  .ToList()
  .Select( x => new ObjectDto
  {
    Id = x.Id,
    Name = x.FirstName + " " + x.LastName,
    Balance = calculateBalance(x.OrderItems.ToList()),
    Children = x.Children.ToList()
      .Select( c => new ChildDto  
      {
        Id = c.Id,
        Name = c.Name
      }).ToList()
  }).ToList();

Statements like this can go on and get rather complex, but the real problems is the .ToList() before the .Select(). Often these creep in because devs try to do something that EF doesn't understand, like call a method. (ie calculateBalance()) and it "works" by first calling .ToList(). The problem here is that you are materializing the entire entity at that point and switching to Linq2Object. This means that any "touches" on related data, such as .Children will now trigger lazy loads, and again further .ToList() calls can saturate more data to memory which might otherwise be reduced in a query. The culprit to look out for is .ToList() calls and to try removing them. Select simpler values before calling .ToList() and then feed that data into view models where the view models can calculate resulting data.

The worst culprit like this I've seen was due to a developer wanting to use a function in a Where clause:

var data = context.MyObjects.ToList().Where(x => calculateBalance(x) > 0).ToList();

That first ToList() statement will attempt to saturate the whole table to entities in memory. A big performance impact beyond just the time/memory/bandwidth needed to load all of this data is simply the # of locks the database must make to reliably read/write data. The fewer rows you "touch" and the shorter you touch them, the nicer your queries will play with concurrent operations from multiple clients. These problems magnify greatly as systems transition to being used by more users.

Provided you've eliminated extra lazy loads and unnecessary queries, the next thing to look at is query performance. For operations that seem slow, copy the SQL statement out of the profiler and run that in the database while reviewing the execution plan. This can provide hints about indexes you can add to speed up queries. Again, using .Select() can greatly increase query performance by using indexes more efficiently and reducing the amount of data the server needs to pull back.

For file storage: Are these stored as columns in a relevant table, or in a separate table that is linked to the relevant record? What I mean by this, is if you have an Invoice record, and also have a copy of an invoice file saved in the database, is it:

Invoices

  • InvoiceId
  • InvoiceNumber
  • ...
  • InvoiceFileData

or

Invoices

  • InvoiceId
  • InvoiceNumber
  • ...

InvoiceFile

  • InvoiceId
  • InvoiceFileData

It is a better structure to keep large, seldom used data in separate tables rather than combined with commonly used data. This keeps queries to load entities small and fast, where that expensive data can be pulled up on-demand when needed.

If you are using GUIDs for keys (as opposed to ints/longs) are you leveraging newsequentialid()? (assuming SQL Server) Keys set to use newid() or in code, Guid.New() will lead to index fragmentation and poor performance. If you populate the IDs via database defaults, switch them over to use newsequentialid() to help reduce the fragmentation. If you populate IDs via code, have a look at writing a Guid generator that mimics newsequentialid() (SQL Server) or pattern suited to your database. SQL Server vs. Oracle store/index GUID values differently so having the "static-like" part of the UUID bytes in the higher order vs. lower order bytes of the data will aid indexing performance. Also consider index maintenance and other database maintenance jobs to help keep the database server running efficiently.

When it comes to index tuning, database server reports are your friends. After you've eliminated most, or at least some serious performance offenders from your code, the next thing is to look at real-world use of your system. The best thing here to learn where to target your code/index investigations are the most used and problem queries that the database server identifies. Where these are EF queries, you can usually reverse-engineer based on the tables being hit which EF query is responsible. Grab these queries and feed them through the execution plan to see if there is an index that might help matters. Indexing is something that developers either forget, or get prematurely concerned about. Too many indexes can be just as bad as too few. I find it's best to monitor real-world usage before deciding on what indexes to add.

This should hopefully give you a start on things to look for and kick the speed of that system up a notch. :)

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM