简体   繁体   English

C#应用程序配置文件给出不同的结果

[英]C# application profiling gives different results

I'm new to profiling. 我是新手。 I'm trying to profile a C# application which connects to an SQLite database and retrieve data. 我正在尝试分析连接到SQLite数据库并检索数据的C#应用​​程序。 The database contains 146856400 rows and the select query retrieves 428800 rows after execution. 数据库包含146856400行,选择查询在执行后检索428800行。

On the first execution the main thread takes 246686 ms 第一次执行时,主线程需要246686 ms

On second execution of the same code the main thread takes only 4296 ms 第二次执行相同的代码时,主线程仅花费4296毫秒

After restarting the system 重新启动系统后

On the first execution the main thread takes 244533 ms 第一次执行时,主线程需要244533毫秒

On the second execution of the same code the main thread takes only 4053 ms 在第二次执行相同的代码时,主线程仅需4053毫秒

Questions: 问题:

1) Why is there a big difference between the first execution timing and the second execution timing 1)为什么第一次执行时间和第二次执行时间之间有很大差异

2) After restarting the system why I'm not getting the same results. 2)重新启动系统后,为什么我没有得到相同的结果。

Pls help 请帮助

You experience the difference between cold and warm execution of your query. 您会遇到冷执行和热执行查询的区别。 Cold means the first time and warm all subsequent invocations of your db query. Cold意味着第一次并预热数据库查询的所有后续调用。 The first time everything is "cold" 第一次一切都是“冷”的

  • OS file system cache is empty. 操作系统文件系统缓存为空。
  • SQLLite cache is empty. SQLLite缓存为空。
  • ORM dynamic query compilation is not done and cached yet. ORM动态查询编译尚未完成并缓存。
  • ORM Mapper cache is empty. ORM映射器缓存为空。
  • Garbage Collector needs to tune your working set 垃圾收集器需要调整您的工作环境
  • .... ....

When you execute your query a second time all these first time initializations (caching) are done and you are measuring the effects of different cache levels as long as there is enough memory available to cache a substantial amount of your requested data. 当您第二次执行查询时,所有这些第一次初始化(缓存)都已完成,并且只要有足够的内存来缓存大量请求的数据,您就在测量不同缓存级别的影响。

A performance difference between 4 minutes and 4s is impressive. 4分钟到4秒之间的性能差异令人印象深刻。 Both numbers are valid. 这两个数字均有效。 Measuring something is easy. 测量东西很容易。 Telling someone else what exactly you have measured and how the performance can be improved by changing this or that is much harder. 告诉其他人您到底测量了什么,以及如何通过更改此难度来改善性能。

The performance game goes often like this: 表演游戏通常是这样的:

Customer: It is slow 
Dev:      I cannot repro your issue.
Customer: Here is my scenario .... 
Dev:      I still cannot repro it. Can you give me data set you use and the exact steps you did perform?
Customer: Sure. Here is the data and the test steps.
Dev:      Ahh I see. I can make it 10 times faster.
Customer: That is great. Can I have the fix?
Dev:      Sure here it is.
Customer: **Very Angry** It has become faster yes. But I cannot read my old data!
Dev:      Ups. We need to migrate all your old data to the new much more efficient format. 
          We need to  develop a a conversion tool which will take 3 weeks and your site will 
          have 3 days downtime while the conversion tool is running. 
          Or 
          We keep the old inefficient data format. But then we can make it only 9 times faster.
Customer: I want to access my data faster without data conversion!
Dev:      Here is the fix which is 10% slower with no schema changes. 
Customer: Finally. The fix does not break anything but it has not become faster?
Dev:      I have measured your use case. It is only slow for the first time. 
          All later data retrievals are 9 times faster than before. 
Customer: Did I mention that in my use case I read always different data?
Dev:      No you did not. 
Customer: Fix it!
Dev:      That is not really possible without a major rewrite of large portions of our software.
Customer: The data I want to access is stored in a list. I want to process it sequentially.
Dev:      In that case we can preload the data in the background while you are working the current data set. You will only experience a delay for the first data set on each working day.
Customer: Can I have the fix?
Dev:      Sure here it is.
Customer: Perfect. It works!

Performance is hard to grasp since most of the time you deal with perceived performance which is subjective. 性能很难掌握,因为大多数时候您都在处理主观的感知性能。 Bringing it down to quantitative measurements is a good start but you need to tune your metrics to reflect actual customer use cases or you will likely optimize at the wrong places like above. 将其用于定量测量是一个好的开始,但是您需要调整指标以反映实际的客户使用情况,否则您可能会在上述错误的位置进行优化。 A complete understanding of customer requirements and use cases is a must. 必须完全了解客户需求和用例。 On the other hand you need to understand your complete system (profile it as hell) to be able to tell the difference between cold and warm query execution and where you can tune the whole thing. 另一方面,您需要了解完整的系统(将其配置为地狱),以区分冷查询和热查询执行之间的区别,以及可以在何处调整整个过程。 These caches become useless if you query for different data all of the time (not likely). 如果您始终查询不同的数据(不太可能),这些缓存将变得无用。 Perhaps you need a different index to speed up queries or you buy a SSD or you keep all of the data in memory and do all subsequent queries in memory.... 也许您需要不同的索引来加快查询速度,或者您购买了SSD,或者将所有数据保留在内存中,然后在内存中进行所有后续查询...。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM