简体繁体 English

持续测试生产环境时

[英]when testing production environment continuously makes sense

原文 2015-07-03 12:41:14 8 1 testing/ continuous-integration/ continuous-testing

Let's say I have a bunch of unit tests, integration tests, and e2e tests that cover my app. 假设我有很多涵盖我的应用程序的单元测试，集成测试和e2e测试。 Does it make sense to have these continuously running against prod, eg every 10 mins? 使这些连续与产品对立运行（例如每10分钟一次）是否有意义？

I'm thinking no, here's why: My tests are already ran after every prod deploy. 我想不，这是为什么：我的测试已经在每个产品部署之后运行了。 If they passed and no code has changed after that, they should continue to pass. 如果它们通过了，之后没有任何代码更改，则应继续通过。 So testing them thereafter doesn't make sense. 因此，随后对其进行测试没有任何意义。

What I really want to test continuously is my infrastructure -- is it still running? 我真正想要连续测试的是我的基础架构-它仍在运行吗？ In this case, running an API integration test every 10 mins to check if my API is still working makes sense. 在这种情况下，每10分钟运行一次API集成测试以检查我的API是否仍然有效。 So I'm dealing with a subset of my test suites -- the ones that test my infrastructure availability (integration+e2e) versus only single bits of code (unit test). 因此，我要处理的是测试套件的一个子集-测试套件的基础设施可用性（integration + e2e）与仅单个代码段（单元测试）的套件。 So in practice, would I have seperate test suites that test prod uptime than the suites used to test pre/post deploy? 因此，在实践中，与用于测试部署前/后的套件相比，我是否可以使用单独的套件来测试产品正常运行时间？

1 个解决方案

Such "redundant" verifications (they can include building as well, BTW, not only testing) offer additional datapoints increasing the monitoring precision for your actual production process. 此类“冗余”验证（它们还可以包括建筑物，BTW，不仅包括测试）还提供了其他数据点，从而提高了实际生产过程的监视精度。

Depending on the complexity of your production environment even the simple "is it up/running?" 根据生产环境的复杂程度，甚至是简单的“是否启动/运行？” question might not have a simple answer and subset/shortcut versions of the verifications might not cut it - you'd only cover those versions, not the actual production ones. 问题可能没有简单的答案，并且验证的子集/快捷方式版本可能没有意义-您只涵盖那些版本，而不是实际的版本。

For example just because a build server is up doesn't mean it's also capable of building the product successfully, you'd need to check every aspect of the build itself: availability of every tool, storage, dependencies, OS resources, etc. For complex builds it's probably simpler to just perform the build itself than to manage the code reliably checking if the build would be feasible ;) 例如，仅因为构建服务器启动并不意味着它也能够成功构建产品，您就需要检查构建本身的每个方面 ：每个工具的可用性，存储，依赖项，OS资源等。复杂的构建可能只执行构建本身要比可靠地管理代码以检查构建是否可行更容易；）

There are 2 production process attributes that would benefit from a more precise monitoring (and for which subset/shortcut verifications won't be suitable either): 有2个生产过程属性可以从更精确的监视中受益（并且子集/快捷方式验证也不适合）：

reliability/stability - the types, occurence rates and root causes of intermittent failures (yes, those nasty surprises which could make a difference between meeting the release date or not) 可靠性/稳定性 -间歇性故障的类型，发生率和根本原因（是的，那些令人讨厌的意外事件可能会影响到是否满足发布日期）
performance - the avg/min/max durations of various verifications; 性能 -各种验证的平均/最小/最大持续时间； especially important if verifications are expensive in terms of duration/resources involved; 如果验证的持续时间/所涉资源昂贵，则尤其重要； trending could be desired for planning, budgeting, production ETAs, etc 规划，预算，生产预计到达时间等可能需要趋势

Donno if any of these are applicable to or have acceptable cost/benefit ratios for your context but they are definitely important for most very large/complex sw projects. Donno（如果有）适用于您的环境或具有可接受的成本/收益比，但是对于大多数大型/复杂SW项目而言，它们绝对是重要的。