[英]Identifying repeated code within PHP project
I have a single PHP file within a legacy project that is at least a few thousand lines long. 我在遗留项目中有一个PHP文件,至少有几千行。 It is predominantly separated up into a number of different conditional blocks by a switch statement with about 10 cases. 它主要由一个大约10个案例的switch语句分成许多不同的条件块。 Within each case there is what appears to be a very similar - if not exact duplicate - block of code. 在每种情况下,似乎都有一个非常相似的 - 如果不是完全相同的 - 代码块。 What methods are available for me identifying these blocks of code as being either the same - or close to the same - so I can abstract that code out and begin to refactor the entire file? 有哪些方法可以让我将这些代码块识别为相同或接近相同 - 因此我可以抽象出代码并开始重构整个文件? I know this is possible in very manual terms (separate each case statement in the code into individual files and Diff) but i'm interested in what tools i could be using to speed this process up. 我知道这可以用非常手动的术语(将代码中的每个case语句分成单个文件和Diff),但我对我可以用什么工具加速这个过程感兴趣。
Thanks. 谢谢。
You can use phpunit PMD (Project Mess Detector) to detect duplicated blocks of code. 您可以使用phpunit PMD(Project Mess Detector)来检测重复的代码块。
It also can compute the Cyclomatic complexity of your code. 它还可以计算代码的Cyclomatic复杂性 。
Here is a screenshot of the pmd tab in phpuc: 这是phpuc中pmd选项卡的屏幕截图:
See our PHP Clone Detector tool. 请参阅我们的PHP克隆检测器工具。
This finds both exact copies and near misses, in spite of reformatting, insertion/deletion of comments, replacement of variable names, addition/replacments of subblocks etc. 尽管重新格式化,插入/删除注释,替换变量名称,添加/替换子块等,这仍然可以找到精确副本和接近未命中。
PHPCPD as far as I can tell finds only (token) sequences which are exactly the same. 据我所知,PHPCPD只找到完全相同的(令牌)序列。 That misses a lot of clones, since the most common operation after copy-paste is edit-to-customize. 这会错过很多克隆,因为复制粘贴后最常见的操作是编辑到自定义。 So it would miss the very clones the OP is trying to find. 所以它会错过OP试图找到的克隆。
You could put the blocks in separate files and just run diff on them? 您可以将块放在单独的文件中,然后在它们上运行diff吗?
However, I think in the end you will need to go through everything manually anyway, since it sounds like this code requires a lot of refactoring, and even if there are differences you will probably need to evaluate whether this is intentional or a bug. 但是,我认为最终你需要手动完成所有操作,因为听起来这段代码需要大量的重构,即使存在差异,你也可能需要评估这是故意还是错误。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.