简体   繁体   中英

How to grep to make the list of all PHP classes from all the files?

I'm trying to grep all the files containing the class names, in example:

$ grep -ER "(^| )class [^ ]*" .

Where I thought (^| ) would include either whitespace or newline , but it's showing all class words within the comments such as:

./includes/bootstrap.inc: *   The name of the class to check or load.

And the following example:

grep -ER "(^|\?abstract)class [^ ]*" .

actually doesn't include any abstract class files.

Basically I'm trying to do the regex such as:

  1. New line or space.
  2. Optional word abstract .
  3. class name (word).
  4. Space after it.
  5. Any actual class name (word).

So the regex would react on the following lines:

  class Entity {
class Entity {
class NoFieldsException extends Exception {}
abstract class CacheArray implements ArrayAccess {

But not for these:

 * This class should be extended by systems that need to cache large amounts

Setting up the testing environment:

$ curl -o- http://ftp.drupal.org/files/projects/drupal-7.31.tar.gz | tar zxf - && cd drupal-7*
$ grep -ER "(^| )class [^ ]*" . | less
$ grep -ER "(^|\?abstract)class [^ ]*" . | wc -l
653 # But this doesn't include ./includes/bootstrap.inc:abstract class DrupalCacheArray

Example use cases:

  • In case when Drupal CMS class registry is broken, the table needs to be rebuilt , so all available files containing classes needs to be imported into SQL table.

An alternative to my other answer is to allow PHP to handle it without actually executing the code by using the tokenizer . This way you should not have any of the negative side-effects of an incorrect regex since PHP is doing all the parsing.

test1.php:

<?php

$source = file_get_contents('test2.php');
$tokens = token_get_all($source);

foreach ($tokens as $token) {
    // 308 is for classes
    if ($token[0] == 308) {
        // $token[1] contains the actual class name
        echo $token[1] ."\n";
    }
}

test2.php:

  class Entity { }
class                                             Entity {
}
class NoFieldsException extends Exception {}
abstract
    class
             CacheArray implements ArrayAccess { }
// But not for these:

 /* This class should be extended by systems that need to cache large amounts */

Output:

Entity
Entity
NoFieldsException
Exception
CacheArray
ArrayAccess

Note, it may not be desirable to have Exception and ArrayAccess show up in here too since one is a native PHP class and the other is being implemented. If not, try doing a print_r($tokens); and playing around with it to get what you want. You could also add them all to an array and then do array_unique() to get the unique values.

As arkascha mentions, there are probably too many possibilities to have this done with regex, however given the examples you have, this should work for them:

/^(\s+)?(abstract\s+)?class\s+(\S+)/igm

See it in action: http://regex101.com/r/qM7iO0/4

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM