简体   繁体   中英

Tool for finding C-style Casts

Does anyone know of a tool that I can use to find explicit C-style casts in code? I am refactoring some C++ code and want to replace C-style casts where ever possible.

An example C-style cast would be:

Foo foo = (Foo) bar;

In contrast examples of C++ style casts would be:

Foo foo = static_cast<Foo>(bar);
Foo foo = reinterpret_cast<Foo>(bar);
Foo foo = const_cast<Foo>(bar);

如果你正在使用gcc / g ++,只需为C风格的强制转换启用警告:

g++ -Wold-style-cast ...

The fact that such casts are so hard to search for is one of the reasons new-style casts were introduced in the first place. And if your code is working, this seems like a rather pointless bit of refactoring - I'd simply change them to new-style casts whenever I modified the surrounding code.

Having said that, the fact that you have C-style casts at all in C++ code would indicate problems with the code which should be fixed - I wouldn't just do a global substitution, even if that were possible.

搜索正则表达式\\)\\w给出了令人惊讶的好结果。

The Offload C++ compiler supports options to report as a compile time error all such casts, and to restrict the semantics of such casts to a safer equivalence with static_cast.

The relevant options are:

-cp_nocstylecasts   

The compiler will issue an error on all C-style casts. C-style casts in C++ code can potentially be unsafe and lead to undesired or undefined behaviour (for example casting pointers to unrelated struct/class types). This option is useful for refactoring to find all those casts and replace them with safer C++ casts such as static_cast.

-cp_c2staticcasts   

The compiler applies the more restricted semantics of C++ static_cast to C-style casts. Compiling code with this option switched on ensures that C-style casts are at least as safe as C++ static_casts

This option is useful if existing code has a large number of C-style casts and refactoring each cast into C++ casts would be too much effort.

A tool that can analyze C++ source code accurately and carry out automated custom changes (eg, your cast replacement) is the DMS Software Reengineering Toolkit .

DMS has a full C++ parser, builds ASTs and symbol tables, and can thus navigate your code to reliably find C style casts. By using pattern-directed matches and rewrites, you can provide a set of rules that would convert all such C-style casts into your desired C++ equivalents.

DMS has been used to carry out massive automated C++ reengineering tasks for Boeing and General Dynamics, each involving thousands of files.

One issue with C-style casts is that, since they rely on parentheses which are way overloaded, they're not trivial to spot. Still, a regex such as (eg in Python syntax):

r'\(\s*\w+\s*\)'

is a start -- it matches a single identifier in parentheses with optional whitespace inside the parentheses. But of course that won't catch, eg, (void*) casts -- to get trailing asterisks as well,

r'\(\s*\w+[\s*]*\)'

You could also start with an optional const to broaden the net still further, etc, etc.

Once you have a good RE, many tools (from grep to vim , from awk to sed , plus perl , python , ruby , etc) lets you apply it to identify all of its matches in your source.

If you use some kind of hungarian style notation (eg iInteger , pPointer etc.) then you can search for eg )p and ) p and so on.

It should be possible to find all those places in reasonable time even for a large code base.

I already answered once with a description of a tool that will find and change all the casts if you want it to.

If all you want to do is find such casts, there's another tool that will do this easily, and in fact is the extreme generalization of all the "regular expression" suggestions made here. That is the SD Source Code Search Engine . This tool enables one to search large code bases in terms of the language elements that make up each language. It provides a GUI allowing you enter queries, see individual hits, and show the file text at the hit point with one mouse click. One more click and you can be in your editor [for many editors] on a file. The tool will also record a list of hits in context so you can revisit them later.

In your case, the following search engine query is likely to get most of the casts:

'(' I ')'  | '(' I ... '*' ')'

which means, find a sequence of tokens, first being (, second being any identifier, third being ')', or a similar sequence involving something that ends in '*'.

You don't specify any whitespace management, as the tool understands the language whitespace rules; it will even ignore a comment in the middle of a cast and still match the above.

[I'm the CTO at the company that supplies this.]

I used this regular expression in Visual Studio (2010) Find in files search box: :i\\):i

Thanks to sth for the inspiration ( his post )

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM