简体   繁体   中英

Parsing command line string in to argv format

I need to parse a command line string in to the argv format so I can pass it in to execvpe. Basically a linux equivilant to CommandLineToArgvW() from Windows. Is there any function or library I could call to do this? Or do I have to write my own parser? (I was hoping I could steal from BASH if I needed to do this since my program is GPL...)

Example: I have three variables:

const char* file = "someapplication";
const char* parameters = "param1 -option1 param2";
const char* environment[] = { "Something=something", NULL };

and I want to pass it to execvpe:

execvpe(file, /* parsed parameters */, environment);

PS: I do not want filename expansion but I want quoting and escaping

I used the link given by rve in the comments ( http://bbgen.net/blog/2011/06/string-to-argc-argv ) and that solved my problem. Upvote his comment, not my answer!

Use my nargv procedure. I've literally beaten this question to death with this answer: https://stackoverflow.com/a/10071763/735796 nargv means New Argument Vectors. It supports everything a shell would in regaurd to parsing a string into seperate elements. It supports double quotes, single quotes and string concatenation for example.

The following is my first attempt at duplicating what the CommandLineToArgvW() function in the Windows shell library does. It uses only standard functions and types, and does not use Boost. Except for one call to strdup() , it is platform independent and works in both Windows and Linux environments. It handles arguments that are single quoted or double quoted.

// Snippet copied from a larger file.  I hope I added all the necessary includes.
#include <string>
#include <string.h>
#include <vector>

using namespace std;

char ** CommandLineToArgv( string const & line, int & argc )
{
    typedef vector<char *> CharPtrVector;
    char const * WHITESPACE_STR = " \n\r\t";
    char const SPACE = ' ';
    char const TAB = '\t';
    char const DQUOTE = '\"';
    char const SQUOTE = '\'';
    char const TERM = '\0';


    //--------------------------------------------------------------------------
    // Copy the command line string to a character array.
    // strdup() uses malloc() to get memory for the new string.
#if defined( WIN32 )
    char * pLine = _strdup( line.c_str() );
#else
    char * pLine = strdup( line.c_str() );
#endif


    //--------------------------------------------------------------------------
    // Crawl the character array and tokenize in place.
    CharPtrVector tokens;
    char * pCursor = pLine;
    while ( *pCursor )
    {
        // Whitespace.
        if ( *pCursor == SPACE || *pCursor == TAB )
        {
            ++pCursor;
        }

        // Double quoted token.
        else if ( *pCursor == DQUOTE )
        {
            // Begin of token is one char past the begin quote.
            // Replace the quote with whitespace.
            tokens.push_back( pCursor + 1 );
            *pCursor = SPACE;

            char * pEnd = strchr( pCursor + 1, DQUOTE );
            if ( pEnd )
            {
                // End of token is one char before the end quote.
                // Replace the quote with terminator, and advance cursor.
                *pEnd = TERM;
                pCursor = pEnd + 1;
            }
            else
            {
                // End of token is end of line.
                break;
            }
        }

        // Single quoted token.
        else if ( *pCursor == SQUOTE )
        {
            // Begin of token is one char past the begin quote.
            // Replace the quote with whitespace.
            tokens.push_back( pCursor + 1 );
            *pCursor = SPACE;

            char * pEnd = strchr( pCursor + 1, SQUOTE );
            if ( pEnd )
            {
                // End of token is one char before the end quote.
                // Replace the quote with terminator, and advance cursor.
                *pEnd = TERM;
                pCursor = pEnd + 1;
            }
            else
            {
                // End of token is end of line.
                break;
            }   
        }

        // Unquoted token.
        else
        {
            // Begin of token is at cursor.
            tokens.push_back( pCursor );

            char * pEnd = strpbrk( pCursor + 1, WHITESPACE_STR );
            if ( pEnd )
            {
                // End of token is one char before the next whitespace.
                // Replace whitespace with terminator, and advance cursor.
                *pEnd = TERM;
                pCursor = pEnd + 1;
            }
            else
            {
                // End of token is end of line.
                break;
            }
        }
    }


    //--------------------------------------------------------------------------
    // Fill the argv array.
    argc = tokens.size();
    char ** argv = static_cast<char **>( malloc( argc * sizeof( char * ) ) );
    int a = 0;
    for ( CharPtrVector::const_iterator it = tokens.begin(); it != tokens.end(); ++it )
    {
        argv[ a++ ] = (*it);
    }


    return argv;
}

char *strtok(char *s, const char *delim) is what you are looking for

char *s will be standard input and char *delim will be ' '

Maybe I'm missing something, but why don't you just pass the &argv[1] as parameters and the environment obtained using getenv() as environment?

EDIT: If you want a different separator, you can use the environment variable IFS (internal field separator) to achieve this.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM