简体   繁体   中英

How to get const char * from external dll written in C++ into C#

I am integrating CMU's PocketSphinx with Unity by compiling my own C++ project in Visual Studio 2010 as a DLL, which I am calling from a C# script from within Unity Pro. I know that the dll works because I have made another project as an exe with the very same code, compiled it, and it works perfectly as a standalone program. I'm using the pocketsphinx_continuous project example, which gets microphone inputs and outputs text to the console. I have customized this code to be called from inside Unity and it is supposed to output back to my C# code as a string rather than to the console. I feel that I almost have this working, but the const char * is just not making it back as a string. I will end up getting access violation errors if I use this declaration:

private static extern string recognize_from_microphone();

so, I have tried to use this one:

private static extern IntPtr recognize_from_microphone();

and then, I use this line of code to try to print the output of that function:

print("you just said " + Marshal.PtrToStringAnsi(recognize_from_microphone()));

but, then I get only "you just said" in return. I can manage to get a memory address back if I do this: print("you just said " + recognize_from_microphone()); So, I know that something is getting returned.

Here is my C++ code (much of this was written originally in C as the example code from pocketsphinx):

char* MakeStringCopy (const char* str) 
{
  if (str == NULL) return NULL;
  char* res = (char*)malloc(strlen(str) + 1);
  strcpy(res, str);
  return res;
}


extern __declspec(dllexport) const char * recognize_from_microphone()
{
//this is a near complete duplication of the code from main()
char const *cfg;
config = cmd_ln_init(NULL, ps_args(), TRUE,
"-hmm", MODELDIR "\\hmm\\en_US\\hub4wsj_sc_8k",
"-lm", MODELDIR "\\lm\\en\\turtle.DMP",
"-dict", MODELDIR "\\lm\\en\\turtle.dic",
NULL);

if (config == NULL)
{
   return "config is null";
}

ps = ps_init(config);
if (ps == NULL)
{
   return "ps is null";
}

ad_rec_t *ad;
int16 adbuf[4096];
int32 k, ts, rem;
char const *hyp;
char const *uttid;
cont_ad_t *cont;
char word[256];
char words[1024] = "";
//char temp[] = "hypothesis";
//hyp = temp;

if ((ad = ad_open_dev(cmd_ln_str_r(config, "-adcdev"),
                      (int)cmd_ln_float32_r(config, "-samprate"))) == NULL)
    E_FATAL("Failed to open audio device\n");

/* Initialize continuous listening module */
if ((cont = cont_ad_init(ad, ad_read)) == NULL)
    E_FATAL("Failed to initialize voice activity detection\n");
if (ad_start_rec(ad) < 0)
    E_FATAL("Failed to start recording\n");
if (cont_ad_calib(cont) < 0)
    E_FATAL("Failed to calibrate voice activity detection\n");

for (;;) {
    /* Indicate listening for next utterance */
    //printf("READY....\n");
    fflush(stdout);
    fflush(stderr);

    /* Wait data for next utterance */
    while ((k = cont_ad_read(cont, adbuf, 4096)) == 0)
        sleep_msec(100);

    if (k < 0)
        E_FATAL("Failed to read audio\n");

    /*
     * Non-zero amount of data received; start recognition of new utterance.
     * NULL argument to uttproc_begin_utt => automatic generation of utterance-id.
     */
    if (ps_start_utt(ps, NULL) < 0)
        E_FATAL("Failed to start utterance\n");

    ps_process_raw(ps, adbuf, k, FALSE, FALSE);
    //printf("Listening...\n");
    fflush(stdout);

    /* Note timestamp for this first block of data */
    ts = cont->read_ts;

    /* Decode utterance until end (marked by a "long" silence, >1sec) */
    for (;;) {

        /* Read non-silence audio data, if any, from continuous listening module */
        if ((k = cont_ad_read(cont, adbuf, 4096)) < 0)
            E_FATAL("Failed to read audio\n");
        if (k == 0) {
            /*
             * No speech data available; check current timestamp with most recent
             * speech to see if more than 1 sec elapsed.  If so, end of utterance.
             */
            if ((cont->read_ts - ts) > DEFAULT_SAMPLES_PER_SEC)
                break;
        }
        else {
            /* New speech data received; note current timestamp */
            ts = cont->read_ts;
        }

        /*
         * Decode whatever data was read above.
         */
        rem = ps_process_raw(ps, adbuf, k, FALSE, FALSE);

        /* If no work to be done, sleep a bit */
        if ((rem == 0) && (k == 0))
            sleep_msec(20);
    }

    /*
     * Utterance ended; flush any accumulated, unprocessed A/D data and stop
     * listening until current utterance completely decoded
     */
    ad_stop_rec(ad);
    while (ad_read(ad, adbuf, 4096) >= 0);
    cont_ad_reset(cont);
    fflush(stdout);
    /* Finish decoding, obtain and print result */
    ps_end_utt(ps);

    hyp = ps_get_hyp(ps, NULL, &uttid);
    fflush(stdout);

    /* Exit if the first word spoken was GOODBYE */
   //actually, for unity, exit if any word was spoken at all! this will avoid an infinite loop of doom!
    if (hyp) {
        /*sscanf(hyp, "%s", words);
        if (strcmp(word, "goodbye") == 0)*/
            break;
    }
   else
     return "nothing returned";
    /* Resume A/D recording for next utterance */
    if (ad_start_rec(ad) < 0)
        E_FATAL("Failed to start recording\n");
}
cont_ad_close(cont);
ad_close(ad);
ps_free(ps);
const char *temp = new char[1024];
temp = MakeStringCopy(hyp);
return temp;}

If change return temp; to return "some string here"; Then I see the text appear inside Unity. That's not helpful, though, because I don't need hardcoded text, I need the output of the speech recognition code, which ends up getting stored inside the hyp variable.

Can anyone help me figure out what I'm doing wrong? Thanks!

The problem is that you are not supposed to allocate raw memory in C++ and consume it in C# this way, who is going to get rid of the memory you allocated inside the function MakeStringCopy ?

Try something like this instead:

[DllImport("MyLibrary.dll")]
[return: MarshalAs(UnmanagedType.LPStr)] 
public static extern string GetStringValue();

This way you are telling the marshaler that CLR owns the memory that came as result of calling the function and it will take care of the de-allocation.

Additionally, .Net strings contain unicode chars , which is why you were getting access violation errors when you tried to assign ANSI chars to it. Using the attribute UnmanagedType.LPStr also tells the marshaler the type of chars it should expect so that it can do the conversion for you.

Finally, for memory allocation in C++ side, according to this sample in MSDN you should use the function CoTaskMemAlloc instead of malloc inside your function MakeStringCopy .

Finally got this working! I ended up having to pass a stringbuilder object into the C++ function and get the string from that object in C# just like I found in this post: http://www.pcreview.co.uk/forums/passing-and-retrieving-string-calling-c-function-c-t1367069.html

The code is slower than I'd like, but at least it works now. Here was my final code:

C#:

[DllImport ("pocketsphinx_unity",CallingConvention=CallingConvention.Cdecl,CharSet = CharSet.Ansi)]
private static extern void recognize_from_microphone(StringBuilder str);StringBuilder mytext= new StringBuilder(1000);
recognize_from_microphone(mytext);
print("you just said " + mytext.ToString());

C++:

extern __declspec(dllexport) void recognize_from_microphone(char * fromUnity){
static ps_decoder_t *ps;
static cmd_ln_t *config;

config = cmd_ln_init(NULL, ps_args(), TRUE,
"-hmm", MODELDIR "\\hmm\\en_US\\hub4wsj_sc_8k",
"-lm", MODELDIR "\\lm\\en\\turtle.DMP",
"-dict", MODELDIR "\\lm\\en\\turtle.dic",
NULL);

if (config == NULL)
{
    //return "config is null";
}

ps = ps_init(config);
if (ps == NULL)
{
    //return "ps is null";
}

ad_rec_t *ad;
int16 adbuf[4096];
int32 k, ts, rem;
char const *hyp;
char const *uttid;
cont_ad_t *cont;
//char word[256];
char * temp;

if ((ad = ad_open_dev(cmd_ln_str_r(config, "-adcdev"),
                      (int)cmd_ln_float32_r(config, "-samprate"))) == NULL)
    printf("Failed to open audio device\n");

/* Initialize continuous listening module */
if ((cont = cont_ad_init(ad, ad_read)) == NULL)
    printf("Failed to initialize voice activity detection\n");
if (ad_start_rec(ad) < 0)
    printf("Failed to start recording\n");
if (cont_ad_calib(cont) < 0)
    printf("Failed to calibrate voice activity detection\n");

for (;;) {
    /* Indicate listening for next utterance */
    //printf("READY....\n");
    fflush(stdout);
    fflush(stderr);

    /* Wait data for next utterance */
    while ((k = cont_ad_read(cont, adbuf, 4096)) == 0)
        sleep_msec(100);

    if (k < 0)
        printf("Failed to read audio\n");

    /*
     * Non-zero amount of data received; start recognition of new utterance.
     * NULL argument to uttproc_begin_utt => automatic generation of utterance-id.
     */
    if (ps_start_utt(ps, NULL) < 0)
        printf("Failed to start utterance\n");

    ps_process_raw(ps, adbuf, k, FALSE, FALSE);
    //printf("Listening...\n");
    fflush(stdout);

    /* Note timestamp for this first block of data */
    ts = cont->read_ts;

    /* Decode utterance until end (marked by a "long" silence, >1sec) */
    for (;;) {

        /* Read non-silence audio data, if any, from continuous listening module */
        if ((k = cont_ad_read(cont, adbuf, 4096)) < 0)
            printf("Failed to read audio 2nd\n");
        if (k == 0) {
            /*
             * No speech data available; check current timestamp with most recent
             * speech to see if more than 1 sec elapsed.  If so, end of utterance.
             */
            if ((cont->read_ts - ts) > DEFAULT_SAMPLES_PER_SEC)
                break;
        }
        else {
            /* New speech data received; note current timestamp */
            ts = cont->read_ts;
        }

        /*
         * Decode whatever data was read above.
         */
        rem = ps_process_raw(ps, adbuf, k, FALSE, FALSE);

        /* If no work to be done, sleep a bit */
        if ((rem == 0) && (k == 0))
            sleep_msec(20);
    }

    /*
     * Utterance ended; flush any accumulated, unprocessed A/D data and stop
     * listening until current utterance completely decoded
     */
    ad_stop_rec(ad);
    while (ad_read(ad, adbuf, 4096) >= 0);
    cont_ad_reset(cont);
    fflush(stdout);
    /* Finish decoding, obtain and print result */
    ps_end_utt(ps);

    hyp = ps_get_hyp(ps, NULL, &uttid);
    fflush(stdout);

    /* Exit if the first word spoken was GOODBYE */
    //actually, for unity, exit if any word was spoken at all! this will avoid an infinite loop of doom!
    if (hyp) {
        strcpy(fromUnity,hyp);
        break;               
    }
    else
        //return "nothing returned";
    /* Resume A/D recording for next utterance */
    if (ad_start_rec(ad) < 0)
        printf("Failed to start recording\n");
}

cont_ad_close(cont);
ad_close(ad);
ps_free(ps);
}

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM