简体   繁体   中英

Watson Conversation Unity - Add third party API data to the Watson Conversation/Assistant dialog answer

Is it possible to add/send information from another API that is requested in Unity to the Watson Conversation Service and add that data to the answer of a certain dialog node?
For example: I have connected the Watson Conversation, speech-to-text and text-to-speech services connected to Unity. Also in Unity, I have made an API call to openweathermap.org. The connection that's currently set-up is that when the user says: What's the weather in Berlin? The bot recognizes the intent #QuestionWeather and the entity @ city: Berlin. It hits the dialog node about the weather and the bot will respond with: The current temperature in Berlin is ... Through the OpenWeatherMap API call I got the temperature and can show the temperature in Unity.
But what I actually want to do is send the temperature variable to the Watson conversation and add that piece of information to the output text, so the user will also hear the temperature, instead of reading it off the screen.
I was thinking about making a $context variable to send the api temperature info to, is that possible? Or is what I'm trying to achieve not yet possible with Watson Conversation and Unity in C#?
This is my script for the Watson services:

using System.Collections;
using System.Collections.Generic;
using UnityEngine;
using IBM.Watson.DeveloperCloud.Services.TextToSpeech.v1;
using IBM.Watson.DeveloperCloud.Services.Conversation.v1;
//using IBM.Watson.DeveloperCloud.Services.Assistant.v1;
using IBM.Watson.DeveloperCloud.Services.ToneAnalyzer.v3;
using IBM.Watson.DeveloperCloud.Services.SpeechToText.v1;
using IBM.Watson.DeveloperCloud.Logging;
using IBM.Watson.DeveloperCloud.Utilities;
using IBM.Watson.DeveloperCloud.Connection;
using IBM.Watson.DeveloperCloud.DataTypes;
using MiniJSON;
using UnityEngine.UI;
using FullSerializer;

public class WatsonAgent : MonoBehaviour {

public string literalEntityCity;
public Text DestinationField;
public Text DepartureField;
public Text DepartureTime;
public Text StayingPeriod;
public Text AdultsField;
public Text KidsField;
public Text RoomsField;
public Text TransportationField;
public string destinationCity;
public string departureCity;

[SerializeField]
private fsSerializer _serializer = new fsSerializer();

[System.Serializable]
public class CredentialInformation
{
    public string username, password, url;
}

[System.Serializable]
public class Services
{
    public CredentialInformation
        textToSpeech, 
        languageTranslator, 
        personality_insights, 
        conversation, 
        speechToText, 
        toneAnalyzer;
}

[Header("Credentials")]
[Space]
public Services
    serviceCredentials;

[Space]
[Header("Agent voice settings")]
[Space]
public AudioSource
    voiceSource;

public VoiceType
    voiceType;

[Space]
[Header("Conversation settings")]
[Space]
public string
    workspaceId;

[Space]
[Header("Feedback fields")]
[Space]
public Text
    speechToTextField;
public Text 
    conversationInputField;
public Text
    conversationOutputField;

[System.Serializable]
public class Emotion
{
    public string 
        emotionId;
    public float 
        power;
}

[Space]
[Header("Emotions (read only)")]
[Space]
public List<Emotion> 
    emotions = new List<Emotion>();

public enum SocialState
{
    idle, listening, thinking, talking
}

[Space]
[Header("Agent social behaviour (read only)")]
[Space]
public SocialState
    characterState;

// services
SpeechToText
    speechToText;

private int 
    recordingRoutine = 0,
    recordingBufferSize = 1,
    recordingHZ = 22050;

private string 
    microphoneID = null;

private AudioClip 
    recording = null;

TextToSpeech 
    textToSpeech;

Conversation
    conversation;

private Dictionary<string, object> 
    conversationContext = null;

ToneAnalyzer
    toneAnalyzer;


private void Start()
{
    PrepareCredentials();
    Initialize();
}

void PrepareCredentials()
{
    speechToText = new SpeechToText(GetCredentials(serviceCredentials.speechToText));
    textToSpeech = new TextToSpeech(GetCredentials(serviceCredentials.textToSpeech));
    conversation = new Conversation(GetCredentials(serviceCredentials.conversation));
    toneAnalyzer = new ToneAnalyzer(GetCredentials(serviceCredentials.toneAnalyzer));
}

Credentials GetCredentials (CredentialInformation credentialInformation)
{
    return new Credentials(credentialInformation.username, credentialInformation.password, credentialInformation.url);
}

void Initialize ()
{
    conversation.VersionDate = "2017-05-26";
    toneAnalyzer.VersionDate = "2017-05-26";
    Active = true;
    StartRecording();
}

// speech to text
public bool Active
{
    get { return speechToText.IsListening; }
    set
    {
        if (value && !speechToText.IsListening)
        {
            speechToText.DetectSilence = true;
            speechToText.EnableWordConfidence = true;
            speechToText.EnableTimestamps = true;
            speechToText.SilenceThreshold = 0.01f;
            speechToText.MaxAlternatives = 0;
            speechToText.EnableInterimResults = true;
            speechToText.OnError = OnSpeechError;
            speechToText.InactivityTimeout = -1;
            speechToText.ProfanityFilter = false;
            speechToText.SmartFormatting = true;
            speechToText.SpeakerLabels = false;
            speechToText.WordAlternativesThreshold = null;
            speechToText.StartListening(OnSpeechRecognize);
        }
        else if (!value && speechToText.IsListening)
        {
            speechToText.StopListening();
        }
    }
}

private void StartRecording()
{
    if (recordingRoutine == 0)
    {
        UnityObjectUtil.StartDestroyQueue();
        recordingRoutine = Runnable.Run(RecordingHandler());
    }
}

private void StopRecording()
{
    if (recordingRoutine != 0)
    {
        Microphone.End(microphoneID);
        Runnable.Stop(recordingRoutine);
        recordingRoutine = 0;
    }
}

private void OnSpeechError(string error)
{
    Active = false;

    Log.Debug("ExampleStreaming.OnError()", "Error! {0}", error);
}

private IEnumerator RecordingHandler()
{
    recording = Microphone.Start(microphoneID, true, recordingBufferSize, recordingHZ);
    yield return null;      // let _recordingRoutine get set..

    if (recording == null)
    {
        StopRecording();
        yield break;
    }

    bool bFirstBlock = true;
    int midPoint = recording.samples / 2;
    float[] samples = null;

    while (recordingRoutine != 0 && recording != null)
    {
        int writePos = Microphone.GetPosition(microphoneID);
        if (writePos > recording.samples || !Microphone.IsRecording(microphoneID))
        {
            Debug.Log("Microphone disconnected.");
            StopRecording();
            yield break;
        }

        if ((bFirstBlock && writePos >= midPoint) || (!bFirstBlock && writePos < midPoint))
        {
            // front block is recorded, make a RecordClip and pass it onto our callback.
            samples = new float[midPoint];
            recording.GetData(samples, bFirstBlock ? 0 : midPoint);

            AudioData record = new AudioData();
            record.MaxLevel = Mathf.Max(Mathf.Abs(Mathf.Min(samples)), Mathf.Max(samples));
            record.Clip = AudioClip.Create("Recording", midPoint, recording.channels, recordingHZ, false);
            record.Clip.SetData(samples, 0);

            speechToText.OnListen(record);

            bFirstBlock = !bFirstBlock;
        }
        else
        {
            // calculate the number of samples remaining until we ready for a block of audio, 
            // and wait that amount of time it will take to record.
            int remaining = bFirstBlock ? (midPoint - writePos) : (recording.samples - writePos);
            float timeRemaining = (float)remaining / (float) recordingHZ;

            yield return new WaitForSeconds(timeRemaining);
        }
    }

    yield break;
}

private void OnSpeechRecognize(SpeechRecognitionEvent result)
{
    if (result != null && result.results.Length > 0)
    {
        foreach (var res in result.results)
        {
            foreach (var alt in res.alternatives)
            {

                string text = string.Format("{0} ({1}, {2:0.00})\n", alt.transcript, res.final ? "Final" : "Interim", alt.confidence);
                // Log.Debug("ExampleStreaming.OnRecognize()", text);

                if (speechToTextField != null)
                {
                    speechToTextField.text = text;
                }

                if (res.final)
                {
                    if (characterState == SocialState.listening)
                    {
                        Debug.Log("WATSON | Speech to text recorded: \n" + alt.transcript);
                        StartCoroutine(Message(alt.transcript));
                    }
                } else
                {
                    if(characterState == SocialState.idle)
                    {
                        characterState = SocialState.listening;
                    }
                }
            }
        }
    }
}


// text to speech
private IEnumerator Synthesize(string text)
{
    Debug.Log("WATSON CALL | Synthesize input: \n" + text);

    textToSpeech.Voice = voiceType;
    bool doSynthesize = textToSpeech.ToSpeech(HandleSynthesizeCallback, OnFail, text, true);

    if(doSynthesize)
    {
        StartCoroutine(Analyze(text));
        characterState = SocialState.talking;
    }

    yield return null;
}

void HandleSynthesizeCallback(AudioClip clip, Dictionary<string, object> customData = null)
{
    if (Application.isPlaying && clip != null)
    {
        Invoke("ResumeIdle", clip.length);
        voiceSource.clip = clip;
        voiceSource.Play();
    }
}

void ResumeIdle()
{
    characterState = SocialState.idle;
}

// conversation
private IEnumerator Message (string text)
{
    Debug.Log("WATSON | Conversation input: \n" + text);

    MessageRequest messageRequest = new MessageRequest()
    {
        input = new Dictionary<string, object>()
        {
            { "text", text }
        },
        context = conversationContext
    };
    bool doMessage = conversation.Message(HandleMessageCallback, OnFail, workspaceId, messageRequest);

    if(doMessage)
    {
        characterState = SocialState.thinking;

        if (conversationInputField != null)
        {
            conversationInputField.text = text;
        }
    }

    yield return null;
}

    void HandleMessageCallback (object resp, Dictionary<string, object> customData)
{
    object _tempContext = null;
    (resp as Dictionary<string, object>).TryGetValue("context", out _tempContext);

    if (_tempContext != null)
        conversationContext = _tempContext as Dictionary<string, object>;
    string contextList = conversationContext.ToString();

    Dictionary<string, object> dict = Json.Deserialize(customData["json"].ToString()) as Dictionary<string, object>;

    // load output --> text;answer from Json node
    Dictionary <string, object> output = dict["output"] as Dictionary<string, object>;

    var context = dict["context"] as Dictionary<string, object>;
    if (context["destination_city"] != null)
    {
        destinationCity = context["destination_city"].ToString();
        Debug.Log("Destination city: " + destinationCity);
        DestinationField.text = "Destination: " + destinationCity;
    }
    if (context["departure_city"] != null)
    {
        departureCity = context["departure_city"].ToString();
        DepartureField.text = "Departure: " + departureCity;
    }
    if (context["DateBegin"] != null && context["date"] != null)
    {
        string dateBegin = context["DateBegin"].ToString();
        string dateEnd = context["DateEnd"].ToString();
        StayingPeriod.text = "Stay: " + dateBegin + " - " + dateEnd;
    }
    if (context["PeriodNumber"] != null && context["PeriodDate"] != null && context["DateEnd"] != null)
    {
        string periodNumber = context["PeriodNumber"].ToString();
        string periodDate = context["PeriodDate"].ToString();
        string dateEnd = context["DateEnd"].ToString();
        StayingPeriod.text = "Stay: " + periodNumber + " " + periodDate + " - " + dateEnd;
    }
    if (context["time"] != null)
    {
        string timeInfo = context["time"].ToString();
        DepartureTime.text = "Time: " + timeInfo;
    }

    List<object> text = output["text"] as List<object>;
    string answer = text[0].ToString(); //returns only the first response

    Debug.Log("WATSON | Conversation output: \n" + answer);

    if (conversationOutputField != null)
    {
        conversationOutputField.text = answer;
    }

    fsData fsdata = null;
    fsResult r = _serializer.TrySerialize(resp.GetType(), resp, out fsdata);
    if (!r.Succeeded)
    {
        throw new WatsonException(r.FormattedMessages);
    }

        //convert fsdata to MessageResponse
    MessageResponse messageResponse = new MessageResponse();
    object obj = messageResponse;
    r = _serializer.TryDeserialize(fsdata, obj.GetType(), ref obj);
    if (!r.Succeeded)
    {
        throw new WatsonException(r.FormattedMessages);
    }

    if (resp != null)
    {
        //Recognize intents & entities
        if (messageResponse.intents.Length > 0 && messageResponse.entities.Length > 0)
        {
            string intent = messageResponse.intents[0].intent;
            string entity = messageResponse.entities[0].entity;
            string literalEntity = messageResponse.entities[0].value;
            if (entity == "city")
            {
                literalEntityCity = literalEntity;
            }
            if (intent == "weather" && entity == "city")
            {
                literalEntityCity = literalEntity;
            }
        }
        if (messageResponse.intents.Length > 0)
        {
            string intent = messageResponse.intents[0].intent;
            //intent name
            Debug.Log("Intent: " + intent);
        }
        if (messageResponse.entities.Length > 0)
        {
            string entity = messageResponse.entities[0].entity;
            //entity name
            Debug.Log("Entity: " + entity);
            string literalEntity = messageResponse.entities[0].value;
            //literal spoken entity
            Debug.Log("Entity Literal: " + literalEntity);
            if (entity == "city")
            {
                literalEntityCity = literalEntity;
            }
        }
    }

    StartCoroutine(Synthesize(answer));
}

// tone analyzer
private IEnumerator Analyze (string text)
{
    Debug.Log("WATSON | Tone analyze input: \n" + text);

    bool doAnalyze = toneAnalyzer.GetToneAnalyze(HandleAnalyzeCallback, OnFail, text);

    yield return null;
}

private void HandleAnalyzeCallback(ToneAnalyzerResponse resp, Dictionary<string, object> customData)
{
    Dictionary<string, object> dict = Json.Deserialize(customData["json"].ToString()) as Dictionary<string, object>;

    Dictionary<string, object> document_tone = dict["document_tone"] as Dictionary<string, object>;

    List<object> tone_categories = document_tone["tone_categories"] as List<object>;

    string debugOutput = "";

    foreach (object tone in tone_categories)
    {
        Dictionary<string, object> category = (Dictionary<string, object>)tone;

        List<object> newTone = category["tones"] as List<object>;

        foreach (object insideTone in newTone)
        {
            Dictionary<string, object> tonedict = (Dictionary<string, object>)insideTone;

            float score = float.Parse(tonedict["score"].ToString());
            string id = tonedict["tone_id"].ToString();

            bool emotionAvailable = false;

            foreach(Emotion emotion in emotions)
            {
                if(emotion.emotionId == id)
                {
                    emotionAvailable = true;
                    emotion.power = score;
                    debugOutput += emotion.emotionId + " : " + emotion.power.ToString() + " - ";
                    break;
                }
            }

            if(!emotionAvailable)
            {
                Emotion newEmotion = new Emotion();
                newEmotion.emotionId = id;
                newEmotion.power = score;
                emotions.Add(newEmotion);
                debugOutput += newEmotion.emotionId + " : " + newEmotion.power.ToString() + " - ";
            }
        }
    }

    Debug.Log("WATSON | Tone analyze output: \n" + debugOutput);
}

private void OnFail(RESTConnector.Error error, Dictionary<string, object> customData)
{
    Log.Error("WatsonAgent.OnFail()", "Error received: {0}", error.ToString());
}
}

I don't have knowledge on unity/C#, but independent from the language you are using, it's possible to manipulate the information that the Conversation/Assistent will have by changing the context object before sending it back, creating a new index, like "context.temperature", and then in the conversation, you can use something like "The current temperature in Berlin is $temperature".

The Conversation/Assistent is stateless, so everything the system will know about your dialog is in the context object, thats why you aways need to send it back on a new requisition. So anytime you need to add a information to the flow from another source, all you need to do is create a new index in the context.

I feel like I saw you post on another forum as well...but just in case that wasn't you, here is the pattern for making an external callout. You can either do it client side or server side via IBM Cloud Functions: https://console.bluemix.net/docs/services/conversation/dialog-actions.html

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM