Open Forms with Google Cloud Dialogflow CX

Create an open form type to use with the Form or Text Form node and Google Cloud Dialogflow CX.

The open form type connects to a natural language processing (NLP) engine and matches JSON name-value pairs of interest.

Use the open form type to transcribe language responses in real time. Capture a caller's inputs in a natural way and in multiple languages.

Note:

Open Forms use an enhanced phone model when using Google Cloud Dialogflow and US English as the recognition language. This model provides up to 40% improvement in transcription accuracy over the models that were used before.

Prerequisite

Grant Studio access to Dialogflow Virtual Agents

Note:

Studio cannot process Google Cloud Dialogflow parameters with an underscore (_) in the key name. Try camelCase instead of snake_case.

Name

Give the open form type a name.

Description

Enter a description for the open form type.

NLP Engine

Select Google Cloud Dialogflow CX.

Agent Name

Retrieve the agent name from the Google Cloud Dialogflow CX project.

  1. Log in to Google Cloud Dialogflow.

  2. Select the project.

  3. From the menu to the right of the agent, click Copy Name.

    This copies the agent name to the clipboard.

  4. Paste the copied agent name to Studio.

Example:

The agent name is in this format:

projects/<project-name>/locations/<location>/agents/<agent-identifier>

Sample name:

projects/studio-project/locations/global/agents/c615c032-aabf-48ab-b0ef-813e6af1e28a

Test Language

Select the language for the test query.

Test Query Text

Type a brief typical response that you would expect to receive.

Example:

Four skinny lattes to go.

This is a coffee order. It communicates the number of coffees (4), the type of coffee (latte), extras (skinny milk), and the delivery method (to-go).

The maximum query size is 2,000 characters. If the query is too long, the open form configuration will not save. Studio warns you when the query is too long.

Test Query Parameters

Pass query parameters in JSON format.

Example:
{
"timeZone": "America/New_York",
"geoLocation": {"latitude": 12, "longitude": 85},
"sessionEntityTypes": [{"name":"snack", "entityOverrideMode":"ENTITY_OVERRIDE_MODE_OVERRIDE", "entities":{"value": "Tea", "synonyms": ["tea", "Tea"]}}],
"currentPage": "projects/<Project ID>/locations/<Location ID>/agents/<Agent ID>/flows/<Flow ID>/pages/<Page ID>",
"disableWebhook":false,
"analyzeQueryTextSentiment":true
}

Learn more about query parameters with Dialogflow CX.

API Version

Select the Google Cloud Dialogflow CX API version.

Note:

Support for V3 and V3Beta1 is limited to text channels.

Response

Click Preview Response to see the JSON name-value pairs.

Assign Values to Variables

Match JSON name-value pairs of interest.

  1. Click a JSON name-value pair from the preview response.

    The path to the selected name-value pair displays in the JSON Path field.

    The value displays in the black strip under the path.

    Note:

    To edit the JSON path, select Editable JSON. For more information, see Editable JSON.

  2. Give a name to the JSON name-value pair in the Assign path to Return value name field.

    This name represents the JSON name-value pair in the form node.

  3. Click Add Return Value.

    The JSON path and name are added to the table of return values.

  4. Repeat. Select another JSON name-value pair.

Table of Return Values

The form node has access to items in the table of return values. In the form node, return values from matched pairs are assigned to variables.

Click Test to show the value assigned to a JSON path. The value shows in the black strip under the JSON Path field.

Advanced ASR Settings

In the Form node, these settings are initially applied and can be further tuned.

Do not tune unless you have a clear understanding of how these settings affect speech recognition. Generally speaking, the default settings are the best. To return a setting to its default value, remove the value from the field and click outside the field.

Timeout Settings

Description

No Input Timeout

Wait time, in milliseconds, from when the prompt finishes to when the system directs the call to the No Input Event Handler as it has been unable to detect the caller’s speech.

Speech Complete Timeout

Speech Incomplete Timeout

Use these settings for responses with an interlude to ensure the system listens until the caller's speech is finished.

Speech Complete Timeout measures wait time, in milliseconds, from when the caller stops talking to when the system initiates an end-of-speech event. It should be longer than the Inter Result Timeout to prevent overlaps. To customize Speech Complete Timeout, disable Single Utterance.

Speech Incomplete Timeout measures wait time, in milliseconds, from when incoherent background noise begins and continues uninterrupted to when the system initiates an end-of-speech event.

Speech Start Timeout

Wait time, in milliseconds, from when the prompt starts to play to when the system begins to listen for caller input. This is similar to the scenario where barge in is enabled.

Inter Result Timeout

Wait time, in milliseconds, from when the caller stops talking to when the system initiates an end-of-speech event as it has been unable to detect interim results.

The typical use case would be for a caller reading out numbers. The caller might pause between the digits.

It is recommended to keep the value shorter than Speech Complete Timeout to avoid overlaps.

Inter Result Timeout does not reset if there is background noise. Speech Complete Timeout does reset if there is background noise. If there is background noise, Inter Result Timeout may be more reliable in determining when the speech is complete.

Disabled when the value is 0. Set from 500ms to 3000ms based on the maximum pause time in the caller response. To customize Inter Result Timeout, disable Single Utterance.

Barge In Sensitivity

Raising the sensitivity requires the caller to speak louder above background noise. Applicable when Barge In is enabled.

Single Utterance

A single utterance is a string of things said without long pauses. A single utterance can be yes or no or a request, like Can I book an appointment?, or I need help with support.

The single utterance setting is selected by default.

Disable Single Utterance to customize Speech Complete Timeout and Inter Result Timeout.

You may decide to disable the single utterance setting if the caller is expected to pause as part of the conversation. For example, the caller may read out a sequence of numbers and pause in appropriate places.