cantonese.aiAPI Reference

Score Pronunciation

Score pronunciation across Cantonese, English, and Mandarin. This unified endpoint accepts an audio recording, a target text, and a language parameter, then returns a score indicating how closely the pronunciation matches the expected text.

Request Parameters

This endpoint requires multipart/form-data for file uploads.

ParameterTypeRequiredDescription
api_keystringYesYour API key for authentication
audiofileYesAudio file of the user's pronunciation. Supported formats: wav, mp3, m4a, flac, ogg. Max size: 10MB.
textstringYesThe target text to compare pronunciation against.
languagestringNo (defaults to "cantonese")The language to evaluate. One of: cantonese, english, or mandarin.

Example Request

Here are examples of how to score pronunciation using different programming languages.

to auto-fill your API key in the code examples below.
curl -X POST "https://cantonese.ai/api/score-pronunciation" \
  -F "api_key=YOUR_API_KEY" \
  -F "text=你好嗎" \
  -F "language=cantonese" \
  -F "[email protected];type=audio/ogg"

Response

On success, the response returns a JSON object with the pronunciation score. The response fields vary depending on the language.

Common Response Fields

FieldTypeDescription
successbooleanWhether the request was processed successfully.
scorenumberPronunciation score from 0 to 100.
passedbooleanWhether the pronunciation passed. Cantonese: score >= 90. English/Mandarin: score >= 70.
languagestringThe language that was evaluated.

Additional Cantonese Fields

FieldTypeDescription
expectedJyutpingstringThe correct jyutping romanization for the target text.
transcribedJyutpingstringThe jyutping transcribed from the audio recording.

Additional English/Mandarin Fields

FieldTypeDescription
expectedTextstringThe target text that was expected.
transcribedTextstringThe text recognized from the audio recording.
wordScoresarrayPer-word/character scores. Each entry has word (string) and score (number).
fluencyScorenumberFluency score (0-100).
integrityScorenumberCompleteness/integrity score (0-100).
pronunciationScorenumberPronunciation accuracy score (0-100).

Cantonese Response Examples

High score (pronunciation matches the target text):

{
  "success": true,
  "score": 95,
  "expectedJyutping": "nei5 hou2 maa3",
  "transcribedJyutping": "nei5 hou2 maa3",
  "passed": true,
  "language": "cantonese"
}

Low score (pronunciation does not match the target text):

{
  "success": true,
  "score": 42,
  "expectedJyutping": "nei5 soeng2 dim2 aa3",
  "transcribedJyutping": "nei5 hou2 maa3",
  "passed": false,
  "language": "cantonese"
}

English Response Example

{
  "success": true,
  "score": 88,
  "passed": true,
  "expectedText": "Hello, good morning.",
  "transcribedText": "Hello, good morning.",
  "wordScores": [
    {
      "word": "Hello",
      "score": 90
    },
    {
      "word": "good",
      "score": 85
    },
    {
      "word": "morning",
      "score": 88
    }
  ],
  "fluencyScore": 92,
  "integrityScore": 100,
  "pronunciationScore": 88,
  "language": "english"
}

Mandarin Response Example

{
  "success": true,
  "score": 82,
  "passed": true,
  "expectedText": "你好世界",
  "transcribedText": "你好世界",
  "wordScores": [
    {
      "word": "你",
      "score": 85
    },
    {
      "word": "好",
      "score": 80
    },
    {
      "word": "世",
      "score": 78
    },
    {
      "word": "界",
      "score": 84
    }
  ],
  "fluencyScore": 85,
  "integrityScore": 100,
  "pronunciationScore": 82,
  "language": "mandarin"
}

Status Codes

The API returns standard HTTP status codes to indicate the success or failure of requests.

Status CodeDescription
200Success - Pronunciation scored successfully
400Bad Request - Missing audio file, target text, or invalid language parameter
401Unauthorized - Invalid or missing API key
405Method Not Allowed - Only POST requests are accepted
429Too Many Requests - Rate limit exceeded
500Internal Server Error - Server encountered an unexpected condition