Why we score your pitch, not just your grammar

雨 and 飴 are built from the same two sounds, a and me. One means rain, the other means candy. The only thing that tells them apart is pitch: rain falls from high to low, candy rises from low to high. Get the pitch wrong and you have said a different word. This is why SayLocal scores your pitch, not just your grammar.

Pitch accent is not optional politeness

English speakers are trained to hear pitch as emotion, not meaning. In Japanese, pitch is part of the word itself. Tokyo Japanese gives most words a fixed high-low shape, and getting it wrong does not just sound foreign, it can change which word a listener hears or briefly stall the whole sentence while they work out what you meant.

Figure 1. Same two morae, different word. SayLocal scores your contour against the standard high-low pattern for each word, the same convention dictionaries use.Example pair: 雨 (atamadaka, H–L) vs 飴 (heiban, L–H).

English speakers hear pitch as feeling. In Japanese it is part of the word. That is the habit we are retraining.

The good news: it is trainable

Adults are often told that ear for a new language is fixed after childhood. The evidence says otherwise. In a landmark study, Logan, Lively, and Pisoni trained Japanese adults to hear the English /r/–/l/ distinction, a contrast notoriously hard for them, by drilling it across many different speakers and word positions rather than a single clean recording. Perception improved and the gains held up.^[1] The key was variety: hearing the sound from many voices forced the listener to learn the category, not memorize one example.

That same high-variability approach has since been pointed straight at our problem. Shport trained native English listeners to identify Tokyo Japanese pitch-accent patterns using multiple words, talkers, and sentence contexts, and found that accuracy improved and generalized to new words they had never heard in training.^[2] Pitch accent, the thing learners are told to give up on, responds to the right kind of practice.

How SayLocal trains it

Each vocabulary card carries its standard pitch pattern. When you speak, the app extracts your pitch contour, buckets it into morae, and scores your high-low sequence against the expected one, so the feedback is about the specific word rather than a vague “sounds off.” Because the research is clear that variety is what teaches the category, you hear and practise each pattern across different words and voices rather than one canned clip.

Shadowing works the same way for rhythm. The app lines up your attempt against a native track and scores how closely your timing and emphasis match, because sounding natural is as much about the music of a sentence as the individual sounds.

What this means for you

You will not just learn that 雨 means rain. You will learn to say it so a listener hears rain, and you will get told, word by word, when your pitch drifts. It is the part most apps skip because it is hard to measure, and it is a big part of the gap between understood and native.

References

1. Logan, J. S., Lively, S. E., & Pisoni, D. B. (1991). Training Japanese listeners to identify English /r/ and /l/: A first report. Journal of the Acoustical Society of America, 89(2), 874–886. Link
2. Shport, I. A. (2016). Training English listeners to identify pitch-accent patterns in Tokyo Japanese. Studies in Second Language Acquisition, 38(4), 739–769. Link