What does “ambivalent” really mean? The potential of transformers for cross-cultural translation.
Epistemic status: Story-telling; maybe a hypothesis; at least a little bit tongue-in-cheek. All views are obviously my own.
The story
“I am deeply ambivalent”, my sister said confidently.
She’s a therapist, so she can get away with saying stuff like that. She knows all about the process of swimming in uncertainty, and she refuses to let anyone impose binary judgements on her.
The Finnish person that she was talking to was annoyed. They had wanted a direct answer.
“Which one is it?”
But my sister held her ground, firmly seated on the fence, maintaining strong arguments for this, and the other.
“Yes”, she said, “deeply ambivalent.”
“I am deeply ambivalent”, I tried.
“Yeah…” my American friend nodded in understanding. “That’s pretty clear then.”
I was trying to express my feelings about a job I was considering applying for. I, too, wanted to take a confident seat on the fence, clearly expressing that I had considered the full reality of this exciting, demanding, scary opportunity. But it seemed that the sounds I had uttered arrived in my friend’s auditory processing stream in a much altered form. What they understood was that I was solidly rejecting this atrocious excuse for a career path.
To tell the truth, these conversations didn’t play out exactly that way. To begin with, my sister never said, “I am deeply ambivalent.” What she said was, “Jag är djupt ambivalent”, which is the word-for-word translation of the same expression in Swedish. In retrospect, I understand that that is totally different.
This happens to me all the time. When I applied to grad school, my then-mentor apparently received several phone calls from professors that I had interviewed with, with an inquiry of approximately this form: “Kata is very qualified, but… …it’s not clear that she’s very excited about our work.”
(I’m from Finland. We don’t do “excited”. In fact, in my native language, one of the greatest compliments that you can pay to a friend for, say, a meal that they’ve cooked for you is: “He fäjla ast.” This literally translates to: “It lacked little.”¹ We really don’t do “excited”.)
“No, no, no!” my mentor would reply. “You see, she’s from Finland! What she means is that she couldn’t be more excited!”
And thus, I went to grad school.
You can’t always count on somebody else to act as your on-demand cross-cultural interpreter. One day many years ago, I ran into this brilliant blog post. Over time, it’s become a habit for me to send the link along to Americans that I meet and talk to, you know, as an apology, an explanation, a Rosetta Stone.
What’s transformers got to do with it?
As part of an on-going deep learning journey, I’ve recently been playing around with some language models. There’s a cool, public repo called `transformers` from Hugging Face that I recommend that you check out. Take, as an example, a pre-trained model for sentiment analysis. Sentiment analysis means that the model can take a written sentence as input, and output a value for whether it thinks that sentence is positive or negative. Pre-trained means that it’s already seen lots of data, and it should be good at doing the task. This model was trained on English-language movie reviews.² You can play with it yourself by following the code available in this friendly tutorial.
If you feed the model the sentence, “I love it”, it will tell you that it’s “positive” with a score of 0.99988.
(The scores for either the positive or negative label range from 0.5 to 1. Scores close to 1 means that the model is really confident in its choice. Scores close to 0.5 means it’s unsure. If the score were to tip below 0.5, the sentence would be assigned the opposite label).
If you feed it, “I hate it”, it returns “negative” with a score of 0.99964.
Now, the model doesn’t “like” to be uncertain.³ But it can be. For the input, “I feel 50–50 about that”, it returned the label “positive”, but with a score of merely 0.56475.
Now the critical question is: Is “ambivalent” truly ambivalent? According to this model, trained on English-language movie reviews (drum roll): No. With a score of 0.96681, “ambivalent” is unambiguously negative.⁴
Moral of the story
If you, too, are a cross-cultural transplant, consider using a transformer model to not only translate the words, but also assess the valence, of the message you intend to convey.
Footnotes
These will make the story more truthful and less entertaining. Read at your own peril.
¹ I heard that the expression “I can’t fault it” has been used on MasterChef Australia, so this style of complimenting may not be as unique to the Närpes dialect as I thought.
² I am making a confident assumption that nearly all reviews were in English, and a less confident assumption that a majority of them were written by Americans, or at least people steeped in American culture and communication style.
³ For example, at the last layer of the network, the outputs are normalized to a 0–1 range, using a softmax function. This pushes large values close to 1, and small values close to 0. It makes large values, relatively speaking, even larger, and small values even smaller, so you end up with fewer values near the center (0.5). | In addition, if I read some of the background articles for this network correctly, I believe that the authors deliberately selected polarized movie reviews (positive or negative) for the training phase. Hence, it may be that the network hasn’t seen a whole lot of truly ambivalent examples.
⁴ Notice that, because the softmax operation pushes large values closer to 1 and close together (see footnote 3 and this article for example), a score of 0.96681 is further away from 0.99964 than you might intuitively think if you view them as proportional confidence levels or probabilities. So the classification score is maybe not as “unambiguous” as I’m making it sound here. But “never let the truth get in the way of a good story”. :-p