Using Artificial Intelligence to Understand Brands

Artificial Intelligence techniques behind automated translation and other NLP (Natural Language Processing) applications doesn’t just work at the level of words and phrases, but give us quantitative data about the meaning people assign to different concepts, including brands. We can then measure relationships between different brands in a space, understanding what distinguishes how people talk about, or rather with, them.

As an example, we’ll use data from the GloVe project at Stanford — in particular their Twitter- and Wikipedia-derived sets — to look at the main smartphone brands: Samsung, Apple, Nokia, Sony, LG, HTC, Motorola, and Huawei. It’s a dictionary, but a very special one: instead of taking a term in English and mapping it to a Chinese one, it takes terms in English, Chinese, etc, and translates them to points in an abstract space, essentially sets of apparently meaningless numbers. But they are far from arbitrary; the software learns to map terms that are used in similar ways, regardless of how they are written, into points that are close in this abstract space. So bicycle, bicicleta, and 自行车 are translated to points that are close to each other, which is how a system like Google Translate knows how to go from the English term to the Spanish one — it just looks for the closest point that comes from a Spanish term (using neural networks to figure out this map is where the real difficulty lies, but that’s for a different post).

We can leverage this into an intuitive but data-driven way of looking at the relationship between multiple brands. Just as words that have similar meanings are closer to each other than those that have different ones, brands that people think of as similar will be talked about in the same way, and so, because they are similar as words, will end up having close points in the abstract space. It sounds a bit… abstract, but here’s how it looks for the main smartphone brands, using the mapping based on GloVe’s crawling of about two billion tweets:

What this map is telling us is that, based on the way people actually use the brands as words when tweeting — in some senses, the “real” content of the brand — most smartphone brands are pretty much identical, with Apple (as expected), Huawei, and LG the odd ones out. An algorithm that told us that Samsung and Motorola are identical as brands wouldn’t be a very perceptive one, and in fact that’s not what’s happening here. If we zoom into that cluster of brands, we see they are quite separate from each other:

What we’re seeing is that, simply put, compared with Apple (and Huawei and LG) Samsung and Motorola are identical as brands; it’s just when you zoom into the area of “major Android brands” that you can see that Samsung and Nokia are quite similar — compared to Motorola. It’s all relative, but not in an arbitrary way.

So we’ve used AI to put in perspective the success of Samsung’s branding efforts, or, in a more positive way, to highlight how Apple and a couple of other brands are on their own class, each of them separated as words from the pretty much homogeneous (unless you forget the competitors and zoom into them) central core of Android brands. But can we say something about the semantics of that difference?

Surprisingly, we can! Sort of. The biggest surprise of this algorithm — and it caused quite a stir in the AI community when it was first published — is that there’s meaning not just in the distance between points, but also in their specific geometric relationships. The best way to explain it is by showing it:

Each of the words king, queen, man, woman, son, daughter gets its own point, as expected. The fascinating thing is that the arrow between king and queen is almost the same in length and direction as the arrow between man and woman, and they are also almost identical to the arrow between son and daughter! Somehow, the algorithm doesn’t just learn a way to represent words as points, but also the abstract relationship female version of, which is “translated” into an arrow of specific angle and length; to know what the female version of son is, you just start from that point, do the same jump as you’d do to go from man to woman… and reach the point for daughter.

This is an incredibly powerful capability, because now we can ask not just which brands are comparatively similar or different, but in what way. We took a list of fifty common adjectives in English as our vocabulary, and asked the data

if king is to queen as man is to woman then
the generic Android brand is to Apple as… ? is to ?

The closest metaphors we got?

the generic Android brand is to Apple as black is to white

the generic Android brand is to Apple as international is to national

the generic Android brand is to Apple as special is to great

This doesn’t mean people use the word national a lot when talking about Apple. It’s subtler and more powerful: the “national-ness” that is the difference between the words international and national is similar to the difference between the way people use the words for the major Android brands, e.g. Samsung, and Apple. The white cases and Apple being an US company (and, arguably, their being great) are part of the difference in meaning between Apple and Samsung.

The most important thing about the above is how unsurprising it is if you pay attention to the brands; it’s part of the discourse, yes, but the algorithm automatically teased those differences out and simplified them to the starkest meaningful metaphor. Apple is the white smartphone — you push billions of tweets in one side, process it carefully enough, and out comes a conceptual observation.

Artificial intelligence: it’s not just about numbers anymore (and it never was).

Applying the same analysis to compare LG and Huawei against the rest of the Android brands gives us consistently the terms good (for LG) and strong (for Huawei), perhaps an indication of solid if not brilliant conceptual branding (remember, we’re quantifying metaphors, not counting mentions — it’s not the words in the advertising copy, but the way people use the brand in their own tweets).

But the important aspect of this technique is that we could just as easily have used a completely different vocabulary to query the relationship between brands — feelings instead of adjectives, or terms related to prices, or whatever vocabulary helps answer the specific question we wanted to ask. Brands, like every other word, are deeply multidimensional, and so are their relationships; rather than attempting to oversimplify them through the narrow lens of a specific survey, pulling vast amounts of actual usage data allows us to look at concrete answers to specific questions about the relationship between brands in all the messy complexity of the real world, yet distill that complexity into conceptually usable semantic relationships.

Forget counting retweets and classifying mentions: we now have the tools to look into the living reality of brands as part of our continuously shifting languages, and to apply quantitative methods to elucidate, and eventually shape, their conceptual and emotional overtones. As happened with metrics-driven, quantitatively optimized advertising, leveraging these tools will require expanding the conceptual and strategic toolsets of organizations in ways that to many will feel too alien to attempt, but which will eventually become part of the basic practices of the industry. Marketing, I believe, will become the richer for that, not just through increased transparency and effectiveness, but also by making possible the development of completely new means to achieve the oldest ends.

After all, what has marketing always been, if not the engineering of hidden metaphors?