'Bubble tea' isn't completely wrong (but it's still wrong)
An unnecessarily deep look at the bubble-boba divide
When I moved to Massachusetts for college, the first thing that struck me about my new life was how sticky the summers were. As an Asian American teen coming from LA, I knew there were some things I’d have to get used to living with in my new home — things like humidity, different cultural vibes, New England old-money elitism, etc.
But I wasn’t expecting to hear so many people sacrilegiously butcher the name of my favorite Taiwanese treat. What I knew as boba tea, the milk tea dessert topped with tapioca balls, was known as “bubble tea” here. I’d never witnessed so many people refer to the drink’s tapioca add-on as “bubbles,” and I’d even get corrected at tea shops for ordering boba. It felt like an affront to the very fabric of my reality. My parents back home run a dumpling business where they sell boba. Going out for boba was a cornerstone of my social life in high school. My non-Mandarin speaking ass can barely order three things for myself when I’m in Taiwan, and one of them is boba.
I did some Googling and was shocked to find out “bubble tea” is a pretty common term for the drink. The drink’s Wikipedia entry is titled “bubble tea,” and the consensus is the name came about some time when the drink arrived in the US. So while it’s not necessarily wrong to prefer the blasphemous “bubble” to boba, I messed around with some Google Trends data to learn more about the bubble vs. boba divide.
1. Boba is a relatively recent phenomenon in the US.
If you’re young and/or Asian on the internet, then you’ve definitely encountered the rise of boba-as-a-personality-trait-ism. Take a look at the wildly popular Subtle Asian Traits Facebook group and you’ll see almost every other meme is boba-related. The very piece you’re reading now is admittedly born from this phenomenon. On the whole, boba is a distinct part of the Asian American experience. But while it seems like it’s everywhere, boba wasn’t always as prolific as it is today.
Google searches for the terms “boba” and “bubble tea” have been on the rise since the early 2000s, and searches for “boba” have consistently outnumbered searches “bubble tea.” I’ll go into the caveats and methodology more at the end of this, but for now, “hits” is the unitless measurement of search frequency Google Trends spits out when you compare search terms: 100 just means max search popularity for the selected time frame, and everything else is scaled accordingly.
2. The bubble-boba divide is regional.
The map below compares the frequency of Google searches for “bubble tea” and “boba.”
States like California and Nevada have more searches for “boba” than “bubble tea,” while New York and Massachusetts are the opposite. States like Texas and Florida have an almost equal amount of “boba” and “bubble tea” searches.
For this analysis, I’m approximating preference as the difference between searches in “boba” and “bubble tea.” Preference = searches for boba - searches for bubble tea. The greater the difference, the greater the preference. Here’s another (less pretty but more numeric) chart to illustrate that. States that fall on the diagonal line have approximately as many searches for “bubble tea” as “boba”; states that land above the line skew towards “bubble,” and states that fall below skew towards “boba.”
3. Modeling and predicting the bubble-boba divide.
Since there are only 50 states (plus DC) Google can collect trends data on, zooming in to the city-level search preferences gives us a bigger sample to work with (209 cities) when analyzing the regional distribution of bubble-boba preference. Here are 10 of the biggest US cities and where they fall on the spectrum:
The regional divide is still there at the city level. In the beeswarm plot below, individual dots representing cities are grouped into census regions: West, South, Midwest, or Northeast. The horizontal axis again illustrates preference by subtracting searches for “bubble tea” from searches for “boba”: dots appearing to the far left represent cities where “bubble tea” searches outnumber “boba” searches and vice versa.
As the first entry to Small Data, I want to set the tone of what I’m going for with this project: as I learn R and Python and other data-sciencey things, my goal is to create data essays that make you think, “I didn’t particularly need to know that, but cool.”
So I built a linear regression model that predicts a city’s search preference based on its region. The following charts are based on the formula:
The ridge plot below illustrates the possible outcomes of search preference for a city in each region. The horizontal axis represents a possible outcome for search preference, and the height of each histogram is the relative probability of that outcome. Preference is still defined as “boba” searches minus “bubble tea” searches, so -100 is a total preference for “bubble tea,” 100 is a total preference for “boba”, and 0 is no preference.
The probability distributions for Midwestern and Southern cities is centered around 0 with significant overlap. This means that if we wanted to predict the search preference of a Midwestern or Southern city not included in the initial sample, there would most likely be a weak preference or no preference at all. In short, the difference between being in the Midwest or South doesn’t mean much for a city’s preference for “bubble tea” or “boba.”
But distributions for Northeastern and Western cities are more distinct. Here’s the same data but filtered to include only predictions where a city’s region is either Northeast or West:
Here, the predicted preferences show a clear separation. The majority of outcomes for a Northeastern city fall below 0 (preference for “bubble tea,” and the majority of outcomes for a Western city fall above 0 (preference for “boba”). Although there’s still some overlap between the distributions, if we wanted to predict the search preference of a Northeastern or Western city outside of the initial sample, there’s a much greater chance region would be associated with a difference in search preference.
4. In conclusion, it’s not that deep.
When you look at the Google Trends data for “bubble tea” and “boba,” you can clearly see it’s a regional thing. “Boba” is the more popular way to say it on the whole, but without trying to make a historical or linguistic argument, I can’t say it’s the definitely correct way to say it. I dug really deep into census and demographic data to try to figure out who exactly is on either side of the bubble-boba divide, but there weren’t any clear trends.
My first guess was it had something to do with non-Asian people discovering the drink and butchering some translation along the way. Or worse, applying a Western lens to an “exotic” find. The New York Times issued an apology in 2017 for something similar. But Wikipedia says that’s not the case, and I compared every demographic metric I could think of across “boba” and “bubble” cities — immigration numbers from China and Taiwan, the share of ethnically Chinese and Taiwanese residents in a given city, the number of Chinese speakers in a city, etc. — and still found no patterns.
Part of that could be due to the limits of Google Trends data. The data from Trends is scaled by population and contextualized within the parameters of your request. The search popularity of “boba” is a score Trends assigns based on what share of all Google searches are for “boba” in a particular state or city. Another fun thing I found is there are huge spikes in searches for “boba” at the same time there are huge spikes for “boba fett” (and there was a big influx of “boba” searches when the Star Wars character made his first cameo in The Mandalorian). In the end, I’d say the data is too far removed from individuals and their actual opinions to give a more satisfying explanation for the bubble-boba divide.
After living in the Northeast for most of the last four years, I can say I’ve gotten used to most parts of life there. But “bubble tea” is definitely not one of them. I still cringe a little when I hear the words uttered aloud, and I still ask for “boba” regardless of what the teashop calls it.
When I first moved in for college, I used to be surprised how frequently classmates or acquaintances could guess I’m from LA — maybe it’s “boba” that gives it away.
Notice anything I missed? Got ideas for something I should look into? Let me know.