When Cambridge Analytica refreshed data, meaning updating the locally held database with new data points, we struck a range of agreements with clients and vendors. Depending on those agreements, the data sets could cost either in the millions of dollars or nothing, as Cambridge sometimes struck data-sharing agreements by which we shared our proprietary data with other companies for theirs. No money had to change hands. An example of this comes from the company Infogroup, which has a data-sharing “co-op” that nonprofits use to identify donors. When one nonprofit shares with Infogroup its list of donors, and how much each gave, it receives in return the same data on other donors, their habits, fiscal donation brackets, and core philanthropic preferences.
From the massive database that Cambridge had compiled from all these different sources, it then went on to do something else that differentiated it from its competitors. It began to mix the batter of the figurative “cake” Alexander had talked about. While the data sets we possessed were the critical foundation, it was what we did with them, our use of what we called “psychographics,” that made Cambridge’s work precise and effective.
The term psychographics was created to describe the process by which we took in-house personality scoring and applied it to our massive database. Using analytic tools to understand individuals’ complex personalities, the psychologists then determined what motivated those individuals to act. Then the creative team tailored specific messages to those personality types in a process called “behavioral microtargeting.”
With behavioral microtargeting, a term Cambridge trademarked, they could zoom in on individuals who shared common personality traits and concerns and message them again and again, fine-tuning and tweaking those messages until we got precisely the results we wanted. In the case of elections, we wanted people to donate money; learn about our candidate and the issues involved in the race; actually get out to the polling booths; and vote for our candidate. Likewise, and most disturbing, some campaigns also aimed to “deter” some people from going to the polls at all.
As Tayler detailed the process, Cambridge took the Facebook user data he had gathered from entertaining personality surveys such as the Sex Compass and the Musical Walrus, which he had created through third-party app developers, and matched it with data from outside vendors such as Experian. We then gave millions of individuals “OCEAN” scores, determined from the thousands of data points about them.
OCEAN scoring grew out of academic behavioral and social psychology. Cambridge used OCEAN scoring to determine the construction of people’s personalities. By testing personalities and matching data points, CA found it was possible to determine the degree to which an individual was “open” (O), “conscientious” (C), “extroverted” (E), “agreeable” (A), or “neurotic” (N). Once CA had models of these various personality types, they could go ahead and match an individual in question to individuals whose data was already in the proprietary database, and thus group people accordingly. So that was how CA could determine who among the millions upon millions of people whose data points CA had were O, C, E, A, N, or even a combination of several of those traits.
It was OCEAN that allowed for Cambridge’s five-step approach.
First, CA could segment all the people whose info they had into even more sophisticated and nuanced groups than any other communications firm. (Yes, other companies were also able to segment groups of people beyond their basic demographics such as gender and race, but those companies, when determining advanced characteristics such as party affinity or issue preference, often used crude polling to determine where people generally stood on issues.) OCEAN scoring was nuanced and complex, allowing Cambridge to understand people on a continuum in each category. Some people were predominantly “open” and “agreeable.” Others were “neurotic” and “extroverts.” Still others were “conscientious” and “open.” There were thirty-two main groupings in all. A person’s “openness” score indicated whether he or she enjoyed new experiences or was more inclined to rely on and appreciate tradition. The “conscientiousness” score indicated whether a person preferred planning over spontaneity. The “extroversion” score revealed the degree to which one liked to engage with others and be part of a community. “Agreeableness” indicated whether the person put others’ needs before their own. And “neuroticism” indicated how likely the person was to be driven by fear when making decisions.
Depending on the varied subcategories in which people were sorted, CA then added in the issues about which they had already shown an interest (say, from their Facebook “likes”) and segmented each group with even more refinement. For example, it was too simplistic to see two women who were thirty-four years old and white and who shopped at Macy’s as the same person. Rather, by doing the psychographic profiling and then adding to it everything ranging from the women’s lifestyle data to their voting records to their Facebook “likes” and credit scores, CA’s data scientists could begin to see each woman as profoundly different from the other. People who looked alike weren’t necessarily alike at all. They therefore shouldn’t be messaged together. While this seems obvious—it was a concept supposedly already permeating the advertising industry at the time Cambridge Analytica came along—most political consultants had no idea how to do this or that it was even possible. It would be for them a revelation and a means to victory.
Second, CA provided clients, political and commercial, with a benefit that set the company apart: the accuracy of its predictive algorithms. Dr. Alex Tayler, Dr. Jack Gillett, and CA’s other data scientists constantly ran new algorithms, producing much more than mere psychographic scores. They produced scores for every person in America, predicting on a scale of 0 to 100 percent how likely, for example, each was to vote; how likely each was to belong to a particular political party; or what toothpaste each was likely to prefer. CA knew whether you were more likely to want to donate to a cause when clicking a red button or a blue, and how likely you were to wish to hear about environmental policy versus gun rights. After breaking people up into groups using their predictive scores, CA’s digital strategists and data scientists spent much of their time testing and retesting these “models,” or user groupings called “audiences,” and refining them to a high degree of accuracy, with up to 95 percent confidence in those scores.
Third, CA then took what they had learned from these algorithms and turned around and used platforms such as Twitter, Facebook, Pandora (music streaming), and YouTube to find out where the people they wished to target spent the most interactive time. Where was the best place to reach each person? It might be through something as physical and basic as direct paper “snail” mail sent to an actual mailbox. It might be in the form of a television ad or in whatever popped up at the top of that person’s Google search engine. By purchasing lists of key words from Google, CA was able to reach users when they typed those words into their browsers or search engines. Each time they did, they would be met with materials (ads, articles, etc.) that CA had designed especially for them.
At the fourth step in the process, another ingredient in the “cake recipe,” and the one that put CA head and shoulders above the competition, above every political consulting firm in the world, they found ways to reach targeted audiences, and to test the effectiveness of that reach, through client-facing tools such as the one CA designed especially for its own use. Called Ripon, this canvassing software program for door-to-door campaigners and phone bankers allowed its users direct access to your data as they approached your house or called you on the phone. Data-visualization tools also helped them determine their strategy before you’d even opened your door or picked up your phone.
Then campaigns would be designed based on content our in-house team had composed—and the final, fifth step, the micro-targeting strategy, allowed everything from video to audio to print ads to reach the identified targets. Using an automated system that refined that content again and again, we were able to understand what made individual users finally engage with that content in a meaningful way. We might learn that it took as many as twenty or thirty variations of the same ad sent to the same person thirty different times and placed on different parts of their social media feed before they clicked on it to