David Carroll Q & A: Data and Democracy

In our first interview with David Carroll, the Courier explored the Easton native’s journey to reclaim his data from Cambridge Analytica, the now-defunct British political consulting firm that was involved in an illegal data collection scandal prior to the 2016 U.S. presidential election.

Contributing editor Neil Grasso sat down (virtually) with David Carroll once again to learn more about the subject, as well as to find out how Americans can protect themselves from harmful data harvesting leading up to Election Day.

This time of year is typically filled with spooky Halloween costumes and sweet candy. However, with the way things have gone so far in 2020, the American people may have had all of the scares they can handle.

Aside from the ongoing COVID-19 crisis that initially spiked in early March, killing over 225,000 Americans and devastating the U.S. economy, there is another threat that may have a major impact on the lives of Americans as the presidential election nears: data collection.

I spoke with David Carroll, associate professor at the Parsons School of Design and the subject of the Netflix original documentary The Great Hack, to get a better understanding of how individuals can protect themselves from invasive data collection and how such tactics could play a role in one of the most divisive presidential elections in this country’s history.

NG: Is there such a thing as useless data? Would someone like my Mom or Dad possess data that is remotely as valuable as the data of a corporate executive or politician?

Individuals’ data is not very valuable. Data is valuable in the aggregate. It’s assembled into massive data sets that then contribute to creating algorithms, products, and services. It’s important to try to balance the individual versus the collective here – yes there are individual harms that can occur, but generally it is the harms to a larger community where true impact occurs, and it occurs because data is collected from all of us and then used in ways that we can’t understand as a whole.

We can still protect ourselves and our families from harm, so it makes it worthwhile to protect our data. But it’s not because the data itself is worth money. It’s worthwhile because of the knowledge, information, and capability that is embedded (in the data), especially when it’s a part of a larger group and larger system.

Senior citizens, for example, are routinely targeted for fraud and scams, so senior citizens as a class of users are sort of inherently vulnerable. Therefore, data that individuals radiate that identifies them as senior citizens increases their potential to be targeted for said scams. Similar issues exist for young women who are thinking about getting pregnant or are about to get pregnant. They are vulnerable because their data is signaling to advertisers their interest in becoming a mother, which will result in them being bombarded with advertising accordingly.

There are very personal details of our lives that we are broadcasting unknowingly. It’s one thing for a person to say “I’m pregnant!” It’s another thing for a company to predict that you are pregnant based on your search histories, interests, and behavior signatures. That’s very invasive, but you could see why marketers would want that. Examples like these answer the question as to why can’t individuals sell their own data – it’s just not worth that much. It’s only worth it in the aggregate.

NG: What are some ways you currently protect your own data? Do you use software, VPNs, encryption, or avoid certain websites?

DC: There are defensive postures that you can take, but it takes work to be defensive. The default position is to be naked, basically. People who do nothing and then proceed to sign up for things in an unthoughtful or uncareful manner are naked, and the analogy I use is that being defensive is putting on some clothes. There are things you can do to travel the Internet with clothes on.

For certain services, I go deep into the settings to disable and opt-out of all of the personalization and targeting features that are on by default. You can do this on platforms such as Facebook and Twitter. Most of these services have this capability, but it takes work to find it. And, when you turn it off, they will beg you not to do so! But it’s worth doing. You can see why you are so valuable to them.

If you have an iPhone or iPad, there is a setting to limit ad tracking. It is off by default, but if you turn it on, it will prevent a lot of ad-tracking for the apps on your phone and other kinds of services that are much more integrated. Many of the advertisers will respect that. There is another setting called “Reset Advertising Identifier.” This is a unique name that your phone gives you so that advertisers can track you. It’s not the name that your parents gave you, but it is certainly a name that the advertising industry has given you, and it is one that no one else has. You can reset that – some people even recommend resetting it every month. That way, you are a whole new profile every month in the eyes of advertisers, and you are not associated with the data from the month before.

Apple gives some controls in this regard, but the language can be confusing. A great example is how “limit ad tracking” is set to “off” initially. They do this on purpose to confuse us, because they don’t want us to press the button. You are the product, not the customer. The customers are the advertisers. There are similar features in Android phones as well.

If you take these steps, you may find your ads to be less creepy or accurate in their predictions of your interests.

A third layer of defense that I use is a paid service called disconnect.me. It was founded by veterans of the ad tech industry who left to create tools to protect people from these companies. It is the most aggressive software that you can install into your phone or computer to block trackers. It blocks them at the deepest possible level – at the source. There won’t be any trackers in the mobile games you play, on the apps on your phone, or on the websites you visit. There is a free product available for phones, and a paid product for both phones and computers. I have been blocking over 160,000 trackers since I began using it. Until you block them, you don’t even know they are there. If you do want to get aggressive, you do have to start using industrial grade tools.

Other VPNs (Virtual Private Networks) need to be evaluated, as they can be made by disreputable companies that actually conduct data harvesting themselves. When you choose a VPN, you need to make sure you choose a reputable company. Be aware of a free VPN – if it is free, you are the product.

A VPN is not necessary all the time. The cases you would want to use them might be when you are using public wifi, or when you are trying to shield information from your ISP (Internet Service Provider).

Again, it takes a lot of work and maybe even a little bit of money to protect yourself. Unfortunately this is what the industry forces us to do. It’s an opt-in regime forcing us to do all the work, unlike Europe where it is an opt-out regime.

NG: When is it okay for a political campaign, a company, a research firm like Cambridge Analytica, or a law enforcement agency to utilize this technology? When do they cross the line?

I would argue that none of it is legitimate without legal protections in place for individuals. In terms of elections, U.S. citizens have no special rights and privileges with regards to political and voter data. I would be much more comfortable with companies doing this stuff if it had rights and protections underlying it, like in Europe.

We need a lot more to create a safe, equitable, transparent, and symmetrical environment for this. It is too asymmetrical now, companies get all of the privacy. We can’t know what companies do, but they know everything about us.

If there was a more equitable arrangement, then I would be much more comfortable with the whole economy, and this would apply to the government too. If there were adequate safeguards, transparency, and accountability requirements to the way that the government uses surveillance to fight crime, then I would be more comfortable with it. Right now, it seems very unaccountable, highly secretive, and quite difficult for us to achieve accountability when things go wrong. Yes, people can argue that the data is valuable. It can do good things, it can save money, and achieve business outcomes, etc. But that is because of the asymmetry. That side of the equation has all of the knowledge and power. On the other side of the equation, people and individuals have no rights, no knowledge, no power, no influence, and no accountability. The asymmetry is the problem.

NG: Does data eventually culminate to the point where it creates politically-fueled, in- person events?

The data algorithm has no morality beyond making money and capturing attention. We become caught in this vicious loop of feedback mechanisms where incendiary content performs better than non-volatile content. The research shows that emotion drives engagement in this way and the algorithm rewards it while penalizing other kinds of content. We see this in Youtube recommendations, and in how we get likes on Instagram, Facebook, and Twitter. We are motivated by getting this attention, and we are motivated by getting subscribers and views. It is the fuel of the algorithm, and because we don’t know how it works, we are at its mercy without understanding it. It’s quite powerful.

A Wall Street Journal report on an internal Facebook study found that 60% of users would join a more radical group when it was suggested in the group recommendations tab. We are all being manipulated by these machines, and it seems to be pouring gasoline on the tensions of society. The report was fascinating and terrifying as it showed how Facebook knows its platform radicalizes people, and it has struggled to undo that quality. YouTube has also been accused of this as well, with conspiracy theory rabbit holes.

Increasingly, our lives are becoming determined by invisible algorithms that are manipulating our behavior. Our inability to see how they work, have them regulated, and have them be accountable to the social damage that they are causing – that is the problem.

NG: Are “pesuasadables” (the Americans that were targeted by Cambridge Analytica in 2016)  still the most valuable demographic to target for an election campaign? Are they similar to the ones depicted in The Great Hack leading up to the 2016 U.S. election?

The situation with our country being so profoundly different than it was four years ago puts me in a position where I wouldn’t be able to make any predictions on what the data looks like right now. Conventional wisdom was blown out in 2016, and I would say conventional wisdom is again blowing out the window in 2020. The only thing that is predictable is that it is unpredictable.

Any type of voter analytics company working for a campaign is going to try to exploit the electoral college to win. That is what they are hired to do. Within the Cambridge Analytica case study, you can find the origins of some very effective slogans. “Build the Wall,” “Deep State,” and “Crooked Hilary.” These words were slogans that Cambridge Analytica discovered very early on in their initial focus groups. They would do very traditional qualitative marketing. Get focus groups together, get target voters talking, get their words, and identify the terms that work.

In 2016, Steve Bannon would receive these words and then whisper them into Trump’s ear before he went on stage. Trump would say them, and the audience would react. They were using very basic marketing techniques, but they were using them to foment division and animosity to exploit said divisions to achieve the electoral outcome. Effective is a machiavellian term. It’s win at all costs, which means divide the country, foment division, and make sure that the winner loses the popular vote.

U.S. citizens will likely be targeted with strategic advertising campaigns more intensely than ever before in the days leading up to the election.  But, as Carroll notes, it is worth making the effort to protect personal data at all times, as the data harvesting trend is almost certain to continue long after Election Day.