The Department of Homeland Security (DHS) is rapidly expanding its collection of social media information and using it to evaluate the security risks posed by foreign and American travelers. This year marks a major expansion. The visa applications vetted by DHS will include social media handles that the State Department is set to collect from some 15
million travelers per year.1 Social media can provide a vast trove of information about individuals, including their personal preferences, political and religious views, physical and mental health, and the identity of their friends and family. But it is susceptible
to misinterpretation, and wholesale monitoring of social media creates serious risks to privacy and free speech. Moreover, despite the rush to implement these programs, there is scant evidence that they actually meet the goals for which they are deployed.
While officials regularly testify before Congress to highlight some of the ways in which DHS is using social media, they rarely give a full picture or discuss either the effectiveness of such programs or their risks. The extent to which DHS exploits social media information is buried in jargon-filled notices about changes to document storage systems that impart only the vaguest outlines of the underlying activities.
To fill this gap, this report seeks to map out the depart- ment’s collection, use, and sharing of social media infor- mation by piecing together press reports, information obtained through Freedom of Information Act requests, Privacy Impact Assessments,2 System of Records Notices (SORNs),3 departmental handbooks, government contracts, and other publicly available documents.
In light of DHS’s expanding use of social media moni- toring programs, understanding the ways in which the department exploits social media is critical. Personal infor- mation gleaned from social media posts has been used to target dissent and subject religious and ethnic minorities to enhanced vetting and surveillance. Some DHS programs are targeted at travelers, both Americans and those from other countries. And while the department’s immigra- tion vetting programs ostensibly target foreigners, they also sweep up information about American friends, family members, and business associates, either deliberately or as a consequence of their broad scope.
Muslims are particularly vulnerable to targeting. Accord- ing to a 2011 Pew survey (which was followed by a similar survey in 2017), more than a third of Muslim Americans who traveled by air reported that they had been singled out by airport security for their faith, suggesting a connection between being a devout Muslim and engaging in terrorism that has long been debunked.4 A legal challenge to this practice is pending.5 According to government documents, one of the plaintiffs, Hassan Shibly, executive director of the Florida chapter of the Council on American-Islamic Relations, was pulled aside for secondary screening at the
border at least 20 times from 2004 to 2011.6 He says he was asked questions like “Are you part of any Islamic tribes?” and “Do you attend a particular mosque?”7 Shibly’s story is hardly unique.8
Concerns about such screenings are even more urgent under the Trump administration, which has made excluding Muslims a centerpiece of its immigration agenda through policies such as the Muslim ban and implementation of “extreme vetting” for refugee and visa applicants, primarily those from the Muslim world.9 A leaked DHS draft report from 2018 suggests that the administration is considering tagging young Muslim men as “at-risk persons” who should be subjected to intensive screening and ongoing monitor- ing.10 If implemented, such a policy would affect hundreds of thousands of people.11 DHS’s social media monitor- ing pilot programs seem to have focused in large part on Muslims: at least two targeted Syrian refugees, one targeted both Syrian and Iraqi refugees, and the analytical tool used in at least two pilots was tailored to Arabic speakers.12
More generally, social media monitoring — like other forms of surveillance — will impact what people say online, leading to self-censorship of people applying for visas as well as their family members and friends. The deleterious effect of surveillance on free speech has been well documented in empirical research; one recent study found that aware- ness or fear of government surveillance of the internet had a substantial chilling effect among both U.S. Muslims and broader U.S. samples of internet users.13 Even people who said they had nothing to hide were highly likely to self-censor online when they knew the government was watching.14 As Justice Sonia Sotomayor warned in a 2012 Supreme Court case challenging the warrantless use of GPS tracking tech- nology, “[a]wareness that the Government may be watch- ing chills associational and expressive freedoms. And the Government’s unrestrained power to assemble data that reveals private aspects of identity is susceptible to abuse.”15
DHS’s pilot programs for monitoring social media have been notably unsuccessful in identifying threats to national security.16 In 2016, DHS piloted several social media moni- toring programs, one run by ICE and five by United States Customs and Immigration Services (USCIS).17 A February 2017 DHS inspector general audit of these pilot programs found that the department had not measured their effec- tiveness, rendering them an inadequate basis on which to build broader initiatives.18
Even more damning are USCIS’s own evaluations of the programs, which showed them to be largely ineffective. According to a brief prepared by DHS for the incoming administration at the end of 2016, for three out of the four programs used to vet refugees, “the information in the accounts did not yield clear, articulable links to national security concerns, even for those applicants who were found to pose a potential national security threat based on other security screening results.”19 The brief does show that USCIS complied with its own rules, which prohibit denying benefits solely on the basis of public-source infor- mation — such as that derived from social media — due to “its inherent lack of data integrity.”20 The department reviewed 1,500 immigration benefits cases and found that none were denied “solely or primarily because of informa- tion uncovered through social media vetting.”21 But this information provided scant insights in any event: out of the 12,000 refugee applicants and 1,500 immigration benefit applicants screened, USCIS found social media information
helpful only in “a small number of cases,” where it “had a limited impact on the processing of those cases — specifi- cally in developing additional lines of inquiry.”22
In fact, a key takeaway from the pilot programs was that they were unable to reliably match social media accounts to the individual being vetted, and even where the correct accounts were found, it was hard to determine “with any level of certainty” the “authenticity, veracity, [or] social context” of the data, as well as whether there were “indica- tors of fraud, public safety, or national security concern.”23The brief explicitly questioned the overall value of the programs, noting that dedicating personnel “to mass social media screening diverts them away from conducting the more targeted enhanced vetting they are well trained and equipped to do.”24
The difficulties faced by DHS personnel are hardly surprising; attempts to make judgments based on social media are inevitably plagued by problems of interpreta- tion.25 In 2012, for example, a British national was denied entry at a Los Angeles airport when DHS agents misinter- preted his posting on Twitter that he was going to “destroy America” — slang for partying — and “dig up Marilyn Monroe’s grave” — a joking reference to a television show.26As the USCIS pilot programs demonstrate, interpretation is even harder when the language used is not English and the cultural context is unfamiliar. If the State Department’s current plans to undertake social media screening for 15 million travelers are implemented, government agencies will have to be able to understand the languages (more than 7,000) and cultural norms of 193 countries.27
Nonverbal communications on social media pose yet another set of challenges. As the Brennan Center and 34 other civil rights and civil liberties organizations pointed out in a May 2017 letter to the State Department:
If a Facebook user posts an article about the FBI persuad- ing young, isolated Muslims to make statements in support of ISIS, and another user “loves” the article, is he sending appreciation that the article was posted, signal- ing support for the FBI’s practices, or sending love to a friend whose family has been affected?28
All of these difficulties, already substantial, are compounded when the process of reviewing posts is automated. Obvi- ously, using simple keyword searches in an effort to iden- tify threats would be useless because they would return an overwhelming number of results, many of them irrele- vant. One American police department learned this lesson the hard way when efforts to unearth bomb threats online instead turned up references to “bomb” (i.e., excellent) pizza.29 Natural language processing, the tool used to judge the meaning of text, is not nearly accurate enough to do the job either. Studies show that the highest accuracy rate achieved by these tools is around 80 percent, with top-rated tools generally achieving 70–75 percent accuracy.30 This means that 20–30 percent of posts analyzed through natu- ral language processing would be misinterpreted.
Algorithmic tone and sentiment analysis, which senior DHS officials have suggested is being used to analyze social media, is even less accurate.31 A recent study concluded that it could make accurate predictions of political ideol- ogy based on users’ Twitter posts only 27 percent of the time, observing that the predictive exercise was “harder and more nuanced than previously reported.”32 Accuracy plummets even further when the speech being analyzed is not standard English.33 Indeed, even English speakers using nonstandard dialects or lingo may be misidentified by auto- mated tools as speaking in a different language. One tool flagged posts in English by black and Hispanic users — like “Bored af den my phone finna die!!!!” (which can be loosely translated as “I’m bored as f*** and then my phone is going to die”) — as Danish with 99.9 percent confidence.34
Crucially — as the USCIS pilot programs discussed above demonstrated — algorithms are generally incapa- ble of making the types of subjective evaluations that are required in many DHS immigration programs, such as whether someone poses a threat to public safety or national security or whether certain information is “derogatory.” Moreover, because these types of threats are difficult to define and measure, makers of algorithms will turn to “prox- ies” that are more easily observed. But there is a risk that the proxies will bear little or no relationship to the task and that they will instead reflect stereotypes and assump-
tions. The questioning of Muslim travelers about their reli- gious practice as a means of judging the threat they pose shows that unfounded and biased assumptions are already entrenched at DHS. It would be easy enough to embed them in an algorithm.
Despite these serious shortcomings in terms of effective- ness and critics’ well-founded concerns about the poten- tial for targeting certain political views and faiths, DHS is proceeding with programs for monitoring social media.35The department’s attitude is perhaps best summed up by an ICE official who acknowledged that while they had not yet found anything on social media, “you never know, the day may come when social media will actually find some- one that wasn’t in the government systems we check.”36
The consequences of allowing these types of programs to continue unchecked are too grave to ignore. In addi- tion to responding to particular cases of abuse, Congress needs to fully address the risks of social media monitor- ing in immigration decisions. This requires understand- ing the overall system by which DHS collects this type of information, how it is used, how it is shared with other agencies, and how it is retained – often for decades – in government databases. Accordingly, this paper maps social media exploitation by the four parts of DHS that are most central to immigration: Customs and Border Protection (CBP), the Transportation Security Administration (TSA), Immigration and Customs Enforcement (ICE), and United States Citizenship and Immigration Services (USCIS). It also examines DHS’s cooperation with the Department of State, which plays a key role in immigration vetting.