7 Problems with predictive policing

For those who either fear or welcome the world of Philip K. Dick's Minority Report, we're getting there and it's time to take stock. Although we aren't talking about actual clairvoyance of crimes and criminals, or about preventative detention based on algorithms, the theory that crime happens not randomly but in "patterned ways," combined with the confidence in big data being used to predict all kinds of social behavior and phenomena, have taken hold in cities looking to spend their federal policing grants on shiny things. This is true even though crime is decreasing overall (and as we see below, although violent crime periodically spikes back up, predictive policing is least effective against it).

And while there are legal limits on law enforcement’s direct use on some data appending products, we’re finding that agencies may use aggregators to get around even the most rigorous civil rights protections.

Not everyone is excited. Here are the most important reasons why:

  1. Policing algorithms reinforce systemic racism 

The simplest iteration of this argument is: most data to be folded into predictive policing comes from police. A lot of it comes from community members. Racism undeniably exists across these populations, as "AI algorithms are only able to analyze the data we give them . . . if human police officers have a racial bias and unintentionally feed skewed reports to a predictive policing system, it may see a threat where there isn’t one." In fact, Anna Johnson, writing for Venture Beat about the failure of predictive policing in New Orleans, says that city's experience basically proved that biased input creates biased results.

  1. Predictive crime analytics produce huge numbers of false positives

Kaiser Fung, founder of Principal Analytics Prep, has a very plainly-spoken and often bitingly funny blog where last month he devoted two posts to "the absurdity of predictive policing."

One thing Fung points out is that certain crimes are "statistically rare" (even if they seem to happen a lot). A predictive model has to generate many more red flags (targets to be investigated) than actual instances of the crime occurring in order to be "accurate."

"Let's say the model points the finger at 1 percent of the list," he writes. "That would mean 1,000 potential burglars. Since there should be only 770 burglars, the first thing we know is that at least 230 innocent people would be flagged by this model." That's a lot of suspects. How many of them will be pressured into confessing to something they didn't do, or at a minimum, have their lives painfully disrupted.

  1. Attributing crime prevention to predictive systems is meaningless: you can't identify things that didn't happen

This is a particularly devastating observation from Fung's posts about predictive policing. If you flag an area or individual as "at risk" and then police that area or individual, you may or may not have prevented anything. You can't prove that the prediction was accurate in the first place, and Fung finds it absurd that sales reps of these systems basically say " Look, it flagged 1,000 people, and subsequently, none of these people committed burglary! Amazing! Genius! Wow!" They can get away with claiming virtually 100% accuracy through this embarrassing rhetorical slight-of-hand. Call it statistical or technological illiteracy. It's also deeply cynical on the part of those promoting the systems.

  1. Predictive analytics falls apart when trying to predict violent crimes or terrorism

One area where predictive policing seems to at least . . . predict the risk of crime, is property crime. When it comes to literally anything more dreadful than burglary, though, the technology doesn't have much to say in its favor. Timme Bisgaard Munk of the University of Copenhagen's school of information science wrote a scathing review in 2017 entitled "100,000 false positives for every real terrorist: Why anti-terror algorithms don't work," and the title does justice to the article. In particular, Munk points out that predictive analytics of terrorist attack risks borrows from prediction work around credit card fraud. But terrorism is "categorically less predictable" than credit card fraud. In general, violent crime is the least predictable kind of crime.

  1. Predictive policing is mostly hype to make a frightened public trust the police

After reviewing many studies and analyses, Munk concluded that European agencies' choices of predictive policing programs is based more on pacifying the public, particularly a European public frightened of terrorism. "The purchase and application of these programs," Munk wrote in the 2017 article, "is based on a political acceptance of the ideology that algorithms can solve social problems by preventing a possible future." This is striking because there is no evidence, certainly no scientific evidence, that predictive counter-terrorism is a thing. And in a more general sense, there's no consensus that any predictive policing technology works.

  1. There's no such thing as neutral tech.

We read a powerful post by Rick Jones, an attorney at Neighborhood Defender Service of Harlem, and president of the National Association of Criminal Defense Lawyers. The post is obviously written from the point of view of a public defender, and written to highlight the public suspicion of policing technology. But a sound argument is a sound argument. Jones reminds us "that seemingly innocuous or objective technologies are not, and are instead subject to the same biases and disparities that exist throughout the rest of our justice system." Jones may be assuming a "garbage in/garbage out" metaphor that doesn't precisely describe what happens when algorithms and data sets synthesize new knowledge "greater than the sum of its parts," de-colonizing that data, "removing" bias from its inputs and practitioners, needs to be proactive at a minimum, and then may not be adequate anyway.

  1. Guess what data these programs rely on? Data from previously over-policed neighborhoods

Attorney Jones specifically talks about a system called "PredPol" which uses data on location, time, and nature of crimes to mark "high-risk" areas for future crime. It calls those areas "hot spots," a stunning display of unoriginality. And speaking of unoriginal, PredPol literally uses the very data that policing—and specifically over-policing, has generated. It's basically incestuous data collection that demonstrates the very thing it needs to prove to justify more overpolicing. It's a "feedback loop" that "enables police to advance discriminatory practices behind the presumed objectivity of technology."


Sci-Fi Shows Us Benevolence and Vulnerability in AI Characters

Benevolent and vulnerable superintelligent robots are notable because they are atypical. In both the real world and in many science fiction stories, there's something rather grey and mundane about AI. In particular, it seems like the stereotype is that AI is either malevolent or neutral-and-waiting-to-be malevolent. When characters break that stereotype through benevolence or inquisitiveness, they become iconic in their transcendence. This is certainly true of Brent Spiner's Data (and Data's "brother" Lore). But there are a few other noteworthy android AI types that exhibit similarly unusual traits. 

With AI about to power "the next generation" of real robots, with tech companies creating "reinforcement learning software for robots" that, in one instance, gets these creations to "pick up objects they’re never encountered before," we are seeing the ongoing "anthropomorphizing" of them as well. Sophia was made an honorary Saudi citizen, but the video of interactions with her leaves one hesitant to declare her "revolutionary" in her approach and immediacy to the world. She's pretty stiff and many of her answers to questions come off as predictable "go-to" subroutines. She's good-looking, though, and not just in the sense that she's an attractive talking mannequin; she also comes off as just the slightest bit curious, wondering what she's doing there, and cleverly self-effacing.  

What are some leading AI fictional characters and what are their distinguishing traits? To bring up Lieutenant Commander Data again, one would have to say "his" distinguishing trait is vulnerability. From being discovered and rescued as the sole "survivor" of an attack on his colony, to his endless struggles with identity formation on the Enterprise, Data is vulnerably honest, vulnerably curious, conscious of his power over, and simultaneous dependence on, the material provisions and benevolence of Starfleet. 

Data has (and loses) an emotion chip. According to the Memory Beta fandom site (which is not canon but in this instance simply cites the series), Lore killed the androids' "father" Dr. Soong and stole the chip, used it to manipulate Data in the TNG episode "Brothers," and eventually Data removed it upon neutralizing Lore. Starfleet eventually ordered its removal from Data but allowed it to be upgraded later when Commander Data was "reborn." Outerplaces reports that; 

"In the hunt to create more helpful, responsive autonomous machines, many robotics companies are working hard to build computers that can empathize with humans and tailor their actions so as to anticipate their owners' needs. One such company is Emoshape, which is building software for robots that will help machines to learn more about humans' moods based on their facial expressions. The company takes a novel approach to this, as engineers work to create an "emotion chip" for machines so that they can approach emotional learning with some degree of understanding as to what it feels to be happy, or sad, or otherwise frustrated."

Data has many existential vulnerabilities: computer viruses, energy discharges, ship malfunctions, and someone reaching his "off switch." But he is also vulnerable to having his feelings hurt, whatever those are. 

Polish sci-fi writer and satirist Stanislaw Lem, who wrote Solaris and who has been called science fiction's Kafka, developed an AI character called Golem, whose main attribute could be called "change," or evolution. Golem begins as a military AI computer but develops self-consciousness, then engineers its own intelligence supplements. Lem's book includes "lectures" written by Golem on the nature of humanity and reads like Olaf Stapleton (whose work is an early, metaphorical foreshadowing of big data—a superintelligent meta-history of humanity and the universe). Golem becomes concerned with understanding and critiquing humanity from a scholarly perspective. The idea of a robot scholar is pretty original. Check out this short film based on the story.

Then there's Ray Bradbury's Grandma, a character whose main trait is certainly benevolence. Grandma emerges in I Sing the Body Electric, as an "electric grandmother" product in this innovative story. A father buys it for his children after their mother passes away, and she quickly becomes an indispensable member of the family, although it takes a while for every last member of the family to learn to love her.

Grandma has unusual traits like being able to run water from a tap in her finger, but she also has the characteristic of being 100% committed to the children, in a way that is clearly compatible with Asimov's robot ethics. At one point, she risks her life to save one of the children. This, of course, reminds us of "DroneSense," a drone software platform that purports to be "used for public safety, although without such drastic scenes as an android racing to save a child from being hit by a truck. One can obviously ask "But does it want to be benevolent?"

The deeper question in the industry, though, is not whether AI will "want" to be benevolent, but whether certain traits in the actual construction of AI will tend toward good or evil. In an article published four years ago, Olivia Solon argues that it is much more likely that artificially intelligent robots will hurt us by accident than intentionally "rising up against us" or turning against any individual humans deliberately. She points out Elon Musk's speculative fear that  "an artificially intelligent hedge fund designed to maximize the value of its portfolio could be incentivized to short consumer stocks, buy long on defence stocks and then start a war" Making the wrong decision in traffic scenarios is always high on the fear list too. The "bumbling fool" AI is less terrifying than the malevolent robot, even if it may end up being a more dangerous scenario. 

It is worth noting that while these are always the top of mind concerns, the vast majority of AI will be intentionally limited by design. Deployed in a neutral way to help feed the new and most important asset in the world - data. It is these AI that are greatly changing the marketing world and how companies like our client Accurate Append, an email, phone, and data vendor, operate within it.


Election 2020: boots on the ground & bits in the cloud

I'm getting excited about the election. I feel my pulse get a tiny bit faster watching political ads, getting text messages, seeing people volunteer. It feels "American" even though I know we don't always live up to our ideals. Not all the Founders of the United States even wanted a popular vote. "If elections were open to all classes of people, the property of landed proprietors would be insecure," James Madison feared during a secret debate in 1787. But he didn't prevail over his colleagues who subscribed to the view of Thomas Paine, the conscience of the Founders if not fully counted among them. Paine wrote: "The right of voting for representatives is the primary right by which other rights are protected. To take away this right is to reduce a man to slavery, for slavery consists in being subject to the will of another, and he that has not a vote in the election of representatives is in this case." 

Today, candidates have to have both a ground game and a digital game. You can judge a ground game by the number of (and location of) campaign offices. Buttigieg and Warren lead in this metric. Or you can count the number of volunteers a candidate has. Bernie has 25,000 in Iowa alone, an impressive number working on those all-important Caucuses. And last February the campaign announced that "more than one million people have now volunteered to support the senator's 2020 bid." Or more precisely, to do volunteer work to support the campaign. The campaign has well over 100,000 active supporters in Pennsylvania, calling across the state and organizing in cities like Pittsburgh.

Speaking of Iowa, and as impressive as Bernie's campaign is doing there with volunteers, Joshua Barr at 538 recently posted a great analysis comparing Barack Obama's fieldwork in Iowa to all of the current Democratic contenders and finding that none of them match Obama's 2008 Iowa ground game. The campaign had field offices in the smallest of towns and rural counties. One wonders how important the candidates feel Iowa is in 2020, although the top tier seem very invested in it. 

There's no doubt that Bernie will have boatloads of volunteers, and one could easily see the scenario where he has more than any other candidate. But "a million" sign-ups might mean only a fraction of actual volunteers showing up—a calculation that all campaign volunteer coordinators have baked into their analysis of what can be done. Volunteers can be fickle and unreliable. But many hands make light work, and operations that make volunteers feel important and appreciated will keep enough of them coming back that a lot of campaign work can be done. 

The Sanders campaign is on to something, as a recent Huff Post piece describes: they have a vision and a method. They empower people to host house parties and deliver stories, they use a lot of texting, the campaign has created "an infrastructure to facilitate the work of its most dedicated supporters." More and more campaigns that are investing in this outreach, especially via SMS messages, are using vendors like our client Accurate Append, an email, and phone contact data quality vendor, to acquire those mobile numbers.  

Far more money is being spent on digital advertising. It's not just for the weird world of mass microtargeting either. Digital ads can also test campaign messages, which can then be transposed into television advertising, which still dominates the elections media, particularly in the two months before election day. But despite that TV focus, by "September, presidential hopefuls had cumulatively spent $60.9 million on Facebook and Google ads compared to $11.4 million on television ads, according to an analysis by the Wesleyan Media Project." Voters also give feedback online. Data, and lately big data, have played a role in processing strategies from social media engagement.

All of this emerged from Barack Obama’s use of Facebook ads in 2008—what people in the field call a "turning point." One expert "predicts that $6 billion will be spent on paid advertising during the 2020 election" with most ultimately going to broadcast and cable television, but at least $1.2 billion on digital ads. 

It's when the two types of campaigning are combined in large order that you know a candidate is serious. The Republicans are often written off as lacking ground games, but that accusation would be laughable in 2020. Whatever Trump's approval numbers, and whatever support he may have shedded from those who did not know what to expect from him, there's no doubt that his supporters will make every effort to be organized and proactive; now they have a president to defend. And a portion of the billionaire class has the money to spend. 

In Michigan, "Republican President Donald Trump’s re-election campaign is training volunteers for what his national press secretary described as the most advanced ground game in modern political history." If Trump wins Michigan, well, it's a ballgame at that point. The Michigan Republican Party is facilitating national training sessions, and the campaign is distributing outreach tools to the states. But the RNC also has a digital game that it's poured $300 million into since 2014. We know the Trump campaign will pay trolls and botmakers and all kinds of craftsmen for social media engagement. 

So these are the thoughts that keep me alert to news of both grassroots campaigning and digital work, including digital shenanigans that make me cringe. The game is afoot, and there will be unprecedented human and monetary capital invested in its outcome. I'm not the only person who feels oddly, perhaps ironically, patriotic about it. In an essay called "Democratic Vistas," Walt Whitman wrote: "I know nothing grander, better exercise, better digestion, more positive proof of the past, the triumphant result of faith in humankind, than a well-contested American national election." Now, I can think of a few grander exercises or more fun ones at least, but there is certainly some humanist pride in the whole enterprise, as corrupted and malleable as it sometimes seems.


What does 2020 Hold for Big Data, AI and Tech?

Forbes predicts "AI, Disinformation, and Human Augmentation" in 2020 and I can't say I disagree, but let's take a deeper dive. I'm especially interested in the way that new technology, and new conversations, are building upon existing ones. 2019 gave us lots of discussion about AI, quantum computing, cryptocurrencies, and unethical political advertising via microtargeting. Yes, Forbes says, these discussions will continue. But here's what I'm looking at. 

Big data does IoT: The most promising technological evolution to continue into the new year is the merging of data analytics with the Internet of Things (IoT). The heraldry of IoT a few years ago has not proven unwarranted. The promise of an integrated material and informational life, with more efficient and appropriate exchange and delivery of everything, is taking shape. The integration of more and better data analytics will take this even further. This is first on Marcel Deer's list of important predictions for big data in 2020: "This time next year, we can expect to have 20 billion IoT devices collecting data for analysis . . . This means we will likely acknowledge more analytical solutions for IoT devices, helping to provide transparency and more relevant data." The business implications of this trove of data will also be interesting to see develop as well. Those in the data appending industry like our client Accurate Append, an email, and phone contact data quality vendor - might see new ways to help businesses better connect with and understand their customers. 

Shortage of science data pros: Regarding IoT analytics and the AI sector in general, Deer also says "around 75 percent of companies might suffer while accomplishing matured benefits of IoT due to a lack of data science professionals." There are a lot of late-in-the-year stories floating around about this now, such as Rainmakrr's coverage of recruiter agency demand in the UK and Upside's prediction that demand will grow in 2020. The Trump administration's extension of its immigration caps on H-1B visas won't help matters, and that's likely to be a political showdown as the administration tries to step up its anti-immigrant red meat efforts to solidify votes in the 2020 election, and Stewart Anderson at Forbes says that those increased restrictions will be a story next year. 

In-Memory Computing: I'm putting quantum computing aside in this post even though it was one of the biggest stories of 2019 and will probably continue to be discussed (but see this post saying it all might come to nothing). Something almost as mind-blowing is happening with in-memory computing, where you can store data in RAM among many computers and implement parallel processing that's 5,000 times faster than processing in individual computers. Deer points out that the "decreasing cost of memory technology" will popularize in-memory computing, augmenting real-time sentiment analysis, machine learning, and a host of AI aspirations. Just to pique your interest further, one system achieved a billion financial transactions per second using 10 commodity servers, tech, and equipment that cost less than $25,000. 

Tech, mental health, and cybernetics: I also wonder about the ongoing discussion on technology and mental health. Two years ago, the Healthy Living Blog cited a Duke University study that aligned with the conventional wisdom of the time—that adolescent use of social media technology was associated with high ADHD symptoms. I've always been a little troubled by the ableism in these kinds of reports, but I found it hard to articulate my suspicions. Something about where you draw the line on the technological enhancement of communication; the fact that people treated telegraphs like we treat social media now, and some other sentiments. 

But that older Healthy Living Blog post also cited studies from the University of Michigan (decreased happiness), University of Gothenburg (depressive symptoms), and still more studies finding psychological withdrawal and "poor mental health" in general.  

Look for new voices to push ahead in the conversation in 2020, raising different concerns, including the ways in which social media can improve mental health. As a foreshadowing of this, in December, Jenna Tsui wrote of the mounds of narrative data rolling in, written by people with mental illness, lauding some platforms for making them "feel less alone by acting as a peer support mechanism." The Dartmouth study analyzed 3000 comments and found clusters of content on feeling less alone and coping with the fear of mental illness. 

I don't think we need to limit that discussion to those with explicit, self-identified or diagnosed mental illnesses either, although those are important. I think these platforms offer peer support, validation, and connectivity in general, and as with any medium, it's important to weigh how they do and how they don't. The Dartmouth research is qualitative and so it's different than the more data-driven findings that raise concerns over adolescent tech use, but it opens up the door to a larger conversation about our cybernetic identity and evolution, and I hope and expect this to be a deeper topic of discussion in 2020—maybe even combined with talk about the need to democratize and increase the transparency of platforms that are currently implicated in spreading false news through microtargeting

Watching the watchers: Finally, 2020 should see continuing concern over surveillance technology like facial recognition technology and large-scale DNA database access. Facial recognition tech is still yielding confirmed results showing racial bias as of December of this year. Concern over police powers is not letting up. Even though the United State Supreme Court is becoming more conservative, two years ago in Carpenter v. United States the Court took the notable step of finding that there were fourth amendment issues in public surveillance, something it hadn't acknowledged before, always buying the police's argument that there's no expectation of privacy in public. And lower courts have weighed in: Last summer the Ninth Circuit held that "the development of a face template using facial-recognition technology without consent (as alleged here) invades an individual’s private affairs and concrete interests." So I'll be interested to watch the dynamic between leaps in the technology due to big data, and the legal debates that emerge. 

It looks like 2020 will be a race between the good news and the bad news on the computing technology front. Happy New Year and may you live in non-interesting (or at least benevolent) times as much as possible!


Per-Vote Municipal Election Spending and Climate Change: Seattle Questions

Crises, divisions, and battle stakes are all accelerating. That's why it's increasingly important for political candidates to have good information on voters, using vendors like our client Accurate Append, an email, and phone contact data quality vendor to have accurate data for outreach. Despite the undeniable fact that you have to spend money to win elections, the dynamics and optics of that spending are also important. 

You'll find a shorter post I wrote here from a couple of weeks ago, written while results were still being tabbed for the King County (Seattle) elections. There, I pontificated on the folly of Amazon and other corporations spending so much money on these local races—and losing most of them. But now that the King County results are finally tabulated, we can also springboard into a deeper and weirder discussion: what are these corporate stakeholders, and other donors, spending their money on? 

2019 was unprecedented: votes for "Egan Orion and Heidi Wills, two losing candidates who were backed heavily by big businesses like Amazon," cost nearly $59 and more than $50 respectively. That's much higher than the average for the 14 city council campaigns overall, at nearly $29 per vote, but that is still a lot of money. We usually associate big election spending with national races, particularly presidential elections. But in so many ways, the local is more politically real than the national anyway. And as we'll see a bit later, municipal policies are going to make or break communities as the effects of climate change begin taking their toll, particularly in coastal states like Washington. 

Not only do we associate spending with federal elections; we also tend to think more, talk more, and participate more in those national races, and such priorities don't actually serve us. Last year, Lee Drutman wrote an article for Vox lamenting that "America has local political institutions, but nationalized politics. This is a problem." It's a problem, Drutman says, because data indicates people consume far more national than local news, and behave accordingly, despite the fact that only 537 federal elected offices exist, compared to around half a million state and local electeds. That huge abstraction of political energy into a realm where individual votes matter far less than they do in mayoral or county commissioner races means that highly ideological and spectacle-oriented national political parties control public discourse—making it more about drama than actual policy. 

Drutman discusses Daniel Hopkins' recent book on the nationalization of political behavior. Hopkins' argument that the United States prioritizes "place-based voting" is even more provocative now that much of the world is shifting towards a more migratory existence. 

"Climate migrants" (who are not legally considered refugees, although this could change in the future of international legal activity) are those of us who have moved, are moving, or will move in response to weather events, food availability, resource conflicts, and other crises, and the numbers of them are going to grow exponentially in the coming decades. We have no idea how many people will be moving around the world, but we have good reasons to think it will be more than we end up estimating. The movement will take place both from country to country (or, alarmingly, from country to permanent nomadism), and within countries. The number of people completely abandoning their part of the world is likely to be in the hundreds of millions over the next century at the very least. 

In June of this year, the Center for Climate Integrity released a report showing that Washington would bear the highest cost of all West Coast states in protecting and rendering sustainable those communities most likely to suffer from the climate crisis. "Beyond laying out broad cost estimates, the report also questions who will foot the bill for climate adaptation." This debate generally consists of folks on the left saying that fossil fuel companies ought to bear those financial costs, and those on the right continuing to argue against redistributive regulation. 

Seattle's city council recently passed a resolution committing the city to one of the strongest localized Green New Deals in the country, requiring drastic emissions reductions "while increasing affordability for low-income families." Under this vision, the city will be carbon emissions-free by 2030, will invest in neighborhoods that have been historically marginalized and unfairly hit with the worst environmental harms in the past, and will confer with indigenous people and tribal nations on climate policy. We can expect future city council decisions, at least for the foreseeable future, to do more of this boundary-pushing.  

But a curious part of this, one which I don't think has been examined politically or philosophically, is the tension between the U.S. being home to a "politics of place," to paraphrase Hopkins, and the likelihood that people may not be staying long, or much longer at least, in those municipalities and surrounding greater city areas if staying there is financially or physically hazardous because of climate change. Here is where we approach a very weird convergence of local politics, a national anti-corporate zeitgeist, and deeper philosophical questions of the cost and long-term consequences of digging our heels in for or against public spending. Consider just the Seattle race. 

First, consider that it was won by the left in a come-from-behind victory, at least perceptually. A few days before the elections began, and even in analysis of the initial (and ultimately misleading) results, critics of Seattle's "progressive-socialist" coalition government were predicting that the scare tactics and promises of "responsibility" by the "moderate Democrats, neighborhood groups, and public-safety unions"—along with Amazon & Co.'s big bags of money would pay off. But just before the election began, Amazon threw another $1 million to the Chamber of Commerce's political action committee. Then, the race turned into a referendum on corporations buying elections, and that, combined with a base much more loyal than the skeptics had supposed, resulted in the progressives and socialists pretty much running the table.

Second, imagine if Amazon's candidates of choice had won in November 2019. Those candidates generally had more negative views of taxes, and more neoliberal views on which entities (if any) ought to be providing services to the public. A slate of candidates backed by a big business might disfavor the tax aspirations of the current Seattle city government. Consider that this outcome might shift the course of Seattle's long-term population trends. What if Amazon hadn't made that strategic error and had instead genuinely changed the composition of the city government? It's possible a more "business-friendly" municipal government would reverse course on some of the public-oriented climate response policies. And, over the next 50-ish years, that relative diminishing of public climate mitigation actions, now replaced by either wishy-washy private initiatives or nothing at all, might further drive emigration from King County. Then the per-vote-spending becomes even more surreal. Or, those in favor of strong corporate influence might have been proven right, the private sector offering better climate adaptation solutions to the city. 

Third, consider what would happen if voter turnout in local races were to increase. Right now about 1 in 5 people vote in exclusively local elections—a figure much lower than voter turnout in national elections. The collection of voter data, a project taken on by institutes such as Pew Research, can actually increase voter turnout if the research, and contact with research subjects, begin before election day. And there's also some misreporting by respondents who want to appear more engaged than they actually are, and who will report they'd voted when they hadn't.

So we dip into a double irony when we think about how absurd it is for municipal candidates to rely on spending that hits $50 or more per vote gained, but also how such an investment is truly an investment in votes rather than in the residents themselves, who, depending on the policy orientation of local leaders, may end up being literal climate refugees or another category of municipal expat.


Why Educating Policymakers Is Not Enough

There's an interesting new article up at Texas National Security Review on policymaking, and specifically on policy "competence." Titled "To Regain Policy Competence: The Software of American Public Problem-Solving," the article laments the decline of policy education at the highest levels of elite and civil service-based training in governance, and its author offers a comprehensive set of reforms in university training of policymakers. 

It's understandable that this has become a concern: the so-called populism that we are told is driving the current political moment brings with it (not in every instance but in more than a few) a thumbing-the-nose at policy expertise. The author of the TNSR article, Philip Zelikow, says there's been a decline in policymaking skills going back much further than the past three years; I'll leave it to others to weigh that data. 

Whatever the case, the author is optimistic that "[t]he skills needed to tackle public problem-solving are specific and cultural — and they are teachable." Zelikow uses the metaphors of "hardware" and "software" to describe the tools required for nations and governments to implement good policies. Hardware is the structure of government (and, I would imagine, the material things needed to carry out policies). Software is the culture of education and decisionmaking that goes into training, acclimating, and facilitating policies and policymakers. 

Good software, Zelikow says, can compensate for bad hardware. "For instance," he writes, "amid all the public controversies about law in America, the United States still does reasonably well upholding the rule of law and the administration of justice. Why? One reason is because the American legal profession has established very strong norms about what constitutes appropriate legal reasoning and quality legal research and writing. This is software." It's true that in spite of crisis after crisis at national and local levels, we still haven't seen the total breakdown of law and order in society. But, as I will suggest below, the reason policy regimes succeed isn't just the respect that legal practitioners have for the standards of their profession: it's also that constituents still trust most of the fundamentals of the rule of law. If they didn't, all the legal training in the world wouldn't be enough to compel the non-legally trained people to obey the law. 

But the bottom line in Zelikow's article remains lack of training, or improper training, as the cause of current policy incompetence. He has a good point, particularly historically, that such training matters. It was training, he writes, not only in the philosophy of civic virtue but in the incorporation of the right amount of technical knowledge into policymaking, that helped the Allies win the Second World War: "The Allied successes included extremely high-quality policy work on grand strategy, logistics, and problem-solving of every kind. The German and Japanese high commands were comparably deficient in all these respects." 

So I also agree with Zelikow that the evolution of this broad-based approach into mere "economics, statistics, and quantitative analysis" in the latter half of the 20th century was an unfortunate descent. But I would offer up that the comprehensive curricular reform suggested in that article will itself simply devolve into hyperspecialization eventually without the check of citizen deliberation. 

Constituents and stakeholders are also hardware, and building a culture of interaction and participatory democracy is software. One important piece of the deliberative model is its distinction between constituents and consumers. As Tony Greenham writes, "Consumption implies a passive acceptance of what is on offer. Although we have a choice, it is within a narrow envelope of options within a fixed system. In contrast, citizenship brings a sense of ownership and agency over the system itself." The bottom line is that deliberation creates a better policy because those who are affected by the policies get to have a say in their creation and implementation. 

Deliberation in the form of proactive constituent engagement can also check back on groupthink. A great case study about the dangers of groupthink is the tragic explosion of the Space Shuttle Challenger in 1986. And that particular manifestation of groupthink, inspired as it was by the imperative of the "political victory" of NASA looking good and the Reagan administration being able to celebrate a technological victory, suggests that an exclusive focus on better policy education on the part of leaders is not enough. 

Engineers from Thiokol had teleconferenced with NASA management about 12 hours before the launch of the Challenger. The engineers expressed their concern that lower temperatures at the launch site had created icicles. The engineers had already been concerned about the integrity of the O-rings on the craft. NASA rejected those concerns and judged the risks appropriate. 

Of course, groupthink results in thousands of mini-disasters, and occasionally bigger ones that mirror the magnitude of a deadly spaceflight disaster, in the worlds of national, state, and local policymaking. Even teaching leaders to question themselves, or organizational structures to question their own conclusions, may not be enough. What is needed is the perspective of those "outside" the decisionmaking body who aren't really "outside." This means constituents, stakeholders. Creating two-way, multi-way, collaborative platforms for constituent communication hardwires deliberative disruption. Disruption is the nutrient of democracy. 

So while I think it's great that we're talking about better leadership training, more robust civil service education, and creating holistic education for those who draft and implement policies, we also have to keep talking about the points of engagement between them and those they serve. That's why software companies build government CRM solutions to manage these constituent communications at scale.


Doing Data & Education Well

Done correctly, big data can make education better. Pritom Das just posted three significant changes in education policy and practice driven by data innovation. They include personalization (tracking student behaviors and preferences to develop customization technology), increased participation, and better measurement of performance. I think the most promising of those is participation. Like personalization, the idea is to develop platforms that customize with (not just for) the student. 

The most concerning and complicated of these is surely performance assessment. It has the most potential for unreflective, mostly quantitative assessment of student performance. If the collection of data is especially cold and impersonal in certain contexts, it doesn't matter whether the data is being accumulated slowly or quickly. 

The motive of many education policymakers is undoubtedly to use performance data to improve learning systems, but each factor, including temporal questions like how long students take to answer test questions, how many times they read a text or watch a video, or whether they go over learning materials many or few times, obscures why different students would choose to do these things. I may go back over the material multiple times for reasons other than a deficit in immediate understanding; I may gain a deeper, more synthesized understanding precisely because I go over something more than once intentionally.  

It's really a matter of what questions are asked and who is studying the answers, of course, but too much emphasis on how long it takes a student to learn something (honestly, do we care about temporal efficiency that much? are we that Fordist?) risks less attention being paid to questions like social class: where mountains of research (enhanceable by good metadata, of course) demonstrates that "children’s social class is one of the most significant predictors—if not the single most significant predictor—of their educational success." The research also shows that these class differences affect kids at the earliest stages of education and then "take root" and "fail to narrow" over time⁠—creating large gaps in cognitive and noncognitive development that translate into hundreds of thousands of dollars and countless quality of life factors. It's really a terrifying indictment of the American education system. 

The concern is that because of the emphasis on speed of learning or the desire to complete steps, lots of otherwise good educational advocates stop asking tough questions about data-driven education. Lots of otherwise good outcomes may also be suppressing individuality, papering over class, race, or gender inequality, not meeting kids' and families' needs. 

These concerns, and a few more, were expressed in a post last summer by Rachel Buchanan and Amy McPherson at the Australian site EduResearch Matters provocatively titled "Education shaped by big data and Silicon Valley. Is this what we want for Australia?" The tying of potentially beneficial technology to a particular business interest (whatever that interest) evokes a frustration that we have lost a compass for tech serving the common good. The authors point out that the "products" being introduced measure "not just students' learning choices but keystrokes, progression, and motivation. Additionally, students are increasingly being surveilled and tracked." The post quotes a poem by education blogger Michael Rosen:

"First they said they needed data

about the children

to find out what they’re learning.

Then they said they needed data

about the children

to make sure they are learning.

Then the children only learnt

what could be turned into data.

Then the children became data."

I think many folks I know who think about big data a lot would like to see a world where we used it to improve education in the right ways and not think so much about students en masse in these kinds of cause-and-effect relationships. One solution, given that the data genie isn't going back into the bottle (and has the potential to help fight inequality while also building individuality) is to teach students precisely what's happening to them, to pull back the curtain, to show them the gears and scaffolding of education policy itself⁠—as well as its quantitative assessments. I mean things like teaching middle school students, for example, how AI works ideologically, not just technically. This is the focus of a suggested curriculum outlined by a professor and two graduate students at MIT, "Empowering Children through Algorithmic Justice Education." 

The proposal calls AI education "an issue of social justice - it is not enough to teach AI or machine learning as a technical topic." It cites the findings that "neutral" data can actually be biased, requiring the teaching of ethics "in conjunction with AI as a technical topic." The question to be asked: who are we building this technology for? Ongoing efforts to examine and develop industry ethics in quantitative data are also important and encouraging.

Whether you feel like data should be idealized to be objective and neutral, or seen as reflective of human biases rather than overcoming them, watching videos like this will make you think about what kind of world we want to build with the literal quantum leaps we're making in the field. 

And as you might imagine, such ethical and social questions also haunt the use of data in political and issue campaigns, including the ways we append that data with additional information using vendors like our client Accurate Append, an email, and phone contact data quality vendor. We should always be self-reflective⁠—and other-reflective⁠—in the way we ask even seemingly neutral demographic or profile questions.


Big (and Fast) Quantum Data

"Of all the theories proposed in this century," physicist Michio Kaku wrote in Hyperspace, "the silliest is quantum theory. In fact, some say that the only thing that quantum theory has going for it is that it is unquestionably correct." 

Last month, our client Accurate Append, an email and phone contact data quality vendor,  blogged about a big space data conference, and described promising developments in the use of data analytics to measure greenhouse gases based on satellite imagery, to identify organic molecules and methane on other planets and moons (critical to the search for the origins of life) and more.

But how about a deeper dive into something even more complex? The particles (like photons and electrons) that make up the substance of the universe behave in really strange ways if we look at them closely. They have a "reality" very different from classical descriptions of matter as stable and consistent. Understanding that strange behavior—and then even harnessing it, or flowing along with it—is the challenge of applying quantum theory, and this has world-shattering implications for big data and artificial intelligence, to say the least.

It really depends on who you ask, of course, whether this is a good thing. Shouldn't we be able to break codes used by criminals or terrorists?  We may be heading into a brave new world where security and insecurity co-exist along with the on, off, and on-off of quantum states. An expert in "Ethical Hacking" said back in 2014 that told Vice in 2014 that "the speed of quantum computers will allow them to quickly get through current encryption techniques such as public-key cryptography (PK) and even break the Data Encryption Standard (DES) put forward by the US Government."

In the most oversimplified of nutshells, quantum computing goes beyond the binary on/off states computer bits normally operate under, adding the additional state of on and off. The main consequence of this third state is that quantum computers can work "thousands of times faster" than non-quantum computers—beyond anything that could be otherwise imagined. That speed also adds to the security of quantum data. Experts call it un-hackable, which is pretty audacious. Some of the basic everyday miracles of quantum physics also make their way into quantum computing, like "entangled" particles being changeable together even if they are far apart. This provides a way of essentially "teleporting" data—transferring it without fear of being intercepted. Chinese scientists seem to have taken the lead on the un-hackable quantum data plan. Since there is no signal, there is nothing to intercept or steal. To put it in even simpler terms, the data being shared is, in a sense, the same data. It's like existing at two distinct points in the universe simultaneously but only as one unit. More precisely, you've created a pair "of photons that mirror one another." This indeterminacy leads to the possibility that many of the "laws of science" we take for granted are just arrangements of things in relation to other things. Gravity itself, and many of the behaviors of space and time might actually be "correcting codes" at the quantum level. 

Qbits, which are these nonbinary computer bits we're talking about, can be made by superconductors that maintain a quantum state. This requires extremely cold temperatures—close to absolute zero; colder than the vacuum of space. Underlying these miraculous evolutionary steps is the quantum theory's embrace of "imprecision" in a computing world that has mostly relied on precision and predictability. This makes quantum theory natural kin to artificial intelligence since AI aspires to teach computers how to "handle" and process imprecision. 

In some ways, embracing imprecision in computing technology is similar to the implications of philosophers rejecting binarism in the 19th and 20th centuries. Georg Wilhelm Friedrich Hegel, for example, in the early 19th century, developed the dialectic to do justice, as many of his interpreters have put it, to the reality of the half-truth, to the idea that things may be in a state of development where they are neither and both one thing and/or another. In a very different way, the Danish theologian Soren Kierkegaard sought the rejection of absolutes and the embrace of absurdity, a kind of simultaneous being-and-non-being. Werner Heisenberg, one of the founders of quantum theory, seemed more like a philosopher than a scientist when he wrote "[T]he atoms or elementary particles themselves are not real; they form a world of potentialities or possibilities rather than one of things or facts."

The implications for big data are immeasurable because quantum computing is to nonquantum computing what the speed of light is to the speed of sound. "Quantum computers," says Science Alert, "perform calculations based on the probability of an object's state before it is measured - instead of just 1s or 0s - which means they have the potential to process exponentially more data compared to classical computers." All of this culminates in Lesa Moné's post on quantum computing and big data analytics. With quantum computers, Moné writes, the complex calculations called for by such analytics will be performed in seconds—and we are talking calculations that currently take years to solve (and are sometimes measured in the thousands of years). Quantum calculations will change the very nature of how we view the interaction of time and information processing. It's something on par with the discovery of radio waves, but given that we'll be crunching years into seconds, the social impact may be much, much larger.


Data, Election Hacking, and Paper Ballots

Thomas Paine wrote that "the right of voting for representatives is the primary right by which other rights are protected." Taking away that right reduces us "to slavery" and makes us "subject to the will of another." Regardless of whether you're on the left or right, Americans value that kind of autonomy⁠—that we choose the rule makers and enforcers, that we periodically get to choose new ones and whether to retain incumbents. How to protect the integrity of that process, so that its outcomes actually reflect our conscious preferences, seems to be as important a question as any that law and policy makers could ever ask in a democracy.

Data and its processing are commodities and conduits of power, and because of this, there will always be attempts to steal and manipulate them. Our SEO client Accurate Append, a phone, address, and email contact data vendor, recently wrote about the danger of fake data, and hacked elections are a manifestation of the same overall aim: to distort the will of voters and undermine people's participation in civil society.

For people who work on improving the electoral process, and those of us offering services for candidates and leaders to better reach voters, this is also a question of professional importance. It's personally frustrating for those who offer data, analytics, other informational services that help campaigns learn more about their constituents. Hacking undermines the effort to construct electoral strategies commensurate with the needs and perspectives of the real people who are voting.

News media is buzzing that Senate Majority Leader Mitch McConnell is blocking legislation to address election hacking at the federal level. And states are not moving either. The Verge reports that although progress has been made on moving back to paper ballots (only 8 states remain paperless now compared to 14 in 2016), "most states won’t require an audit of those paper records, in which officials review randomly selected ballots⁠—another step experts recommend. Today, only 22 states and the District of Columbia have voter-verifiable paper records and require an audit of those ballots before an election is certified." As we'll shortly explain, that extra verification step is necessary because even paper ballots are vulnerable to manipulation.

Much of what we know about the ability to hack into things like elections is due to the efforts of organizational hacking conferences like DefCon, which gathers experienced hackers at conferences to discuss the ways that security may be open to breach.

The work they do, which was featured at a recent annual conference, is fascinating. In addition to election hacking, which we're discussing here, hackers and scholars of hacking research things like whether AI and robots are subject to sabotage via electromagnetic pulse (EMP). It all feels very James Bond. But the most immediately relevant stuff is voting machines, systems, and databases⁠—all set up as the "Voting Village" at the conference, with the aim of promoting "a more secure democracy." These "good guy" hackers find ways to remotely control local voting machines, "the innards of democracy," so that the public can be aware of potential threats and constantly demand solutions. As one hacker put it, "these systems crash at your Walmart scanning your groceries. And we're using those systems here to protect our democracy, which is a little bit unsettling. I wouldn't use this . . . to control my toaster!"

This work is important even if all states switched back permanently to paper ballots, because some kind of technological facilitation, intervention, and processing is inevitable, and as long as such activity contains data shared between machines, it's subject to outside sabotage or manipulation. Freedom to Tinker has an outstanding list of the ways this could happen in a paper system. Hackers could hack the software used in the auditing and recount processes, and avoiding the use of computers during that process is impractical in contemporary society. "For example, we may have print a spreadsheet listing the “manifest” of ballot-batches, how many ballots are in each batch; we may use a spreadsheet to record and sum the tallies of our audit or recount.  How much of a nontrivial “business method” such as an audit, can we really run entirely without computers?"

Of course, as the article adds, one could simply manually manipulate the recount process with a "bit of pencil lead under your fingernail," but at least at that point there would be people to catch locally doing such things. The call for "careful and transparent chain-of-custody procedures, and some basic degree of honesty and civic trust" is easier to enforce in person than across bytes, air, and cables. In the meantime, though, paper ballots aren't perfect, but they are the cornerstone of addressing current threats to election integrity.


Healthcare Staffing Is the New Data Commodity

If you have nursing credentials, and are willing to travel to meet the ever-shifting (and ever-growing) demands of healthcare providers, chances are that your contact information will be part of varying bundles of data bought, sold, or traded by nurse recruiting websites. This isn't necessarily a bad thingand it's a subject I think about a lot since my SEO client Accurate Append is in the business of providing the most accurate email, cell phone, and landline contact data.

Healthcare professionals are the gold standard of the contemporary tight professional labor market. Healthcare CEOs list their biggest rising expense as the money they spend competing for talent. Hiring rates are incredibly high now and are expected to either stay the same or even grow in 2019. According to the "Modern Healthcare CEO Power Panel survey," about 75 percent of CEOs responded that "front-line caregivers are most in need." No wonder the unemployment rate for practitioners is only 1.4%, while the rate for assistants and aides, higher at 3.4%, is still well below the typical 4% unemployment rate. If you want to be in demand, be a healthcare professional.

And if you want to be in even higher demand, be a Registered Nurse (RN) who can travel. These stats are unbelievable. RN employment overall will grow 15 percent over the next 8 yearshigher than pretty much any other profession. Travel nurses are hired on contract to fill temporary gaps in nursing, and the industry is benefiting from increasing participation by states in the Enhanced Nurse Licensure Compact (eNLC).

The recruiting game for these (often highly-specialized) travel nurses is in full steam. Kyle Schmidt of the Travel Nursing Blog has a fascinating piece on the emerging market for the personal contact information of nurses who make themselves available for travel services. This demand is met by gathering data from prominent services like travelnursing.org, travelnursesource.com, rnvip.com and others, but it's also met by more traditional healthcare staffing companies which, although they don't sell contact information to their partners, simply generate their own leads to fulfill their clients' needs.

When this data is collected via the web, it's done through website visits where visitors are encouraged to provide their contact information. Those sites are using old-fashioned, but reliable, methods of getting people there: Kyle points cites stats by Conductor, a digital marketer, that "47% of all website traffic is driven by natural search while 6% is driven by paid search and only 2% is driven by social media sites." Paid advertisements are virtually ignored in comparison to naturally clicked links, while "75% of search users never scroll past the first page of search results."

How are nurses convinced to provide their contact info? The answer is through the advertisement of "broadcast services,"  which promise to provide candidates' information and availability to agencies. The business model works even if the broadcast service doesn't get a lot of money for providing the data, because the web sites are very basic, often managed through content-management systems, and don't need to be heavily maintained.

This steady (and often high-speed) increase in recruiting needs is part of maybe the longest term employment trend in the U.S. today. Hospitals have been using contract labor to fill in for massive nursing labor shortages at least since the late 1990s. Over a million and a half jobs were added from 2004-2014, and as we know, these vacancies kept growing in the last five years. A June 2019 market research report sheds some interesting light on the healthcare recruiting industry--some facts that might explain why the recruiting game for nurses seems so data-driven (and dependent on contact info-fishing at such a volume-driven level). While we know the labor market is competitive, what we do not know is whether particular states and regions will have consistent demand for  nurses. This is because, while we know government spending on Medicare and Medicaid is expected to increase in 2019, the ongoing political volatility around healthcare spending means that there may be bumps in the road, unexpected windfalls in unexpected places, and unexpected losses of funding as well.

All of this leads me to ask whether contact info data collection and distribution for the nursing industry might be streamlined and made more efficient over time.  We know that hospitals' human resources departments "use analytics for recruiting, hiring and managing employees," but there's a divide between how big and small facilities collect and manage this data. According to the Nurse.com blog, big hospitals "have sophisticated human resource management systems for employee records and talent management, while small facilities and clinics often rely on free analytics tools such as job sites, email clients and search engines"analytics tools provided by sites similar to the data collection and distribution sites mentioned earlier.