With “big data” come big risks

Cartoon showing people considering crossing the valley of big data

Prebabble: Sound research is backed by the scientific method; it’s measurable, repeatable and reasonable consistent with theory-based hypotheses. Data analysis is a component of scientific research but is not scientific by itself. This article provides examples of how research or summary conclusions can be misunderstood by fault of either the reviewer or the researcher - especially when big data are involved. It is not specific to psychological research, nor is it a comprehensive review of faulty analysis or big data.

When I was a grad student, (and dinosaurs trod the earth) four terminals connected to a mainframe computer were the only computational resources available to about 20 psychology grad students. “Terminal time,” (beyond the sentence that was graduate school) was as precious and competitively sought after as a shaded parking spot in the summer. (I do write from the “Sunshine State” of Florida)

Even more coveted than time at one of the terminals, data from non-academic sources were incredibly desirable and much harder to come by. To gain access to good organization data was the “holy grail” of industrial organizational psychology dissertations. Whenever data were made available, one was not about to look this gift horse in the mouth without making every effort to find meaningful research within those data. Desperate, but crafty grad students could wrench amazing research from rusty data.

But some data are rusted beyond repair.

One of my cell-, I mean class-, mates came into the possession of a very large organizational database. Ordinarily the envy of those of us without data, such was not the case here. It was well known that this individual’s data, though big, were hollow; a whole lot of “zeroes.” To my surprise and concern, this individual seemed to be merrily “making a go of it” with their impotent data. Once convinced that they were absolutely going to follow through with a degree-eligible study (that no one “understood”), sarcasm got the best of me, “Gee, Jeff (identity, disguised), you’ve been at it with those data for some time. Are any hypotheses beginning to shake out of your analyses?”

“Working over” data in hope of finding a reasonable hypothesis is a breach of proper research and clearly unethical whether one knows it or not. But it happens – more today than ever before.

"Big data" has become the Sirens’ song, luring unwitting, (like my grad school colleague) or unscrupulous, prospectors in search of something – anything - statistically significant. But that’s not the way science works. That’s not how knowledge is advanced. That’s just “rack-n-hack” pool where nobody “calls their shots.”

It isn’t prediction if it’s already happened.

The statistical significance (or probability) of any prediction in relation to a given (already known) outcome is always perfect (hence, a “foregone” conclusion). This is also the source of many a superstition. Suppose you win the lottery by betting on your boyfriend’s prison number. To credit your boyfriend’s “prison name” for your winnings would be a mistake (and not just because he may claim the booty). Neither his number nor your choice of it had any influence in determining the outcome – even-though you did win. But if we didn’t care about “calling our shot’s” we’d argue for the impossibly small odds of your winning ticket as determined by your clever means of its choice.

This error of backward reasoning is also known by the Latin phrase, post hoc, ergo propter hoc, or, “after this, therefore because of this.” It’s not veridical to predict a cause from its effect. Unfortunately, the logic may be obvious, but the practice isn’t.

Sophisticated statistical methods can confuse even well-intended researchers who must decide which end of the line to put an arrow on. In addition, the temptation to “rewind the analysis” by running a confirmatory statistical model (i.e., “calling my shot” analysis) AFTER a convenient exploratory finding (i.e., “rack-n-hack” luck) can be irresistible when one’s career is at stake as is frequently the case in the brutal academic world of “publish or perish.” But doing this is more than unprofessional, it’s cheating and blatantly unethical. (Don’t do this.)

Never before has the possibility of bad research making news been so great. Massive datasets are flung about like socks in a locker room. Sophisticated analyses that once required an actual understanding of the math in order to do the programming can now be done as easily as talking to a wish-granting hockey puck named “Alexa.” (“What statistical assumptions?”) Finally, the ease of publishing shoddy “research” results to millions of readers is as easy as snapping a picture of your cat.

All of the aforementioned faux-paus (or worse) concern data “on the table.” The most dubious risk when drawing conclusions from statistical analyses – no matter how ‘big’ the data are – is posed by the data that AREN’T on the table.

A study may legitimately find a statistically significant effect on children’s grades based on time spent watching TV vs playing outdoors. The study may conclude, “When it comes to academic performance, children that play outside significantly outperform those that watch TV.” While this is a true conclusion, the causality of the finding is uncertain.

To further complicate things, cognitive biases work their way into the hornet’s nest of correlation vs causation. In an effort to simplify the burden on our overworked brains, correlation and causation tend to get thrown together in our “cognitive laundry bin.” Put bluntly, correlation is causation.

Although it’s easy to mentally “jump track” from correlation to causation, the opposite move, i.e., from causation to correlation, is not so naturally risky.

Cigarette makers were “Kool” (can I get in trouble for this?) with labeling that claimed an ‘association’ between smoking and a litany of health problems. They were, not-so-Kool with terminology using the word “causes.”

Causal statements trigger a more substantial and lasting mental impression than statements of association. “A causes B” is declarative and signals “finality,” whereas “A is associated with B” is descriptive and signals “probability.” Depending on how a statement of association is positioned, it can very easily evoke an interpretation of causation.

Sometimes obfuscation is the author’s goal, other times it’s an accident or merely coincidental. Both are misleading (at best) when our eyes for big data are bigger than our stomachs for solid research.

Why Personality Inventories Don’t Tell the Whole Story

Self Check - Personality Inventories

The vast majority of personality inventories rely on “self report” for their input. Quite simply, individuals assess themselves on what I’ll call the “first level.” Since I refer to a “first level,” there obviously must be at least one more level. There is, and it’s a level of assessment that individuals can NOT provide by themselves no matter how good the inventory nor how “truthfully” the individuals respond to it. Therefore, personality assessments don’t tell the whole story.

You don't know yourself as well as you think you do. How can we assume that even the best personality inventory completed by oneself would know you any better?

This doesn’t mean personality assessments aren’t useful (or ‘valid’ in scientific terms), it simply means that there’s more to a person’s story than they can reveal via any series of questions in a personality inventory. This goes for ALL personality inventories, some, more than others, but none can overcome the limitations of self-assessment. In short, you don’t know yourself as well as you think you do. How can we assume that even the best personality inventory completed by oneself would know you any better? This is where an expert in psychology comes in handy. To get the best understanding of an individual, an expert in psychology and psychological assessment can help to ‘fill in the gaps’ that we ALL leave in our own account of our personality.

Although many psychologists would agree and offer varying degrees of scientific proof, Sigmund Freud developed a theory of personality that serves my point. Freud’s theory is grounded in the way he described the structure of the human psyche. This structure includes three components; the Id, Ego and Superego. Without going into the details of each of these components, Freud also developed the concepts of Consciousness and Unconsciousness (although he wasn’t the first to describe them). Almost everyone has some familiarity with these terms – even if not exactly in the way that Freud defined them.

Consciousness has to do with one’s direct awareness of their thoughts, feelings and behaviors. We can fairly accurately describe things that we experience while we are in a conscious state. Unconsciousness is the other ‘side’ of ourselves; the side to which we do not have direct access and therefore do not readily understand nor recognize. As such, we are unable to describe things that exist in our unconscious mind – even though it is constantly at work.

I could stop here and have a pretty good case for why self-report assessments don’t tell the whole story. They don’t include our unconscious self and our unconscious self has a big impact on who we are.

But there’s more.

Freud also described how the conscious and unconscious aspects of our personality work together. I’m not going to go into great detail here except to say that the unconscious mind significantly influences our thinking, feeling and behavior. And it's far more influential than most think.

Here’s a simple example of how unconscious behavior reveals itself in our daily lives: Tying your shoes. This is an activity that we perform virtually every day – but odds are you can’t tell me how you do it. We’ve done it so much that it’s become “automatic.” Basically, we do it without thinking. There are many other examples. Sticking with the shoe example, behaviors that we do repetitively oftentimes become “automatic.” Automatic behaviors require very little (if any) thought, and true to unconscious behavior, we have a hard time recalling or describing them. (A nice benefit to automatic behavior is that it uses almost no mental resources. This means that we have plenty of resources to attend to other matters – aka, multi-tasking.)

Automatic behavior is just one way in which unconsciousness affects who we are. Unconsciousness also affects our thinking and feeling. In short, we are very significantly influenced by psychological processes that we aren’t even aware of. Others may note these influences (or outcomes in our behavior) but we don’t. Things we say may be very apparent to others, but pass completely unnoticed by ourselves. For example, some individuals have a habit of repeating various phrases (usually “filler” words) without any awareness. You may know someone who repeatedly says, “at the end of the day”, or “you know what I mean?”, “um”, “actually”, or any of a cast of phrases that are “thrown in” to the conversation but add no value. Even if they are partially aware that they say these things, they have no idea how frequently they do it – unless you record them and show it back to them. In addition, people are very poor judges of how much they talk (vs. listen). You can test this with a friend, but I must warn that you this is almost never appreciated. Test at your own risk.

These are some simple ways in which our unconscious mind affects our behavior without our awareness. But that’s not all. There are even more “active” ways that our unconscious mind affects us that can be very confusing, or even misleading to an accurate assessment of ourselves (as actors) AND others (as observers).

Freud also developed the concept of “defense mechanisms.” In short, these are ways of thinking and behaving that counteract a thought or memory that is bothering us at an unconscious level. One such example is called, “reaction formation.”

Reaction formation is the term Freud used to describe the unconscious -- and extreme -- change of thought and behavior resulting from one’s unconscious need to (over)compensate for previous behavior that the individual now considers offensive. By way of “reaction formation” the individual unconsciously undergoes a radical transformation wherein the behavior or attitude they once held, suddenly becomes hyper offensive and disgustingly deplorable -- in others! Smoking is often given as an example. Former smokers sometimes become the loudest and most assertive critics of those who smoke. Freud’s theorizing is that by engaging in overcompensating behavior, one is clearing up or avoiding the unconscious tension they experience by virtue of having been a former transgressor.

Other forms of defense mechanisms include denial (unwilling or unable to accept the truth because of the psychological harm it causes), projection (attributing one’s own intolerable thoughts or problems to another so as to ‘shift blame’), repression (a less extreme variant of denial that involves pushing one’s hurtful thoughts or feelings into the unconscious self so as not to deal with them directly). And there are others.

Scores on personality assessments may be radically different from what an objective assessment would reveal.

The point is, not only are we largely unaware of our most frequent behaviors (automatic behavior), but our psyche is constantly at work trying to protect ourselves from threatening thoughts, feelings or behaviors (defense mechanisms). As a result, scores on personality assessments may be radically different from what an objective assessment would reveal. And this isn’t because the respondent is lying, they really believe that they are accurately describing themselves. There are many other factors that distort our valid understanding of ourselves, these are just two of the most common.

An expert in psychology and psychological assessment can identify these, and other unconscious influences on behavior, and consequently, scores on a self-report personality inventory. Sometimes this can be done merely by noting unusual or telling patterns in the individual’s responses to a reputable personality assessment, but frequently it requires the collection of data beyond the single assessment. Psychological interviews are among the best ways to spot potentially misleading information as taken straight from the personality inventory. The content of these interviews can be designed specifically to test questions raised by the instrument.

It’s very important to stress that these types of advanced interpretation of any psychometric assessment are complex. They need to be left to experts who have a thorough understanding of psychology as well as tests and measures used as tools to predict behavior.

In sum: Solid psychological assessments offer great value over less scientifically constructed measures (e.g., typical unstructured interviews). But, as with any other tool, it’s important to know the true strengths and limits of what they offer in the complex task of psychological assessment. As anyone who’s made a regrettable hire can agree, what you see in the interview isn’t always what you get on the job.

Psychology at work: It really makes a difference.

Psychology by Machine? Not for a While.

Psychology button on computer where "Enter" key should be

Technology can fly planes, drive cars; heck, virtually perform remote surgery (pun, not intended). Some believe that literally all jobs will eventually be performed by technology. For them, if a “machine” isn’t already doing it, just wait. (Note: This is an extreme view).

Technology is changing the world faster than ever. If you agree with Moore’s law, it will only continue to increase its impact even faster over time.

Will technology take my job?

Probably so. But don’t quit yet! If you’ve been around a few years, like I have, it’s likely that technology has already “taken” all or much of the job you had 10 years ago. You’ve simply changed to stay in front of the technological evolution.

What does science say?

A recent study looked at the rise of technology in relation to the probability of it overtaking more than 700 jobs catalogued in O*Net, a public database of jobs and the various knowledge, skills and abilities required for their performance. The researchers (Frye and Osborne, 2013) reasoned that the probability of technology overtaking a given job is closely related to the time it will take for this to occur. As such, they created a list rank ordering the probability that these 700 jobs will be overtaken by technology in 20 years.

The study is now a few years old, but seems to have already made some accurate predictions. For example, you’ve probably received a “robocall”, a task once was performed by a person.

The crux of the study is in the researchers’ identification of three key job characteristics they refer to as “bottlenecks to computerization.” The degree to which a job encompasses one or more of these “bottlenecks” predicts the probability (and time) required for technology to be able to perform that job. These three bottlenecks include: 1) Fine Perception and Manipulation, 2) Creative Intelligence and 3) Social Intelligence.

The three bottlenecks were further broken down into seven more discreet tasks. Of these seven tasks, Social Intelligence encompasses a majority of four.

The practical implication is that if your job requires you to “read” people or influence them, particularly in emotional ways, you’re likely safe from seeing a robot at your desk one morning anytime soon.

Specifically, the study predicts that social workers, therapists and teachers should have relatively long careers as far as “automation threat” is concerned. Psychologist, is also in the top 20 of the 700 jobs ranked according to the difficulty of automation.

Although this research is new, the issue isn’t. Psychological assessment has long been a topic of technological debate: Can a personality assessment alone more accurately predict behavior than an expert in psychological assessment?

Continue reading “Psychology by Machine? Not for a While.”

When Psychology Talks, Money Listens

Today global HR and risk management consulting firm, Towers Watson, announced the purchase of Saville Consulting (a psychology-based assessment firm) for £42 million. This is clearly a justification that psychology makes money — and not just at wholesale.

In the wake of similar acquisitions, firms delivering good psychometric assessment at work have now been just about totally gobbled up by the much larger HR conglomerates.

Read about it here.

It didn’t used to be this way. Continue reading “When Psychology Talks, Money Listens”