How to Avoid Data Pitfalls by Self Spark Chief Science Officer

Hands writing on data chart

In a corporate world, where data is king and a humongous volume of data is generated every day, it is an essential duty of a Product manager to filter out the right data, to make an accurate prediction on the lifespan of the Product. It is crucial to not fall to the fallacies that surface in the volumes of data experimentation and analysis. Orin Davis, as an experienced Human Capital Consultant, rightly informs us of the possibility of data pitfalls in our day to day encounters.

A Spark of Genius – Orin Davis

Orin davis photo

Orin Davis is the principal investigator of the Quality of Life Laboratory and the Chief Science Officer of Self Spark. He is a passionate engineer, who holds a degree in positive psychology and works as a Human Capital & creativity consultant. He helps the budding startups with pitches, propositions, culture and human capital.

Avoiding Data Pitfalls

In the ocean of information that we delve in, on an everyday basis, it is important not to draw sudden conclusions from the data set. Mr. Davis stresses on confounding ‘what?’ with ‘why?’ He says, often we find the ‘what?’ of data easily and assume the why? For instance, from the last Presidential election in the USA, we can get data about the people who voted for Trump and Clinton. We know the ‘what?’ of data here, and we end up assuming ‘why?’ with it. 

This is a blunder because what ultimately matters is that the data & conclusions are meaningful.

Four people holding smartphones data sheets and tablets

As a senior mentor and capital advisor, he says, he is often questioned on ‘How do we hire candidates in a company?’ He addresses this concern with a clarification that one should never hire a person based on his personality or strength, as any personality combination or strength combination, can do any given job.

Mr. Davis adds to this by saying that, he has turned many firms upside down in the meetings, as their criteria for hiring were the basic surveys and didn’t check the convergent validity. He calls this situation GIGO –  Garbage In and Garbage Out.

GIGO rule states that your conclusions are only as valid as the surveys that collect them.

Labels Lead to Validity

This theory questions the truthfulness of data. That is, the degree to which the findings are “[T/t]rue”. While the truthfulness of data is measured, it is important to hold the argument of the presence of Construct Validity…

Construct ValidityDid we measure what we said we would measure?

Pencil and set squares

This approach to finding the truthfulness of data is important because when we operationalize variables, there are limits. Many variables are abstract, and we also can’t cover every possible angle of a given variable.

There are many variants of Construct Validity. They are:

  1. Face Validity –  Do creative people “think outside the box”? Generalizing that all creative people think away from mundane ways
  2. Content ValidityConfirmation bias
  3. Predictive ValidityPredictive tests to validate? Ex: personality test for hiring?
  4. Concurrent ValidityMeasuring concurrency to draw conclusions
  5. Convergent ValidityCorrelates with openness to experience and creativity
  6. Discriminant ValidityThe measures are not related to things from which it should be independent
woman analyzing a database

Method Problems

Some of the factors that affect the validity of the information gathered are:

  1. Historysome external event that affects the result
  2. Fatiguesurvey gets too long and people stop thinking about the questions
  3. Instrumentation –  changes in the instrument due to use and age
  4. Selection BiasGroups are not chosen randomly
  5. Dropout people may leave the survey without filling the complete questionnaire

While gathering information or data, Validity and Reliability depends on various factors. Here reliability relates to validity. There are potential threats encountered to gain data that is valid and reliable. Some of them are:

  • Extreme/moderate response patterns
  • Experimenter expectations
  • Mood of the participants
  • Social desirability
  • Language difficulty
Colleagues around a table discussing data

Key Takeaways – Best Practices

  • Include tracking the failures too. We often only track the survivors. As it only gives us the pattern or ‘what?’. It doesn’t give us the ‘why?’
  • Avoid jumping to conclusions. Concentrate of generalizability, meaningful results and valid statistics.
  • Get clarity between ‘what?’ and ‘why?’
  • Survey =/= experiment.
  • Be careful with your words.
The Product Book: How to Become a Great Product Manager

Enjoyed the article? You may like this too: