Bias (statistics)

2023-08-19 19:09:03

Statistical bias is a characteristic of the statistical method or its result and the expected value of the result is different from the estimated true underlying quantitative parameter

When statistical data is calculated in a different way from the estimated whole parameters, the statistical data is biased. Below are some of the types of deviations that may overlap.

Selective bias includes individuals who are more likely to be selected for study than others, thereby biasing the sample. This is also a prejudice to Barkson. [1]

Spectral bias arises from the evaluation of diagnostic tests for biased patient samples. And it leads to overestimation of test sensitivity and specificity

The deviation of the estimator is the difference between the expected value of the estimator and the true value of the estimated parameter.

Ignoring the deviation of a variable is a deviation that occurs in the parameter estimation of regression analysis when the independent variable to be included in the model is omitted by specifying the hypothesis.

For statistical hypothesis tests, for some alpha levels (between 0 and 1), if the probability of rejecting the null hypothesis is less than or equal to the alpha level of the entire parameter space defined by the null hypothesis, It is said to be unbiased. And the probability of being rejected is more than the alpha level of the entire parameter space defined by the replacement hypothesis. [2]

Detection bias occurs when the phenomenon is likely to be observed for a particular set of subjects. For example, epidemics with obesity and diabetes mean that obesity patients are more likely to find diabetes than those with poorer patients, and biased testing leads to obesity diabetes shortage for obese patients.

In educational measurements, prejudice is "systematic mistakes in test content, test management, and / or scoring procedures may cause some testers to score below or above the actual capacity," It is defined. The cause of the bias is irrelevant to the test. Characteristics designed to measure. "[3]

Capital deviations may lead to selection results, test samples or testing procedures for financial sponsors promoting research

Since the deviation of the report includes bias of the availability of the data, it is more likely to report some observation result.

The deviation of recall is due to the difference in accuracy or completeness of recall of past events. For example, patients can not remember how many cigarettes they had suffocated last week, causing overvaluation or underestimation.

Observer bias occurs when researchers potentially affect experiments with cognitive bias. In this case, depending on the judgment, the method of executing the experiment and the method of recording the result may change.

Regardless of cognitive prejudice, social prejudice, statistical prejudice, or other bias, prejudice is inaccurate in the same direction. Systematic error As a simple example, consider a recruiter who constantly underestimates women's abilities. Hiring recruiters are not only unfair but they actually hurt their success when adopting outstanding candidates. If it is an algorithm, it will reduce its accuracy when testing. Prejudice is not just a moral problem, it mainly affects the success of people and algorithms including it.

David McLenny is sad in his life but introduces the real facts. Our mind keeps deceiving us. For those who want data scientists, this book is important as it lists statistical bias of many common types. It points out the classic mistakes such as selfish bias, usability heuristic, confirmation bias and explains why people are being fooled by people who can not help news, fraud, or a heart attack to deceive a busy city I will. I recognize that these prejudices should be fundamental, even practical data experts are sometimes falling for them.

For Cow's second question, cultural prejudice, the performance of the Bell curve confounds the technical (and correct) meaning of "prejudice" compared to the performance of Arthur Jensen and other geneticists It calls it) "S biased", "statistics", and a completely different regional concept (I call it "V bias") caused the public debate. In the definition of statisticians, all of these authors have pledged to raise / lower the test is unbiased (I fully agree). The absence of S bias means that the same thing is predicted when members of different groups achieve the same score, ie that blacks and whites with the same score have the same probability and can perform arbitrary IQ prediction It means.