An analysis is only as good as the data on which it is based. If there is bias in data, then any analysis will be flawed to some degree.
Bias in datasets has affected:
(1) clinical trials
(2) machine learning
(3) outcomes research
Patient selection bias (data is only collected on included patients):
(1) inequalities in access to care
(1a) based on race
(1b) based on ability to pay or type of insurance
(1c) based on physician or specialist availability
(2) underrepresentation as subjects
(3) ability to undergo follow-up
(4) disease severity
(5) selection based on ability to pay for tests or medications
(6) ability to understand instructions
(7) geographic distribution
Bias with data, especially when datasets are combined:
(1) incomplete data
(2) incorrect data
(3) miscoding
(4) what data is recorded
(5) sample size errors