심슨의 역설

심슨의 역설(Simpson's paradox)은 데이터의 세부 그룹별로 일정한 추세나 경향성이 나타나지만, 전체적으로 보면 그 추세가 사라지거나 반대 방향의 경향성을 나타내는 현상을 의미한다. 이 현상은 사회과학이나 의학 통계 연구에서 종종 발생한다.^[1]^[2]^[3] 심슨의 역설은 통계의 함정이 유발할 수 있는 잘못된 결과를 설명하는 데 쓰이기도 한다.

에드워드 심슨이 1951년 처음으로 이 현상을 설명한 것으로 알려져 있으나,^[4] 1899년 칼 피어슨^[5], 1903년 우드니 율^[6]이 유사한 현상에 관해 설명한 적이 있다. "심슨의 역설"이라는 이름은 1972년 콜린 블리스라는 학자가 사용하였다.^[7]

사례 편집

신장결석 치료법 편집

심슨의 역설을 보여주는 하나의 사례는 신장결석 치료법에 관한 의학 연구이다.^[8]^[9] 아래의 표는 작은 크기의 신장결석과 큰 크기의 신장결석에 대해 두 가지 치료법을 적용한 결과 성공률을 나타낸 표이다.

치료법 결석 크기	치료법 A	치료법 B
작은 결석	그룹 1 93% (81/87)	그룹 2 87% (234/270)
큰 결석	그룹 3 73% (192/263)	그룹 4 69% (55/80)
모두	78% (273/350)	83% (289/350)

작은 결석과 큰 결석 모두에서 치료법 A의 성공률이 높게 나왔지만, 결석의 크기를 구분하지 않고 합친 경우에는 치료법 B의 성공률이 높은 결과가 나왔다. 이 사례에서는 결석의 크기라는 숨겨진 변수 또는 혼재변수가 각 치료법의 성공률에 영향을 미친 경우에 해당된다. 결석의 크기에 따라 성공률 자체가 달라지며, 결석의 크기 등과 같은 환자의 특성에 따라 선택하는 치료법이 달라진다는 것^[9]이 심슨의 역설 현상을 낳게 하였다.

벡터 해석 편집

벡터 표현

심슨의 역설은 2차원 벡터의 기울기를 비교하는 방법으로 보일 수 있다.^[10] B1은 L1보다 가파르고, B2 역시 L2보다 가파른 경우에도 B1+B2는 L1+L2보다 완만한 기울기를 가질 수 있다.

각주 편집

↑ Clifford H. Wagner (February 1982). “Simpson's Paradox in Real Life”. 《The American Statistician》 36 (1): 46–48. doi:10.2307/2684093. JSTOR 2684093.
↑ Holt, G. B. (2016). Potential Simpson's paradox in multicenter study of intraperitoneal chemotherapy for ovarian cancer. Journal of Clinical Oncology, 34(9), 1016–1016.
↑ Franks, Alexander; Airoldi, Edoardo; Slavov, Nikolai (2017). “Post-transcriptional regulation across human tissues”. 《PLOS Computational Biology》 13 (5): e1005535. arXiv:1506.00219. doi:10.1371/journal.pcbi.1005535. ISSN 1553-7358. PMC 5440056. PMID 28481885.
↑ Simpson, Edward H. (1951). “The Interpretation of Interaction in Contingency Tables”. 《Journal of the Royal Statistical Society, Series B》 13: 238–241.
↑ Pearson, Karl; Lee, Alice; Bramley-Moore, Lesley (1899). “Genetic (reproductive) selection: Inheritance of fertility in man, and of fecundity in thoroughbred racehorses”. 《Philosophical Transactions of the Royal Society A》 192: 257–330. doi:10.1098/rsta.1899.0006.
↑ G. U. Yule (1903). “Notes on the Theory of Association of Attributes in Statistics”. 《Biometrika》 2 (2): 121–134. doi:10.1093/biomet/2.2.121.
↑ Colin R. Blyth (June 1972). “On Simpson's Paradox and the Sure-Thing Principle”. 《Journal of the American Statistical Association》 67 (338): 364–366. doi:10.2307/2284382. JSTOR 2284382.
↑ C. R. Charig; D. R. Webb; S. R. Payne; J. E. Wickham (1986년 3월 29일). “Comparison of treatment of renal calculi by open surgery, percutaneous nephrolithotomy, and extracorporeal shockwave lithotripsy”. 《British Medical Journal》 292 (6524): 879–882. doi:10.1136/bmj.292.6524.879. PMC 1339981. PMID 3083922.
↑ ^가 ^나 Steven A. Julious; Mark A. Mullee (1994년 12월 3일). “Confounding and Simpson's paradox”. 《BMJ》 309 (6967): 1480–1481. doi:10.1136/bmj.309.6967.1480. PMC 2541623. PMID 7804052.
↑ Kocik Jerzy (2001). “Proofs without Words: Simpson's Paradox” (PDF). 《Mathematics Magazine》 74 (5): 399. doi:10.2307/2691038. JSTOR 2691038.

외부 링크 편집

위키미디어 공용에 관련된
미디어 분류가 있습니다.

심슨의 역설

Simpson's Paradox at the Stanford Encyclopedia of Philosophy, by Jan Sprenger and Naftali Weinberger.
How statistics can be misleading – Mark Liddell – TED-Ed 비디오 강좌.
유디 펄, "Understanding Simpson’s Paradox" (PDF)
Simpson's Paradox, a short article by Alexander Bogomolny on the vector intepretation of Simpson's paradox
The Wall Street Journal column "The Numbers Guy" for December 2, 2009 dealt with recent instances of Simpson's paradox in the news. Notably a Simpson's paradox in the comparison of unemployment rates of the 2009 recession with the 1983 recession.
At the Plate, a Statistical Puzzler: Understanding Simpson's Paradox by Arthur Smith, August 20, 2010
Simpson's Paradox, a video by Henry Reich of MinutePhysics

[1] Clifford H. Wagner (February 1982). “Simpson's Paradox in Real Life”. 《The American Statistician》 36 (1): 46–48. doi:10.2307/2684093. JSTOR 2684093.

[2] Holt, G. B. (2016). Potential Simpson's paradox in multicenter study of intraperitoneal chemotherapy for ovarian cancer. Journal of Clinical Oncology, 34(9), 1016–1016.

[VogelFranks2017-3] Franks, Alexander; Airoldi, Edoardo; Slavov, Nikolai (2017). “Post-transcriptional regulation across human tissues”. 《PLOS Computational Biology》 13 (5): e1005535. arXiv:1506.00219. doi:10.1371/journal.pcbi.1005535. ISSN 1553-7358. PMC 5440056. PMID 28481885.

[4] Simpson, Edward H. (1951). “The Interpretation of Interaction in Contingency Tables”. 《Journal of the Royal Statistical Society, Series B》 13: 238–241.

[5] Pearson, Karl; Lee, Alice; Bramley-Moore, Lesley (1899). “Genetic (reproductive) selection: Inheritance of fertility in man, and of fecundity in thoroughbred racehorses”. 《Philosophical Transactions of the Royal Society A》 192: 257–330. doi:10.1098/rsta.1899.0006.

[yule-6] G. U. Yule (1903). “Notes on the Theory of Association of Attributes in Statistics”. 《Biometrika》 2 (2): 121–134. doi:10.1093/biomet/2.2.121.

[blyth-72-7] Colin R. Blyth (June 1972). “On Simpson's Paradox and the Sure-Thing Principle”. 《Journal of the American Statistical Association》 67 (338): 364–366. doi:10.2307/2284382. JSTOR 2284382.

[8] C. R. Charig; D. R. Webb; S. R. Payne; J. E. Wickham (1986년 3월 29일). “Comparison of treatment of renal calculi by open surgery, percutaneous nephrolithotomy, and extracorporeal shockwave lithotripsy”. 《British Medical Journal》 292 (6524): 879–882. doi:10.1136/bmj.292.6524.879. PMC 1339981. PMID 3083922.

[KidneyParadox-9] 가 ^나 Steven A. Julious; Mark A. Mullee (1994년 12월 3일). “Confounding and Simpson's paradox”. 《BMJ》 309 (6967): 1480–1481. doi:10.1136/bmj.309.6967.1480. PMC 2541623. PMID 7804052.

[10] Kocik Jerzy (2001). “Proofs without Words: Simpson's Paradox” (PDF). 《Mathematics Magazine》 74 (5): 399. doi:10.2307/2691038. JSTOR 2691038.

[1]

[2]

[3]

[4]

[5]

[6]

[7]

[8]

[9]

[10]