다중 작업 학습

다중 작업 학습(Multi-task learning, MTL)은 여러 학습 작업을 동시에 해결하면서 작업 간의 공통점과 차이점을 활용하는 기계 학습의 하위 분야이다. 이는 모델을 별도로 훈련하는 것과 비교할 때 작업별 모델의 학습 효율성과 예측 정확도를 향상시킬 수 있다.^[1]^[2]^[3] MTL의 초기 버전은 "힌트"(hint)라고 불렸다.^[4]^[5]

널리 인용된 1997년 논문에서 리치 카루아나(Rich Caruana)는 다음과 같은 특징을 제시했다.

다중 작업 학습(Multitask Learning)은 관련 작업의 훈련 신호에 포함된 도메인 정보를 귀납적 편향으로 사용하여 일반화를 향상시키는 귀납적 전이에 대한 접근 방식이다. 공유 표현을 사용하면서 동시에 작업을 학습함으로써 이를 수행한다. 각 작업에 대해 배운 내용은 다른 작업을 더 잘 배우는 데 도움이 될 수 있다.^[3]

분류 맥락에서 MTL은 여러 분류 작업을 공동으로 학습하여 성능을 향상시키는 것을 목표로 한다. 한 가지 예는 스팸 필터로, 서로 다른 사용자 간에 서로 다르지만 관련된 분류 작업으로 처리될 수 있다. 이를 보다 구체적으로 만들기 위해, 사람들마다 스팸 이메일과 합법적인 이메일을 구별하는 기능의 분포가 다르다는 점을 고려해야 한다. 예를 들어, 영어 사용자는 러시아어로 된 모든 이메일이 스팸임을 알 수 있지만 러시아어 사용자에게는 그렇지 않다는 것을 알 수 있다. 그러나 이 분류 작업에는 사용자들 사이에 분명한 공통점이 있다. 예를 들어 공통 기능 중 하나는 송금과 관련된 텍스트일 수 있다. MTL을 통해 각 사용자의 스팸 분류 문제를 공동으로 해결하면 솔루션이 서로 정보를 제공하고 성능을 향상시킬 수 있다. MTL 설정의 추가 예로는 다중 클래스 분류 및 다중 레이블 분류가 있다.^[6]

다중 작업 학습은 알고리즘이 관련 작업을 잘 수행하도록 요구하여 유도된 정규화가 모든 복잡성에 균일하게 페널티를 적용하여 과적합을 방지하는 정규화보다 우수할 수 있기 때문에 작동한다. MTL이 특히 도움이 될 수 있는 상황 중 하나는 작업이 중요한 공통점을 공유하고 일반적으로 샘플링이 약간 부족한 경우이다. 그러나 MTL은 관련 없는 작업을 학습하는 데에도 도움이 되는 것으로 나타났다.^[7]

같이 보기 편집

각주 편집

↑ Baxter, J. (2000). A model of inductive bias learning" Journal of Artificial Intelligence Research 12:149--198, On-line paper
↑ Thrun, S. (1996). Is learning the n-th thing any easier than learning the first?. In Advances in Neural Information Processing Systems 8, pp. 640--646. MIT Press. Paper at Citeseer
↑ ^가 ^나 Caruana, R. (1997). “Multi-task learning” (PDF). 《Machine Learning》 28: 41–75. doi:10.1023/A:1007379606734.
↑ Suddarth, S., Kergosien, Y. (1990). Rule-injection hints as a means of improving network performance and learning time. EURASIP Workshop. Neural Networks pp. 120-129. Lecture Notes in Computer Science. Springer.
↑ Abu-Mostafa, Y. S. (1990). “Learning from hints in neural networks”. 《Journal of Complexity》 6 (2): 192–198. doi:10.1016/0885-064x(90)90006-y.
↑ Ciliberto, C. (2015). “Convex Learning of Multiple Tasks and their Structure”. arXiv:1504.03101 [cs.LG].
↑ Romera-Paredes, B., Argyriou, A., Bianchi-Berthouze, N., & Pontil, M., (2012) Exploiting Unrelated Tasks in Multi-Task Learning. http://jmlr.csail.mit.edu/proceedings/papers/v22/romera12/romera12.pdf

외부 링크 편집

소프트웨어 편집

The Multi-Task Learning via Structural Regularization Package
Online Multi-Task Learning Toolkit (OMT) A general-purpose online multi-task learning toolkit based on conditional random field models and stochastic gradient descent training (C#, .NET)

[1] Baxter, J. (2000). A model of inductive bias learning" Journal of Artificial Intelligence Research 12:149--198, On-line paper

[2] Thrun, S. (1996). Is learning the n-th thing any easier than learning the first?. In Advances in Neural Information Processing Systems 8, pp. 640--646. MIT Press. Paper at Citeseer

[:2-3] 가 ^나 Caruana, R. (1997). “Multi-task learning” (PDF). 《Machine Learning》 28: 41–75. doi:10.1023/A:1007379606734.

[4] Suddarth, S., Kergosien, Y. (1990). Rule-injection hints as a means of improving network performance and learning time. EURASIP Workshop. Neural Networks pp. 120-129. Lecture Notes in Computer Science. Springer.

[5] Abu-Mostafa, Y. S. (1990). “Learning from hints in neural networks”. 《Journal of Complexity》 6 (2): 192–198. doi:10.1016/0885-064x(90)90006-y.

[:1-6] Ciliberto, C. (2015). “Convex Learning of Multiple Tasks and their Structure”. arXiv:1504.03101 [cs.LG].

[:3-7] Romera-Paredes, B., Argyriou, A., Bianchi-Berthouze, N., & Pontil, M., (2012) Exploiting Unrelated Tasks in Multi-Task Learning. http://jmlr.csail.mit.edu/proceedings/papers/v22/romera12/romera12.pdf

[1]

[2]

[3]

[4]

[5]

[6]

[7]