交叉驗證

交叉驗證，有時亦稱循環估計^[1]^[2]^[3]，是一種統計學上將數據樣本切割成較小子集的實用方法。於是可以先在一個子集上做分析，而其它子集則用來做後續對此分析的確認及驗證。一開始的子集被稱為訓練集。而其它的子集則被稱為驗證集或測試集。交叉驗證的目的，是用未用來給模型作訓練的新數據，測試模型的性能，以便減少諸如過擬合和選擇偏差等問題，並給出模型如何在一個獨立的數據集上通用化（即，一個未知的數據集，如實際問題中的數據）。

交叉驗證的理論是由Seymour Geisser所開始的。它對於防範根據數據建議的測試假設是非常重要的，特別是當後續的樣本是危險、成本過高或科學上不適合時去搜集。

交叉驗證的使用

假設有個未知模型具有一個或多個待定的參數，且有一個數據集能夠反映該模型的特徵屬性（訓練集）。適應的過程是對模型的參數進行調整，以使模型儘可能反映訓練集的特徵。如果從同一個訓練樣本中選擇獨立的樣本作為驗證集合，當模型因訓練集過小或參數不合適而產生過擬合時，驗證集的測試予以反映。交叉驗證是一種預測模型擬合性能的方法。

常見的交叉驗證形式

Holdout 驗證

常識來說，Holdout 驗證並非一種交叉驗證，因為數據並沒有交叉使用。隨機從最初的樣本中選出部分，形成交叉驗證數據，而剩餘的就當做訓練數據。一般來說，少於原本樣本三分之一的數據被選做驗證數據。 ^[4]

k折交叉驗證

k折交叉驗證（英語：k-fold cross-validation），將訓練集分割成k個子樣本，一個單獨的子樣本被保留作為驗證模型的數據，其他k − 1個樣本用來訓練。交叉驗證重複k次，每個子樣本驗證一次，平均k次的結果或者使用其它結合方式，最終得到一個單一估測。這個方法的優勢在於，同時重複運用隨機產生的子樣本進行訓練和驗證，每次的結果驗證一次，10次交叉驗證是最常用的。

留一驗證

正如名稱所建議，留一驗證（英語：leave-one-out cross-validation, LOOCV）意指只使用原本樣本中的一項來當做驗證資料，而剩餘的則留下來當做訓練資料。這個步驟一直持續到每個樣本都被當做一次驗證資料。事實上，這等同於k折交叉驗證，其中k為原本樣本個數。^[5] 在某些情況下是存在有效率的演算法，如使用kernel regression 和吉洪諾夫正則化。

誤差估計

可以計算估計誤差。常見的誤差衡量標準是均方差和方根均方差，分別為交叉驗證的方差和標準差。

另見

參考文獻

↑ Kohavi, Ron. A study of cross-validation and bootstrap for accuracy estimation and model selection. Proceedings of the Fourteenth International Joint Conference on Artificial Intelligence. 1995, 2 (12): 1137–1143 [2008-07-14]. (Morgan Kaufmann, San Mateo)
↑ Chang, J., Luo, Y., and Su, K. 1992. GPSM: a Generalized Probabilistic Semantic Model for ambiguity resolution. In Proceedings of the 30th Annual Meeting on Association For Computational Linguistics (Newark, Delaware, June 28 - July 02, 1992). Annual Meeting of the ACL. Association for Computational Linguistics, Morristown, NJ, 177-184
↑ Devijver, P. A., and J. Kittler, Pattern Recognition: A Statistical Approach, Prentice-Hall, London, 1982
↑ Tutorial 12. Decision Trees Interactive Tutorial and Resources. [2006-06-21].
↑ Elements of Statistical Learning: data mining, inference, and prediction. 2nd Edition.. web.stanford.edu. [2019-04-04].

外部連結

Naive Bayes implementation with cross-validation in Visual Basic (includes executable and source code)
A generic k-fold cross-validation implementation (free open source; includes a distributed version that can utilize multiple computers and in principle can speed up the running time by several orders of magnitude.)

[Kohavi95-1] Kohavi, Ron. A study of cross-validation and bootstrap for accuracy estimation and model selection. Proceedings of the Fourteenth International Joint Conference on Artificial Intelligence. 1995, 2 (12): 1137–1143 [2008-07-14]. (Morgan Kaufmann, San Mateo)

[Chang92-2] Chang, J., Luo, Y., and Su, K. 1992. GPSM: a Generalized Probabilistic Semantic Model for ambiguity resolution. In Proceedings of the 30th Annual Meeting on Association For Computational Linguistics (Newark, Delaware, June 28 - July 02, 1992). Annual Meeting of the ACL. Association for Computational Linguistics, Morristown, NJ, 177-184

[Devijver82-3] Devijver, P. A., and J. Kittler, Pattern Recognition: A Statistical Approach, Prentice-Hall, London, 1982

[4] Tutorial 12. Decision Trees Interactive Tutorial and Resources. [2006-06-21].

[5] Elements of Statistical Learning: data mining, inference, and prediction. 2nd Edition.. web.stanford.edu. [2019-04-04].

[1]

[2]

[3]

[4]

[5]