尝试使用类库实现 KNN 时出现“外部函数调用中的 NA/NaN/Inf”错误

问题描述 投票:0回答:1

我尝试使用类包来实现 knn,代码如下:



KNN_build <- knn(train = who_training, test = who_validation, 
                 cl = who_training$Status, k = 5  )

我收到的错误如下:

Warning in knn(train = who_training, test = who_validation, cl = who_training$Status,  :
  NAs introduced by coercion
Warning in knn(train = who_training, test = who_validation, cl = who_training$Status,  :
  NAs introduced by coercion
Error in knn(train = who_training, test = who_validation, cl = who_training$Status,  : 
  NA/NaN/Inf in foreign function call (arg 6)

这是我使用后正在处理的数据集的结构:str(who_training)

 $ Country                        : chr  "Kiribati" "Afghanistan" "El Salvador" "Guyana" ...
 $ Year                           : num [1:2938, 1] -1.196 -0.546 1.405 -0.979 -0.979 ...
 $ Status                         : Factor w/ 2 levels "Developed","Developing": 2 2 2 2 2 2 2 2 2 2 ...
 $ Life.expectancy                : num [1:2938, 1] -0.492 -1.262 0.427 -0.418 -1.104 ...
 $ Adult.Mortality                : num [1:2938, 1] 0.526 1.188 0.204 0.705 1.654 ...
 $ infant.deaths                  : num [1:2938, 1] -0.439 1.498 -0.393 -0.416 -0.279 ...
 $ Alcohol                        : num [1:2938, 1] -1.036 -1.157 -0.516 0.871 -1.019 ...
 $ percentage.expenditure         : num [1:2938, 1] -0.368 -0.435 0.208 -0.421 -0.43 ...
 $ Hepatitis.B                    : num [1:2938, 1] -0.206 -0.793 0.426 -3.366 0.336 ...
 $ Measles                        : num [1:2938, 1] -0.3043 0.0316 -0.3043 -0.3043 -0.2068 ...
 $ BMI                            : num [1:2938, 1] 1.571 -1.213 0.8537 -0.0292 -1.2581 ...
 $ under.five.deaths              : num [1:2938, 1] -0.447 1.493 -0.414 -0.43 -0.282 ...
 $ Polio                          : num [1:2938, 1] 0.597 -2.11 0.383 0.241 0.526 ...
 $ Total.expenditure              : num [1:2938, 1] 1.4574 1.2784 0.4148 0.0569 -1.0483 ...
 $ Diphtheria                     : num [1:2938, 1] -0.703 -1.994 0.452 0.385 0.385 ...
 $ HIV.AIDS                       : num [1:2938, 1] -0.406 -0.406 -0.364 0.442 0.357 ...
 $ GDP                            : num [1:2938, 1] -0.465 -0.552 -0.12 -0.448 -0.53 ...
 $ Population                     : num [1:2938, 1] -0.3631 -0.3547 -0.0603 -0.3306 -0.1846 ...
 $ thinness..1.19.years           : num [1:2938, 1] -1.153 -0.319 -0.777 0.353 1.401 ...
 $ thinness.5.9.years             : num [1:2938, 1] -1.148 -0.318 -0.773 0.298 1.396 ...
 $ Income.composition.of.resources: num [1:2938, 1] -3.0728 -1.1425 0.2225 -0.0944 -3.0728 ...
 $ Schooling                      : num [1:2938, 1] -0.14 -1.388 0.352 -0.337 -2.439 ...
 $ disease                        : num  252 1478 280 111 657 ...

还有其他数据集,

str(who_validation)
'data.frame':   439 obs. of  23 variables:
 $ Country                        : chr  "Niger" "Estonia" "Saudi Arabia" "Saudi Arabia" ...
 $ Year                           : num [1:439, 1] -0.546 0.104 1.621 0.971 -0.112 ...
  ..- attr(*, "dimnames")=List of 2
  .. ..$ : NULL
  .. ..$ : NULL
 $ Status                         : Factor w/ 2 levels "Developed","Developing": 2 2 2 2 2 2 2 2 2 2 ...
 $ Life.expectancy                : num [1:439, 1] -1.642 0.522 0.553 0.511 0.933 ...
  ..- attr(*, "dimnames")=List of 2
  .. ..$ : NULL
  .. ..$ : NULL
 $ Adult.Mortality                : num [1:439, 1] 1.0719 0.0785 -0.6285 -1.3355 -1.2908 ...
  ..- attr(*, "dimnames")=List of 2
  .. ..$ : NULL
  .. ..$ : NULL
 $ infant.deaths                  : num [1:439, 1] 0.815 -0.439 -0.279 -0.256 -0.416 ...
  ..- attr(*, "dimnames")=List of 2
  .. ..$ : NULL
  .. ..$ : NULL
 $ Alcohol                        : num [1:439, 1] -1.134 -0.199 -0.2 -1.139 -0.103 ...
  ..- attr(*, "dimnames")=List of 2
  .. ..$ : NULL
  .. ..$ : NULL
 $ percentage.expenditure         : num [1:439, 1] -0.433 -0.219 -0.437 -0.246 0.29 ...
  ..- attr(*, "dimnames")=List of 2
  .. ..$ : NULL
  .. ..$ : NULL
 $ Hepatitis.B                    : num [1:439, 1] 0.381 0.471 0.652 0.652 0.426 ...
  ..- attr(*, "dimnames")=List of 2
  .. ..$ : NULL
  .. ..$ : NULL
 $ Measles                        : num [1:439, 1] 0.262 -0.304 -0.248 -0.228 -0.304 ...
  ..- attr(*, "dimnames")=List of 2
  .. ..$ : NULL
  .. ..$ : NULL
 $ BMI                            : num [1:439, 1] -1.153 0.919 1.496 1.365 0.854 ...
  ..- attr(*, "dimnames")=List of 2
  .. ..$ : NULL
  .. ..$ : NULL
 $ under.five.deaths              : num [1:439, 1] 1.427 -0.447 -0.315 -0.299 -0.43 ...
  ..- attr(*, "dimnames")=List of 2
  .. ..$ : NULL
  .. ..$ : NULL
 $ Polio                          : num [1:439, 1] -2.965 0.526 0.668 0.739 0.811 ...
  ..- attr(*, "dimnames")=List of 2
  .. ..$ : NULL
  .. ..$ : NULL
 $ Total.expenditure              : num [1:439, 1] 0.5849 0.3388 -0.0404 -0.8872 -1.979 ...
  ..- attr(*, "dimnames")=List of 2
  .. ..$ : NULL
  .. ..$ : NULL
 $ Diphtheria                     : num [1:439, 1] -2.878 0.52 0.724 0.724 0.385 ...
  ..- attr(*, "dimnames")=List of 2
  .. ..$ : NULL
  .. ..$ : NULL
 $ HIV.AIDS                       : num [1:439, 1] 0.23 -0.406 -0.406 -0.406 -0.406 ...
  ..- attr(*, "dimnames")=List of 2
  .. ..$ : NULL
  .. ..$ : NULL
 $ GDP                            : num [1:439, 1] -0.5524 -0.3486 -0.2572 -0.2789 0.0101 ...
  ..- attr(*, "dimnames")=List of 2
  .. ..$ : NULL
  .. ..$ : NULL
 $ Population                     : num [1:439, 1] 0.298 -0.367 -0.3 -0.3 -0.3 ...
  ..- attr(*, "dimnames")=List of 2
  .. ..$ : NULL
  .. ..$ : NULL
 $ thinness..1.19.years           : num [1:439, 1] 1.993 -0.669 0.89 0.81 -0.293 ...
  ..- attr(*, "dimnames")=List of 2
  .. ..$ : NULL
  .. ..$ : NULL
 $ thinness.5.9.years             : num [1:439, 1] 1.959 -0.639 0.834 0.78 -0.344 ...
  ..- attr(*, "dimnames")=List of 2
  .. ..$ : NULL
  .. ..$ : NULL
 $ Income.composition.of.resources: num [1:439, 1] -1.718 0.998 1.046 0.915 0.603 ...
  ..- attr(*, "dimnames")=List of 2
  .. ..$ : NULL
  .. ..$ : NULL
 $ Schooling                      : num [1:439, 1] -2.833 1.305 1.305 0.779 1.272 ...
  ..- attr(*, "dimnames")=List of 2
  .. ..$ : NULL
  .. ..$ : NULL
 $ disease                        : num  297 284 512 588 285 ...

如果需要,我可以提供更多详细信息,例如我的训练数据框的 dput

dput(head(who_training))
structure(list(Country = structure(c(89L, 1L, 54L, 72L, 56L, 
193L), .Label = c("Afghanistan", "Albania", "Algeria", "Angola", 
"Antigua and Barbuda", "Argentina", "Armenia", "Australia", "Austria", 
"Azerbaijan", "Bahamas", "Bahrain", "Bangladesh", "Barbados", 
"Belarus", "Belgium", "Belize", "Benin", "Bhutan", "Bolivia (Plurinational State of)", 
"Bosnia and Herzegovina", "Botswana", "Brazil", "Brunei Darussalam", 
"Bulgaria", "Burkina Faso", "Burundi", "Cabo Verde", "Cambodia", 
"Cameroon", "Canada", "Central African Republic", "Chad", "Chile", 
"China", "Colombia", "Comoros", "Congo", "Cook Islands", "Costa Rica", 
"Côte d'Ivoire", "Croatia", "Cuba", "Cyprus", "Czechia", "Democratic People's Republic of Korea", 
"Democratic Republic of the Congo", "Denmark", "Djibouti", "Dominica", 
"Dominican Republic", "Ecuador", "Egypt", "El Salvador", "Equatorial Guinea", 
"Eritrea", "Estonia", "Ethiopia", "Fiji", "Finland", "France", 
"Gabon", "Gambia", "Georgia", "Germany", "Ghana", "Greece", "Grenada", 
"Guatemala", "Guinea", "Guinea-Bissau", "Guyana", "Haiti", "Honduras", 
"Hungary", "Iceland", "India", "Indonesia", "Iran (Islamic Republic of)", 
"Iraq", "Ireland", "Israel", "Italy", "Jamaica", "Japan", "Jordan", 
"Kazakhstan", "Kenya", "Kiribati", "Kuwait", "Kyrgyzstan", "Lao People's Democratic Republic", 
"Latvia", "Lebanon", "Lesotho", "Liberia", "Libya", "Lithuania", 
"Luxembourg", "Madagascar", "Malawi", "Malaysia", "Maldives", 
"Mali", "Malta", "Marshall Islands", "Mauritania", "Mauritius", 
"Mexico", "Micronesia (Federated States of)", "Monaco", "Mongolia", 
"Montenegro", "Morocco", "Mozambique", "Myanmar", "Namibia", 
"Nauru", "Nepal", "Netherlands", "New Zealand", "Nicaragua", 
"Niger", "Nigeria", "Niue", "Norway", "Oman", "Pakistan", "Palau", 
"Panama", "Papua New Guinea", "Paraguay", "Peru", "Philippines", 
"Poland", "Portugal", "Qatar", "Republic of Korea", "Republic of Moldova", 
"Romania", "Russian Federation", "Rwanda", "Saint Kitts and Nevis", 
"Saint Lucia", "Saint Vincent and the Grenadines", "Samoa", "San Marino", 
"Sao Tome and Principe", "Saudi Arabia", "Senegal", "Serbia", 
"Seychelles", "Sierra Leone", "Singapore", "Slovakia", "Slovenia", 
"Solomon Islands", "Somalia", "South Africa", "South Sudan", 
"Spain", "Sri Lanka", "Sudan", "Suriname", "Swaziland", "Sweden", 
"Switzerland", "Syrian Arab Republic", "Tajikistan", "Thailand", 
"The former Yugoslav republic of Macedonia", "Timor-Leste", "Togo", 
"Tonga", "Trinidad and Tobago", "Tunisia", "Turkey", "Turkmenistan", 
"Tuvalu", "Uganda", "Ukraine", "United Arab Emirates", "United Kingdom of Great Britain and Northern Ireland", 
"United Republic of Tanzania", "United States of America", "Uruguay", 
"Uzbekistan", "Vanuatu", "Venezuela (Bolivarian Republic of)", 
"Viet Nam", "Yemen", "Zambia", "Zimbabwe"), class = "factor"), 
    Year = structure(c(-1.19612277260829, -0.545905298957761, 
    1.40474712199383, -0.97938361472478, -0.97938361472478, 0.104312174692768
    ), .Dim = c(6L, 1L)), Status = structure(c(2L, 2L, 2L, 2L, 
    2L, 2L), .Label = c("Developed", "Developing"), class = "factor"), 
    Life.expectancy = structure(c(-0.491703688946568, -1.26227192428072, 
    0.426644755903723, -0.417813584188498, -1.10393598551343, 
    -2.22284328613562), .Dim = c(6L, 1L)), Adult.Mortality = structure(c(0.525991300297776, 
    1.18826519708731, 0.203803999156919, 0.704984245376029, 1.65364685429077, 
    -0.12733294923785), .Dim = c(6L, 1L)), infant.deaths = structure(c(-0.438727256213326, 
    1.49835585218564, -0.39314883013335, -0.415938043173338, 
    -0.279202764933411, 0.244949134986309), .Dim = c(6L, 1L)), 
    Alcohol = structure(c(-1.03648468049828, -1.15698267513806, 
    -0.516035895139232, 0.870972936778229, -1.01853817065831, 
    -0.24940203465972), .Dim = c(6L, 1L)), percentage.expenditure = structure(c(-0.368092282523513, 
    -0.435250731904734, 0.207934359477078, -0.420640377004664, 
    -0.429901306980893, -0.416415384868822), .Dim = c(6L, 1L)), 
    Hepatitis.B = structure(c(-0.20594618442662, -0.792822896206842, 
    0.426074889798235, -3.3660515555509, 0.33578616490897, -0.38652363420515
    ), .Dim = c(6L, 1L)), Measles = structure(c(-0.304292225859768, 
    0.0316262810969904, -0.304292225859768, -0.304292225859768, 
    -0.206834387421696, -0.304292225859768), .Dim = c(6L, 1L)), 
    BMI = structure(c(1.57101747462593, -1.21297832599103, 0.853699637710209, 
    -0.029153084647603, -1.25812420383887, -0.49064428042555), .Dim = c(6L, 
    1L)), under.five.deaths = structure(c(-0.446748080417218, 
    1.49318542883105, -0.413867851446909, -0.430307965932063, 
    -0.28234693556567, 0.309497185899902), .Dim = c(6L, 1L)), 
    Polio = structure(c(0.596897326350385, -2.10995531198027, 
    0.383198433850596, 0.240732505517404, 0.525664362183789, 
    -0.898994921148134), .Dim = c(6L, 1L)), Total.expenditure = structure(c(1.45740571947119, 
    1.27842521470575, 0.414844279212532, 0.0568832696816624, 
    -1.0483213472449, -0.39504250485106), .Dim = c(6L, 1L)), 
    Diphtheria = structure(c(-0.702868620889384, -1.9941482705615, 
    0.452486855133033, 0.384524768308185, 0.384524768308185, 
    -0.83879279453908), .Dim = c(6L, 1L)), HIV.AIDS = structure(c(-0.405940295110872, 
    -0.405940295110872, -0.363540899930249, 0.442047608501586, 
    0.35724881814034, -0.405940295110872), .Dim = c(6L, 1L)), 
    GDP = structure(c(-0.465047724667165, -0.552420239231645, 
    -0.12024411417243, -0.447578456170917, -0.529794471763561, 
    -0.519666432315853), .Dim = c(6L, 1L)), Population = structure(c(-0.363065262988643, 
    -0.354733142686027, -0.0602998996500711, -0.330582727476206, 
    -0.184602200482603, 0.295425496705686), .Dim = c(6L, 1L)), 
    thinness..1.19.years = structure(c(-1.15296800498538, -0.319469152262016, 
    -0.776549168271602, 0.352707341869728, 1.40130267271525, 
    0.890448537175124), .Dim = c(6L, 1L)), thinness.5.9.years = structure(c(-1.14791740179065, 
    -0.317661129245177, -0.772962956124954, 0.298335460062757, 
    1.39641633665516, 0.887549588965998), .Dim = c(6L, 1L)), 
    Income.composition.of.resources = structure(c(-3.07284300635863, 
    -1.14245031797641, 0.222473805122128, -0.0943835805971753, 
    -3.07284300635863, -1.02058209269975), .Dim = c(6L, 1L)), 
    Schooling = structure(c(-0.140241143049229, -1.38811570864962, 
    0.352340922319346, -0.337273969196659, -2.43895744810258, 
    -0.797017230207329), .Dim = c(6L, 1L)), disease = c(252.1, 
    1478.1, 280.2, 111.1, 656.9, 245.5)), row.names = c(1392L, 
11L, 820L, 1119L, 863L, 2930L), class = "data.frame")

以及我的验证数据框的 dput:

dput(head(who_validation))
structure(list(Country = structure(c(110L, 52L, 132L, 132L, 40L, 
60L), .Label = c("Afghanistan", "Albania", "Algeria", "Angola", 
"Antigua and Barbuda", "Argentina", "Armenia", "Australia", "Austria", 
"Azerbaijan", "Bahrain", "Bangladesh", "Barbados", "Belarus", 
"Belgium", "Belize", "Benin", "Bhutan", "Bolivia (Plurinational State of)", 
"Botswana", "Brazil", "Brunei Darussalam", "Bulgaria", "Burkina Faso", 
"Burundi", "Cabo Verde", "Cambodia", "Cameroon", "Canada", "Central African Republic", 
"Chad", "Chile", "China", "Colombia", "Comoros", "Congo", "Costa Rica", 
"Côte d'Ivoire", "Croatia", "Cuba", "Cyprus", "Czechia", "Democratic People's Republic of Korea", 
"Democratic Republic of the Congo", "Denmark", "Dominican Republic", 
"Ecuador", "Egypt", "El Salvador", "Equatorial Guinea", "Eritrea", 
"Estonia", "Ethiopia", "Finland", "Gabon", "Gambia", "Georgia", 
"Germany", "Ghana", "Greece", "Grenada", "Guatemala", "Guinea", 
"Guinea-Bissau", "Haiti", "Honduras", "Hungary", "Iceland", "India", 
"Indonesia", "Iran (Islamic Republic of)", "Iraq", "Ireland", 
"Israel", "Italy", "Jamaica", "Japan", "Jordan", "Kazakhstan", 
"Kenya", "Kiribati", "Kuwait", "Kyrgyzstan", "Lao People's Democratic Republic", 
"Latvia", "Lesotho", "Liberia", "Libya", "Lithuania", "Luxembourg", 
"Malawi", "Malaysia", "Maldives", "Mali", "Malta", "Mauritania", 
"Mauritius", "Mexico", "Micronesia (Federated States of)", "Mongolia", 
"Montenegro", "Morocco", "Mozambique", "Myanmar", "Namibia", 
"Nepal", "Netherlands", "New Zealand", "Nicaragua", "Niger", 
"Nigeria", "Norway", "Oman", "Pakistan", "Palau", "Panama", "Papua New Guinea", 
"Paraguay", "Peru", "Philippines", "Poland", "Portugal", "Qatar", 
"Republic of Moldova", "Romania", "Russian Federation", "Rwanda", 
"Saint Lucia", "Saint Vincent and the Grenadines", "Samoa", "Sao Tome and Principe", 
"Saudi Arabia", "Senegal", "Serbia", "Seychelles", "Sierra Leone", 
"Singapore", "Slovakia", "Slovenia", "Solomon Islands", "Somalia", 
"South Africa", "South Sudan", "Spain", "Sri Lanka", "Sudan", 
"Suriname", "Swaziland", "Sweden", "Switzerland", "Syrian Arab Republic", 
"Tajikistan", "Thailand", "The former Yugoslav republic of Macedonia", 
"Timor-Leste", "Tonga", "Trinidad and Tobago", "Tunisia", "Turkey", 
"Turkmenistan", "Uganda", "Ukraine", "United Arab Emirates", 
"United Kingdom of Great Britain and Northern Ireland", "United Republic of Tanzania", 
"United States of America", "Uruguay", "Uzbekistan", "Vanuatu", 
"Venezuela (Bolivarian Republic of)", "Yemen", "Zambia", "Zimbabwe"
), class = "factor"), Year = structure(c(-0.545905298957761, 
0.104312174692768, 1.62148627987734, 0.971268806226806, -0.112426983190742, 
-0.329166141074251), .Dim = c(6L, 1L), .Dimnames = list(NULL, 
    NULL)), Status = structure(c(2L, 2L, 2L, 2L, 2L, 2L), .Label = c("Developed", 
"Developing"), class = "factor"), Life.expectancy = structure(c(-1.64227817732222, 
0.521646319164099, 0.553313506917557, 0.511090589912945, 0.933319759959055, 
1.1022114279775), .Dim = c(6L, 1L), .Dimnames = list(NULL, NULL)), 
    Adult.Mortality = structure(c(1.07191978278645, 0.0785089376021415, 
    -0.62851319545696, -1.33553532851606, -1.2907870922465, -0.72695931525
    ), .Dim = c(6L, 1L), .Dimnames = list(NULL, NULL)), infant.deaths = structure(c(0.814679460986005, 
    -0.438727256213326, -0.279202764933411, -0.256413551893423, 
    -0.415938043173338, -0.438727256213326), .Dim = c(6L, 1L), .Dimnames = list(
        NULL, NULL)), Alcohol = structure(c(-1.1339085910581, 
    -0.199408185819812, -0.200049132599811, -1.13903616529809, 
    -0.103266168819988, 1.25297721765753), .Dim = c(6L, 1L), .Dimnames = list(
        NULL, NULL)), percentage.expenditure = structure(c(-0.432962903502602, 
    -0.218689034593455, -0.43659516554427, -0.246467036861867, 
    0.290217084455785, -0.128158007506263), .Dim = c(6L, 1L), .Dimnames = list(
        NULL, NULL)), Hepatitis.B = structure(c(0.380930527353603, 
    0.471219252242868, 0.651796702021398, 0.651796702021398, 
    0.426074889798235, 0.471219252242868), .Dim = c(6L, 1L), .Dimnames = list(
        NULL, NULL)), Measles = structure(c(0.261533469114463, 
    -0.304292225859768, -0.247528218897167, -0.228088490485318, 
    -0.304292225859768, -0.304292225859768), .Dim = c(6L, 1L), .Dimnames = list(
        NULL, NULL)), BMI = structure(c(-1.15278382219391, 0.918910350157092, 
    1.49577434487953, 1.36535291998576, 0.853699637710209, 1.14463973939631
    ), .Dim = c(6L, 1L), .Dimnames = list(NULL, NULL)), under.five.deaths = structure(c(1.42742497089043, 
    -0.446748080417218, -0.31522716453598, -0.298787050050825, 
    -0.430307965932063, -0.430307965932063), .Dim = c(6L, 1L), .Dimnames = list(
        NULL, NULL)), Polio = structure(c(-2.96475088197942, 
    0.525664362183789, 0.668130290516981, 0.739363254683577, 
    0.810596218850173, 0.739363254683577), .Dim = c(6L, 1L), .Dimnames = list(
        NULL, NULL)), Total.expenditure = structure(c(0.584875758739695, 
    0.338777564687222, -0.0404373797845428, -0.887238892956006, 
    -1.97901997202516, 1.56479402233045), .Dim = c(6L, 1L), .Dimnames = list(
        NULL, NULL)), Diphtheria = structure(c(-2.87765539928452, 
    0.520448941957881, 0.724335202432426, 0.724335202432426, 
    0.384524768308185, 0.724335202432426), .Dim = c(6L, 1L), .Dimnames = list(
        NULL, NULL)), HIV.AIDS = structure(c(0.230050632598472, 
    -0.405940295110872, -0.405940295110872, -0.405940295110872, 
    -0.405940295110872, -0.405940295110872), .Dim = c(6L, 1L), .Dimnames = list(
        NULL, NULL)), GDP = structure(c(-0.552410469161336, -0.348597444406105, 
    -0.257188085587563, -0.278877759218448, 0.0101370026695019, 
    -0.284633785448075), .Dim = c(6L, 1L), .Dimnames = list(NULL, 
        NULL)), Population = structure(c(0.298357417598302, -0.366680744764259, 
    -0.299596727877204, -0.299596727877204, -0.299596727877204, 
    -0.361842293185169), .Dim = c(6L, 1L), .Dimnames = list(NULL, 
        NULL)), thinness..1.19.years = structure(c(1.99281798755118, 
    -0.669000929210523, 0.890448537175124, 0.809787357879315, 
    -0.292582092496746, -0.99164564639376), .Dim = c(6L, 1L), .Dimnames = list(
        NULL, NULL)), thinness.5.9.years = structure(c(1.95884800515371, 
    -0.63905065410149, 0.833984668156613, 0.780419747347227, 
    -0.34444358964987, -1.01400509976719), .Dim = c(6L, 1L), .Dimnames = list(
        NULL, NULL)), Income.composition.of.resources = structure(c(-1.71766834128222, 
    0.997555717881655, 1.04630300799232, 0.91468532469353, 0.602702667985292, 
    1.07067665304765), .Dim = c(6L, 1L), .Dimnames = list(NULL, 
        NULL)), Schooling = structure(c(-2.83302310039744, 1.30466624869859, 
    1.30466624869859, 0.779245378972111, 1.27182744434069, 1.3703438574144
    ), .Dim = c(6L, 1L), .Dimnames = list(NULL, NULL)), disease = c(297.1, 
    284.1, 512.1, 588.1, 285.1, 290.1)), row.names = c(1888L, 
874L, 2234L, 2237L, 666L, 1036L), class = "data.frame")

我希望用我的数据集训练 KNN 算法,这样我就可以预测一组特定的数据点。

在处理数据之前,我通过以下代码估算了缺失值:

who_validation$disease <- ifelse(is.na(who_validation$disease), median(who_validation$disease, na.rm = TRUE), who_validation$disease) 

who_training$disease <- ifelse(is.na(who_training$disease), median(who_training$disease, na.rm = TRUE), who_training$disease) 

我尝试通过以下代码将相关变量转换为因子,以防它们干扰函数:

who_training$Status <- as.factor(who_training$Status)
who_validation$Status <- as.factor(who_validation$Status)

who_training$Country <- as.factor(who_training$Country)
who_validation$Country <- as.factor(who_validation$Country)

这似乎不起作用。我束手无策,试图找出数据帧的哪一部分导致了这个错误。有人有解决问题的想法吗?

r statistics knn
1个回答
0
投票

该错误抱怨 NA、NaN 和 Inf 值,而不仅仅是 NA。你用 is.nan() 和 is.infinite() 检查过它们吗?

这个答案还建议从测试和训练数据中删除标签:错误:用 R 编写的 knn 函数中的外部函数调用(arg 6)中的 NA/NaN/Inf。

如果这没有帮助,请提供一个最小的示例,以便我们可以测试它。

© www.soinside.com 2019 - 2024. All rights reserved.