kernlab - Invalid probability model for large support vector machines using ksvm in R -
i train support vector machines using ksvm
function kernlab package in r, on large numbers of observations (300k) not many features (1-8). want use resulting probability model, large data sets, resulting probability model has unexpected format.
this should happen:
n <- 1000 df <- data.frame(label=c(rep("x",n),rep("y",n)),value=c(runif(n),runif(n)+2)) m <- ksvm(label~value,df,prob.model=true) > prob.model(m) [[1]] [[1]]$a [1] -6.836228 [[1]]$b [1] 0.003163229
however, large values of n
(e.g. 100k; beware of high memory usage , long execution times), value of prob.model(m)[[1]]
numeric vector of length 2n
, seemingly likelihood each observation in df
. cause this?
session info:
r version 2.15.2 (2012-10-26) platform: x86_64-unknown-linux-gnu (64-bit) locale: [1] lc_ctype=en_us.utf-8 lc_numeric=c lc_time=en_us.utf-8 lc_collate=en_us.utf-8 lc_monetary=en_us.utf-8 lc_messages=en_us.utf-8 lc_paper=c lc_name=c lc_address=c [10] lc_telephone=c lc_measurement=en_us.utf-8 lc_identification=c attached base packages: [1] graphics grdevices datasets utils stats methods base other attached packages: [1] kernlab_0.9-16 e1071_1.6-1 class_7.3-5 data.table_1.8.8 loaded via namespace (and not attached): [1] tools_2.15.2
edit: classification task i'm talking about, df
has following form:
label value "x" 0.21 ... "x" -1.20 "y" 2.42 ...
the origin of problem indicated following error message:
line search fails
a more specific question, including original data frame used, here: line search fails in training ksvm prob.model.
Comments
Post a Comment