How to recover data points after applying which.min function in r
我有两个矩阵,分别是矩阵A和矩阵B
矩阵A:
1 2 3 4 5 6 7 8 9 10 11 | [,1][,2] [1,] 1 1 [2,] 1 2 [3,] 2 1 [4,] 2 2 [5,] 10 1 [6,] 10 2 [7,] 11 1 [8,] 11 2 [9,] 5 5 [10,] 5 6 |
矩阵B:
1 2 | [,1][,2][,3][,4][,5][,6] [1,] 2 1 5 5 10 1 |
对于矩阵A中的每一行,我计算到矩阵B中每两列的欧式距离。
例如,要在结果矩阵中获取以下答案:
1 2 | [,1] [1,] |
计算公式为:
1 2 3 4 5 6 7 8 9 | A(1,1) - From Matrix A B(2,1) - From Matrix B = sqrt((xA -xB)^2 + (yA -yB)^2) = sqrt((1-2)^2 + (1-1)^2) = 1.00 xA and yA from Matrix A xB and yB from Matrix B |
要在结果矩阵中获得以下答案:
1 2 | [,2] [1,] 5.66 |
计算公式为:
1 2 3 4 5 6 | A(1,1) - From Matrix A B(5,5) - From Matrix B = sqrt((xA -xB)^2 + (yA -yB)^2) = sqrt((1-5)^2 + (1-5)^2) = 5.66 |
要在结果矩阵中获得以下答案:
1 2 | [,3] [1,] 9.00 |
计算公式为:
1 2 3 4 5 6 | A(1,1) - From Matrix A B(10,1) - From Matrix B = sqrt((xA -xB)^2 + (yA -yB)^2) = sqrt((1-10)^2 + (1-1)^2) = 9.00 |
获得所有距离后,按照如下信息将其存储在距离矩阵中:
1 2 3 4 5 6 7 8 9 10 11 12 13 | Distance matrix (the answer for the euclidean distance): [1,] [,2] [,3] [1,] 1.00 5.66 9.00 [2,] 1.00 1.41 [3,] [4,] [5,] [7,] [8,] [9,] [10] |
然后我根据每行的最小距离进行分组,以了解每行是否属于组1、2或3。总共有3组。例如,如果我得到以下组,如何恢复到来自矩阵A的数据点?
1 2 | > groupings <- apply(distanceMatrix, 1, which.min) > [1] 1 1 1 1 3 2 3 2 1 1 |
例如前四行属于组1,第五点属于组3,依此类推。但是,如果我重新安排答案并将第1组,第2组和第3组一起分组,矩阵A的位置将发生变化。所以我怎样才能正确地从矩阵A中得到该点呢?
我不确定您的预期输出是什么,但是以下两个选项之一可能会有所帮助?
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 | # sample data A = as.matrix(read.table(text="1 1 1 2 2 1 2 2 10 1 10 2 11 1 11 2 5 5 5 6",header=F)) B = c(2, 1, 5, 5, 10, 1) B = matrix(B,3,byrow = T) # compute minimum distance distancematrix = t(apply(A, 1,function(y) {apply(B,1,function(x) {dist(rbind(x,y))})})) # option 1 A_df = as.data.frame(A) A_df$group = apply(distancematrix,1,which.min) A_df[order(A_df$group),] # option 2 split(as.data.frame(A),apply(distancematrix,1,which.min)) |
输出选项1:
1 2 3 4 5 6 7 8 9 10 11 | V1 V2 group 1 1 1 1 2 1 2 1 3 2 1 1 4 2 2 1 9 5 5 2 10 5 6 2 5 10 1 3 6 10 2 3 7 11 1 3 8 11 2 3 |
输出选项2:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 | $`1` V1 V2 1 1 1 2 1 2 3 2 1 4 2 2 $`2` V1 V2 9 5 5 10 5 6 $`3` V1 V2 5 10 1 6 10 2 7 11 1 8 11 2 |