Scatter plot of variables in a column of a tibble using dplyr & ggplot2
我有以下小技巧,希望用来绘制散点图(使用ggplot2),该散点图是通过基因匹配的AA_Colon与BA_Colon的logcpm值。
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 | gene sample logcpm <chr> <chr> <dbl> 1 ENSG00000169903 AA_Colon 0.31536340 2 ENSG00000145321 AA_Colon 0.19735593 3 ENSG00000171560 AA_Colon 0.00000000 4 ENSG00000171557 AA_Colon 0.19735593 5 ENSG00000106327 AA_Colon 0.06882901 6 ENSG00000228278 AA_Colon 0.13452328 7 ENSG00000138115 AA_Colon 0.31536340 8 ENSG00000148702 AA_Colon 0.00000000 9 ENSG00000140107 AA_Colon 0.00000000 10 ENSG00000197723 AA_Colon 0.00000000 11 ENSG00000169903 BA_Colon 1.14724849 12 ENSG00000145321 BA_Colon 0.08113901 13 ENSG00000171560 BA_Colon 0.36654820 14 ENSG00000171557 BA_Colon 0.23088996 15 ENSG00000106327 BA_Colon 0.08113901 16 ENSG00000228278 BA_Colon 0.08113901 17 ENSG00000138115 BA_Colon 0.42987550 18 ENSG00000148702 BA_Colon 0.00000000 19 ENSG00000140107 BA_Colon 0.00000000 20 ENSG00000197723 BA_Colon 0.08113901 |
当前,我正在(可以正常工作):
1 2 3 4 | tibble %>% spread(key = sample, value = logcpm) %>% ggplot(aes(x = AA_Colon, y = BA_Colon)) + geom_point() |
但是,我想知道是否还有一种更优雅的方法可以直接使用整齐的格式并提取两个矢量以进行绘制,而不是将数据分散到两列中。
当数据采用整洁格式时,ggplot给出的散点图只有两列点,一列用于AA_Colon,另一列用于BA_??Colon。
1 2 | ggplot(tibble, aes(x = sample, y = logcpm)) + geom_point() |
也许具有
1 2 3 | ggplot(tibble, aes(x = sample, y = logcpm)) + geom_boxplot() + geom_jitter(width = 0.3) |