How to format a number as percentage in R?
令我困惑的R之一是如何格式化数字以打印百分比。
例如,将
1 2 3 4 5 6 7 8 | set.seed(1) m <- runif(5) paste(round(100*m, 2),"%", sep="") [1]"26.55%""37.21%""57.29%""90.82%""20.17%" sprintf("%1.2f%%", 100*m) [1]"26.55%""37.21%""57.29%""90.82%""20.17%" |
问题:是否有基本的R函数可以执行此操作? 另外,是否有使用广泛的包装提供方便的包装?
尽管在
甚至更晚:
如@DzimitryM所指出的,
1 2 3 4 | library(scales) x <- c(-1, 0, 0.1, 0.555555, 1, 100) label_percent()(x) ## [1]"-100%" "0%" "10%" "56%" "100%" "10 000%" |
通过在第一组括号内添加参数来对此进行自定义。
1 2 3 | label_percent(big.mark =",", suffix =" percent")(x) ## [1]"-100 percent" "0 percent" "10 percent" ## [4]"56 percent" "100 percent" "10,000 percent" |
几年后的更新:
这些天,
尝试类似
1 2 3 | percent <- function(x, digits = 2, format ="f", ...) { paste0(formatC(100 * x, format = format, digits = digits, ...),"%") } |
随着使用,例如
1 2 | x <- c(-1, 0, 0.1, 0.555555, 1, 100) percent(x) |
(如果愿意,可以将格式从
签出
1 2 3 | library('scales') percent((1:10) / 100) # [1]"1%" "2%" "3%" "4%" "5%" "6%" "7%" "8%" "9%" "10%" |
用于检测精度的内置逻辑在大多数情况下应该可以正常工作。
1 2 3 4 5 6 7 8 9 10 11 | percent((1:10) / 1000) # [1]"0.1%""0.2%""0.3%""0.4%""0.5%""0.6%""0.7%""0.8%""0.9%""1.0%" percent((1:10) / 100000) # [1]"0.001%""0.002%""0.003%""0.004%""0.005%""0.006%""0.007%""0.008%" # [9]"0.009%""0.010%" percent(sqrt(seq(0, 1, by=0.1))) # [1]"0%" "32%" "45%" "55%" "63%" "71%" "77%" "84%" "89%" "95%" # [11]"100%" percent(seq(0, 0.1, by=0.01) ** 2) # [1]"0.00%""0.01%""0.04%""0.09%""0.16%""0.25%""0.36%""0.49%""0.64%" # [10]"0.81%""1.00%" |
从
1 2 3 4 | library(formattable) x <- c(0.23, 0.95, 0.3) percent(x) [1] 23.00% 95.00% 30.00% |
我对这些答案的速度进行了一些基准测试,并惊讶地发现
以下是尝试将(0,1)中的100,000个百分比的格式设置为2位数字的百分比的结果:
1 2 3 4 5 6 7 8 9 | library(microbenchmark) x = runif(1e5) microbenchmark(times = 100L, andrie1(), andrie2(), richie(), krlmlr()) # Unit: milliseconds # expr min lq mean median uq max # 1 andrie1() 91.08811 95.51952 99.54368 97.39548 102.75665 126.54918 #paste(round()) # 2 andrie2() 43.75678 45.56284 49.20919 47.42042 51.23483 69.10444 #sprintf() # 3 richie() 79.35606 82.30379 87.29905 84.47743 90.38425 112.22889 #paste(formatC()) # 4 krlmlr() 243.19699 267.74435 304.16202 280.28878 311.41978 534.55904 #scales::percent() |
因此,当我们要添加百分号时,
1 2 3 4 5 | # Unit: milliseconds # expr min lq mean median uq max # 1 andrie1() 4.43576 4.514349 4.583014 4.547911 4.640199 4.939159 # round() # 2 andrie2() 42.26545 42.462963 43.229595 42.960719 43.642912 47.344517 # sprintf() # 3 richie() 64.99420 65.872592 67.480730 66.731730 67.950658 96.722691 # formatC() |
您可以将scales软件包仅用于此操作(无需将其与require或库一起加载)
1 | scales::percent(m) |
这是定义新函数的解决方案(主要是让我可以使用Curry和Compose :-)):
1 2 | library(roxygen) printpct <- Compose(function(x) x*100, Curry(sprintf,fmt="%1.2f%%")) |
看到
1 2 3 4 5 6 7 8 9 10 11 | library(microbenchmark) library(scales) library(formattable) x<-runif(1e5) lilip <- function() formattable::percent(x,2) krlmlr <- function() scales::percent(x) andrie1 <- function() paste0(round(x,4) * 100, '%') microbenchmark(times=100L,lilip(), krlmlr(), andrie1()) |
这些是我得到的结果:
1 2 3 4 5 | Unit: microseconds expr min lq mean median uq max neval lilip() 194.562 373.7335 772.5663 889.7045 950.4035 1611.537 100 krlmlr() 226270.845 237985.6560 260194.9269 251581.0235 280704.2320 373022.180 100 andrie1() 87916.021 90437.4820 92791.8923 92636.8420 94448.7040 102543.252 100 |
但是,我不知道为什么我的
1 2 3 4 5 6 7 8 9 10 11 12 13 | > library(tidyverse) > set.seed(1) > m <- runif(5) > dt <- as.data.frame(m) > dt %>% mutate(perc=scales::percent(m,accuracy=0.001)) m perc 1 0.2655087 26.551% 2 0.3721239 37.212% 3 0.5728534 57.285% 4 0.9082078 90.821% 5 0.2016819 20.168% |
看起来像往常一样整洁。
此功能可以按列将数据转换为百分比
1 2 3 4 5 6 7 8 9 10 | percent.colmns = function(base, columnas = 1:ncol(base), filas = 1:nrow(base)){ base2 = base for(j in columnas){ suma.c = sum(base[,j]) for(i in filas){ base2[i,j] = base[i,j]*100/suma.c } } return(base2) } |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 | try this~ data_format <- function(data,digit=2,type='%'){ if(type=='d') { type = 'f'; digit = 0; } switch(type, '%' = {format <- paste("%.", digit,"f%", type, sep='');num <- 100}, 'f' = {format <- paste("%.", digit, type, sep='');num <- 1}, cat(type,"is not a recognized type\ ") ) sprintf(format, num * data) } |