Faster and precise way to count lines other than wc -l
通常我用
还有其他选择吗?
任何认真对待速度线计数的人都可以创建自己的实现:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 | #include <stdio.h> #include <string.h> #include <fcntl.h> #define BUFFER_SIZE (1024 * 16) char BUFFER[BUFFER_SIZE]; int main(int argc, char** argv) { unsigned int lines = 0; int fd, r; if (argc > 1) { char* file = argv[1]; if ((fd = open(file, O_RDONLY)) == -1) { fprintf(stderr,"Unable to open file "%s". ", file); return 1; } } else { fd = fileno(stdin); } while ((r = read(fd, BUFFER, BUFFER_SIZE)) > 0) { char* p = BUFFER; while ((p = memchr(p, ' ', (BUFFER + r) - p))) { ++p; ++lines; } } close(fd); if (r == -1) { fprintf(stderr,"Read error. "); return 1; } printf("%d ", lines); return 0; } |
用法
1 2 3 | a < input ... | a a file |
例子:
1 2 3 4 5 6 7 8 9 10 11 12 13 | # time ./wc temp.txt 10000000 real 0m0.115s user 0m0.102s sys 0m0.014s # time wc -l temp.txt 10000000 temp.txt real 0m0.120s user 0m0.103s sys 0m0.016s |
&使用
你可以试试
1 | sed -n '$=' file |
或者在Perl中有一种方法,将其另存为
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 | #!/usr/bin/perl use strict; use warnings; my $filename = <@ARGV>; my $lines = 0; my $buffer; open(FILE, $filename) or die"ERROR: Can not open file: $!"; while (sysread FILE, $buffer, 65536) { $lines += ($buffer =~ tr/ //); } close FILE; print"$lines "; |
像这样运行:
1 | wc.pl yourfile |
基本上,它一次以64kb的块读取文件,然后利用
尝试使用nl,看看会发生什么…
您可以使用
1 | awk 'END {print NR}' names.txt |
(或)使用
1 | CNT=0; while read -r LINE; do (( CNT++ )); done < names.txt; echo $CNT |
取决于打开文件的方式,但可能从stdin读取该文件会得到修复:
1 | wc -l < file |