关于 linux:unix – cut 命令(添加自己的分隔符)

unix - cut command (adding own delimiter)

给定一个包含这样数据的文件(即 stores.dat 文件)

1
2
3

id storeNo type
2ttfgdhdfgh 1gfdkl-28 kgdl
9dhfdhfdfh 2t-33gdm dgjkfndkgf

所需的输出：

1
2
3

想在这 3 个剪切范围中的每一个之间添加一个 "|" 分隔符：

1	cut -c1-18,19-30,31-40 stores.dat

在每个剪切之间插入分隔符的语法是什么？

BONUS pts(如果您可以提供像这样修剪值的选项)：

1
2
3

更新(感谢 Mat\\'s answer)我最终在这个解决方案上取得了成功 -(它有点乱，但是 SunOS 和我的 bash 版本似乎不支持更优雅的算术)

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17

#!/bin/bash
unpack=""
filename="$1"
while [ $# -gt 0 ] ; do
arg="$1"
if ["$arg" !="$filename" ]
then
firstcharpos=`echo $arg | awk -F"-" '{print $1}'`
secondcharpos=`echo $arg | awk -F"-" '{print $2}'`
compute=`(expr $firstcharpos - $secondcharpos)`
compute=`(expr $compute \\* -1 + 1)`
unpack=$unpack"A"$compute
fi
shift
done
perl -ne 'print join("|",unpack("'$unpack'", $_)),"\
";' $filename

用法：sh test.sh input_file 1-17 18-29 30-39

相关讨论

因为您在示例中使用了 cut。
假设每个字段用制表符分隔：

1
2
3
4

如果不是这种情况，请添加输入分隔符开关 -d

相关讨论

我会使用 awk:

1	awk '{print $1"\|" $2"\|" $3}'

与其他一些建议一样，它假定列以空格分隔，并且不关心列号。如果您在其中一个字段中有空格，它将不起作用。

相关讨论

基于字符位置的更好的 awk 解决方案，而不是空格

1
2
3
4
5

$ awk -v FIELDWIDTHS='17 12 10' -v OFS='|' '{ $1=$1""; print }' stores.dat | tr -d ' '

id|storeNo|type
2ttfgdhdfgh|1gfdkl-28|kgdl
9dhfdhfdfh|2t-33gdm|dgjkfndkgf

如果你不怕使用 perl，这里有一个单行：

1 2	$ perl -ne 'print join("\|",unpack("A17A12A10", $_)),"\ ";' input

unpack 调用将从输入行中提取一个 17 个字符的字符串，然后是一个 12 个字符的字符串，然后是一个 10 个字符的字符串，并将它们返回到一个数组中(去除空格)。 join 添加 |s.

如果您希望输入列采用 x-y 格式，而无需编写"真实"脚本，您可以像这样破解它(但它很丑)：

1
2
3
4
5
6
7
8
9
10
11

#!/bin/bash
unpack=""

while [ $# -gt 1 ] ; do
arg=$(($1))
shift
unpack=$unpack"A"$((-1*$arg+1))
done

perl -ne 'print join("|",unpack("'$unpack'", $_)),"\
";' $1

用法：t.sh 1-17 18-29 30-39 input_file.

相关讨论

你可以简单地使用

1	cat stores.dat \| tr -s ' ' '\|'

如何只使用 tr 命令。

1	tr -s"""\|" < stores.dat

来自 man 页面：

1
2
3
4

-s Squeeze multiple occurrences of the characters listed in the last
operand (either string1 or string2) in the input into a single
instance of the character. This occurs after all deletion and
translation is completed.

测试：

1
2
3
4
5
6
7
8
9

您可以轻松地将其重定向到这样的新文件 -

1	[jaypal:~/Temp] tr -s"""\|" < stores.dat > new.stores.dat

注意：正如 Mat 在评论中指出的那样，此解决方案假设每一列由一个或多个空格分隔，而不是由固定长度分隔。

据我所知，你不能用 cut 做到这一点，但你可以用 sed 轻松做到这一点，只要每列中的值永远不会有内部空格：

1	sed -e 's/ */\|/g'

编辑：如果文件格式是真正的固定列格式，并且您不想使用 Mat 所示的 perl，则可以使用 sed 来完成，但它不漂亮，因为 sed 不支持数字重复量词 (.{17})，所以你必须输入正确的点数：

1	sed -e 's/^\$.................\$\$............\$\$..........\$$/\\1\|\\2\|\\3/; s/ *\|/\|/g'

相关讨论

使用\\'sed\\'基于正则表达式搜索和替换文件的部分

用 infile1 中的 \\'|\\' 替换空格

1 2	sed -e 's/[ \\t\ ]/\|/g' infile1 > outfile3