关于bash:根据公共列追加两个文件

Appending two files based on common column

我知道这个问题可能会重复,但是我并不确定我要寻找的是什么。所以,这是我的必要。我想在两个文件之间匹配column-1,并将file2.txt的column-4追加到file1.txt。如果某列不匹配,我想将" 0"作为最后一个字段附加到file1.txt

我直接从NSE网站获取了两个文件。

file1.txt的数据如下:

1
2
3
20MICRONS,20170207,41.4,41.75,40.75,40.95,74624
3IINFOTECH,20170207,5.5,5.65,5.5,5.6,2679590
3MINDIA,20170207,11865.7,11919.95,11632.05,11892.25,425

,依此类推。这是我的主文件,因此应保留file1.txt中的所有行。

file2.txt的数据如下:

1
2
20MICRONS,EQ,57597,77.18
3IINFOTECH,EQ,1795693,67.01

依此类推...

请注意,两个文件都不能包含相同的行数。

我的输出文件可能如下所示,

1
2
3
20MICRONS,20170207,41.4,41.75,40.75,40.95,74624,77.18
3IINFOTECH,20170207,5.5,5.65,5.5,5.6,2679590,67.01
3MINDIA,20170207,11865.7,11919.95,11632.05,11892.25,425,0

我尝试了这个,

1
awk -F, 'NR==FNR{a[$1]=$0; next} {print a[$1]","$4}' file1.txt file2.txt

但不会将file1.txt的整个行作为输出。


您可以使用join

1
join -a 1 -e 0 -t ',' -j 1 -o '1.1 1.2 1.3 1.4 1.5 1.6 1.7 2.4' file1 file2

如果文件没有从头开始排序,则必须按照以下顺序对其进行排序:

1
2
3
join -a1 -e0 -t, -j1 -o '1.1 1.2 1.3 1.4 1.5 1.6 1.7 2.4' \\
    <(sort -k1 -t, file1) \\
    <(sort -k1 -t, file2)

这样的事情应该起作用(假设列1中的条目在两个文件中仅出现一次):

1
2
3
4
5
6
7
8
9
10
while read line;
do
   col1=$(echo $line | awk -F, '{print $1}');
   col4=$(grep $col1 file2.txt | awk -F, '{print $4}');
   echo $line,${col4:=0};
done < file1.txt

20MICRONS,20170207,41.4,41.75,40.75,40.95,74624,77.18
3IINFOTECH,20170207,5.5,5.65,5.5,5.6,2679590,67.01
3MINDIA,20170207,11865.7,11919.95,11632.05,11892.25,425,0