关于csv:删除python中的特定行和对应文件

Delete specific row and corresponding file in python

我想删除90%的"转向"值等于0的行。这三个图像都有一个对应的图像文件,中间,左边和右边。我也要删除它们。csv文件如下:enter image description here

我编写了以下代码,以至少获取转向值为0的文件。我所需要的就是随机获取90%的文件并删除它们的代码。

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
with open('data/driving_log.csv') as csvfile:
    reader = csv.reader(csvfile)
    for i,line in enumerate(reader):
        lines.append(line)
        index.append(i)

lines = np.delete(lines,(0), axis = 0)
for i, line in enumerate(lines):
    #print(type(line[3].astype(np.float)))
    line_no.append(line[3].astype(np.float32))
    #print(line_no[i])
    if line_no[i]==0.0:
          # this gets the first column of the row.
        for j in range(3):
            source_path = line[j]
            filename = source_path.split('/')[-1]
            print(filename)
        count += 1


我认为这会满足你的需求:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
import csv
from random import randint
from os import remove

# Create a 2D list from which we can work with
lines = []
with open('data/driving_log.csv', newline='') as csvfile:
    reader = csv.reader(csvfile)
    for line in reader:
        lines.append(line)

# Find 10% of total lines (to keep), not including header row
numToKeep = round(sum(1 for i in lines if i[3] == '0') * 0.1)

# Save 10% of lines to a new 2D list
toKeep = []
for i in range(numToKeep):
    while True:
        index = randint(1, len(lines)-1)
        # Make sure we haven't already selected the same line
        if lines[index] not in toKeep and lines[index][3] == '0':
            toKeep.append(lines[index])
            break

# Deleting all files of the selected 90% of rows
for i, line in enumerate(lines):
    if i == 0:  # Omit the header row
        continue
    if lines[i][3] != '0':  # Keep rows that don't have a steering value of 0
        toKeep.append(lines[i])
    if line not in toKeep:
        print("Deleting: {}".format(line))
        for i in range(3):
            remove(line[i])

with open('data/driving_log.csv', 'w', newline='') as csvfile:
    writer = csv.writer(csvfile)
    writer.writerows([lines[0]])  # Put the header back in
    writer.writerows(toKeep)

我意识到这不是最优雅的解决方案。我对numpy不熟悉,现在也没有时间去学习,但这应该管用。