Creating a dictionary from a csv file?
我正在尝试从csv文件创建字典。 csv文件的第一列包含唯一键,第二列包含值。 csv文件的每一行代表字典中的唯一键,值对。 我试图使用
1 2 3 4 5 6 7 8 9 10 11 | import csv with open('coors.csv', mode='r') as infile: reader = csv.reader(infile) with open('coors_new.csv', mode='w') as outfile: writer = csv.writer(outfile) for rows in reader: k = rows[0] v = rows[1] mydict = {k:v for k, v in rows} print(mydict) |
当我运行上面的代码时,我得到一个
我相信您正在寻找的语法如下:
1 2 3 4 5 | with open('coors.csv', mode='r') as infile: reader = csv.reader(infile) with open('coors_new.csv', mode='w') as outfile: writer = csv.writer(outfile) mydict = {rows[0]:rows[1] for rows in reader} |
或者,对于python <= 2.7.1,您需要:
1 | mydict = dict((rows[0],rows[1]) for rows in reader) |
通过依次调用open和
1 | input_file = csv.DictReader(open("coors.csv")) |
您可以通过遍历input_file遍历csv文件dict阅读器对象的行。
1 2 | for row in input_file: print row |
要么
仅访问第一行
1 | dictobj = csv.DictReader(open('coors.csv')).next() |
1 2 3 4 5 6 | import csv reader = csv.reader(open('filename.csv', 'r')) d = {} for row in reader: k, v = row d[k] = v |
这不是很优雅,但使用熊猫的一线解决方案。
1 2 | import pandas as pd pd.read_csv('coors.csv', header=None, index_col=0, squeeze=True).to_dict() |
如果要为索引指定dtype(如果由于错误而使用index_col参数,则无法在read_csv中指定该类型):
1 2 | import pandas as pd pd.read_csv('coors.csv', header=None, dtype={0: str}).set_index(0).squeeze().to_dict() |
您只需要将csv.reader转换为dict:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 | ~ >> cat > 1.csv key1, value1 key2, value2 key2, value22 key3, value3 ~ >> cat > d.py import csv with open('1.csv') as f: d = dict(filter(None, csv.reader(f))) print(d) ~ >> python d.py {'key3': ' value3', 'key2': ' value22', 'key1': ' value1'} |
您也可以为此使用numpy。
1 2 3 | from numpy import loadtxt key_value = loadtxt("filename.csv", delimiter=",") mydict = { k:v for k,v in key_value } |
我建议添加
1 2 3 4 5 6 | import csv with open('coors.csv', mode='r') as infile: reader = csv.reader(infile) with open('coors_new.csv', mode='w') as outfile: writer = csv.writer(outfile) mydict = dict(row[:2] for row in reader if row) |
如果可以使用numpy包,则可以执行以下操作:
1 2 3 4 5 6 | import numpy as np lines = np.genfromtxt("coors.csv", delimiter=",", dtype=None) my_dict = dict() for i in range(len(lines)): my_dict[lines[i][0]] = lines[i][1] |
一线解决方案
1 2 3 | import pandas as pd dict = {row[0] : row[1] for _, row in pd.read_csv("file.csv").iterrows()} |
您可以使用它,这非常酷:
1 2 3 4 5 6 | import dataconverters.commas as commas filename = 'test.csv' with open(filename) as f: records, metadata = commas.parse(f) for row in records: print 'this is row in dictionary:'+rowenter code here |
例如,使用熊猫要容易得多。
假设您拥有以下数据作为CSV并将其称为
1 2 3 | a,b,c,d 1,2,3,4 5,6,7,8 |
现在正在使用熊猫
1 2 3 | import pandas as pd df = pd.read_csv("./text.txt") df_to_doct = df.to_dict() |
对于每一行,它将是
1 | df.to_dict(orient='records') |
就是这样。
对于简单的csv文件,例如以下内容
1 2 3 4 5 | id,col1,col2,col3 row1,r1c1,r1c2,r1c3 row2,r2c1,r2c2,r2c3 row3,r3c1,r3c2,r3c3 row4,r4c1,r4c2,r4c3 |
您可以仅使用内置功能将其转换为Python字典
1 2 3 4 5 6 7 8 | with open(csv_file) as f: csv_list = [[val.strip() for val in r.split(",")] for r in f.readlines()] (_, *header), *data = csv_list csv_dict = {} for row in data: key, *values = row csv_dict[key] = {key: value for key, value in zip(header, values)} |
这应该产生以下字典
1 2 3 4 | {'row1': {'col1': 'r1c1', 'col2': 'r1c2', 'col3': 'r1c3'}, 'row2': {'col1': 'r2c1', 'col2': 'r2c2', 'col3': 'r2c3'}, 'row3': {'col1': 'r3c1', 'col2': 'r3c2', 'col3': 'r3c3'}, 'row4': {'col1': 'r4c1', 'col2': 'r4c2', 'col3': 'r4c3'}} |
注意:Python字典具有唯一键,因此,如果csv文件具有重复的
1 2 3 4 5 6 7 | for row in data: key, *values = row if key not in csv_dict: csv_dict[key] = [] csv_dict[key].append({key: value for key, value in zip(header, values)}) |
已经发布了许多解决方案,我想为我的做出贡献,该解决方案适用于CSV文件中不同数量的列。
它创建一个字典,每列一个键,每个键的值是一个列表,其中包含该列中的元素。
1 2 3 4 5 | input_file = csv.DictReader(open(path_to_csv_file)) csv_dict = {elem: [] for elem in input_file.fieldnames} for row in input_file: for key in csv_dict.keys(): csv_dict[key].append(row[key]) |
尝试使用
1 2 3 4 5 6 7 8 9 | import csv from collections import defaultdict my_dict = defaultdict(list) with open('filename.csv', 'r') as csv_file: csv_reader = csv.DictReader(csv_file) for line in csv_reader: for key, value in line.items(): my_dict[key].append(value) |
它返回:
1 | {'key1':[value_1, value_2, value_3], 'key2': [value_a, value_b, value_c], 'Key3':[value_x, Value_y, Value_z]} |