关于python：NumPy数组不是JSON可序列化的

NumPy array is not JSON serializable

创建NumPy数组并将其另存为Django上下文变量后，加载网页时出现以下错误：

1	array([ 0, 239, 479, 717, 952, 1192, 1432, 1667], dtype=int64) is not JSON serializable

这是什么意思？

相关讨论

我定期" jsonify" np.arrays。尝试首先在数组上使用" .tolist()"方法，如下所示：

1
2
3
4
5
6
7

import numpy as np
import codecs, json

a = np.arange(10).reshape(2,5) # a 2 by 5 array
b = a.tolist() # nested lists with same data, indices
file_path ="/path.json" ## your path variable
json.dump(b, codecs.open(file_path, 'w', encoding='utf-8'), separators=(',', ':'), sort_keys=True, indent=4) ### this saves the array in .json format

为了" unjsonify"数组使用：

1
2
3

obj_text = codecs.open(file_path, 'r', encoding='utf-8').read()
b_new = json.loads(obj_text)
a_new = np.array(b_new)

相关讨论

将numpy.ndarray或任何嵌套列表组成的内容存储为JSON。

1
2
3
4
5
6
7
8
9
10

class NumpyEncoder(json.JSONEncoder):
def default(self, obj):
if isinstance(obj, np.ndarray):
return obj.tolist()
return json.JSONEncoder.default(self, obj)

a = np.array([[1, 2, 3], [4, 5, 6]])
print(a.shape)
json_dump = json.dumps({'a': a, 'aa': [2, (2, 3, 4), a], 'bb': [2]}, cls=NumpyEncoder)
print(json_dump)

将输出：

1 2	(2, 3) {"a": [[1, 2, 3], [4, 5, 6]],"aa": [2, [2, 3, 4], [[1, 2, 3], [4, 5, 6]]],"bb": [2]}

要从JSON还原：

1
2
3
4

json_load = json.loads(json_dump)
a_restored = np.asarray(json_load["a"])
print(a_restored)
print(a_restored.shape)

将输出：

1
2
3

[[1 2 3]
[4 5 6]]
(2, 3)

相关讨论

您可以使用熊猫：

1 2	import pandas as pd pd.Series(your_array).to_json(orient='values')

相关讨论

如果您在字典中嵌套了numpy数组，我找到了最佳解决方案：

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21

import json
import numpy as np

class NumpyEncoder(json.JSONEncoder):
""" Special json encoder for numpy types"""
def default(self, obj):
if isinstance(obj, (np.int_, np.intc, np.intp, np.int8,
np.int16, np.int32, np.int64, np.uint8,
np.uint16, np.uint32, np.uint64)):
return int(obj)
elif isinstance(obj, (np.float_, np.float16, np.float32,
np.float64)):
return float(obj)
elif isinstance(obj,(np.ndarray,)): #### This is the fix
return obj.tolist()
return json.JSONEncoder.default(self, obj)

dumped = json.dumps(data, cls=NumpyEncoder)

with open(path, 'w') as f:
json.dump(dumped, f)

感谢这个家伙。

相关讨论

其他一些numpy编码器似乎过于冗长。

使用json.dumps default kwarg：

default should be a function that gets called for objects that can’t otherwise be serialized.

在default函数中，检查对象是否来自numpy模块，如果是，则将ndarray.tolist用于ndarray或将.item用于任何其他numpy特定类型。

1
2
3
4
5
6
7
8
9
10
11

import numpy as np

def default(obj):
if type(obj).__module__ == np.__name__:
if isinstance(obj, np.ndarray):
return obj.tolist()
else:
return obj.item()
raise TypeError('Unknown type:', type(obj))

dumped = json.dumps(data, default=default)

默认情况下不支持此功能，但是您可以使其轻松工作！如果您想返回完全相同的数据，则需要对几件事进行编码：

数据本身，您可以使用obj.tolist()获得@travelingbones。有时这可能已经足够了。
数据类型。我觉得在某些情况下这很重要。
如果假设输入实际上始终是"矩形"网格，则可以从上面得出尺寸(不一定是2D)。
内存顺序(行或列为主)。这通常并不重要，但有时却很重要(例如性能)，那么为什么不保存所有内容呢？

此外，您的numpy数组可能是数据结构的一部分，例如您有一个包含一些矩阵的列表。为此，您可以使用自定义编码器，基本上可以完成上述操作。

这应该足以实施解决方案。或者，您可以使用json-tricks来做到这一点(并支持其他各种类型)(免责声明：我做到了)。

1	pip install json-tricks

然后

1
2
3
4
5
6
7
8
9
10
11

data = [
arange(0, 10, 1, dtype=int).reshape((2, 5)),
datetime(year=2017, month=1, day=19, hour=23, minute=00, second=00),
1 + 2j,
Decimal(42),
Fraction(1, 3),
MyTestCls(s='ub', dct={'7': 7}), # see later
set(range(7)),
]
# Encode with metadata to preserve types when decoding
print(dumps(data))

您还可以使用default自变量，例如：

1
2
3
4
5

def myconverter(o):
if isinstance(o, np.float32):
return float(o)

json.dump(data, default=myconverter)

嵌套字典中有一些numpy.ndarrays，我也遇到了类似的问题。

1
2
3
4
5
6
7
8
9
10
11
12
13

def jsonify(data):
json_data = dict()
for key, value in data.iteritems():
if isinstance(value, list): # for lists
value = [ jsonify(item) if isinstance(item, dict) else item for item in value ]
if isinstance(value, dict): # for nested lists
value = jsonify(value)
if isinstance(key, int): # if key is integer: > to string
key = str(key)
if type(value).__module__=='numpy': # if value is numpy.*: > to python list
value = value.tolist()
json_data[key] = value
return json_data

TypeError：array([[0.46872085，0.67374235，1.0218339，0.13210179，0.5440686，0.9140083，0.58720225，0.2199381]]，dtype = float32)不是JSON可序列化的

当我期望以json格式响应时，尝试将数据列表传递给model.predict()时，抛出了上述错误。

1
2
3
4
5
6
7
8
9
10
11

> 1 json_file = open('model.json','r')
> 2 loaded_model_json = json_file.read()
> 3 json_file.close()
> 4 loaded_model = model_from_json(loaded_model_json)
> 5 #load weights into new model
> 6 loaded_model.load_weights("model.h5")
> 7 loaded_model.compile(optimizer='adam', loss='mean_squared_error')
> 8 X = [[874,12450,678,0.922500,0.113569]]
> 9 d = pd.DataFrame(X)
> 10 prediction = loaded_model.predict(d)
> 11 return jsonify(prediction)

但是幸运的是找到了解决抛出错误的提示
对象的序列化仅适用于以下转换
映射应采用以下方式
对象-dict
数组列表
字符串-字符串
整数-整数

如果向上滚动以查看行号10
预测= loading_model.predict(d)此行代码在其中生成输出的位置
类型为数组数据类型，当您尝试将数组转换为json格式时，它是不可能的

最后，我找到了解决方案，方法是将获得的输出转换为类型列表
以下代码行

prediction = loaded_model.predict(d)
listtype = prediction.tolist()
return jsonify(listtype)

Bhoom！终于得到了预期的输出，
enter image description here

可以使用检查类型来简化循环：

1
2
3
4
5
6
7
8
9
10

with open("jsondontdoit.json", 'w') as fp:
for key in bests.keys():
if type(bests[key]) == np.ndarray:
bests[key] = bests[key].tolist()
continue
for idx in bests[key]:
if type(bests[key][idx]) == np.ndarray:
bests[key][idx] = bests[key][idx].tolist()
json.dump(bests, fp)
fp.close()

这是一个不同的答案，但这可能有助于帮助试图保存数据然后再次读取数据的人。
有一个比泡菜快和容易的hi。
我试图保存它并在pickle dump中阅读，但是阅读时有很多问题，浪费了一个小时，尽管我正在处理自己的数据来创建聊天机器人，但仍然找不到解决方案。

vec_x和vec_y是numpy数组：

1 2	data=[vec_x,vec_y] hkl.dump( data, 'new_data_file.hkl' )

然后，您只需阅读并执行以下操作：

1	data2 = hkl.load( 'new_data_file.hkl' )

这是一个对我有用的实现，并删除了所有nan(假设它们是简单的对象(列表或字典))：

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28

from numpy import isnan

def remove_nans(my_obj, val=None):
if isinstance(my_obj, list):
for i, item in enumerate(my_obj):
if isinstance(item, list) or isinstance(item, dict):
my_obj[i] = remove_nans(my_obj[i], val=val)

else:
try:
if isnan(item):
my_obj[i] = val
except Exception:
pass

elif isinstance(my_obj, dict):
for key, item in my_obj.iteritems():
if isinstance(item, list) or isinstance(item, dict):
my_obj[key] = remove_nans(my_obj[key], val=val)

else:
try:
if isnan(item):
my_obj[key] = val
except Exception:
pass

return my_obj

此外，有关Python中的列表与数组的更多非常有趣的信息?> Python列表与数组-何时使用？

可以注意到，在将数组保存到JSON文件中之前将数组转换成列表之后，无论如何现在都在我的部署中，一旦读取了该JSON文件供以后使用，我就可以继续以列表形式使用它(如而不是将其转换回数组)。

这样，与屏幕上的列表(逗号分隔)和数组(非逗号分隔)相比，它实际上看起来更好(在我看来)。

使用上面的@travelingbones的.tolist()方法，我已经这样使用了(也发现了一些我发现的错误)：

保存词典

1
2
3
4

def writeDict(values, name):
writeName = DIR+name+'.json'
with open(writeName,"w") as outfile:
json.dump(values, outfile)

阅读词典

1
2
3
4
5
6
7
8
9
10
11
12

def readDict(name):
readName = DIR+name+'.json'
try:
with open(readName,"r") as infile:
dictValues = json.load(infile)
return(dictValues)
except IOError as e:
print(e)
return('None')
except ValueError as e:
print(e)
return('None')

希望这可以帮助！