关于python:将Dictionary的String表示转换为字典?

Convert a String representation of a Dictionary to a dictionary?

如何将dictstr表示(如以下字符串)转换为dict

1
s ="{'muffin' : 'lolz', 'foo' : 'kitty'}"

我不喜欢用eval。我还能用什么?

主要原因是,我的一个同事编写了类,将所有输入转换为字符串。我没有心情去修改他的课程,来处理这个问题。


从python 2.6开始,您可以使用内置的ast.literal_eval

1
2
3
>>> import ast
>>> ast.literal_eval("{'muffin' : 'lolz', 'foo' : 'kitty'}")
{'muffin': 'lolz', 'foo': 'kitty'}

这比使用eval更安全。正如它自己的文档所说:

1
2
3
4
5
6
7
8
>>> help(ast.literal_eval)
Help on function literal_eval in module ast:

literal_eval(node_or_string)
    Safely evaluate an expression node or a string containing a Python
    expression.  The string or node provided may only consist of the following
    Python literal structures: strings, numbers, tuples, lists, dicts, booleans,
    and None.

例如:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
>>> eval("shutil.rmtree('mongo')")
Traceback (most recent call last):
  File"<stdin>", line 1, in <module>
  File"<string>", line 1, in <module>
  File"/opt/Python-2.6.1/lib/python2.6/shutil.py", line 208, in rmtree
    onerror(os.listdir, path, sys.exc_info())
  File"/opt/Python-2.6.1/lib/python2.6/shutil.py", line 206, in rmtree
    names = os.listdir(path)
OSError: [Errno 2] No such file or directory: 'mongo'
>>> ast.literal_eval("shutil.rmtree('mongo')")
Traceback (most recent call last):
  File"<stdin>", line 1, in <module>
  File"/opt/Python-2.6.1/lib/python2.6/ast.py", line 68, in literal_eval
    return _convert(node_or_string)
  File"/opt/Python-2.6.1/lib/python2.6/ast.py", line 67, in _convert
    raise ValueError('malformed string')
ValueError: malformed string


http://docs.python.org/2/library/json.html网站

JSON可以解决这个问题,尽管它的解码器希望在键和值之间使用双引号。如果你不介意替换黑客…

1
2
3
4
5
import json
s ="{'muffin' : 'lolz', 'foo' : 'kitty'}"
json_acceptable_string = s.replace("'",""")
d = json.loads(json_acceptable_string)
# d = {u'muffin': u'lolz', u'foo': u'kitty'}

请注意,如果您的键或值中包含单引号,则会由于字符替换不当而失败。只有当您强烈反对Eval解决方案时,才建议使用此解决方案。

关于json单引号的更多信息:json响应中的jquery单引号


使用json.loads

1
2
3
4
5
6
7
8
9
>>> import json
>>> h = '{"foo":"bar","foo2":"bar2"}'
>>> type(h)
<type 'str'>
>>> d = json.loads(h)
>>> d
{u'foo': u'bar', u'foo2': u'bar2'}
>>> type(d)
<type 'dict'>


以OP为例:

1
s ="{'muffin' : 'lolz', 'foo' : 'kitty'}"

我们可以使用yaml在字符串中处理这种非标准JSON:

1
2
3
4
5
6
>>> import yaml
>>> s ="{'muffin' : 'lolz', 'foo' : 'kitty'}"
>>> s
"{'muffin' : 'lolz', 'foo' : 'kitty'}"
>>> yaml.load(s)
{'muffin': 'lolz', 'foo': 'kitty'}


如果字符串总是可信的,那么您可以使用eval(或者按照建议使用literal_eval;不管字符串是什么,它都是安全的。)否则,您需要一个解析器。如果JSON解析器(如simplejson)只存储符合JSON方案的内容,那么它就可以工作。


使用jsonast库消耗大量内存,速度较慢。我有一个进程需要读取156MB的文本文件。ast,转换字典json延迟5分钟,使用60%的内存减少1分钟!


你可以试试这个。< BR>

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
    >>> import ast
    >>> data ="{'user': 'bob', 'age': 10, 'grades': ['A', 'F', 'C']}"
    >>> ast.literal_eval(data)

    O/P: {'age': 10, 'grades': ['A', 'F', 'C'], 'user': 'bob'}

    >>> user = ast.literal_eval(data)

    >>> user['age']
    O/P: 10

    >>> user['grades']
    O/P: ['A', 'F', 'C']

    >>> user['user']
    O/P: 'bob'


1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
string ="{'server1':'value','server2':'value'}"

#Now removing { and }
s = string.replace("{" ,"")
finalstring = s.replace("}" ,"")

#Splitting the string based on , we get key value pairs
list = finalstring.split(",")

dictionary ={}
for i in list:
    #Get Key Value pairs separately to store in dictionary
    keyvalue = i.split(":")

    #Replacing the single quotes in the leading.
    m= keyvalue[0].strip('\'')
    m = m.replace(""","")
    dictionary[m] = keyvalue[1].strip('"\'')

print dictionary


如果不能使用python 2.6,可以使用简单的safeeval实现,如http://code.activestate.com/recipes/364469/

它依赖于Python编译器,因此您不必自己完成所有的繁重工作。


总结:

1
2
3
4
5
6
7
8
9
10
11
12
13
import ast, yaml, json, timeit

descs=['short string','long string']
strings=['{"809001":2,"848545":2,"565828":1}','{"2979":1,"30581":1,"7296":1,"127256":1,"18803":2,"41619":1,"41312":1,"16837":1,"7253":1,"70075":1,"3453":1,"4126":1,"23599":1,"11465":3,"19172":1,"4019":1,"4775":1,"64225":1,"3235":2,"15593":1,"7528":1,"176840":1,"40022":1,"152854":1,"9878":1,"16156":1,"6512":1,"4138":1,"11090":1,"12259":1,"4934":1,"65581":1,"9747":2,"18290":1,"107981":1,"459762":1,"23177":1,"23246":1,"3591":1,"3671":1,"5767":1,"3930":1,"89507":2,"19293":1,"92797":1,"32444":2,"70089":1,"46549":1,"30988":1,"4613":1,"14042":1,"26298":1,"222972":1,"2982":1,"3932":1,"11134":1,"3084":1,"6516":1,"486617":1,"14475":2,"2127":1,"51359":1,"2662":1,"4121":1,"53848":2,"552967":1,"204081":1,"5675":2,"32433":1,"92448":1}']
funcs=[json.loads,eval,ast.literal_eval,yaml.load]

for  desc,string in zip(descs,strings):
    print('***',desc,'***')
    print('')
    for  func in funcs:
        print(func.__module__+' '+func.__name__+':')
        %timeit func(string)        
    print('')

结果:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
*** short string ***

json loads:
4.47 μs ± 33.4 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)
builtins eval:
24.1 μs ± 163 ns per loop (mean ± std. dev. of 7 runs, 10000 loops each)
ast literal_eval:
30.4 μs ± 299 ns per loop (mean ± std. dev. of 7 runs, 10000 loops each)
yaml load:
504 μs ± 1.29 μs per loop (mean ± std. dev. of 7 runs, 1000 loops each)

*** long string ***

json loads:
29.6 μs ± 230 ns per loop (mean ± std. dev. of 7 runs, 10000 loops each)
builtins eval:
219 μs ± 3.92 μs per loop (mean ± std. dev. of 7 runs, 1000 loops each)
ast literal_eval:
331 μs ± 1.89 μs per loop (mean ± std. dev. of 7 runs, 1000 loops each)
yaml load:
9.02 ms ± 92.2 μs per loop (mean ± std. dev. of 7 runs, 100 loops each)

结论:首选json.loads


不使用任何LIB:

1
2
3
4
5
6
dict_format_string ="{'1':'one', '2' : 'two'}"
d = {}
elems  = filter(str.isalnum,dict_format_string.split("'"))
values = elems[1::2]
keys   = elems[0::2]
d.update(zip(keys,values))

注:由于它已经硬编码,所以cx1〔9〕只适用于数据为"单引号"的字符串。