Python gzip: is there a way to decompress from a string?
我已经阅读了有关问题的这篇SO帖子,但没有成功。
我正在尝试解压缩来自URL的.gz文件。
1 2 3 4 | url_file_handle=StringIO( gz_data ) gzip_file_handle=gzip.open(url_file_handle,"r") decompressed_data = gzip_file_handle.read() gzip_file_handle.close() |
...但是我得到TypeError:强制转换为Unicode:需要字符串或缓冲区,cStringIO.StringI发现
这是怎么回事?
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 | Traceback (most recent call last): File"/opt/google/google_appengine-1.2.5/google/appengine/tools/dev_appserver.py", line 2974, in _HandleRequest base_env_dict=env_dict) File"/opt/google/google_appengine-1.2.5/google/appengine/tools/dev_appserver.py", line 411, in Dispatch base_env_dict=base_env_dict) File"/opt/google/google_appengine-1.2.5/google/appengine/tools/dev_appserver.py", line 2243, in Dispatch self._module_dict) File"/opt/google/google_appengine-1.2.5/google/appengine/tools/dev_appserver.py", line 2161, in ExecuteCGI reset_modules = exec_script(handler_path, cgi_path, hook) File"/opt/google/google_appengine-1.2.5/google/appengine/tools/dev_appserver.py", line 2057, in ExecuteOrImportScript exec module_code in script_module.__dict__ File"/home/jldupont/workspace/jldupont/trunk/site/app/server/tasks/debian/repo_fetcher.py", line 36, in <module> main() File"/home/jldupont/workspace/jldupont/trunk/site/app/server/tasks/debian/repo_fetcher.py", line 30, in main gziph=gzip.open(fh,'r') File"/usr/lib/python2.5/gzip.py", line 49, in open return GzipFile(filename, mode, compresslevel) File"/usr/lib/python2.5/gzip.py", line 95, in __init__ fileobj = self.myfileobj = __builtin__.open(filename, mode or 'rb') TypeError: coercing to Unicode: need string or buffer, cStringIO.StringI found |
如果您的数据已经在字符串中,请尝试zlib,它声称与gzip完全兼容:
1 2 | import zlib decompressed_data = zlib.decompress(gz_data, 16+zlib.MAX_WBITS) |
了解更多:http://docs.python.org/library/zlib.html?
1 2 | open(filename, mode='rb', compresslevel=9) #Shorthand for GzipFile(filename, mode, compresslevel). |
与
1 2 3 | class GzipFile __init__(self, filename=None, mode=None, compresslevel=9, fileobj=None) # At least one of fileobj and filename must be given a non-trivial value. |
所以这应该为你工作
1 | gzip_file_handle = gzip.GzipFile(fileobj=url_file_handle) |
如果您不喜欢将模糊的参数传递给
当处理
1 2 3 4 5 6 7 8 | import gzip from StringIO import StringIO # response = urllib2.urlopen(... content_raw = response.read() if 'gzip' in response.info().getheader('Content-Encoding'): content = gzip.GzipFile(fileobj=StringIO(content_raw)).read() |
处理可存储gzip压缩或未压缩数据的文件时:
1 2 3 4 5 6 7 8 9 | import gzip # some_file = open(... try: content = gzip.GzipFile(fileobj=some_file).read() except IOError: some_file.seek(0) content = some_file.read() |
上面的示例在Python 2.7中