python—将模块名作为变量导入(从传递的参数动态加载模块)

python--Import module name as variable (dynamically load module from passed argument)

Python 3.4.2…我一直在尝试从一个参数动态加载自定义模块。我想加载自定义代码来抓取特定的HTML文件。示例:scrape.py -m name_of_module_to_load file_to_scrape.html

我尝试过许多解决方案,包括:当模块名在变量中时导入模块

当我使用实际的模块名而不是变量名args.module时,模块加载很好。

代码:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
$ cat scrape.py
#!/usr/bin/env python3
from urllib.request import urlopen
from bs4 import BeautifulSoup
import argparse
import os, sys
import importlib

parser = argparse.ArgumentParser(description='HTML web scraper')
parser.add_argument('filename', help='File to act on')
parser.add_argument('-m', '--module', metavar='MODULE_NAME', help='File with code specific to the site--must be a defined class named Scrape')
args = parser.parse_args()

if args.module:
#    from get_div_content import Scrape #THIS WORKS#
    sys.path.append(os.getcwd())
    #EDIT--change this:
    #wrong# module_name = importlib.import_module(args.module, package='Scrape')
    #to this:
    module = importlib.import_module(args.module) # correct

try:
    html = open(args.filename, 'r')
except:
    try:
    html = urlopen(args.filename)
    except HTTPError as e:
    print(e)
try:
    soup = BeautifulSoup(html.read())
except:
    print("Error... Sorry... not sure what happened")

#EDIT--change this
#wrong#scraper = Scrape(soup)
#to this:
scraper = module.Scrape(soup) # correct

模块:

1
2
3
4
5
$ cat get_div_content.py
class Scrape:
    def __init__(self, soup):
    content = soup.find('div', {'id':'content'})
    print(content)

命令运行和错误:

1
2
3
4
5
6
7
8
9
$ ./scrape.py -m get_div_content.py file.html
Traceback (most recent call last):
  File"./scrape.py", line 16, in <module>
    module_name = importlib.import_module(args.module, package='Scrape')
  File"/usr/lib/python3.4/importlib/__init__.py", line 109, in import_module
    return _bootstrap._gcd_import(name[level:], package, level)
  File"<frozen importlib._bootstrap>", line 2249, in _gcd_import
  File"<frozen importlib._bootstrap>", line 2199, in _sanity_check
SystemError: Parent module 'Scrape' not loaded, cannot perform relative import

工作命令——无错误:

1
2
3
$ ./scrape.py -m get_div_content file.html

...

你不需要包裹。仅使用模块名称

1
module = importlib.import_module(args.module)

然后您有一个module名称空间,其中包含模块中定义的所有内容:

1
scraper = module.Scrape(soup)

调用时,请记住使用模块名,而不是文件名:

1
./scrape.py -m get_div_content file.html