python--Import module name as variable (dynamically load module from passed argument)
Python 3.4.2…我一直在尝试从一个参数动态加载自定义模块。我想加载自定义代码来抓取特定的HTML文件。示例:
我尝试过许多解决方案,包括:当模块名在变量中时导入模块
当我使用实际的模块名而不是变量名
代码:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 | $ cat scrape.py #!/usr/bin/env python3 from urllib.request import urlopen from bs4 import BeautifulSoup import argparse import os, sys import importlib parser = argparse.ArgumentParser(description='HTML web scraper') parser.add_argument('filename', help='File to act on') parser.add_argument('-m', '--module', metavar='MODULE_NAME', help='File with code specific to the site--must be a defined class named Scrape') args = parser.parse_args() if args.module: # from get_div_content import Scrape #THIS WORKS# sys.path.append(os.getcwd()) #EDIT--change this: #wrong# module_name = importlib.import_module(args.module, package='Scrape') #to this: module = importlib.import_module(args.module) # correct try: html = open(args.filename, 'r') except: try: html = urlopen(args.filename) except HTTPError as e: print(e) try: soup = BeautifulSoup(html.read()) except: print("Error... Sorry... not sure what happened") #EDIT--change this #wrong#scraper = Scrape(soup) #to this: scraper = module.Scrape(soup) # correct |
模块:
1 2 3 4 5 | $ cat get_div_content.py class Scrape: def __init__(self, soup): content = soup.find('div', {'id':'content'}) print(content) |
命令运行和错误:
1 2 3 4 5 6 7 8 9 | $ ./scrape.py -m get_div_content.py file.html Traceback (most recent call last): File"./scrape.py", line 16, in <module> module_name = importlib.import_module(args.module, package='Scrape') File"/usr/lib/python3.4/importlib/__init__.py", line 109, in import_module return _bootstrap._gcd_import(name[level:], package, level) File"<frozen importlib._bootstrap>", line 2249, in _gcd_import File"<frozen importlib._bootstrap>", line 2199, in _sanity_check SystemError: Parent module 'Scrape' not loaded, cannot perform relative import |
工作命令——无错误:
1 2 3 | $ ./scrape.py -m get_div_content file.html ... |
你不需要包裹。仅使用模块名称
1 | module = importlib.import_module(args.module) |
然后您有一个
1 | scraper = module.Scrape(soup) |
调用时,请记住使用模块名,而不是文件名:
1 | ./scrape.py -m get_div_content file.html |