How do you walk through the directories using python?
我有一个叫做"笔记"的文件夹,自然它们会被分类到文件夹中,在这些文件夹中也会有子文件夹用于子类别。现在我的问题是我有一个函数,它遍历三个级别的子目录:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 | def obtainFiles(path): list_of_files = {} for element in os.listdir(path): # if the element is an html file then.. if element[-5:] ==".html": list_of_files[element] = path +"/" + element else: # element is a folder therefore a category category = os.path.join(path, element) # go through the category dir for element_2 in os.listdir(category): dir_level_2 = os.path.join(path,element +"/" + element_2) if element_2[-5:] ==".html": print"- found file:" + element_2 # add the file to the list of files list_of_files[element_2] = dir_level_2 elif os.path.isdir(element_2): subcategory = dir_level_2 # go through the subcategory dir for element_3 in os.listdir(subcategory): subcategory_path = subcategory +"/" + element_3 if subcategory_path[-5:] ==".html": print"- found file:" + element_3 list_of_files[element_3] = subcategory_path else: for element_4 in os.listdir(subcategory_path): print"- found file:" + element_4 |
请注意,这仍然是一项正在进行的工作。我的眼睛很难看…我想在这里实现的是把所有的文件夹和子文件夹放下来,把所有的文件名放在一个名为"文件列表"的字典中,名为"键",完整路径为"值"。这个函数目前还不能完全正常工作,但是我们想知道如何使用os.walk函数来做类似的事情?
谢谢
根据您的简短描述,类似这样的内容应该有效:
1 2 3 4 5 | list_of_files = {} for (dirpath, dirnames, filenames) in os.walk(path): for filename in filenames: if filename.endswith('.html'): list_of_files[filename] = os.sep.join([dirpath, filename]) |
另一种选择是使用生成器,基于@ig0774的代码构建
1 2 3 4 5 6 | import os def walk_through_files(path, file_extension='.html'): for (dirpath, dirnames, filenames) in os.walk(path): for filename in filenames: if filename.endswith(file_extension): yield os.path.join(dirpath, filename) |
然后
1 2 | for fname in walk_through_files(): print(fname) |
你可以这样做:
1 2 3 4 | list_of_files = dict([ (file, os.sep.join((dir, file))) for (dir,dirs,files) in os.walk(path) for file in files if file[-5:] == '.html' ]) |
我已经多次遇到这个问题,但没有一个答案能让我满意——所以为这个问题创建了一个脚本。在浏览目录时使用pytohn非常麻烦。
它的使用方法如下:
1 2 3 4 5 6 7 8 9 10 11 | import file_walker for f in file_walker.walk("/a/path"): print(f.name, f.full_path) # Name is without extension if f.isDirectory: # Check if object is directory for sub_f in f.walk(): # Easily walk on new levels if sub_f.isFile: # Check if object is file (= !isDirectory) print(sub_f.extension) # Print file extension with sub_f.open("r") as open_f: # Easily open file print(open_f.read()) |