python os.walk to certain level
我想构建一个程序,它使用一些基本代码来读取文件夹,并告诉我文件夹中有多少文件。以下是我目前的做法:
1 2 3 4 5 6 | import os folders = ['Y:\\path1', 'Y:\\path2', 'Y:\\path3'] for stuff in folders: for root, dirs, files in os.walk(stuff, topdown=True): print("there are", len(files),"files in", root) |
在"主"文件夹中有多个文件夹之前,这非常有效,因为它会返回一个长的、垃圾的文件列表,因为文件夹/文件管理不好。所以我最多只想上二级。例子:
1 2 3 4 5 6 7 8 9 | Main Folder ---file_i_want ---file_i_want ---Sub_Folder ------file_i_want <--* ------file_i want <--* ------Sub_Folder_2 ---------file_i_dont_want ---------file_i_dont_want |
我只知道如何从这个职位和这个职位上获得一个
1 2 3 4 5 6 7 8 | import os import pandas as pd folders = ['Y:\\path1', 'Y:\\path2', 'Y:\\path3'] for stuff in folders: for root, dirs, files in os.walk(stuff, topdown=True): print("there are", len(files),"files in", root) del dirs[:] # or a break here. does the same thing. |
但无论我如何寻找,我都不知道该如何深入两层。我可能只是不理解上面的其他帖子或者什么?我在想类似于
你可以这样做:
1 2 3 4 | for root,dirs,files in os.walk(stuff): if root[len(stuff)+1:].count(os.sep)<2: for f in files: print(os.path.join(root,f)) |
关键是:
它从
当然,它仍然会扫描完整的文件结构,但是除非它非常深,否则就可以工作。
另一种解决方案是只使用具有最大递归级别的
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 | def scanrec(root): rval = [] def do_scan(start_dir,output,depth=0): for f in os.listdir(start_dir): ff = os.path.join(start_dir,f) if os.path.isdir(ff): if depth<2: do_scan(ff,output,depth+1) else: output.append(ff) do_scan(root,rval,0) return rval print(scanrec(stuff)) # prints the list of files not below 2 deep |
注:
您可以计算分隔符,如果是两个级别的深度,则删除
1 2 3 4 5 6 7 8 9 | import os MAX_DEPTH = 2 folders = ['Y:\\path1', 'Y:\\path2', 'Y:\\path3'] for stuff in folders: for root, dirs, files in os.walk(stuff, topdown=True): print("there are", len(files),"files in", root) if root.count(os.sep) - stuff.count(os.sep) == MAX_DEPTH - 1: del dirs[:] |
python文档说明了以下行为:
When topdown is True, the caller can modify the dirnames list in-place (perhaps using del or slice assignment), and walk() will only recurse into the subdirectories whose names remain in dirnames; this can be used to prune the search, impose a specific order of visiting, or even to inform walk() about directories the caller creates or renames before it resumes walk() again.
请注意,您需要考虑