关于python:处理从文件读取的文件名中的反斜杠转义

Handling backslash escapes in filenames read from a file

TL DR:文本文件;escapes that contains字符串代表的反斜杠;how to them as给使用输入os.stat()?P></

安:我input.txt输入文件P></

1
2
3
./with\backspace
./with
newline

他们处理与单回路不工作:P></

1
2
3
4
5
6
7
8
>>> import os
>>> with open('input.txt') as f:
...     for line in f:
...         os.stat(line.strip())
...
Traceback (most recent call last):
  File"<stdin>", line 3, in <module>
FileNotFoundError: [Errno 2] No such file or directory: './with\\backspace'

在另一个问题suggested .decode("unicode_escape")using as the first only电离线厂茶叶文件失败,不
with the second。P></

输入文件名./have the旁注:我知道我只使用过的文件和os.listdir('.')迭代直到find the right酮。那不是我的目标。is that the目标处理的文件从文件escapes contain反斜杠。P></

额外的测试:P></

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
>>> import os
>>> with open('./input.txt') as f:
...     for l in f:
...         os.stat(l.strip().decode('unicode_escape'))
...
Traceback (most recent call last):
  File"<stdin>", line 3, in <module>
AttributeError: 'str' object has no attribute 'decode'
>>> with open('./input.txt') as f:
...     for l in f:
...         try:
...             os.stat(l.strip().encode('utf-8').decode('unicode_escape'))
...             print(l.strip())
...         except:
...             pass
...
os.stat_result(st_mode=33188, st_ino=1053469, st_dev=2049, st_nlink=1, st_uid=1000, st_gid=1000, st_size=0, st_atime=1536468565, st_mtime=1536468565, st_ctime=1536468565)
./with
newline

作品:os.fsencode()显字符串与写作P></

1
2
>>> os.stat(os.fsencode('with\x08ackspace'))
os.stat_result(st_mode=33188, st_ino=1053465, st_dev=2049, st_nlink=1, st_uid=1000, st_gid=1000, st_size=0, st_atime=1536468565, st_mtime=1536468565, st_ctime=1536468565)

不管一个人多有变化,多由命令行一样,仍然不能读取文件字符串from the such that the os.stat()accepts EN。P></

1
2
3
4
5
6
7
>>> with open('./input.txt') as f:
...      for l in f:
...          os.stat(os.fsdecode( bytes(l.strip(),'utf-8').decode('unicode_escape').encode('latin1') ) )
...
Traceback (most recent call last):
  File"<stdin>", line 3, in <module>
FileNotFoundError: [Errno 2] No such file or directory: './with\x08ackslash'


MACOS作品

1
2
3
4
5
6
7
8
9
10
11
12
13
touch $'with\backspace'
touch $'with
ewline'

echo $'./with\\backspace
./with\
ewline'
> input.txt
python
>>> import os
>>> with open('./input.txt') as f:
...     for l in f:
...         os.stat(l.strip().decode('unicode_escape'))
posix.stat_result(st_mode=33188, st_ino=8604304962, st_dev=16777220, st_nlink=1, st_uid=501, st_gid=20, st_size=0, st_atime=1536469815, st_mtime=1536469815, st_ctime=1536469815)
posix.stat_result(st_mode=33188, st_ino=8604305024, st_dev=16777220, st_nlink=1, st_uid=501, st_gid=20, st_size=0, st_atime=1536470112, st_mtime=1536470112, st_ctime=1536470112)

这是在达尔文内核版本17.7.0上的python 2.7.14。


经过大约2小时的研究,我发现输入文件包含./with\backslash,而实际文件名是通过touch with$'\b'ackspace创建的。因此,健康Raftery的答案是有效的,但只适用于python 2。在python 3中,您得到AttributeError: 'str' object has no attribute 'decode',因为python3中的字符串已经是Unicode字符串。

在这个过程中,我可能找到了一个更好的方法,通过os.fsencode()。参考JFS的答案。

1
2
3
4
5
6
7
8
9
import os

with open('./input.txt') as f:
    for l in f:
        # alternatively one can use
        # bytes(l.strip(),sys.getdefaultencoding())
        bytes_filename =  bytes(l.strip(), 'utf-8').decode('unicode_escape')
        f_stat = os.stat(os.fsdecode( bytes_filename ) )
        print(l.strip(),f_stat)

因为我主要使用python 3,所以这是我要找的。然而,健康Raftery的答案是非完全有效的,因此+1'ed。