带有自定义分隔符的Python readline

end-of-linepythonreadline

Python readline with custom delimiter

这里是新手。我正试图从一个文件中读取行，但是.txt文件中的一行在中间的某个位置有一个
，而在用.read line python读取该行时，它将中间部分剪切，并输出为两行。

当我将行复制并通过此窗口时，它显示为两行。所以我把文件上传到这里：https://ufile.io/npt3n
还添加了文件的屏幕截图，如在TXT文件中所示。
这是从whatsup.导出的群聊历史记录。如果您想知道的话。
请帮助我完整地阅读一行，如TXT文件所示。

。

1
2
3
4
5
6
7

f= open("f.txt",mode='r',encoding='utf8')

for i in range(4):
lineText=f.readline()
print(lineText)

f.close()

enter image description here 。

相关讨论

python 3允许您定义特定文件的换行符。它很少使用，因为默认的通用换行模式非常宽容：

When reading input from the stream, if newline is None, universal newlines mode is enabled. Lines in the input can end in '
', '
', or '

', and these are translated into '
' before being returned to the caller.

因此，在这里您应该明确指出只有'
'是行尾：

1
2
3
4
5
6
7

f= open("f.txt",mode='r',encoding='utf8', newline='

')

# use enumerate to show that second line is read as a whole
for i, line in enumerate(fd):
print(i, line)

相关讨论

不使用readline函数，您可以通过regex读取整个内容和拆分行：

1
2
3
4
5
6
7
8
9
10
11
12
13

import re

with open("txt","r") as f:
content = f.read()
# remove end line characters
content = content.replace("
","")
# split by lines
lines = re.compile("(\[[0-9//, :\]]+)").split(content)
# clean"" elements
lines = [x for x in lines if x !=""]
# join by pairs
lines = [i + j for i, j in zip(lines[::2], lines[1::2])]

如果所有内容都有相同的开头[…]，那么可以用这个来拆分，然后清除所有省略"元素"的部分。然后您可以使用zip函数(https://stackoverflow.com/a/5851033/1038301)连接每个部分。