关于selenium：Python循环覆盖最后的HTML写

Python Loop Overwriting Last HTML Write

本问题已经有最佳答案，请猛点这里访问。

这个脚本在"while true："处循环，它是通过单击底部的"下一步"按钮从多个页面中提取数据而编写的，但是我不知道如何构造代码，以便在HTML分页时继续写入HTML。相反，它会覆盖先前编写的HTML结果。感谢你的帮助。谢谢！

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59

while True:
time.sleep(10)

golds = driver.find_elements_by_css_selector(".widgetContainer #widgetContent > div.singleCell")
print("found %d golds" % len(golds))

template ="""\
<tr class="border">
<td class="image"><img src="{0}"></td>\
<td class="title">{2}</td>\
<td class="price">{3}</td>
</tr>"""

lines = []

for gold in golds:
goldInfo = {}

goldInfo['title'] = gold.find_element_by_css_selector('#dealTitle > span').text
goldInfo['link'] = gold.find_element_by_css_selector('#dealTitle').get_attribute('href')
goldInfo['image'] = gold.find_element_by_css_selector('#dealImage img').get_attribute('src')

try:
goldInfo['price'] = gold.find_element_by_css_selector('.priceBlock > span').text
except NoSuchElementException:
goldInfo['price'] = 'No price display'

line = template.format(goldInfo['image'], goldInfo['link'], goldInfo['title'], goldInfo['price'])
lines.append(line)

try:
#clicks next button
driver.find_element_by_link_text("Next→").click()
except NoSuchElementException:
break

time.sleep(10)

html ="""\
<html>
<body>
<table>
<tr class='headers'>
<td class='image'></td>
<td class='title'>Product</td>
<td class='price'>Price / Deal</td>
</tr>
</table>
<table class='data'>
{0}
</table>
</body>
</html>\
"""

f = open('./result.html', 'w')
f.write(html.format('
'.join(lines)))
f.close()

相关讨论

在脚本末尾打开文件时，请查看不同的模式：https://docs.python.org/2/library/functions.html open

The most commonly-used values of mode are 'r' for reading, 'w' for writing (truncating the file if it already exists), and 'a' for appending

号

然后还有更多

Modes 'r+', 'w+' and 'a+' open the file for updating (reading and writing); note that 'w+' truncates the file. Append 'b' to the mode to open the file in binary mode, on systems that differentiate between binary and text files; on systems that don’t have this distinction, adding the 'b' has no effect.

号

所以你有几个选择。您可能会使用a，因为您希望向它附加数据。

或者您可以将打开的文件移动到循环之外，这样就不会根据需要不断地重新打开文件。

1
2
3
4
5

f = open('./result.html', 'w')
while True:
# do stuff
f.write (...)
f.close()

号

您应该在附加模式下打开文件

1	f = open('./result.html', 'a')