Python requests.get()返回破碎的源代码而不是预期的源代码？

Python requests.get() returns broken source code instead of expected source code?

在上面的维基百科页面上提出了请求。具体来说，我需要从https://en.wikipedia.org/wiki/2017%E2%80%9318_La_Liga#Results中删除"结果矩阵"

1	selectedSeasonPage = requests.get('https://en.wikipedia.org/wiki/2017–18_La_Liga', features='html5lib')

做pprint.pprint(selectedSeasonPage.text)并跳转到矩阵的源代码，可以看出它是不完整的。

requests.get()返回的HTML片段：

1
2
3
4
5
6
7
8
9
10
11

<table class="wikitable plainrowheaders" style="text-align:center;font-size:100%;">
.
.
<th scope="row" style="text-align:right;">Alavés</th>
<td style="font-weight: normal;background-color:transparent;">— </td>
<td style="white-space:nowrap;font-weight: normal;background-color:transparent;"></td>
<td style="white-space:nowrap;font-weight: normal;background-color:transparent;"></td>
<td style="white-space:nowrap;font-weight: normal;background-color:transparent;"></td>
<td style="white-space:nowrap;font-weight: normal;background-color:transparent;"></td>
<td style="white-space:nowrap;font-weight: normal;background-color:transparent;"></td>
<td style="white-space:nowrap;font-weight: normal;background-color:#BBF3FF;">2–1</td>

request.get()返回的HTML通过浏览器查看，并且按预期不完整。
可以查看此图片以供参考。

来自view-source的片段和所需的输出。

1
2
3
4
5
6
7
8
9
10
11

<table class="wikitable plainrowheaders" style="text-align:center;font-size:100%;">
.
.
Alavés</th>
<td style="font-weight: normal;background-color:transparent;">—</td>
<td style="white-space:nowrap;font-weight: normal;background-color:#BBF3FF;">3–1</td>
<td style="white-space:nowrap;font-weight: normal;background-color:#FFBBBB;">0–1</td>
<td style="white-space:nowrap;font-weight: normal;background-color:#FFBBBB;">0–2</td>
<td style="white-space:nowrap;font-weight: normal;background-color:#BBF3FF;">2–1</td>
<td style="white-space:nowrap;font-weight: normal;background-color:#BBF3FF;">1–0</td>
<td style="white-space:nowrap;font-weight: normal;background-color:#FFBBBB;">1–2</td>

发布样本HTML以供参考，因为无法发布整个输出。如果需要，可以发布更具体的部件。

我的问题是如何获得矩阵的整个来源而不会导致价值损失？

根据我对以前的问题的理解，如果页面的某些部分由JavaScript呈现，requests将无法返回预期的输出。但是这个页面似乎是简单的HTML和CSS(至少是需要的部分)。不能使用Selenium需要刮多页。非常感谢使用requests或等效的解决方案。

请求版本为2.19.1。 Python版本是3.7.0。

有什么遗失？我对这些东西不熟悉，任何帮助表示赞赏。

相关讨论

几乎没有你在get调用中没有"features"参数的确切代码：

1
2
3

import requests
selectedSeasonPage = requests.get('https://en.wikipedia.org/wiki/2017–18_La_Liga')
print(selectedSeasonPage.text)

给我：

1
2
3
4
5
6
7
8
9

<th scope="row" style="text-align:right;">Alavés
</th>
<td style="font-weight:normal;background:transparent;">—</td>
<td style="white-space:nowrap;font-weight:normal;background:#BBF3FF;">3–1</td>
<td style="white-space:nowrap;font-weight:normal;background:#FBB;">0–1</td>
<td style="white-space:nowrap;font-weight:normal;background:#FBB;">0–2</td>
<td style="white-space:nowrap;font-weight:normal;background:#BBF3FF;">2–1</td>
<td style="white-space:nowrap;font-weight:normal;background:#BBF3FF;">1–0</td>
<td style="white-space:nowrap;font-weight:normal;background:#FBB;">1–2</td>