关于uri:什么时候从http请求规范中使用absoluteUri?

When is absoluteUri used from the http request specs?

我一直在研究具有GETRealestururi和GeQuestEtURL方法的HTTPServServices API(Java)。这让我想到:https://tools.ietf.org/html/rfc7230第5.3节据我所知,getrequesturi返回HTTP请求第一行的值,该行大部分时间都是资源的相对路径,除非源服务器位于入站代理之后,在这种情况下,它必须是绝对URI。我想互联网上流行网站的大多数源服务器都属于这一类别,这意味着原始HTTP请求中的URI应该是绝对URI(根据HTTP规范),但我没有在任何地方找到这样的例子。浏览器真的能知道它是向入站代理发送请求,还是直接向Origing服务器发送请求?HTTP规范中的absoluteuri概念有什么实际价值吗?因为主机头字段总是在HTTP 1.1请求中发送。规范的这一部分在HTTP 1.0时是否具有一些实际价值,当时还没有主机头字段?


我想你可能会对正在讨论的代理类型感到困惑。看上去RFC指的是一个转发代理,通过另一个代理向不同的服务器发出请求(客户机告诉代理将流量转发到哪里)。

使用反向代理,您是对的,客户机不知道请求已被代理到另一个服务器。

代理服务器和反向代理服务器的区别


(P)Daniel Scott已经确认了我最初困惑的来源I will make a note of so me points that were't so clear to me and prevented me from understanding the Specs Correctly:(p)

  • Forward Proxies are often referred to simply as"Proxies".
  • Reverse proxies are often referred to as"gateway".
  • For some reason I consider forward proxies to be synonymous with outbound proxies and reverse proxies to be synonymous with inbound proxies.I think I've seen i t on some article about proxies some where I don't know if these terms are widely used.
  • On the TCP/IP level when we are behind a forward proxy all the web traffic is sent to that proxy.The browser never communicate s with the origin server directly and has to some send the address(IP or domain name)to the forward proxy so it can communicate with the origin server on the client's behalf.This happens on the http://protocol level in the request-line.When we are not behind a forward proxy we can communicate with the origin server directly through TCP/IP and the absolute URL in the level request-line is not needed.
  • Absolute urls in the request-line were designed from the time of http://1.0 to handle the problem of communicating behind a forward proxy.The host head field was introduced to be a mandatory header by the http/1.1 specification which,by it,introduced the support for virtual hosing.I guess http/1.1 could have simply made the absolute url mandatory and kill two birds with one stone but for some reason it decided that the host head solution was better.
  • I realize that aside from forward and reverse proxies there are also the"transparent"proxies.这些是ISPS和Whatnot使用的CDNS或Proxies。这两个缔约方都是近视的,不需要两个通报缔约方的内部配置。他们不会有任何事情来处理这个问题,但他们有一些用以迷惑我的东西。
  • (P)Also I want to say that I did an experiment which confirmed what is stated in the http://specs.(p)(P)I googled"free proxy ip and port",Went to"https://www.hide-my-ip.com/proxylist.shtml"and configured windows to use a forward proxy(control panel)>Internet options->connections->lan settings->use a proxy server-.Then I made a request to www.bbc.com and examined the raw http://request from the chrome console network tab the address in the request-line was absolute.他们删除了代理并提出了同样的要求。The adress at the request-line was now just the path.(p)(P)I'm not sure about the whole reconstruction of the URL thing by a proxy which Alexius diakogianos is mentioning.It seems very logical that this is an option that most forward proxies have if the client does not send the absolute url but from what I can see,at least chrome,sends the absolute url to the proxy correctly when it realizes that it is behind it.of course I have never managed/ran a forward proxy myself so I wouldn't know.(p)


    来自HTTP协议1.0规范

    The absoluteURI form is only allowed when the request is being made to a proxy. The proxy is requested to forward the request and return the response. If the request is GET or HEAD and a prior response is cached, the proxy may use the cached message if it passes any restrictions in the Expires header field. Note that the proxy may forward the request on to another proxy or directly to the server specified by the absoluteURI. In order to avoid request loops, a proxy must be able to recognize all of its server names, including any aliases, local variations, and the numeric IP address. An example Request-Line would be: GET /TheProject.html HTTP/1.0

    The most common form of Request-URI is that used to identify a
    resource on an origin server or gateway. In this case, only the
    absolute path of the URI is transmitted (see Section 3.2.1, abs_path).
    For example, a client wishing to retrieve the resource above directly
    from the origin server would create a TCP connection to port 80 of the
    host"www.w3.org" and send the line:
    GET /pub/WWW/TheProject.html HTTP/1.0 followed by the remainder of the Full-Request. Note that the absolute path cannot be empty; if
    none is present in the original URI, it must be given as"/" (the
    server root).

    所以是的,这一切都是有实际意义的,但前提是你知道你实际上是在向一个代理人发帖。浏览器不能真正知道他在向代理提交信息,但由于这是最常见的情况,这就是为什么总是传输主机和uri属性,而不是显式路径的原因。现代代理(而不是现代代理)从主机、协议、端口和URI重建URL

    以下面的例子为例

    1
    2
    3
    4
    5
    6
    7
    8
    9
    GET /standards/ HTTP/1.1
    Host: www.w3.org
    User-Agent: Mozilla/5.0 (Windows NT 6.1; WOW64; rv:49.0) Gecko/20100101 Firefox/49.0
    Accept: text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8
    Accept-Language: en-US,en;q=0.5
    Accept-Encoding: gzip, deflate, br
    Referer: https://www.w3.org/
    Connection: keep-alive
    Upgrade-Insecure-Requests: 1

    代理将重新构造客户机用于发出请求的URL。返回的URL将包含协议、服务器名称、端口号和服务器路径。

    在Java中,类似的事情也完成了。如果您查看servletapi规范,您将看到同样的行为。

    因此,根据经验,只有在向代理发出请求时才允许绝对URI表单。请求不一定来自浏览器,但是如果代理没有接收到绝对路径,则它使用头中的其余数据构造URL,类似于Java的GETURL。