urllib2:
Date: XXX
Server: Apache
Last-Modified: XXX
Accept-Ranges: bytes
Content-Length: 12345678
Vary: Accept-Encoding
Connection: close
Content-Type: text/plain
requests:
Content-Encoding: gzip
Accept-Ranges: bytes
Vary: Accept-Encoding
Keep-alive: timeout=5, max=128
Last-Modified: XXX
Connection: Keep-Alive
ETag: xxxxxxxxx
Content-Type: text/plain
为何 requests 少了 content-length ?其它发送请求的设置是完全一样的。。 requests 和 Chrome 开发者工具查看到的一致。但是这里我又需要 content-length 的值(为了断点续传)
import urllib2 
import requests 
url = 'exmaple.com' 
headers = { 
"Authorization": "Basic xxxx", 
"Range": "bytes=0-" 
} 
req = urllib2.Request(url, headers=headers) 
resp = urllib2.urlopen(req) 
print resp.info() 
r = requests.get(url, headers=headers) 
print r.headers 
assert resp.info()['ETag'] == r.headers['ETag'] 
Date: Sat, 14 Jan 2017 09:39:50 GMT
Server: Apache
Last-Modified: Sat, 14 Jan 2017 09:39:49 GMT
ETag: "e91103-10e04f7-5460abb4743a3"
Accept-Ranges: bytes
Content-Length: 17695991
Vary: Accept-Encoding
Content-Range: bytes 0-17695990/17695991
Connection: close
Content-Type: text/plain
{'Content-Encoding': 'gzip', 'Transfer-Encoding': 'chunked', 'Accept-Ranges': 'bytes', 'Vary': 'Accept-Encoding', 'Keep-Alive': 'timeout=5, max=128', 'Server': 'Apache', 'Last-Modified': 'Sat, 14 Jan 2017 09:39:49 GMT', 'Connection': 'Keep-Alive', 'ETag': '"e91103-10e04f7-5460abb4743a3"', 'Date': 'Sat, 14 Jan 2017 09:39:50 GMT', 'Content-Type': 'text/plain'}
我也知道肯定是两次发送的请求header不一样。。。现在总算解决了。。
The response is different because requests indicates that it supports gzip-encoded bodies, by sending an Accept-Encoding: gzip, deflate header field. urllib2 does not. You'll find if you added that header to your urllib2 request that you get the new behaviour.
Clearly, in this case, the server is dynamically gzipping the responses. This means it doesn't know how long the response will be, so it is sending using chunked transfer encoding.
If you really must get the Content-Length header, then you should add the following headers to your Requests request: {'Accept-Encoding': 'identity'}.
|  |      1redhatping      2017-01-14 17:18:47 +08:00 看官方文档 | 
|  |      2binux      2017-01-14 17:20:43 +08:00 via Android Content-Length 不应该手动设置 | 
|  |      3dofine OP @binux 我描述不清~~上边给出的结果是响应的 header ,我的意思是需要知道当前 content-length 的值。。但是 requests 的返回里面没有。。 urllib2 就有。。 @redhatping 文档已经看了许多遍了。。怀疑是服务器的问题? | 
|  |      4hahastudio      2017-01-14 17:25:47 +08:00 你这是结果,肯定还是因为你发送的请求不一样 | 
|  |      5Lonely      2017-01-14 17:36:14 +08:00 via iPhone 你把代码也贴出来啊…… | 
|  |      6dofine OP {'Range': 'bytes=0-', 'Authorization': 'Basic XXX'} 手动加了这个 header , urllib2 和 requests 返回的 ETag 都是一样的啊。。为什么会发送请求不一样呢。。 @hahastudio | 
|  |      7dofine OP ``` import os import urllib2 import requests url = 'exmaple.com' headers = { "Authorization": "Basic xxxx", "Range": "bytes=0-" } req = urllib2.Request(url, headers=headers) resp = urllib2.urlopen(req) print resp.info() r = requests.get(url, headers=headers) print r.headers assert resp.info()['ETag'] == r.headers['ETag'] ``` | 
|  |      9lhbc      2017-01-14 17:45:03 +08:00  1 明显你的 request header 不一样 | 
|  |      10hahastudio      2017-01-14 17:45:36 +08:00  1 requests 允许额外设置 auth 么? http://docs.python-requests.org/en/master/user/authentication/ | 
|  |      11dofine OP @hahastudio 开始就使用的文档里的方法,结果跟换成手动设置 auth 一样的。。 | 
|  |      12lbp0200      2017-01-14 17:55:21 +08:00  1 试试随机 ua | 
|  |      13dofine OP 谢谢大家。。 | 
|  |      14dsg001      2017-01-14 19:10:06 +08:00 抓包看看发送的请求有木有区别 | 
|  |      15qgy18      2017-01-14 23:11:08 +08:00 via iPhone  1 |