V2EX = way to explore
V2EX 是一个关于分享和探索的地方
现在注册
已注册用户请  登录
aladd
V2EX  ›  晒晒更健康

好,百度蜘蛛成功让我 502 啦

  •  
  •   aladd · 2016-05-21 13:40:43 +08:00 · 3364 次点击
    这是一个创建于 3112 天前的主题,其中的信息可能已经有所发展或是发生改变。

    220.181.108.143 - - [21/May/2016:07:12:02 +0800] "GET /?t=nkvka HTTP/1.1" 200 13592 "-" "Mozilla/5.0 (compatible; Baiduspider/2.0; +http://www.baidu.com/search/spider.html)" - 220.181.108.145 - - [21/May/2016:07:12:02 +0800] "GET /?t=mdjki HTTP/1.1" 200 13583 "-" "Mozilla/5.0 (compatible; Baiduspider/2.0; +http://www.baidu.com/search/spider.html)" - 220.181.108.91 - - [21/May/2016:07:12:02 +0800] "GET /?t=uwk79 HTTP/1.1" 200 13581 "-" "Mozilla/5.0 (compatible; Baiduspider/2.0; +http://www.baidu.com/search/spider.html)" - 220.181.108.80 - - [21/May/2016:07:12:12 +0800] "GET /?t=mcn9f HTTP/1.1" 200 13617 "-" "Mozilla/5.0 (compatible; Baiduspider/2.0; +http://www.baidu.com/search/spider.html)" - 220.181.108.160 - - [21/May/2016:07:12:14 +0800] "GET /?t=ucwmk HTTP/1.1" 200 13615 "-" "Mozilla/5.0 (compatible; Baiduspider/2.0; +http://www.baidu.com/search/spider.html)" - 220.181.108.82 - - [21/May/2016:07:12:18 +0800] "GET /?t=jshz3 HTTP/1.1" 200 13605 "-" "Mozilla/5.0 (compatible; Baiduspider/2.0; +http://www.baidu.com/search/spider.html)" - 220.181.108.75 - - [21/May/2016:07:12:18 +0800] "GET /?t=8vzbr HTTP/1.1" 200 13583 "-" "Mozilla/5.0 (compatible; Baiduspider/2.0; +http://www.baidu.com/search/spider.html)" - 123.125.71.12 - - [21/May/2016:07:12:18 +0800] "GET /?t=h4ipq HTTP/1.1" 200 13601 "-" "Mozilla/5.0 (compatible; Baiduspider/2.0; +http://www.baidu.com/search/spider.html)" - 123.125.71.49 - - [21/May/2016:07:12:18 +0800] "GET /?t=cbpyc HTTP/1.1" 200 13540 "-" "Mozilla/5.0 (compatible; Baiduspider/2.0; +http://www.baidu.com/search/spider.html)" - 220.181.108.161 - - [21/May/2016:07:12:18 +0800] "GET /?t=vfz6f HTTP/1.1" 200 13516 "-" "Mozilla/5.0 (compatible; Baiduspider/2.0; +http://www.baidu.com/search/spider.html)" - 123.125.71.21 - - [21/May/2016:07:12:18 +0800] "GET /?t=eroe3 HTTP/1.1" 200 13608 "-" "Mozilla/5.0 (compatible; Baiduspider/2.0; +http://www.baidu.com/search/spider.html)" - 220.181.108.92 - - [21/May/2016:07:12:19 +0800] "GET /?t=qwjtk HTTP/1.1" 200 13602 "-" "Mozilla/5.0 (compatible; Baiduspider/2.0; +http://www.baidu.com/search/spider.html)" - 123.125.71.19 - - [21/May/2016:07:12:20 +0800] "GET /?t=1lqcp HTTP/1.1" 200 13589 "-" "Mozilla/5.0 (compatible; Baiduspider/2.0; +http://www.baidu.com/search/spider.html)" - 123.125.71.49 - - [21/May/2016:07:12:22 +0800] "GET /?t=qh3nr HTTP/1.1" 200 13595 "-" "Mozilla/5.0 (compatible; Baiduspider/2.0; +http://www.baidu.com/search/spider.html)" - 123.125.71.81 - - [21/May/2016:07:12:26 +0800] "GET /?t=dtlt3 HTTP/1.1" 200 13567 "-" "Mozilla/5.0 (compatible; Baiduspider/2.0; +http://www.baidu.com/search/spider.html)" -

    我给它一个 5.21 满分奖。。

    13 条回复    2016-05-21 22:21:35 +08:00
    aladd
        1
    aladd  
    OP
       2016-05-21 13:41:14 +08:00
    [21/May/2016:07:12:18 +0800] "GET /?t=cbpyc HTTP/1.1"
    [21/May/2016:07:12:18 +0800] "GET /?t=cbpyc HTTP/1.1"
    [21/May/2016:07:12:18 +0800] "GET /?t=cbpyc HTTP/1.1"
    [21/May/2016:07:12:18 +0800] "GET /?t=cbpyc HTTP/1.1"
    [21/May/2016:07:12:18 +0800] "GET /?t=cbpyc HTTP/1.1"
    [21/May/2016:07:12:18 +0800] "GET /?t=cbpyc HTTP/1.1"
    [21/May/2016:07:12:18 +0800] "GET /?t=cbpyc HTTP/1.1"
    aladd
        2
    aladd  
    OP
       2016-05-21 13:41:38 +08:00
    哎呀我擦,排版不好看啊 = =
    lhbc
        3
    lhbc  
       2016-05-21 14:25:06 +08:00
    你确定你不是被 CC 攻击了?
    nine
        4
    nine  
       2016-05-21 14:25:24 +08:00
    广告费即将飞起,公司 lz
    aladd
        5
    aladd  
    OP
       2016-05-21 14:56:33 +08:00
    @lhbc 问题是这些 IP 是真的百度蜘蛛, nslookup 都没问题。 0 0
    lhbc
        6
    lhbc  
       2016-05-21 16:50:02 +08:00
    /?t=uwk79
    lhbc
        7
    lhbc  
       2016-05-21 16:50:34 +08:00   ❤️ 1
    不小心发出去了
    这种 URL 一点都不友好
    shiny
        8
    shiny  
       2016-05-21 16:52:11 +08:00
    反查 ip ,是百度蜘蛛;可以去 zhanzhang.baidu.com 调整抓取频率。
    Hello1995
        9
    Hello1995  
       2016-05-21 18:52:19 +08:00 via Android
    可以尝试让百度的蜘蛛 301 回百度首页
    aladd
        10
    aladd  
    OP
       2016-05-21 19:48:41 +08:00
    @lhbc 是啊,动态请求主页,缓存都没用了,然后各种高负载。我现在的处理方法是屏蔽了百度蜘蛛了…………
    明天试试了再。


    @shiny 调整频率~ 主要是 一直动态请求页面,除非不来,不然一直是个尴尬。今天不正常,基本每秒都在请求,平时的蜘蛛不会这么频繁的。


    @Hello1995 暂时屏蔽了来访,全部 403 了。。真特么的。
    zaishanfeng
        11
    zaishanfeng  
       2016-05-21 20:03:13 +08:00
    User-agent: Baiduspider
    User-agent: baiduspider
    User-agent: Baiduspider+
    User-agent: Baiduspider-video
    User-agent: Baiduspider-image
    Disallow: /
    e1eph4nt
        12
    e1eph4nt  
       2016-05-21 21:51:55 +08:00
    502 了怪访问者?逻辑感人
    aladd
        13
    aladd  
    OP
       2016-05-21 22:21:35 +08:00
    @e1eph4nt 作为一个动态网页,蜘蛛每秒的 /?t=(随机字符) 的请求,造成机器高负载,我该怪我自己吗
    关于   ·   帮助文档   ·   博客   ·   API   ·   FAQ   ·   实用小工具   ·   3249 人在线   最高记录 6679   ·     Select Language
    创意工作者们的社区
    World is powered by solitude
    VERSION: 3.9.8.5 · 22ms · UTC 12:58 · PVG 20:58 · LAX 04:58 · JFK 07:58
    Developed with CodeLauncher
    ♥ Do have faith in what you're doing.