首页   注册   登录
 chaneyccy 最近的时间轴更新
chaneyccy

chaneyccy

V2EX 第 461189 号会员,加入于 2019-12-25 11:22:11 +08:00
chaneyccy 最近回复了
@Vegetable 感谢感谢,之前把从'a'改成'w'了,没考虑到这个原因,调回来正常了
@JCZ2MkKb5S8ZX9pq 好的,平时没有用 markdown 写内容的习惯~ 我去研究下
排版有点乱,更新一下

def download(href_urls):

for url in href_urls:

mod_titles = []

ses = requests.session()

html = ses.get(url, headers = header(), verify = False)

soup = BeautifulSoup(html.content, 'html.parser')

title_list = soup.find(class_ = 'g-ctnBar').find_all('a')

title1 = title_list[2].get_text()

title2 = title_list[3].get_text()

title3 = title_list[4].get_text()

title4 = title_list[5].get_text()

list_ = soup.find_all('div', class_ = 'detail-mod J_floor')[:-3]

for txt in list_:

txts = txt.get_text()

download_run(title1, title2, title3, title4, txts)


def download_run(title1, title2, title3, title4, txts):

path = 'C:/Users/Desktop/run/%s/%s/%s' %(title1, title2, title3)

if not os.path.exists(path):

os.makedirs(path)

with open('C:/Users/Desktop/run/%s/%s/%s/%s.txt' %(title1, title2, title3, title4), 'w')as f:

f.write(txts)
关于   ·   FAQ   ·   API   ·   我们的愿景   ·   广告投放   ·   感谢   ·   实用小工具   ·   2580 人在线   最高记录 5168   ·     Select Language
创意工作者们的社区
World is powered by solitude
VERSION: 3.9.8.3 · 13ms · UTC 01:44 · PVG 09:44 · LAX 17:44 · JFK 20:44
♥ Do have faith in what you're doing.