设计思路:requests发送请求返回HTML→正则匹配电影名称+图片地址→电影名称+图片地址下载保存至本地
from loguru import logger
import requests
import redef douban_top250():# return html 页面def html_page():url = 'https://movie.douban.com/top250?start=0&filter='headers = {'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/67.0.3396.99 Safari/537.36'}res = requests.get(url, headers=headers)res.encoding = "utf-8"html = res.textreturn htmli = 1number = 0paeg = int(re.findall(r'&filter=" >(.*)</a>', html_page())[-2]) + 1 # 获取页面的返回值while i < paeg:url = f'https://movie.douban.com/top250?start={number}&filter='headers = {'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/67.0.3396.99 Safari/537.36'}res = requests.get(url, headers=headers)res.encoding = "utf-8"html = res.textnumber += 25i += 1v = 0while v < 25:jpg_name = re.findall(' alt="(.*)" src="', html)[v] # douban图片名称jpg_url = re.findall('" src="(.*)" class="">', html)[v] # douban图片地址file_path = fr'D:\jpg\{jpg_name}.png' # 本地图片地址v += 1try:myfile = requests.get(jpg_url)file_path = fr'D:\jpg\{jpg_name}.png'open(file_path, 'wb').write(myfile.content)except:logger.error(f'图片保存失败,请检查路径{file_path}')logger.info(f"{jpg_name} {jpg_url}")douban_top250()
版权声明:本站所有资料均为网友推荐收集整理而来,仅供学习和研究交流使用。
工作时间:8:00-18:00
客服电话
电子邮件
admin@qq.com
扫码二维码
获取最新动态