python爬虫爬取图片

admin 2024年01月08日 06:17 36 0

Python爬虫爬取图片

在Python中，我们可以使用requests和BeautifulSoup库来爬取网页上的图片，以下是一个简单的示例代码：

import requests
from bs4 import BeautifulSoup
import os

# 定义目标网页的URL
url = 'http://example.com'

# 发送HTTP请求获取网页内容
response = requests.get(url)

# 使用BeautifulSoup解析网页内容
soup = BeautifulSoup(response.text, 'html.parser')

# 找到所有的图片标签
img_tags = soup.find_all('img')

# 创建一个文件夹来保存爬取的图片
folder_name = 'images'
if not os.path.exists(folder_name):
    os.makedirs(folder_name)

# 遍历所有的图片标签，下载图片并保存到文件夹中
for img in img_tags:
    img_url = img.get('src')
    img_name = img_url.split('/')[-1]  # 获取图片文件名
    img_path = os.path.join(folder_name, img_name)  # 拼接图片保存路径
    response = requests.get(img_url, stream=True)  # 发送HTTP请求获取图片内容
    with open(img_path, 'wb') as f:  # 写入文件
        f.write(response.content)

这个代码会爬取指定网页上所有的图片，并将它们保存到本地的文件夹中，你可以将`url`变量替换为你想要爬取的图片的网页URL，在代码中，我们首先发送HTTP请求获取网页内容，然后使用BeautifulSoup解析网页内容，找到所有的图片标签，接下来，我们遍历所有的图片标签，下载图片并保存到文件夹中，在下载图片时，我们使用`requests.get`函数发送HTTP请求获取图片内容，然后使用`open`函数将内容写入文件，我们使用`os.path.join`函数拼接图片保存路径，确保图片保存在正确的文件夹中。