python爬虫代码完整版

admin 2024年03月01日 18:16 31 0

以下是一个简单的Python爬虫代码示例，用于爬取网页上的标题和链接：

import requests
from bs4 import BeautifulSoup

# 定义目标网页的URL
url = "http://example.com"

# 发送HTTP请求获取网页内容
response = requests.get(url)

# 使用BeautifulSoup解析网页内容
soup = BeautifulSoup(response.text, "html.parser")

# 找到所有的标题和链接
titles = soup.find_all("title")
links = soup.find_all("a", href=True)

# 输出标题和链接
for title in titles:
    print(title.get_text())
for link in links:
    print(link.get("href"))

这个代码使用了`requests`库来发送HTTP请求获取网页内容，然后使用`BeautifulSoup`库来解析网页内容，它找到了所有的标题和链接，并输出了它们的文本和URL，你可以根据需要修改代码来爬取其他网页或提取其他信息。