爬虫代码范例-抓取PPT看板NBA，并存储为josn格式

2024年1月17日 20:47:37RS

RS

管理员

357
文章

0
粉丝

技术•随笔评论4,940字数 170阅读0分34秒阅读模式

import requests
from bs4 import BeautifulSoup
import json
url='https://www.ptt.cc/bbs/nba/index.html'
headers={'User-Agent':'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/120.0.0.0 Safari/537.36'}
response=requests.get(url,headers=headers)
soup=BeautifulSoup(response.text,'html.parser')
articles=soup.find_all("div",class_="r-ent")
data_list=[]
for a in articles:
    data={}
    title=a.find('div',class_='title')
    if title and title.a:
        title=title.a.text
    else:
        title='没有标题'
    data['标题']=title

    popular=a.find('div',class_='nrec')
    if popular and popular.span:
        popular=popular.span.text
    else:
        popular='N/A'
    data['人气'] = popular
    date=a.find('div',class_='date')
    if date:
        date=date.text
    else:
        date='N/A'
    data['日期'] = date
    data_list.append(data)
with open('ppt_nab_crawler.json','w',encoding='utf-8') as file:
    json.dump(data_list,file,ensure_ascii=False,indent=4)
print('资料已经存储为ppt_nab_crawler.json')

继续阅读

本文由 RS 发表于2024年1月17日 20:47:37

DeepSeek秒转Word秘籍：职场人必备的标准化文档生成指南

DeepSeek秒转Word秘籍：职场人必备的标准化文档生成指南

Omakub 专为开发者设计的 Ubuntu 环境配置项目！

Omakub 专为开发者设计的 Ubuntu 环境配置项目！

unraid安装所需的工具

unraid安装所需的工具

安装HACS失败遇到的问题

安装HACS失败遇到的问题

关闭Windows10的自动更新

关闭Windows10的自动更新

一句话代码激活（Internet Download Manager）

一句话代码激活（Internet Download Manager）

ubuntu23+ollama+openwebui+llama2-chinese模型+llava模型：布署本地私有化ChatGPT

ubuntu23+ollama+openwebui+llama2-chinese模型+llava模型：布署本地私有化ChatGPT

爬虫代码范例-抓取网页内容，并下载图片

爬虫代码范例-抓取网页内容，并下载图片

找出Word表格中特定内容所在的行和列

找出Word表格中特定内容所在的行和列

python代码：doc转docx文件

python代码：doc转docx文件

DeepSeek秒转Word秘籍：职场人必备的标准化文档生成指南

加载中...

发表评论

匿名网友

确定

昵称

邮箱

网址

Address

拖动滑块以完成验证

最全面的常用正则表达式大全 05/11 3,803
Warning: Trying to access array offset on false in /data/www/wp-content/themes/begin/inc/thumbnail.php on line 85
8分钟没一句台词，这部动画却获117项提名、64项大奖 05/12 3,914
我只是个孩子 05/13 3,339
慢镜头下子弹射穿水瓶 05/13 4,391
Warning: Trying to access array offset on false in /data/www/wp-content/themes/begin/inc/thumbnail.php on line 85
Let's go Belt and Road 05/13 3,999
Warning: Trying to access array offset on false in /data/www/wp-content/themes/begin/inc/thumbnail.php on line 85
魔性GIF 05/13 9,899
Warning: Trying to access array offset on false in /data/www/wp-content/themes/begin/inc/thumbnail.php on line 85
pyspider 爬虫教程 (1)：HTML 和 CSS 选择 05/15 3,821
Warning: Trying to access array offset on false in /data/www/wp-content/themes/begin/inc/thumbnail.php on line 85
pyspider 爬虫教程（二）：AJAX 和 HTTP 05/15 3,875