首頁猿問用 Python 抓取表格數(shù)據(jù)

用 Python 抓取表格數(shù)據(jù)

Python

寶慕林4294392 2021-11-09 20:02:09

我想使用網(wǎng)頁抓取從網(wǎng)站獲取數(shù)據(jù)，但在 to_html 中出現(xiàn)錯誤import requestsimport pandas as pd url= 'https://www.nseindia.com/live_market/dynaContent/live_watch/equities_stock_watch.htm' html = requests.get(url).content df_list = pd.read_html(html) df = df_list.to_html(html) print (df) df.to_csv('my data.csv')錯誤：AttributeError Traceback (most recent call last)<ipython-input-35-61d14e08ca97> in <module>() 5 html = requests.get(url).content 6 df_list = pd.read_html(html)----> 7 df = df_list.to_html(html) 8 print (df) 9 df.to_csv('my data.csv')AttributeError: 'list' object has no attribute 'to_html'

查看完整描述

3 回答

呼喚遠(yuǎn)方

TA貢獻(xiàn)1856條經(jīng)驗獲得超11個贊

請嘗試以下操作：

pip install lxml

pip install html5lib

pip install BeautifulSoup4

現(xiàn)在您不需要導(dǎo)入請求。

import pandas as pd

import html5lib

table=pd.read_html('https://www.nseindia.com/live_market/dynaContent/live_watch/equities_stock_watch.htm')

此外，如果您打算從國家證券交易所抓取股票數(shù)據(jù)，您可以使用 NSEpy，這是一個簡單的 API 來獲取印度公司的股票數(shù)據(jù)。

反對回復(fù) 2021-11-09

慕斯王

TA貢獻(xiàn)1864條經(jīng)驗獲得超2個贊

您收到 AttributeError 因為 pd.read_html() 返回數(shù)據(jù)框列表，而列表沒有屬性“to_html”

來到解決方案，您提到的頁面是使用javascript呈現(xiàn)的。BeautifulSoup無法從 javascript 呈現(xiàn)的頁面中抓取數(shù)據(jù)。

要訪問 Javascript 渲染的頁面，您需要使用成熟的渲染引擎。您可以使用selenium或phantomJS來獲取 javascript 數(shù)據(jù)。

反對回復(fù) 2021-11-09

飲歌長嘯

TA貢獻(xiàn)1951條經(jīng)驗獲得超3個贊

嘗試以下...

# !pip install webdriver-manager

import numpy as np

import requests

from bs4 import BeautifulSoup as bs

from selenium import webdriver

from selenium.webdriver.chrome.options import Options

from webdriver_manager.chrome import ChromeDriverManager

DRIVER_PATH = '/path/to/chromedriver'

url= 'https://www1.nseindia.com/live_market/dynaContent/live_watch/equities_stock_watch.htm'

options = Options()

options.headless = False

driver = webdriver.Chrome(ChromeDriverManager().install())

driver.set_page_load_timeout(5)

try:

driver.get(url)

except:

pass

src= driver.page_source

driver.quit()

soup= bs(src, 'lxml')

table= soup.find_all('table')

table= pd.read_html(str(table[1]),header=0)[0].set_index('Symbol')

table

反對回復(fù) 2021-11-09

3 回答
0 關(guān)注
223 瀏覽

關(guān)注

添加回答

舉報

0/150

提交

取消

使用 Ctrl+D 可將網(wǎng)站添加到書簽

微信客服

購課補(bǔ)貼
聯(lián)系客服咨詢優(yōu)惠詳情

幫助反饋 APP下載

慕課網(wǎng)APP
您的移動學(xué)習(xí)伙伴

公眾號

掃描二維碼
關(guān)注慕課網(wǎng)微信公眾號

第七色在线视频,2021少妇久久久久久久久久,亚洲欧洲精品成人久久av18,亚洲国产精品特色大片观看完整版,孙宇晨将参加特朗普的晚宴

熱搜

最近搜索清空

用 Python 抓取表格數(shù)據(jù)

用 Python 抓取表格數(shù)據(jù)

3 回答

添加回答