3 回答

TA貢獻(xiàn)1874條經(jīng)驗(yàn) 獲得超12個(gè)贊
當(dāng)嘗試從平面數(shù)組創(chuàng)建 Pandas DataFrame 時(shí),數(shù)組必須轉(zhuǎn)換為某種二維形式,因?yàn)?Pandas DataFrame 幾乎總是二維的。
出現(xiàn)這個(gè)問(wèn)題是因?yàn)槟阌幸恍腥校詳?shù)據(jù)數(shù)組的形狀應(yīng)該是(1, 3)
. 構(gòu)造函數(shù)pd.DataFrame
必須在數(shù)組末尾添加一個(gè)維度,并假定第一個(gè)維度中的每個(gè)項(xiàng)目都是 DataFrame 中的一行。
一個(gè)簡(jiǎn)單的解決方法是將數(shù)據(jù)數(shù)組重塑為行數(shù)乘以列數(shù)。
price = np.array([10, 8, 12]).reshape(1, -1)
上面調(diào)用-1
中的.reshape
告訴函數(shù)推斷該軸的長(zhǎng)度。

TA貢獻(xiàn)1825條經(jīng)驗(yàn) 獲得超4個(gè)贊
我的問(wèn)題是這里的 x 后面的逗號(hào) (x,) 表示什么?
此語(yǔ)法是通用的 Python,并不特定于 Numpy。當(dāng)我們要?jiǎng)?chuàng)建一個(gè)元組時(shí),我們?cè)谶@種情況下添加一個(gè)逗號(hào)。您應(yīng)該熟悉元組,例如(3, 4)
. 但是,如果我們想創(chuàng)建一個(gè)只有一個(gè)元素的元組怎么辦。您可以嘗試(3)
,但現(xiàn)在 Python 將括號(hào)解釋為數(shù)學(xué)表達(dá)式中的分組運(yùn)算符,就像我們使用它們時(shí)一樣(3 + 4) * 5
。這意味著它(3)
只是整數(shù)值3
,而不是元組。所以我們添加一個(gè)逗號(hào)(3,)
來(lái)創(chuàng)建一個(gè)只有一個(gè)元素的元組。

TA貢獻(xiàn)1859條經(jīng)驗(yàn) 獲得超6個(gè)贊
錯(cuò)誤的完整回溯表明已經(jīng)DataFrame對(duì)您的輸入進(jìn)行了相當(dāng)多的處理。
In [336]: pd.DataFrame(np.arange(1,4),
...: index=(["Price"]),
...: columns=(["Almond Butter","Peanut Butter", "Cashew Butter"]))
---------------------------------------------------------------------------
ValueError Traceback (most recent call last)
/usr/local/lib/python3.6/dist-packages/pandas/core/internals/managers.py in create_block_manager_from_blocks(blocks, axes)
1653 blocks = [
-> 1654 make_block(values=blocks[0], placement=slice(0, len(axes[0])))
1655 ]
/usr/local/lib/python3.6/dist-packages/pandas/core/internals/blocks.py in make_block(values, placement, klass, ndim, dtype)
3052
-> 3053 return klass(values, ndim=ndim, placement=placement)
3054
/usr/local/lib/python3.6/dist-packages/pandas/core/internals/blocks.py in __init__(self, values, placement, ndim)
124 raise ValueError(
--> 125 f"Wrong number of items passed {len(self.values)}, "
126 f"placement implies {len(self.mgr_locs)}"
ValueError: Wrong number of items passed 1, placement implies 3
During handling of the above exception, another exception occurred:
ValueError Traceback (most recent call last)
<ipython-input-336-43d59803fb0f> in <module>
1 pd.DataFrame(np.arange(1,4),
2 index=(["Price"]),
----> 3 columns=(["Almond Butter","Peanut Butter", "Cashew Butter"]))
/usr/local/lib/python3.6/dist-packages/pandas/core/frame.py in __init__(self, data, index, columns, dtype, copy)
462 mgr = init_dict({data.name: data}, index, columns, dtype=dtype)
463 else:
--> 464 mgr = init_ndarray(data, index, columns, dtype=dtype, copy=copy)
465
466 # For data is list-like, or Iterable (will consume into list)
/usr/local/lib/python3.6/dist-packages/pandas/core/internals/construction.py in init_ndarray(values, index, columns, dtype, copy)
208 block_values = [values]
209
--> 210 return create_block_manager_from_blocks(block_values, [columns, index])
211
212
/usr/local/lib/python3.6/dist-packages/pandas/core/internals/managers.py in create_block_manager_from_blocks(blocks, axes)
1662 blocks = [getattr(b, "values", b) for b in blocks]
1663 tot_items = sum(b.shape[0] for b in blocks)
-> 1664 construction_error(tot_items, blocks[0].shape[1:], axes, e)
1665
1666
/usr/local/lib/python3.6/dist-packages/pandas/core/internals/managers.py in construction_error(tot_items, block_shape, axes, e)
1692 if block_shape[0] == 0:
1693 raise ValueError("Empty data passed with indices specified.")
-> 1694 raise ValueError(f"Shape of passed values is {passed}, indices imply {implied}")
1695
1696
ValueError: Shape of passed values is (3, 1), indices imply (1, 3)
如果我們不指定索引,它會(huì)生成一維列框:
In [337]: pd.DataFrame(np.arange(1,4)) # (3,) input
Out[337]:
0
0 1
1 2
2 3
與 (3,1) 輸入相同:
In [339]: pd.DataFrame(np.arange(1,4)[:,None]) # (3,1) input
Out[339]:
0
0 1
1 2
2 3
但你想要一個(gè)(1,3):
In [340]: pd.DataFrame(np.arange(1,4)[None,:]) # (1,3) input
Out[340]:
0 1 2
0 1 2 3
numpy廣播可以將 (3,) 數(shù)組擴(kuò)展為 (1,3),但這不是它DataFrame正在做的事情。
根據(jù)您的看法,pandas 數(shù)據(jù)框可能看起來(lái)像是 2d numpy 數(shù)組的轉(zhuǎn)置。系列是 1d,但垂直顯示。數(shù)據(jù)框索引優(yōu)先考慮列。在探索底層數(shù)據(jù)存儲(chǔ)和values/to_numpy(). 細(xì)節(jié)很復(fù)雜。請(qǐng)注意,回溯討論了“block_manager”等。
In [342]: pd.Series(np.arange(1,4))
Out[342]:
0 1
1 2
2 3
dtype: int64
添加回答
舉報(bào)