首頁(yè) 猿問(wèn) pandas apply() 導(dǎo)致...

pandas apply() 導(dǎo)致 UnboundLocalError

PHP

翻閱古今 2023-11-09 21:06:10

我有一個(gè)包含 2 列的數(shù)據(jù)框 (df_cluster) [客戶 ID，集群]。大約有 13 個(gè)集群，我嘗試使用 python 中的 apply() 為每個(gè)集群分配一個(gè)名稱。我過(guò)去使用過(guò)相同的函數(shù)并且工作正常，但現(xiàn)在我收到“UnboundLocalError”錯(cuò)誤。如果我做錯(cuò)了什么，請(qǐng)告訴我。我對(duì) apply() 的理解是，它跨軸傳遞函數(shù)（在這種情況下，函數(shù) cluster_name 將為每一行傳遞）這是代碼def cluster_name(df): if df['cluster'] == 1: value = 'A' elif df['cluster'] == 2: value = 'B' elif df['cluster'] == 3: value = 'C' elif df['cluster'] == 4: value = 'D' elif df['cluster'] == 5: value = 'E' elif df['cluster'] == 6: value = 'F' elif df['cluster'] == 7: value = 'G' return valuedf_cluster['cluster_name'] = df_cluster.apply(cluster_name, axis = 1)錯(cuò)誤UnboundLocalError Traceback (most recent call last)<ipython-input-16-b64f3fdc1260> in <module> 16 return value 17 ---> 18 df_cluster['cluster_name'] = df_cluster.apply(cluster_name, axis = 1) 19 df_cluster['cluster_name'].value_counts()/opt/cloudera/parcels/Anaconda/envs/py36/lib/python3.6/site-packages/pandas/core/frame.py in apply(self, func, axis, broadcast, raw, reduce, result_type, args, **kwds) 6926 kwds=kwds, 6927 )-> 6928 return op.get_result() 6929 6930 def applymap(self, func):/opt/cloudera/parcels/Anaconda/envs/py36/lib/python3.6/site-packages/pandas/core/apply.py in get_result(self) 184 return self.apply_raw() 185 --> 186 return self.apply_standard() 187 188 def apply_empty_result(self):

查看完整描述

3 回答

MYYA

TA貢獻(xiàn)1868條經(jīng)驗(yàn) 獲得超4個(gè)贊

else你的函數(shù)中缺少一個(gè)：

def cluster_name(df):

if df['cluster'] == 1:

value = 'A'

elif df['cluster'] == 2:

value = 'B'

elif df['cluster'] == 3:

value = 'C'

elif df['cluster'] == 4:

value = 'D'

elif df['cluster'] == 5:

value = 'E'

elif df['cluster'] == 6:

value = 'F'

elif df['cluster'] == 7:

value = 'G'

else:

value = ...

return value

否則，value如果不在值 {1, 2, ..., 7} 之間，則不會(huì)設(shè)置df['cluster']，并且會(huì)出現(xiàn)異常。

反對(duì) 回復(fù) 2023-11-09

catspeake

TA貢獻(xiàn)1111條經(jīng)驗(yàn) 獲得超0個(gè)贊

手動(dòng)創(chuàng)建if-else函數(shù)被高估了，并且可能會(huì)錯(cuò)過(guò)某個(gè)條件。
由于您將字母指定為'cluster_name'，因此請(qǐng)使用string.ascii_uppercase來(lái)獲取list所有字母中的 a ，并將zip它們分配給中的唯一值'cluster'
- dict從壓縮值創(chuàng)建一個(gè)并.map創(chuàng)建'cluster_name'列。
此實(shí)現(xiàn)使用列中的唯一值來(lái)創(chuàng)建映射，因此不會(huì)出現(xiàn)"local variable 'value' referenced before assignment".
- 在您出現(xiàn)錯(cuò)誤的情況下，這是因?yàn)?code>return value當(dāng)列中存在不符合您的if-else條件的值時(shí)執(zhí)行，這意味著value未在函數(shù)中分配。

import pandas as pd

import string

# test dataframe

df = pd.DataFrame({'cluster': range(1, 11)})

# unique values from the cluster column

clusters = sorted(df.cluster.unique())?

# create a dict to map

cluster_map = dict(zip(clusters, string.ascii_uppercase))

# create the cluster_name column

df['cluster_name'] = df.cluster.map(cluster_map)

# df

? ?cluster cluster_name

0? ? ? ? 1? ? ? ? ? ? A

1? ? ? ? 2? ? ? ? ? ? B

2? ? ? ? 3? ? ? ? ? ? C

3? ? ? ? 4? ? ? ? ? ? D

4? ? ? ? 5? ? ? ? ? ? E

5? ? ? ? 6? ? ? ? ? ? F

6? ? ? ? 7? ? ? ? ? ? G

7? ? ? ? 8? ? ? ? ? ? H

8? ? ? ? 9? ? ? ? ? ? I

9? ? ? ?10? ? ? ? ? ? J

反對(duì) 回復(fù) 2023-11-09

白衣染霜花

TA貢獻(xiàn)1796條經(jīng)驗(yàn) 獲得超10個(gè)贊

似乎您的問(wèn)題已在評(píng)論中得到解答，因此我將提出一種更面向熊貓的方法來(lái)解決您的問(wèn)題。使用apply(axis=1)DataFrame 速度非常慢，而且?guī)缀鯖](méi)有必要（與迭代數(shù)據(jù)幀中的行相同），因此更好的方法是使用矢量化方法。最簡(jiǎn)單的方法是在字典中定義 cluster -> cluster_name 映射，并使用以下方法map：

df = pd.DataFrame(

{"cluster": [1,2,3,4,5,6,7]}

)

# repeat this dataframe 10000 times

df = pd.concat([df] * 10000)

應(yīng)用方法：

def mapping_func(row):

if row['cluster'] == 1:

value = 'A'

elif row['cluster'] == 2:

value = 'B'

elif row['cluster'] == 3:

value = 'C'

elif row['cluster'] == 4:

value = 'D'

elif row['cluster'] == 5:

value = 'E'

elif row['cluster'] == 6:

value = 'F'

elif row['cluster'] == 7:

value = 'G'

else:

# This is a "catch-all" in case none of the values in the column are 1-7

value = "Z"

return value

%timeit df.apply(mapping_func, axis=1)

# 1.32 s ± 91.3 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)

.map方法

mapping_dict = {

1: "A",

2: "B",

3: "C",

4: "D",

5: "E",

6: "F",

7: "G"

}

# the `fillna` is our "catch-all" statement.

# essentially if `map` encounters a value not in the dictionary

# it will place a NaN there. So I fill those NaNs with "Z" to

# be consistent with the above example

%timeit df["cluster"].map(mapping_dict).fillna("Z")

# 4.87 ms ± 195 μs per loop (mean ± std. dev. of 7 runs, 100 loops each)

我們可以看到mapwith 字典方法比 while 方法要快得多，apply而且還避免了長(zhǎng)if/elif語(yǔ)句鏈。

反對(duì) 回復(fù) 2023-11-09

3 回答
0 關(guān)注
185 瀏覽

關(guān)注

添加回答

舉報(bào)

0/150

提交

取消

使用 Ctrl+D 可將網(wǎng)站添加到書簽

微信客服

購(gòu)課補(bǔ)貼
聯(lián)系客服咨詢優(yōu)惠詳情

幫助反饋 APP下載

慕課網(wǎng)APP
您的移動(dòng)學(xué)習(xí)伙伴

公眾號(hào)

掃描二維碼
關(guān)注慕課網(wǎng)微信公眾號(hào)

第七色在线视频,2021少妇久久久久久久久久,亚洲欧洲精品成人久久av18,亚洲国产精品特色大片观看完整版,孙宇晨将参加特朗普的晚宴

熱搜

最近搜索清空

pandas apply() 導(dǎo)致 UnboundLocalError

pandas apply() 導(dǎo)致 UnboundLocalError

3 回答

添加回答