首頁(yè) 猿問如何在 pandas...

如何在 pandas 中從列表中提取數(shù)據(jù)作為字符串，并按值選擇數(shù)據(jù)？

Python

阿波羅的戰(zhàn)車 2023-09-05 20:24:00

我有一個(gè)像這樣的數(shù)據(jù)框：col1 col2[abc, bcd, dog] [[.4], [.5], [.9]][cat, bcd, def] [[.9], [.5], [.4]]列表中的數(shù)字col2描述了中的元素（基于列表索引位置）col1。所以“.4”col2描述了“abc” col1。col1我想創(chuàng)建 2 個(gè)新列，其中一列僅提取中 >= .9 的元素col2，另一列作為col2;中的數(shù)字。所以兩行都是“.9”。結(jié)果：col3 col4[dog] .9[cat] .9我認(rèn)為選擇從中刪除嵌套列表的路線col2就可以了。但這比聽起來更難。我已經(jīng)嘗試了一個(gè)小時(shí)來移除那些指狀支架。嘗試：spec_chars3 = ["[","]"]for char in spec_chars3: # didn't work, turned everything to nan df1['avg_jaro_company_word_scores'] = df1['avg_jaro_company_word_scores'].str.replace(char, '')df.col2.str.strip('[]') #didn't work b/c the nested list is still in a list, not a string我什至還沒弄清楚如何提取列表索引號(hào)并過濾 col1

查看完整描述

2 回答

開心每一天1111

TA貢獻(xiàn)1836條經(jīng)驗(yàn) 獲得超13個(gè)贊

根據(jù)問題末尾的解釋，似乎兩列都是str類型，并且需要轉(zhuǎn)換為list類型
- .applymap與一起使用ast.literal_eval。
- 如果只有一列是str類型，則使用df[col] = df[col].apply(literal_eval)
每列中的數(shù)據(jù)列表必須使用以下方法提取pandas.DataFrame.explode
- 外部explode將值從列表轉(zhuǎn)換為標(biāo)量（即[0.4]轉(zhuǎn)換為0.4）。
一旦值位于不同的行上，就可以使用布爾索引來選擇所需范圍內(nèi)的數(shù)據(jù)。
如果您想df與結(jié)合使用df_new，請(qǐng)使用df.join(df_new, rsuffix='_extracted')
測(cè)試于python 3.10,pandas 1.4.3

import pandas as pd

from ast import literal_eval

# setup the test data: this data is lists

# data = {'c1': [['abc', 'bcd', 'dog'], ['cat', 'bcd', 'def']], 'c2': [[[.4], [.5], [.9]], [[.9], [.5], [.4]]]}

# setup the test data: this data is strings

data = {'c1': ["['abc', 'bcd', 'dog', 'cat']", "['cat', 'bcd', 'def']"], 'c2': ["[[.4], [.5], [.9], [1.0]]", "[[.9], [.5], [.4]]"]}

# create the dataframe

df = pd.DataFrame(data)

# the description leads me to think the data is columns of strings, not lists

# convert the columns from string type to list type

# the following line is only required if the columns are strings

df = df.applymap(literal_eval)

# explode the lists in each column, and the explode the remaining lists in 'c2'

df_new = df.explode(['c1', 'c2'], ignore_index=True).explode('c2')

# use Boolean Indexing to select the desired data

df_new = df_new[df_new['c2'] >= 0.9]

# display(df_new)

? ? c1? ?c2

2? dog? 0.9

3? cat? 1.0

4? cat? 0.9

反對(duì) 回復(fù) 2023-09-05

慕村9548890

TA貢獻(xiàn)1884條經(jīng)驗(yàn) 獲得超4個(gè)贊

您可以使用列表推導(dǎo)式根據(jù)您的條件填充新列。

df['col3'] = [

[value for value, score in zip(c1, c2) if score[0] >= 0.9]

for c1, c2 in zip(df['col1'], df['col2'])

]

df['col4'] = [

[score[0] for score in c2 if score[0] >= 0.9]

for c2 in df['col2']

輸出

col1 col2 col3 col4

0 [abc, bcd, dog] [[0.4], [0.5], [0.9]] [dog] [0.9]

1 [cat, bcd, def] [[0.9], [0.5], [0.4]] [cat] [0.9]

反對(duì) 回復(fù) 2023-09-05

2 回答
0 關(guān)注
170 瀏覽

關(guān)注

添加回答

舉報(bào)

0/150

提交

取消

使用 Ctrl+D 可將網(wǎng)站添加到書簽

微信客服

購(gòu)課補(bǔ)貼
聯(lián)系客服咨詢優(yōu)惠詳情

幫助反饋 APP下載

慕課網(wǎng)APP
您的移動(dòng)學(xué)習(xí)伙伴

公眾號(hào)

掃描二維碼
關(guān)注慕課網(wǎng)微信公眾號(hào)

第七色在线视频,2021少妇久久久久久久久久,亚洲欧洲精品成人久久av18,亚洲国产精品特色大片观看完整版,孙宇晨将参加特朗普的晚宴

熱搜

最近搜索清空

如何在 pandas 中從列表中提取數(shù)據(jù)作為字符串，并按值選擇數(shù)據(jù)？

如何在 pandas 中從列表中提取數(shù)據(jù)作為字符串，并按值選擇數(shù)據(jù)？

2 回答

添加回答

如何在 pandas 中從列表中提取數(shù)據(jù)作為字符串，并按值選擇數(shù)據(jù)？

如何在 pandas 中從列表中提取數(shù)據(jù)作為字符串，并按值選擇數(shù)據(jù)？