首頁(yè) 猿問(wèn) 我們是否有任何功能可以在 R 或...

我們是否有任何功能可以在 R 或 Python 中過(guò)濾數(shù)據(jù)

Python

胡子哥哥 2023-01-04 13:33:51

我是 R 的新手，我無(wú)法弄清楚如何根據(jù)需要過(guò)濾數(shù)據(jù)下面是數(shù)據(jù)（326 行和 6 列）數(shù)據(jù)集這是一個(gè)小例子：Author,Commenid,Parentid,Submissionid Score StanceUser1 , 333c , 222b , 111b , 10 , Positive User2 , 444c , 333c , 5hdc , 15 , NeutralUser3 , 222b , 555d , 23er , 20 , NegativeUser4 , 555d , 666f , 111b , 11 , Positive這里user1的意思是，他已經(jīng)回復(fù)了user2 user3 had replied to user1 user4 had replied to user3我想過(guò)濾為具有相同 commentid 和 parentid 的用戶，對(duì)于上面的示例，我們將過(guò)濾為數(shù)據(jù)Author Score Stance Reply Score StanceUser2 15 Neutral User1 10 Positive User1 10 Positive User3 20 Negative User3 20 Negative User4 11 Positive我嘗試了很多但我無(wú)法弄清楚，任何人都可以幫助我如何準(zhǔn)確地做到這一點(diǎn)（R 或 Python）。

查看完整描述

2 回答

慕慕森

TA貢獻(xiàn)1856條經(jīng)驗(yàn) 獲得超17個(gè)贊

這是一個(gè)基本的 R 答案。

第一match列Commenid與Parentid. 創(chuàng)建一個(gè)數(shù)據(jù)集，其中Author列和Reply作者的列之前匹配。保留所有沒(méi)有NA值的行，并將 ( merge) 與原始數(shù)據(jù)連接起來(lái)以獲得其他列。

i <- with(df1, match(Commenid, Parentid))

res <- data.frame(Author = df1$Author, Reply = df1$Author[i])

res <- res[complete.cases(res), ]

merge(res, df1)

# Author Reply Commenid Parentid Submissionid

#1 User1 User2 333c 222b 111b

#2 User3 User1 222b 555d 23er

#3 User4 User3 555d 666f 111b

一種dplyr解決方案可能是

library(dplyr)

df1 %>%

mutate(i = match(Commenid, Parentid),

Reply = Author[i]) %>%

filter(!is.na(i)) %>%

select(Author, Reply, everything(vars = -i))

數(shù)據(jù)

df1 <- read.csv(text = "

Author,Commenid,Parentid,Submissionid

User1 , 333c , 222b , 111b

User2 , 444c , 333c , 5hdc

User3 , 222b , 555d , 23er

User4 , 555d , 666f , 111b

")

df1[] <- lapply(df1, trimws)

編輯

有了評(píng)論中描述的新數(shù)據(jù)和問(wèn)題，這里有一個(gè)dplyr解決方案。在與上面基本相同之后，它將結(jié)果與原始數(shù)據(jù)集連接起來(lái)并對(duì)列重新排序。

library(dplyr)

df2 %>%

mutate(i = match(Commenid, Parentid),

Reply = Author[i]) %>%

filter(!is.na(i)) %>%

select(-i) %>%

select(Author, Score, Stance, Reply, everything()) %>%

left_join(df2 %>% select(Author, Score, Stance), by = c("Reply" = "Author")) %>%

select(-matches("id$"), everything(), matches("id$"))

新數(shù)據(jù)

df2 <- read.csv(text = "

Author,Commenid,Parentid,Submissionid, Score, Stance

User1 , 333c , 222b , 111b , 10 , Positive

User2 , 444c , 333c , 5hdc , 15 , Neutral

User3 , 222b , 555d , 23er , 20 , Negative

User4 , 555d , 666f , 111b , 11 , Positive

")

names(df1) <- trimws(names(df1))

df1[] <- lapply(df1, trimws)

反對(duì) 回復(fù) 2023-01-04

慕俠2389804

TA貢獻(xiàn)1719條經(jīng)驗(yàn) 獲得超6個(gè)贊

您可以將每個(gè)用戶與其他用戶進(jìn)行比較，如果commentid相等parentid則您可以打印它，下面是您如何在 Python 中執(zhí)行此操作：

for u1 in dataset :

for u2 in dataset :

if u1['parentid'] == u2['commentid'] :

print( u1['Author'],' had comment of ',u2['Author'] )

反對(duì) 回復(fù) 2023-01-04

2 回答
0 關(guān)注
119 瀏覽

關(guān)注

添加回答

舉報(bào)

0/150

提交

取消

使用 Ctrl+D 可將網(wǎng)站添加到書簽

微信客服

購(gòu)課補(bǔ)貼
聯(lián)系客服咨詢優(yōu)惠詳情

幫助反饋 APP下載

慕課網(wǎng)APP
您的移動(dòng)學(xué)習(xí)伙伴

公眾號(hào)

掃描二維碼
關(guān)注慕課網(wǎng)微信公眾號(hào)

第七色在线视频,2021少妇久久久久久久久久,亚洲欧洲精品成人久久av18,亚洲国产精品特色大片观看完整版,孙宇晨将参加特朗普的晚宴

熱搜

最近搜索清空

我們是否有任何功能可以在 R 或 Python 中過(guò)濾數(shù)據(jù)

我們是否有任何功能可以在 R 或 Python 中過(guò)濾數(shù)據(jù)

2 回答

添加回答