首頁猿問使用函數(shù)過濾 Pandas...

使用函數(shù)過濾 Pandas DataFrame

Python

侃侃無極 2023-05-09 14:57:57

這個(gè)問題與我昨天發(fā)布的問題有關(guān)，可以在這里找到。因此，我繼續(xù)將 Jan 提供的解決方案實(shí)施到整個(gè)數(shù)據(jù)集。解決方法如下：import redef is_probably_english(row, threshold=0.90): regular_expression = re.compile(r'[-a-zA-Z0-9_ ]') ascii = [character for character in row['App'] if regular_expression.search(character)] quotient = len(ascii) / len(row['App']) passed = True if quotient >= threshold else False return passedgoogle_play_store_is_probably_english = google_play_store_no_duplicates.apply(is_probably_english, axis=1)google_play_store_english = google_play_store_no_duplicates[google_play_store_is_probably_english]因此，據(jù)我了解，我們正在使用 is_probably_english 函數(shù)過濾 google_play_store_no_duplicates DataFrame 并將結(jié)果（布爾值）存儲(chǔ)到另一個(gè) DataFrame (google_play_store_is_probably_english) 中。然后使用 google_play_store_is_probably_english 過濾掉 google_play_store_no_duplicates DataFrame 中的非英語應(yīng)用程序，最終結(jié)果存儲(chǔ)在新的 DataFrame 中。這是否有意義，是否看起來是解決問題的好方法？有一個(gè)更好的方法嗎？

查看完整描述

目前暫無任何回答