我正在嘗試為每個用戶獲取15個最相關(guān)的項目,但是我嘗試的每個功能都花了很長時間。(超過6個小時后,我將其關(guān)閉了...)我有418個獨特用戶,3718個獨特項目。U2tfifd dict也有418個條目,并且tfidf_feature_names中有32645個單詞。我的interacts_full_df的形狀是(40733,3)我試過了 : def index_tfidf_users(user_id) : return [users for users in U2tfifd[user_id].flatten().tolist()]def get_relevant_items(user_id): return sorted(zip(tfidf_feature_names, index_tfidf_users(user_id)), key=lambda x: -x[1])[:15]def get_tfidf_token(user_id) : return [words for words, values in get_relevant_items(user_id)]然后 interactions_full_df["tags"] = interactions_full_df["user_id"].apply(lambda x : get_tfidf_token(x))或者def get_tfidf_token(user_id) : tags = [] v = sorted(zip(tfidf_feature_names, U2tfifd[user_id].flatten().tolist()), key=lambda x: -x[1])[:15] for words, values in v : tags.append(words) return tags或者def get_tfidf_token(user_id) : v = sorted(zip(tfidf_feature_names, U2tfifd[user_id].flatten().tolist()), key=lambda x: -x[1])[:15] tags = [words for words in v] return tagsU2tfifd是具有鍵= user_id,值=數(shù)組的字典
基于列表理解的加速功能
慕碼人8056858
2021-05-10 13:15:59