1 回答

TA貢獻(xiàn)1850條經(jīng)驗(yàn) 獲得超11個(gè)贊
來(lái)自MovieLens的MovieLens 25M 數(shù)據(jù)集的數(shù)據(jù)
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns
sns.set_style("whitegrid")
# data
df = pd.read_csv('ml-25m/movies.csv')
print(df.head())
movieId title genres
0 1 Toy Story (1995) Adventure|Animation|Children|Comedy|Fantasy
1 2 Jumanji (1995) Adventure|Children|Fantasy
2 3 Grumpier Old Men (1995) Comedy|Romance
3 4 Waiting to Exhale (1995) Comedy|Drama|Romance
4 5 Father of the Bride Part II (1995) Comedy
# clean genres
df['genres'] = df['genres'].str.split('|')
df = df.explode('genres').reset_index(drop=True)
print(df.head())
movieId title genres
0 1 Toy Story (1995) Adventure
1 1 Toy Story (1995) Animation
2 1 Toy Story (1995) Children
3 1 Toy Story (1995) Comedy
4 1 Toy Story (1995) Fantasy
流派很重要
gc = df.genres.value_counts().to_frame()
print(genre_count)
genres
Drama 25606
Comedy 16870
Thriller 8654
Romance 7719
Action 7348
Horror 5989
Documentary 5605
Crime 5319
(no genres listed) 5062
Adventure 4145
Sci-Fi 3595
Children 2935
Animation 2929
Mystery 2925
Fantasy 2731
War 1874
Western 1399
Musical 1054
Film-Noir 353
IMAX 195
陰謀:sns.barplot
和ax
fig, ax = plt.subplots(figsize=(12, 6))
sns.barplot(x=gc.index, y=gc.genres, palette=sns.color_palette("BuGn_r", n_colors=len(genre_count) + 4), ax=ax)
ax.set_xticklabels(ax.get_xticklabels(), rotation=45, horizontalalignment='right')
plt.show()
沒(méi)有ax
plt.figure(figsize=(12, 6))
chart = sns.barplot(x=gc.index, y=gc.genres, palette=sns.color_palette("BuGn_r", n_colors=len(genre_count)))
chart.set_xticklabels(chart.get_xticklabels(), rotation=45, horizontalalignment='right')
plt.show()
陰謀:sns.countplot
如果情節(jié)順序無(wú)關(guān)緊要,請(qǐng)使用
sns.countplot
跳過(guò)使用。.value_counts()
要訂購(gòu)
countplot
,order=df.genres.value_counts().index
必須使用,所以如果需要降序,countplot
并不能真正使您免于需要 。.value_counts()
fig, ax = plt.subplots(figsize=(12, 6))
sns.countplot(data=df, x='genres', ax=ax)
ax.set_xticklabels(ax.get_xticklabels(), rotation=45, horizontalalignment='right')
plt.show()
添加回答
舉報(bào)