1 回答

TA貢獻(xiàn)1883條經(jīng)驗(yàn) 獲得超3個(gè)贊
從文檔中
'column'
將列選擇器指定為(作為簡單字符串)和['column']
(作為包含一個(gè)元素的列表)之間的區(qū)別在于傳遞給轉(zhuǎn)換器的數(shù)組的形狀。在第一種情況下,將傳遞一個(gè)一維數(shù)組,而在第二種情況下,將傳遞一個(gè)具有一列的二維數(shù)組,即列向量。
所有列必須使用相同類型的列選擇器傳遞。
在本例中,為 a?
list
,因?yàn)樾枰?code>list保留一些未轉(zhuǎn)換的列。
import pandas as pd
from sklearn.preprocessing import PolynomialFeatures
from sklearn_pandas import DataFrameMapper
# load data
df = pd.read_csv('https://raw.githubusercontent.com/ageron/handson-ml2/master/datasets/housing/housing.csv')
# create houseAge_income
df['houseAge_income'] = df.housing_median_age.mul(df.median_income)
# configure mapper with all columns passed as lists
mapper = DataFrameMapper([(['houseAge_income'], PolynomialFeatures(2)),
? ? ? ? ? ? ? ? ? ? ? ? ? (['median_income'], PolynomialFeatures(2)),
? ? ? ? ? ? ? ? ? ? ? ? ? (['latitude', 'housing_median_age', 'total_rooms', 'population', 'median_house_value', 'ocean_proximity'], None)])
# fit
poly_feature = mapper.fit_transform(df)
# display(pd.DataFrame(poly_feature).head())
? 0? ? ? ?1? ? ? ? ? ?2? 3? ? ? ?4? ? ? ?5? ? ? 6? ?7? ? ?8? ? ?9? ? ? ? ? 10? ? ? ? 11
0? 1? 341.33? 1.1651e+05? 1? 8.3252? 69.309? 37.88? 41? ?880? ?322? 4.526e+05? NEAR BAY
1? 1? 174.33? ? ? ?30391? 1? 8.3014? 68.913? 37.86? 21? 7099? 2401? 3.585e+05? NEAR BAY
2? 1? 377.38? 1.4242e+05? 1? 7.2574? ?52.67? 37.85? 52? 1467? ?496? 3.521e+05? NEAR BAY
3? 1? 293.44? ? ? ?86108? 1? 5.6431? 31.845? 37.85? 52? 1274? ?558? 3.413e+05? NEAR BAY
4? 1? ? ?200? ? ? ?40001? 1? 3.8462? 14.793? 37.85? 52? 1627? ?565? 3.422e+05? NEAR BAY
添加回答
舉報(bào)