一、定义
通过对原始数据进行变换把数据变换到均值为0,标准差为1范围內
二、公式
示例:
三、API
sklearn. preprocessing .MinMaxScaler (feature_range=(0, 1)…)
o MinMaxScalar .fit_ transform(X)
**X: numpy array格式的数据[n_ samples, n_ features]**
返回值:转换后的形状相同的array
四、代码实例
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16
| from sklearn.preprocessing import StandardScaler import pandas as pd
def stand_demo(): data = pd.read_csv("dating.txt") data = data.iloc[:,:3] print("data:\n",data) transfer = StandardScaler() data_new = transfer.fit_transform(data) print("data_new:\n",data_new) return None
|
五、运行结果
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41
| data: milage Liters Consumtime 0 40920 8.326976 0.953952 1 14488 7.153469 1.673904 2 26052 1.441871 0.895124 3 75136 13.147394 0.428964 4 38344 1.669788 0.134296 5 72993 18.141748 1.932955 6 35948 6.838792 1.213192 7 42666 13.276369 0.543888 8 67497 8.631577 0.749278 9 35483 12.273169 1.508953 10 50242 3.723498 0.831917 11 63275 8.385879 1.669485 12 5569 4.875435 0.728658 13 51052 4.688098 0.625224 14 77372 15.299570 0.331351 15 43673 1.889461 0.191283 16 61364 7.516754 1.269164 17 69673 14.239195 0.261333 18 15669 0.000000 1.259185 data_new: [[ 0.12304713 0.24281169 0.08961701] [-1.09340804 0.00255574 1.51545904] [-0.5612089 -1.16679851 -0.02688998] [ 1.69773803 1.22971173 -0.95010503] [ 0.00449429 -1.12013631 -1.53368563] [ 1.59911275 2.25222225 2.02850131] [-0.10577457 -0.06186912 0.60303358] [ 0.20340165 1.2561172 -0.7225017 ] [ 1.34617549 0.30517366 -0.31573334] [-0.12717483 1.05072877 1.18877883] [ 0.5520648 -0.6996735 -0.15206943] [ 1.15187034 0.2548711 1.50670735] [-1.50387882 -0.46383365 -0.3565706 ] [ 0.58934267 -0.50218777 -0.56141834] [ 1.80064336 1.6703338 -1.14342447] [ 0.24974587 -1.07516193 -1.42082469] [ 1.06392218 0.07693228 0.71388434] [ 1.4463195 1.45323974 -1.28209289] [-1.03905598 -1.4619975 0.69412125]]
|
六、总结
在已有样本足够多的情况下比较稳定,适合现代嘈杂大数据场景。