kandi has reviewed ML-From-Scratch and discovered the below as its top functions. This is intended to give you an instant insight into ML-From-Scratch implemented functionality, and help decide if they suit your requirements.
Machine Learning From Scratch. Bare bones NumPy implementations of machine learning models and algorithms with a focus on accessibility. Aims to cover everything from linear regression to deep learning.
$ git clone https://github.com/eriklindernoren/ML-From-Scratch $ cd ML-From-Scratch $ python setup.py install
Why the predicted value by LinearRegression is exactly the same as the true value?Asked 2020-Sep-24 at 23:10
I'm doing a regression by
LinearRegression and get the mean squared error 0. I think there should be some deviation(at least small). Could you please explain this phenomenon?
## Import packages import numpy as np import pandas as pd from sklearn.linear_model import LinearRegression from sklearn.metrics import mean_squared_error import urllib.request ## Import dataset urllib.request.urlretrieve('https://raw.githubusercontent.com/Data-Science-FMI/ml-from-scratch-2019/master/data/house_prices_train.csv', 'house_prices_train.csv') df_train = pd.read_csv('house_prices_train.csv') x = df_train['GrLivArea'].values.reshape(1, -1) y = df_train['SalePrice'].values.reshape(1, -1) print('The explanatory variable is', x) print('The variable to be predicted is', y) ## Regression reg = LinearRegression().fit(x, y) mean_squared_error(y, reg.predict(x)) print('The MSE is', mean_squared_error(y, reg.predict(x))) print('Predicted value is', reg.predict(x)) print('True value is', y)
The result is
The explanatory variable is [[1710 1262 1786 ... 2340 1078 1256]] The variable to be predicted is [[208500 181500 223500 ... 266500 142125 147500]] The MSE is 0.0 Predicted value is [[208500. 181500. 223500. ... 266500. 142125. 147500.]] True value is [[208500 181500 223500 ... 266500 142125 147500]]
ANSWERAnswered 2020-Sep-24 at 23:10
While the comments are certainly correct that a model's score on its own training set will be inflated, it is unlikely to get a perfect fit with linear regression, especially with just one feature.
Your problem is that you've reshaped the data incorrectly:
reshape(1, -1) makes an array of shape
(1, n), so your model thinks it has
n features and
n outputs with only a single sample, and so is a multiple linear regression with a perfect fit. Try instead with
reshape(-1, 1) for
x and no reshaping for
No vulnerabilities reported
Explore Related Topics