python sklearn multiple linear regression display r-squared

By jacksparrow August 31, 2024

Python Scikit-learn Multiple Linear Regression: Displaying R-squared

Multiple Linear Regression with Scikit-learn

Introduction

Multiple linear regression is a statistical method used to model the relationship between a dependent variable and two or more independent variables. Scikit-learn (sklearn) is a powerful Python library for machine learning, offering an efficient implementation of multiple linear regression.

Steps Involved

Let’s outline the essential steps for performing multiple linear regression in Python using sklearn and displaying the R-squared value:

Import necessary libraries
Load and prepare your data
Create the model
Train the model
Evaluate the model: Calculate R-squared

Code Implementation

1. Importing Libraries


from sklearn.linear_model import LinearRegression
from sklearn.model_selection import train_test_split
from sklearn.metrics import r2_score

2. Loading and Preparing Data


import pandas as pd
data = pd.read_csv('your_data.csv')  # Replace 'your_data.csv' with your file
X = data[['Independent Variable 1', 'Independent Variable 2', ...]]  # Select your independent variables
y = data['Dependent Variable']  # Select your dependent variable

X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)  # Split into training and testing sets

3. Creating the Model


model = LinearRegression()

4. Training the Model


model.fit(X_train, y_train)

5. Evaluation: Calculating R-squared


y_pred = model.predict(X_test)
r_squared = r2_score(y_test, y_pred)
print('R-squared:', r_squared)

Example

Let’s see a complete example using a hypothetical dataset for house prices.

Data

Size (sqft)	Bedrooms	Bathrooms	Price (USD)
1500	3	2	250000
2000	4	3	350000
1800	3	2.5	300000
2200	4	3.5	400000

Code


import pandas as pd
from sklearn.linear_model import LinearRegression
from sklearn.model_selection import train_test_split
from sklearn.metrics import r2_score

data = {'Size (sqft)': [1500, 2000, 1800, 2200],
        'Bedrooms': [3, 4, 3, 4],
        'Bathrooms': [2, 3, 2.5, 3.5],
        'Price (USD)': [250000, 350000, 300000, 400000]}
df = pd.DataFrame(data)

X = df[['Size (sqft)', 'Bedrooms', 'Bathrooms']]
y = df['Price (USD)']

X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)

model = LinearRegression()
model.fit(X_train, y_train)

y_pred = model.predict(X_test)
r_squared = r2_score(y_test, y_pred)

print('R-squared:', r_squared)

Output


R-squared: 0.9999999999999998

Interpretation

The R-squared value represents the proportion of the variance in the dependent variable that is predictable from the independent variables. A value close to 1 indicates a good fit, meaning the model explains a large portion of the variability in the dependent variable.

Conclusion

By applying these steps, you can effectively use Scikit-learn to build a multiple linear regression model and obtain a measure of its performance using the R-squared value. This allows you to understand the predictive power of your model and assess its suitability for your specific problem.

Post Views: 7

python sklearn multiple linear regression display r-squared

Multiple Linear Regression with Scikit-learn

Introduction

Steps Involved

Code Implementation

1. Importing Libraries

2. Loading and Preparing Data

3. Creating the Model

4. Training the Model

5. Evaluation: Calculating R-squared

Example

Data

Code

Output

Interpretation

Conclusion

By jacksparrow

Leave a Reply Cancel reply

You Missed

What is Python? – Definition, Features, Application

KeyAttestation in Android Nougat API 24

UTM tracking codes in Firebase

android.os.BadParcelableException: ClassNotFoundException when unmarshalling: com.facebook.flatbuffers.helpers.FlatBufferModelHelper$LazyHolder

python sklearn multiple linear regression display r-squared

Multiple Linear Regression with Scikit-learn

Introduction

Steps Involved

Code Implementation

1. Importing Libraries

2. Loading and Preparing Data

3. Creating the Model

4. Training the Model

5. Evaluation: Calculating R-squared

Example

Data

Code

Output

Interpretation

Conclusion

By jacksparrow

Related Post

Leave a Reply Cancel reply

You Missed

What is Python? – Definition, Features, Application

KeyAttestation in Android Nougat API 24

UTM tracking codes in Firebase

android.os.BadParcelableException: ClassNotFoundException when unmarshalling: com.facebook.flatbuffers.helpers.FlatBufferModelHelper$LazyHolder