Articles → SKLEARN → Splitting The Data Into A Training And A Testing Set In Sklearn
Splitting The Data Into A Training And A Testing Set In Sklearn
Why Do We Need Data Splitting?
Example
import numpy as np
from sklearn.model_selection import train_test_split
from sklearn.linear_model import LinearRegression
from sklearn.metrics import mean_squared_error, r2_score
# Your data
X = np.array([150, 160, 170]).reshape(-1, 1)
y = np.array([50, 56, 63])
# Split into train and test sets
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.33, random_state=42)
# Train the model
model = LinearRegression()
model.fit(X_train, y_train)
# Predict on test data
y_pred = model.predict(X_test)
# Evaluate
print("Predicted:", y_pred)
print("Actual: ", y_test)
XTrain, X_test, Y_Train, And Y_Test
| Variable Name | Dependent/Independent | Purpose |
|---|
| X_train | Input feature (Independent variable) | Portion used to train the model. |
| X_test | Input feature (Independent variable) | Portion used to test the model. |
| Y_train | Output (Dependent variable) | Portion used while training. |
| Y_test | Output (Dependent variable) | Portion used to test the model. |
Output
| Posted By - | Karan Gupta |
| |
| Posted On - | Tuesday, May 27, 2025 |