
An introduction to Multinomial Logistic Regression
In this blog, you are going to learn
- What is Logistic Regression?
- What are different types of logistic regression models?
- How to implement a Binary Logistic Regression model?
- How to implement a Multinomial Logistic Regression model?
- How to implement an Ordinal Logistic Regression model?
Logistic Regression
Despite having regression in the name, Logistic Regression is a classification algorithm. It calculates the probabilities of the outcome of a trial with the help of the sigmoid function. The sigmoid function is represented by the equation: 1 / (1 + e^-value). Logistic regression can be implemented using L1 and L2 regularization.
Types of Logistic Regression Models
- Binary: The target variable or the dependent variable is having only two possible outcomes (0,1). The examples would be as the following:
- A person has cancer or not.
- The email is spam or not.
- Multinomial: In this category, the target variable or the dependent variable has three or more unordered possible outcomes. The examples would include the following problems.
- The weather would be sunny, rainy, or cloudy.
- Classification of numbers (MNIST dataset)
- Ordinal: In this category, the target variable or the dependent variable has three or more ordered possible outcomes. The examples would include the following problems.
- Predict the type of wine.
- Review classification.
Now we will discuss each category in detail.
Binary Logistic Regression
In this problem, we are going to work on the Cervical Cancer Behavior Risk Data Set. The dataset contains 19 attributes regarding ca cervix behavior risk with the class label ca_cervix with 1 and 0 as values which means the respondent with and without ca cervix, respectively.
1) behavior_eating2) behavior_personalHygine3) intention_aggregation4) intention_commitment5) attitude_consistency6) attitude_spontaneity7) norm_significantPerson8) norm_fulfillment9) perception_vulnerability10) perception_severity11) motivation_strength12) motivation_willingness13) socialSupport_emotionality14) socialSupport_appreciation15) socialSupport_instrumental16) empowerment_knowledge17) empowerment_abilities18) empowerment_desires19) ca_cervix (this is class attribute, 1=has cervical cancer, 0=no cervical cancer)
Importing all the libraries
import pandas as pd from sklearn.model_selection import train_test_split from sklearn.linear_model import LogisticRegression from sklearn.metrics import classification_report from sklearn.metrics import f1_score
Importing the Data
We are downloading the dataset using the Pandas read_csv() method.
data_url='https://archive.ics.uci.edu/ml/machine-learning-databases/00537/sobar-72.csv' df=pd.read_csv(data_url) df.head(5)
In this dataset, “ca_cervix” is our target column. Let us separate the target variable from the training set.
y=df.pop("ca_cervix") y
Splitting Dataset
Once we have separated our target variable, we can split our training and testing dataset. For this, we are going to use the train_test_split() method.
X_train, X_test, y_train, y_test = train_test_split(df, y, test_size=0.15, random_state=42)
Model Implementation
Logistic Regression model parameters:
- penalty : This parameter specifies the type of penalty to be used from {‘L2’, ‘L2’, ‘Elasticnet’, ‘None’}. The default value is ‘L2’
- dualbool : This parameter is to be implemented for L2 penalty only.
- tolfloat : This parameter sets the stopping criteria tolerance.
- Cfloat : This parameter represents 1/ regularization.
- fit_intercept : It is represented as a constant
- intercept_scaling : It is used to add a synthetic feature with a constant value.
- class_weightdict : To adjust the weight given to every class.
- random_state : To shuffle the data.
- solver : The type of algorithm to be used in the model. It has the following options {‘newton-cg’, ‘lbfgs’, ‘liblinear’, ‘sag’, ‘saga’}.
- max_iter: The maximum number of iterations the model will take to learn from the data points.
- multi_class : This parameter is used to tell the model to expect either a ‘Binary’ or ‘Multi-Class’ classification data.
- verbose : This parameter is used to enable or disable the verbosity.
- warm_start : This parameter is set to true when the model has to re-train on the new data.
- n_jobs : The parameter specifies the number of cores to be used in parallel.
- l1_ratio : This parameter is used to set the Elastic-Net mixing rate.
We are going to implement a Binary Logistic Regression model.
binary_logistic_model = LogisticRegression(max_iter=500).fit(X_train, y_train) binary_logistic_model
We have set the parameter “max_iter” = 500. This works for our dataset. You can play with this parameter if your model is not converging on the dataset you are working with. Our model has been trained on the training dataset. Let’s try to predict the test data and check our accuracy.
prediction=binary_logistic_model.predict(X_test) prediction
Model Validation
Once we have our predictions from the testing dataset, we can check how our trained model has performed by using the Classification Score.
The classification Score will give the precision, and recall accuracy of the model.
print(classification_report(prediction,y_test))
Since we have a very small dataset, we are achieving 100% precision, recall, and accuracy. This does not happen in real-world data. In case you are achieving this score on the real-world problem, you need to check your steps. One of the major causes of this is giving the target variable inside the training set, which causes the model to have 100% accuracy.
Multinomial Logistic Regression.
For Multinomial Logistic Regression, we are going to use the Digit Dataset. In this dataset, each datapoint is an 8×8 image of a digit. We have 10 classes to predict. Let’s import the dataset and start working.
Importing Dataset
Here we are importing the digit data set from Sklearn’s Datasets.
from sklearn.datasets import load_digits import matplotlib.pyplot as plt digits = load_digits()
Once we have loaded the dataset, we can assign our training and target varibale respectively.
X=digits['data'] y=digits['target']
X[0]
Let’s have a look at the dataset by plotting the images.
fig = plt.figure plt.imshow(digits['images'][0], cmap='gray_r') plt.show()
We have plotted the digit 0.
Splitting Dataset
Once we have separated our target varibale, we can split our training and testing dataset. For this, we are going to use the train_test_split() method.
X_train, X_test, y_train, y_test = train_test_split(X, y.tolist(), test_size=0.15, random_state=42)
X_train.shape, X_test.shape
Model Implementation
The implementation of the Multinomial Logistic Regression model would be similar to the Binary Logistic Model. We are changing the value of the “max_iter” parameter. We are increasing its value to 5000.
multinomial_logistic_model = LogisticRegression(max_iter=5000).fit(X_train, y_train) multinomial_logistic_model
Once our model has been trained, we can start predicting the values from the test set.
prediction=multinomial_logistic_model.predict(X_test) prediction
Model Validation
Once we have our predictions from the testing dataset, we can check how our trained model has performed by using the Classification Score.
Classification Score will give the precision, recall, and accuracy of the model.
Precision (P) is defined as the number of true positives (TP) divided by the number of true positives (TP) plus the number of false positives(FP)
P= TP/(TP+FP)
Recall(R) is defined as the number of true positives (TP) divided by the number of true positives(TP) plus the number of false negatives (FN)
R=TP/(TP+FN)
print(classification_report(prediction,y_test))
Ordinal Logistic Regression
For this model, we are going to work on the Wine Quality Data Set. The dataset is related to red Vinho Verde wine samples, from the north of Portugal. The goal is to model wine quality based on physicochemical tests. The target variable for this dataset is “quality”.
Attribute Information:
Input variables (based on physicochemical tests):1 – fixed acidity2 – volatile acidity3 – citric acid4 – residual sugar5 – chlorides6 – free sulfur dioxide7 – total sulfur dioxide8 – density9 – pH10 – sulphates11 – alcohol12 – quality (score between 0 and 10)
Importing Dataset
We are going to import the data using the Pandas read_csv() method.
data_url='https://archive.ics.uci.edu/ml/machine-learning-databases/wine-quality/winequality-red.csv' df=pd.read_csv(data_url,sep=";") df.head(5)
Let us have a look at the different classes we have to predict.
df['quality'].unique()
We have 6 distinct classes to predict. Now we are going to separate our training and testing data using the pop() method.
Data Splitting
X_train, X_test, y_train, y_test = train_test_split(df, y, test_size=0.15, random_state=42)
Model Implementation
The implementation of the Ordinal Logistic Regression model would be similar to the Binary Logistic Model. We are changing the value of the “max_iter” parameter. We are increasing its value to 10000.
ordinal_logistic_model = LogisticRegression(max_iter=10000).fit(X_train, y_train) ordinal_logistic_model
Model Prediction
prediction=ordinal_logistic_model.predict(X_test) prediction
Model Validation
Once we have our predictions from the testing dataset, we can check how our trained model has performed by using the F1 Score.
F1 score is defined as the harmonic mean of precision and recall.
F1= 2* (P * R)/(P+r)
print(f1_score(prediction,y_test, average='macro'))
An F1 score of 64.43% can be improved by using a couple of techniques such as the following:
- Data Normalization
- SMOTE technique to increase the data points in the case of an imbalanced dataset
- Hyper-Parameter tuning would also help in achieving the desired results.
Summary
- Binary Logistic Regression model: The target variable or the dependent variable is having only two possible outcomes (0,1).
- Multinomial Logistic Regression model: In this category, the target variable or the dependent variable has three or more unordered possible outcomes.
- Ordinal Logistic Regression model: In this category, the target variable or the dependent variable has three or more ordered possible outcomes.