8

So I made a model with tensorflow keras and it seems to work ok. However, my supervisor said it would be useful to calculate the Matthews correlation coefficient, as well as the accuracy and loss it already calculates.

my model is very similar to the code in the tutorial here (https://www.tensorflow.org/tutorials/keras/basic_classification) except with a much smaller dataset.

is there a prebuilt function or would I have to get the prediction for each test and calculate it by hand?

5 Answers 5

10

There is nothing out of the box but we can calculate it from the formula in a custom metric.

The basic classification link you supplied is for a multi-class categorisation problem whereas the Matthews Correlation Coefficient is specifically for binary classification problems.

Assuming your model is structured in the "normal" way for such problems (i.e. y_pred is a number between 0 and 1 for each record representing predicted probability of a "True" and labels are each exactly a 0 or 1 representing ground truth "False" and "True" respectively) then we can add in an MCC metric as follows:

# if y_pred > threshold we predict true. 
# Sometimes we set this to something different to 0.5 if we have unbalanced categories

threshold = 0.5  

def mcc_metric(y_true, y_pred):
  predicted = tf.cast(tf.greater(y_pred, threshold), tf.float32)
  true_pos = tf.math.count_nonzero(predicted * y_true)
  true_neg = tf.math.count_nonzero((predicted - 1) * (y_true - 1))
  false_pos = tf.math.count_nonzero(predicted * (y_true - 1))
  false_neg = tf.math.count_nonzero((predicted - 1) * y_true)
  x = tf.cast((true_pos + false_pos) * (true_pos + false_neg) 
      * (true_neg + false_pos) * (true_neg + false_neg), tf.float32)
  return tf.cast((true_pos * true_neg) - (false_pos * false_neg), tf.float32) / tf.sqrt(x)

which we can include in our model.compile call:

model.compile(optimizer='adam',
              loss=tf.keras.losses.binary_crossentropy,
              metrics=['accuracy', mcc_metric])

Example

Here is a complete worked example where we categorise mnist digits depending on whether they are greater than 4:

mnist = tf.keras.datasets.mnist

(x_train, y_train), (x_test, y_test) = mnist.load_data()
x_train, x_test = x_train / 255.0, x_test / 255.0
y_train, y_test = 0 + (y_train > 4), 0 + (y_test > 4)

def mcc_metric(y_true, y_pred):
  predicted = tf.cast(tf.greater(y_pred, 0.5), tf.float32)
  true_pos = tf.math.count_nonzero(predicted * y_true)
  true_neg = tf.math.count_nonzero((predicted - 1) * (y_true - 1))
  false_pos = tf.math.count_nonzero(predicted * (y_true - 1))
  false_neg = tf.math.count_nonzero((predicted - 1) * y_true)
  x = tf.cast((true_pos + false_pos) * (true_pos + false_neg) 
      * (true_neg + false_pos) * (true_neg + false_neg), tf.float32)
  return tf.cast((true_pos * true_neg) - (false_pos * false_neg), tf.float32) / tf.sqrt(x)

model = tf.keras.models.Sequential([
  tf.keras.layers.Flatten(input_shape=(28, 28)),
  tf.keras.layers.Dense(128, activation='relu'),
  tf.keras.layers.Dropout(0.2),
  tf.keras.layers.Dense(10, activation='relu'),
  tf.keras.layers.Dense(1, activation='sigmoid')
])

model.compile(optimizer='adam',
              loss=tf.keras.losses.binary_crossentropy,
              metrics=['accuracy', mcc_metric])

model.fit(x_train, y_train, epochs=5)
model.evaluate(x_test, y_test)

output:

Epoch 1/5
60000/60000 [==============================] - 7s 113us/sample - loss: 0.1391 - acc: 0.9483 - mcc_metric: 0.8972
Epoch 2/5
60000/60000 [==============================] - 6s 96us/sample - loss: 0.0722 - acc: 0.9747 - mcc_metric: 0.9495
Epoch 3/5
60000/60000 [==============================] - 6s 97us/sample - loss: 0.0576 - acc: 0.9797 - mcc_metric: 0.9594
Epoch 4/5
60000/60000 [==============================] - 6s 96us/sample - loss: 0.0479 - acc: 0.9837 - mcc_metric: 0.9674
Epoch 5/5
60000/60000 [==============================] - 6s 95us/sample - loss: 0.0423 - acc: 0.9852 - mcc_metric: 0.9704
10000/10000 [==============================] - 1s 58us/sample - loss: 0.0582 - acc: 0.9818 - mcc_metric: 0.9639
[0.05817381642502733, 0.9818, 0.9638971]
7

Prebuilt function to calculate matthews correlation coefficient

sklearn.metrics.matthews_corrcoef(y_true, y_pred, sample_weight=None )

example :

> from sklearn.metrics import matthews_corrcoef
> y_true = [+1, +1, +1, -1]
> y_pred = [+1, -1, +1, +1]
> matthews_corrcoef(y_true, y_pred) 

See the documentation.

2
7

Since the asker accepted a Python version from sklearn, here is Stewart_Rs answer in pure Python:

from math import sqrt
def mcc(tp, fp, tn, fn):

    # https://stackoverflow.com/a/56875660/992687
    x = (tp + fp) * (tp + fn) * (tn + fp) * (tn + fn)
    return ((tp * tn) - (fp * fn)) / sqrt(x)

It has the advanatage of being general, not just for evaluating binary classifications.

1

The other answers to the question (i.e., Keras/Tensorflow implementation of MCC) are limited in the sense that it requires binary classification and a single output column. If your setup was not the case, the function would give you a wrong MCC value without any error.

Of course, MCC can be computed for multi-class and multi-column output. A general custom metric function for Keras (Tensorflow) MCC is the following.

import tensorflow as tf
from keras import backend as K

def keras_calculate_mcc_from_conf(confusion_m):
    """tensor version of MCC calculation from confusion matrix"""
    # as in Gorodkin (2004)
    N = K.sum(confusion_m)
    up = N * tf.linalg.trace(confusion_m) - K.sum(tf.matmul(confusion_m, confusion_m))
    down_left = K.sqrt(N ** 2 - K.sum(tf.matmul(confusion_m, K.transpose(confusion_m))))
    down_right = K.sqrt(N ** 2 - K.sum(tf.matmul(K.transpose(confusion_m), confusion_m)))
    mcc_val = up / (down_left * down_right + K.epsilon())
    return mcc_val


def keras_better_to_categorical(y_pred_in):
    """tensor version of to_categorical"""
    nclass = K.shape(y_pred_in)[1]
    y_pred_argmax = K.argmax(y_pred_in, axis=1)
    y_pred = tf.one_hot(tf.cast(y_pred_argmax, tf.int32), depth=nclass)
    y_pred = tf.cast(y_pred, tf.float32)
    return y_pred


def mcc(y_true, y_pred):
    """To calculate Matthew's correlation coefficient for multi-class classification"""
    # this is necessary to make y_pred values of 0 or 1 because
    # y_pred may contain other value (e.g., 0.6890)
    y_pred = keras_better_to_categorical(y_pred)

    # now it's straightforward to calculate confusion matrix and MCC
    confusion_m = tf.matmul(K.transpose(y_true), y_pred)
    return keras_calculate_mcc_from_conf(confusion_m)


# test mcc
actuals = tf.constant([[1.0, 0], [1.0, 0], [0, 1.0], [1.0, 0]], dtype=tf.float32)
preds = tf.constant([[1.0, 0], [0, 1.0], [0, 1.0], [1.0, 0]], dtype=tf.float32)
mcc_val = mcc(actuals, preds)
print(K.eval(mcc_val))

Also please note that MCC values printed from Keras during iterations will be incorrect because of the metric calculation per batch size. You can only trust MCC value from calling "evaluate" or "score" after fitting. This is because MCC for the whole sample is not the sum/average of the parts, unlike the other metrics. For example, if your batch size is one, MCC printed will be zero during iterations.

0

Here's another method that compute the Mathews Correlation Coefficient for the multi-class case. See https://en.wikipedia.org/wiki/Phi_coefficient#Multiclass_case for more details.

import tensorflow as tf

def mcc_metric(y_true, y_pred, num_classes=3):
    ''' Custom Mathew Correlation Coefficient for multiclass 
        For more details see: 
            "https://en.wikipedia.org/wiki/Phi_coefficient"
        Inputs: 
            y_true (tensor)
            y_pred (tensor)
            num_classes - number of classes
        Outputs:
            mcc - Mathews Correlation Coefficient
        '''
    # obtain predictions here, we can add in a threshold if we would like to
    y_pred = tf.argmax(y_pred, axis=-1)

    # cast to int64
    y_true = tf.squeeze(tf.cast(y_true, tf.int64), axis=-1)
    y_pred = tf.cast(y_pred, tf.int64)

    # total number of samples
    s = tf.size(y_true, out_type=tf.int64)

    # total number of correctly predicted labels
    c = s - tf.math.count_nonzero(y_true - y_pred)

    # number of times each class truely occured
    t = []

    # number of times each class was predicted
    p = []

    for k in range(num_classes):
        k = tf.cast(k, tf.int64)
    
        # number of times that the class truely occured
        t.append(tf.reduce_sum(tf.cast(tf.equal(k, y_true), tf.int32)))

        # number of times that the class was predicted
        p.append(tf.reduce_sum(tf.cast(tf.equal(k, y_pred), tf.int32)))


    t = tf.expand_dims(tf.stack(t), 0)
    p = tf.expand_dims(tf.stack(p), 0)

    s = tf.cast(s, tf.int32)
    c = tf.cast(c, tf.int32)

    num = tf.cast(c*s - tf.matmul(t, tf.transpose(p)), tf.float32)
    dem = tf.math.sqrt(tf.cast(s**2 - tf.matmul(p, tf.transpose(p)), tf.float32)) \
          * tf.math.sqrt(tf.cast(s**2 - tf.matmul(t, tf.transpose(t)), tf.float32))


    mcc = tf.divide(num, dem + 1e-6)

    return mcc

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Not the answer you're looking for? Browse other questions tagged or ask your own question.