DM2017PY: opinion in Q5 (c)
I am Chin Chee Hoong from data mining course January 2018.
For the Q5 c (ii), actually the f2() is euclidean distance, not minkowski distance (minkowski distance can become Manhattan distance if r = 1)
Then, about the disadvantage of Manhattan distance over the (Euclidean distance) is, we are having a hard time to determine the "actual" distance between the points.
For example, lets use P1 (4,5) and P2(1,8) and P3(0,0). By calculating the euclidean distance, the euclidean distance between P1 and P3, and between P2 and P3, is sqrt(41) and sqrt(65). However, by calculating their Manhattan distance, both results are the same, which is 9!
Manhattan distance is pretty like calculating how many move you need to do so that you can reach the end point from the start point. For example, if you want to reach P2 (5,0) from P1 (0,0), you just need to go left for 5 times, with each time increment xaxis value by 1. That value then can think as the difference (Manhattan distance) of both points.
Q5Ciii
Euclidean distance measure the direct distance from A point to B for 2 dimensinal, typically shorter distance than Manhattan.
Which mean Manhattan will better than Eclidean due to Euclidean might easily influnce by those noise thing or outlier.
Working on continuous space where all dimensions are properly scaled and relevant, then Euclidean is going to be better choice for distance function.
If you have a space filled with exclusively countable dimensions, then Manhattan distance will make sense.
