Range-type variables encoding

Range-type variables encoding

di MATTEO GIOELE COLLU -
Numero di risposte: 0

Think about how range-type variables might be represented. Is it reasonable to represent them as any numeric variables or is it better to have ad-hoc encoding?


Range-type variables, if considered as variables that can assume a value inside a certain interval [a,b], are basically characterized by the position in which they are inside the interval rather than the value itself.
In fact, a variable with a specific numeric value (let's say 10 for example) has a total different meaning when it is in the interval [1,10] instead of [10,100]. Then the main insights about hte variable that has to be extracted is its position with respect to the borders of the interval. A way to do this is to normalize the interval as: x_scaled = (x -  a)/(b-a) for any x in [a,b]. In this way, the left border of the interval will have value 0 while the right border value 1. All of the values that lie inside the interval will be between 0 and 1, representing their position inside of it.