Starting from:

$30

EE559-Final Project Wireless Indoor Localization Dataset Solved

Summary description:  An indoor WiFi system has 7 WiFi routers at various locations.  The goal is to use the WiFi signal strength received from the 7 routers, to predict the location of a user (who is in one of 4 rooms).  Thus, there are 7 input variables (features) and 4 classes (user locations).  

Data Description [1]: 8 columns: column 1~7 refer to the measured signal strength (dB, integers) at 7 wireless sensors (WS1-WS7 below) (routers);  column 8 is the user location (class). There are 500 data points for each location (class). 

Full dataset [1] has N_Total=2000, C=4, No. of features=7.  All features are integer-valued; there are no missing values.  

Sample data for user localization using wireless signal strength [2]:

 

Required training and test sets:  Separate testing and training sets are being extracted and will be posted on D2L; they are D_Test (with N_Test = 400) and D_Train (with N_Train = 1600).  For the class project, you are required to use these sets as partitioned, so that everyone’s test-set accuracy can be compared fairly.  (You are free to further divide D_Train as you wish for crossvalidation, etc.)

Comment: Theoretically, the dataset should be linear separable if the locations are not really close to each other and the wireless routers and the geometry are not all symmetric.  However, results obtained by the paper referenced on the UCI website imply that the dataset is probably not linearly separable.  

Tip:  this dataset is pretty straightforward to use for classification.  If you want to extend the problem, here are a few suggestions.  (i) You can try putting more effort into feature-space dimensionality, such as introducing new features as nonlinear functions of the given features (nonlinear mapping), and/or reducing dimensionality of feature space (e.g., with linear transformations or other feature selection).   (ii) Perform additional analysis.  If you have some knowledge of the problem domain, what could be done to improve the features, or how might one improve the localization in a future system?  (iii) You can add a confidence measure to the classification outputs, and see how quickly the accuracy improves when a threshold is placed on the confidence measure (i.e., data points that are below some minimal confidence measure are put into a “reject” class); we will provide more information on confidence measures (probably in Discussion 14).  (iv)  Follow ideas of your own that use this data, are relevant to the problem, and use pattern recognition techniques; it is advisable to check with a TA or professor to see if they think it’s a reasonable idea.  In particular, Zihang has some background in this topic area, so could be helpful if your questions or ideas relate specifically to the domain of the problem

More products