$30
Q1: Self-organizing map analysis of thermal conductivity dataset
The following code SOM-HexagonalTopology.py apply SOM unsupervised clustering to our thermal conductivity dataset and shows the cluster map, and maps the data samples to each of the clusters with colors showing its 1 out of 10 grades corresponding to their percentiles.
(1) Add your code from line 99, so that your SOM map can show different colors for samples of different thermal conductivity grade (from 0 to 9, corresponding to their percentile).
Hint: this function from scipy which can calculate the percentile of a value in a list of numbers
from scipy import stats
print(stats.percentileofscore(target, 500))
You need to install the minisom by: pip3 install minisom https://github.com/JustGlowing/minisom more info.
(2) If possible, try to fix the legend bar so that the color show the range of thermal conductivity values. (optional, bonus points: 10)
Q2: Genetic programming for symbolic regression
Study this fastsr symbolic regression package
https://github.com/cfusting/fast-symbolic-regression
read the thermal_dataset.csv file, use all the numeric columns except the y-exp and y-theory columns as the X_train, use the y-exp as the y_train train a symbolic regression model for this dataset print out the final regression score print out the formula of the best individual plot the final regression scatter plot.