I have written two python files, one simulates data sets and the other runs the gradient boost libraries XGBoost, LightGBM and CatBoost using Bayesian hyperparameter tuning on the generated data. The code runs and yields results but I would like someone with much experience in both python and parameter tuning of gradient boost libraries to proof read. The results are somewhat condratictionary, for example XGBoost has much lower overall accuracy than the other two libraries (i would expect that all three should be somewhat similar, at least when there is no categorical features). Also, it is very important that the categorical features are entered as one-hot encoded in XGBoost while for the other two libraries they must be entered so that the libraries somehow understands that these are categorical variables, here I suspect that the latter is achieved but I feel more uncertain about the former.
If your time budget allows I would be happy if you could fix errors that you encounter, otherwise write me a list of what has to be done. If time allows i would also like that average hyperparameter values are returned for all models in addition to auc and training time (which is already implemented).
I understand that it might be difficult to guarantee that there is no typo in this amount of code, therefore if you miss some obvious typo that would be ok, but other kind of errors must be detected.
For a more detailed description please read the attached document, notice that the code should be written so that it runs several iterations of the experiment (currently 10).
If you are interested i will send you the code so that you can have a look before accepting the offer.