ATM源碼分析

ATM源碼分析

example/example.py

from atm import ATM

atm = ATM()

results = atm.run(train_path="/home/tqc/PycharmProjects/automl/ATM/demos/pollution_1.csv")
results.describe()

atm.worker.Worker#select_hyperpartition

調試打印的信息和論文描述的一致，超劃分hyperpartition表示條件參數樹 $(conditional parameter tree, CPT)$ 從root到leaf的一個路徑

>>> pprint(hyperpartitions)
[<dt: [('criterion', 'entropy')]>,
 <dt: [('criterion', 'gini')]>,
 <knn: [('weights', 'uniform'), ('algorithm', 'ball_tree'), ('metric', 'minkowski')]>,
 <knn: [('weights', 'uniform'), ('algorithm', 'ball_tree'), ('metric', 'euclidean')]>,...]

觀察這個打印信息，會發現

>>> hyperpartitions[0].categoricals
[('criterion', 'entropy')]
>>> pprint(hyperpartitions[0].tunables)
[('max_features',
  <btb.hyper_parameter.FloatHyperParameter object at 0x7fd946ae83c8>),
 ('max_depth',
  <btb.hyper_parameter.IntHyperParameter object at 0x7fd946ae82e8>),
 ('min_samples_split',
  <btb.hyper_parameter.IntHyperParameter object at 0x7fd946ae8e80>),
 ('min_samples_leaf',
  <btb.hyper_parameter.IntHyperParameter object at 0x7fd946ae8f28>)]

超劃分的作用就是從一個支離破碎的結構空間中取一個連續N維空間，從而使GP可以在這個空間中發揮作用。

btb.selection.uniform.Uniform#select
atm.worker.Worker#select_hyperpartition
atm.worker.Worker#run_classifier

hyperpartition = self.select_hyperpartition()

隨機選擇一個超劃分。貌似在進行MAB

pprint(params)
{'_scale': True,
 'algorithm': 'kd_tree',
 'leaf_size': 38,
 'metric': 'chebyshev',
 'n_neighbors': 13,
 'weights': 'uniform'}

atm.database.Database#start_classifier
~~將超參實例化爲分類器對象~~

        classifier = self.Classifier(hyperpartition_id=hyperpartition_id,
                                     datarun_id=datarun_id,
                                     host=host,
                                     hyperparameter_values=hyperparameter_values,
                                     start_time=datetime.now(),
                                     status=ClassifierStatus.RUNNING)

又是個陰間代碼
atm/database.py:382

目測是在用ORM操作數據庫

model, metrics = self.test_classifier(hyperpartition.method, params)

>>> model.pipeline
Pipeline(memory=None,
         steps=[('standard_scale',
                 StandardScaler(copy=True, with_mean=True, with_std=True)),
                ('knn',
                 KNeighborsClassifier(algorithm='ball_tree', leaf_size=20,
                                      metric='euclidean', metric_params=None,
                                      n_jobs=None, n_neighbors=16, p=2,
                                      weights='distance'))],
         verbose=False)
>>> metrics
{'cv': [{'accuracy': 1.0, 'cohen_kappa': 1.0, 'f1': 1.0, 'mcc': 1.0, 'roc_auc': 1.0, 'ap': 1.0}, ...

感覺總體流程也就這樣

selector和tuner默認爲uniform的隨機搜索

            selector (str):
                Type of selector to use. Optional. Defaults to ``'uniform'``.
            tuner (str):
                Type of tuner to use. Optional. Defaults to ``'uniform'``.

發表評論

所有評論

還沒有人評論，想成為第一個評論的人麼? 請在上方評論欄輸入並且點擊發布.

使用c#強大的表達式樹實現對象的深克隆之解決循環引用的問題

痞子衡嵌入式：恩智浦i.MX RT1xxx系列MCU啓動那些事（12.A）- uSDHC eMMC啓動時間(RT1170)

GPT-4o 引領人機交互新風向，向量數據庫賽道沸騰了

本地SSL證書過期輸入命令在IIS自動生成

.NET週刊【5月第2期 2024-05-12】

自研貝葉斯優化算法遇到的坑

CSDN-AutoML技術實踐與應用

幾種測試用的黑盒函數

RoBO源碼分析

peewee調研

https://yachay.unat.edu.pe/blog/index.php?comment_area=format_blog&comment_component=blog&comment_co

linux以太網驅動總結