天津天气预报,自动机器学习:最新进展综述-188bet亚洲体育_金博宝188官方网站_188bet

好莱坞在线 213℃ 0
天津天气预报,主动机器学习:最新进展总述-188bet亚洲体育_金博宝188官方网站_188bet

本文作者为香港浸会大学 贺鑫,雷锋网AI科技谈论获其授权宣布。

英文标题 | AutoML:A survey of State-of-the-art

作  者 | Xin He, Kaiyong Zhao, Xiaowen Chu

单  位 | Hong Kong Baptist University(香港浸会大学)

论文链接天津天气预报,主动机器学习:最新进展总述-188bet亚洲体育_金博宝188官方网站_188bet | https://arxiv.org/abs/1908.00709

深度学习现已运用到多个范畴,为人们日子带来极大便当。可是,为特定使命结构一个高质量的深度学习体系不只需求消耗许多时刻和资源,并且很大程度上需求专业的范畴常识。

因而,为了让深度学习技能以愈加简略的办法应用到更钟庆厚多的范畴,主动机器学习(AutoML)逐步成为人们重视的要点。

本文首要从端张阳上将到端体系的视点总结了主动机器学习在各个流程中的研讨成果(如下图),然后侧重对最近广泛研讨的神经结构查找(Neural Architecture Search, NAS)进行了总结,终究评论了一些未来的研讨方向。

一、数据预备

众所周知,数据关于深度学习使命而言至关重要,因而一个好的AutoML体系应该可以主动进步数据质量和数量,咱们将数据预备划分红两个部分:数据搜集和数据清洗。

1、数据搜集

现如今不断有揭露数据集呈现出来,例如MNIST,CIFAR10,ImageNet等等。咱们也可以经过一些揭露的网站获取各种数据集,例如Kaggle, Google Dataset Search以及Elsevier Data Search等等。可是关于一些特别的使命,尤其是医疗或许触及到个人隐私的使命,因为数据很难获取,所以一般很难找到一个适宜的数据集或许数据集很小。处理这一问题主要有两种思路:数据生成和数据查找。

1)数据生成

图画:

  • Cubuk, E炀kinD., et al. "Autoaugment: Learning augmentation policies fromdata." arXiv preprint arXiv:1805.09501 (2018).

语音:

  • Park, DanielS., et al. "Specaugment: A simple data augmentation method for automaticspeech recognition." arXiv preprint arXiv:1904.08779(2019).

文本

  • Xie, Ziang,et al. "Data noising as smoothing in neural network languagemodels." arXiv preprint arXiv:1703.02573 (2017).
  • Yu, Adams Wei,et al. "Qanet: Combining local convolution with global self-attention forreading comprehension." arXiv preprint arXiv:1804.09541 (2018).

GAN:

  • Karras, Tero, Samuli Laine, and Timo Aila. 佳人宜修"A style-basedgenerator architecture for generative adversarial networks." Proceedingsof the IEEE Conference on Computer Vision and Pattern Recognition. 2019.

模拟器:

  • Brockman, Greg, et al. "Openai gym." arXiv preprintarXiv:1606.01540 (2016).
  • Roh, Yuji, Geon Heo, and Steven Euijong Whang. "A survey ondata collection for machine learning: a big data-ai integrationperspective." arXiv preprint arXiv:1811.03402(2018).
  • Yarowsky, David. "Unsupervised word sense disambiguationrivaling supervised methods." 33rd annual meeting of the associationfor computational linguistics. 1995.
  • Zhou, Yan, and Sally Goldman. "Democraticco-learning." 16th IEEE International Conference on Tools withArtificial Intelligence. IEEE, 2004.
  • Krishnan,Sanjay, and Eugene Wu. "Alphaclean: Automatic generation of data cleaningpipelines." arXiv preprint arXiv:1904.11827 (2019).
  • Chu, Xu, etal. "Katara: A data cleaning system powered by knowledge bases andcrowdsourcing." Proceedings of the 2015 ACM SIGMOD InternationalConference on Management of Data. ACM, 2015.
  • Krishnan,Sanjay, et al. "Activeclean: An interactive data cleaning framework formodern machine learning." Proceedings of the 2016 InternationalConference on Management of Data. ACM, 2016.
  • Krishnan,Sanjay, et al. "SampleClean: Fast and Reliable Analytics on DirtyData." IEEE Data Eng. Bull. 38.3 (2015): 59肠系膜淋巴结炎-75.

特征工程可分为三个部分:

1、特征挑选2、特征结构

  • H. Vafaie and K. De Jong, “Evolutionary feature spacetransformation,” in Feature Extraction, Construction and Selection. Springer,1998, pp. 307–323
  • J. Gama, “Functional trees,” Machine Learning, vol. 55, no. 3, pp.219–250, 2004.
  • D. Roth and K. Small, “Interactive feature space construction usingsemantic information,” in Proceedings of the Thirteenth Conference onComputational Natural Language Learning. Association for Computational Linguistics,2009, pp. 66–74.
  • Q. Meng, D. Catchpoole, D. Skillicom, and P. J. Kennedy, “Relationalautoencoder for feature extraction,” in 2017 International Joint Conference onNeu四级考试成绩查询ral Networks (IJCNN). IEEE, 2017, pp. 364–371.
  • O. Irsoy and E. Alpaydn, “Unsupervised feature extraction withautoencoder trees,” Neurocomputing, vol. 258, pp. 63–73, 2017.

三、模型生成

模型生成的办法主要有两种:一是依据传统的机器学习办法生成模型,例如SVM,decision tree等,现已开源的库有Auto-sklearn和TPOT等。另一种是是神经网络结构查找(NAS)。咱们会从两个方面对NAS进行总结,一是NAS的网络结构,二是查找战略。

1、网络结构1)全体结构(entire structure):

该类办法是生成一个完好的网络结构。其存在显着的缺陷,如网络结构查找空间过大,生成的网络结构缺少可迁移性和灵敏性。

  • B. Zoph and Q. V. Le, “Neural architecture search with reinforcementlearning.” [Online]. Available:http://arxiv.org/abs/1611.01578
  • H. Pham, M. Y. Guan, B. Zoph, Q. V. Le, and J. Dean, “Efficientneural architecture search via parameter sharing,” vol. ICML. [Online].Available: http://arxiv.org/abs/1802.03268

为处理全体结构网络查找存在的问题提出了依据单元结构规划的办法。如下图所示,查找到单元结构后需求叠加若干个单元结构便可得到终究的网络结构。不难发现,查找空间从整个网络减缩到了更小的单元结构,并且咱们可以经过增减单元结构的数量来改动网络结构。可是这种办法相同存在一个很显着的问题,即单元结构的数量和衔接办法不确定,现如今的办法大都是依托人类经历设定。

  • H. Pham, M. Y. Guan, B. Zoph, Q. V. Le, and J. Dean, “Efficientneural architecture search via parameter sharing,” vol. ICML. [Online].Available: http://arxiv.org/abs/1802流氓大亨养精英.03268
  • B. Zoph, V. Vasudevan, J. Shlens, and Q. V. Le, “Learningtransferable architectures for scalable image recognition.” [Online].Available: http://arxiv.org/abs/1707.07012
  • Z. Zhong, J. Yan, W. Wu, J. Shao, and C.-L. Liu, “Practicalblock-wise neural network architecture generation.” [Online]. Available:http://arxiv.org/abs/1708.05552
  • B. Baker, O. Gupta, N. Naik, and R. Raskar, “Designing neuralnetwork architectures using reinforcement learning,” vol. ICLR. [Online].Available: http://arxiv.org/abs/1611.02167
  • E. Real, S. Moore, A. Selle, S. Saxena, Y. L. Suematsu, J. Tan, Q.Le, and A. Kurakin, “Large-scale evolution of image classifiers.” [Online].Available: http://arxiv.org/abs/1703.01041
  • E. Real, A. Aggarwal, Y. Huang, and Q. V. Lfaturee, “Regularized evolutionfor image classifier architecture search.” [Online]. Available:http://arxiv.org/abs/1802.01548

不同于上面将单元结构依照链式的办法进行衔接,层次结构是将前一进程生成的单元结构作为下一步单元结构的根本组成部件,经过迭代的思维得到终究的网络结构。如下图所示,(a)中左面3个是最根本的操作,右边是依据这些根本操作生成的某一个单元结构;(b)中左面展现了上一进程中生成的若干个单元结构,经过依照某种战略将这些单元结构进行组合得到了更高阶的单元结构。

  • H. Liu, K. Simonyan, O. Vinyals, C. Fernando, and K. Kavukcuoglu,“Hierarchical representations for efficient architecture search,” in ICLR, p.13

4)依据网络态射结构(network morphism-based structure):

一般的网络规划办法是首要规划出一个网络结构,然后练习它并在验证集上检查它的功能体现,假如体现较差,则从头规划一个网络。可以很显着地发现这种规划办法会做许多无用功,因而消耗许多时刻。而依据网络态射结构办法可以在原有的网络结构基础上做修正,所以其在很大程度上能保存原网络的长处,并且其特别的改换办法可以保证新的网络结构还原成原网络,也就是说它的体现至少不会差于原网络。

  • T. Chen, I. Goodfellow, and J. Shlens, “Net2net: Acceleratinglearning via knowledge transfer,” arXiv preprint arXiv:1511.05641, 2015.
  • T. Elsken, J. H. Metzen, and F. Hutter, “Efficient multi-objectiveneural architecture search via lamarckian evolution.” [Online]. Available:http://arxiv.org/abs/1804.09081
  • H. Cai, T.天津天气预报,主动机器学习:最新进展总述-188bet亚洲体育_金博宝188官方网站_188bet Chen, W. Zhang, Y. Yu, and J. Wang, “Efficientarchitecture search by network transformation,” in Thirty-Second AAAIConference on Artificial Intelligence, 201天津天气预报,主动机器学习:最新进展总述-188bet亚洲体育_金博宝188官方网站_188bet8
  • H. H. Hoos, Automated Algorithm Configuration and Parameter Tuning,2011
  • I. Czogiel, K. Luebke, and C. Weihs, Response surface methodologyfor optimizing hyper parameters. Universitatsbibliothek Dortmund, 2006.
  • C.-W. Hsu, C.-C. Chang, C.-J. Lin et al., “A practical guide tosupport vector classification,” 2003.
  • J. Y. Hesterman, L. Caucci, M. A. Kupinski, H. H. Barrett, and L. R.Furenlid, “Maximum-likelihood estimation with a contracting-grid searchalgorithm,” IEEE transactions on nuclear science, vol. 57, no. 3, pp.1077–1084, 2010.
  • J. Bergstra and Y. Bengio, “Random search for hyper-parameteroptimization,” p. 25.
  • H. Laro富大龙饶敏莉女儿chelle, D. Erhan, A. Courville, J. Bergstra, and Y. Bengio,“An empirical evaluation of deep architectures on problems with many factors ofvariation,” in Proceedings of the 24th international conference on Machinelearning. ACM, 2007, pp. 473–480.
  • L. Li, K. Jamieson, G. DeSalvo, A. Rostamizadeh, and A. Talwalkar,“Hyperband: A novel bandit-based approach to hyperparameter optimization.”[Online]. Available: http://arxiv.org/abs/1603.06560
  • B. Zoph and Q. V. Le, “Neural architecture search with reinforcementlearning.” [Online]. Available: http://arxiv.org/abs/1611.01578
  • B. Baker, O. Gupta, N. Naik, and R. Raskar, “Designing neuralnetwork architectures using reinforcement learning,” vol. ICLR. [Online].Available: http://arxiv.org/abs/1611.02167
  • H. Pham, M. Y. Guan, B. Zoph, Q. V. Le, and J. Dean, “Efficientneural architecture search via parameter sharing,” vol. ICML. [Online].Available: http://arxiv.org/abs/1802.03268
  • B. Zoph, V. Vasudevan, J. Shlens, and Q. V. Le, “Learningtransferable architectures for scalable image recognition.” [Online波多野结衣无码].Available: http://arxiv.org/abs/1707.07012
  • Z. Zhong, J. Yan, W. Wu, J. Shao, and C.-L. Liu, “Practical block-wiseneural network architecture generation.” [Online]. Available:http://arxiv.org/abs/1708.05552
  • L. Xie and A. Yuille, “Genetic CNN,” vol. ICCV. [Online]. Available:http://arxiv.org/abs/1703.01513
  • M. Suganuma, S. Shirakawa, and T. Nagao, “A genetic programmingapproach to designing c天津天气预报,主动机器学习:最新进展总述-188bet亚洲体育_金博宝188官方网站_188betonvolutional neural network architectures.” [Online].Available: http://arxiv.org/abs/1704.00764
  • E. Real, S. Moore, A. Selle, S. Saxena, Y. L. Suematsu, J. Tan, Q.Le, and A. Kurakin, “Large-scale evolution of image classifiers.” [Online].Available: http://arxiv.org/abs/1703.01041
  • K. O. Stanley and R. Miikkulainen, “Evolving neural networks throughaugmenting topologies,” vol. 10, no. 2, pp. 99–127. [Online]. Available:http://www.mitpressjournals.org/doi/10.1162/106365602320169811
  • T. Elsken, J. H. Metzen, and F. Hutter, “Efficient multi-objectiveneural architecture search via lamarckian evolution.” [Online]. Available:http://arxiv.org/abs/1804.09081
  • J. Gonzalez, “Gpyopt: A bayesian optimization framework in python,”http://github.com/SheffieldML/GPyOpt, 2016
  • J. Snoek, H. Larochelle, and R. P. Adams, “Practical bayesianoptimization of machine learning algorithms,” in Advances in neural informationprocessing systems, 2012, pp. 2951–2959.
  • S. Falkner, A. Klein, and F. Hutter, “BOHB: Robust and efficienthyperparameter optimization at scale,” p. 10.
  • F. Hutter, H. H. Hoos, and K. Leyton-Brown, “Sequential 阑尾炎是怎样引起的model-basedoptimization for general algorithm configuration,” in Learning and IntelligentOptimization, C. A. C. Coello, Ed. Springer Berlin Heidelberg, vol. 6683张境原, pp.507–523. [Online]. Available:http://link.springer.com/10.1007/978-3-642-25566-3 40
  • J. Bergstra, D. Yamins, and D. D. Cox, “Making a science of modelsearch: Hyperparameter optimization in hundreds of dimensions for visionarchitectures,” p. 9.
  • A. Klein, S. Falkner, S. Bartels, P. Hennig, and F. Hutter, “Fastbayesian optimization of machine learning hyperparameters on large datasets.”[Online]. Available: http://arxiv.org/abs/1605.07079
  • H. Liu, K. Simonyan, and Y. Yang, “DARTS: Differentiablearchitecture search.” [Online]. Available: http://arxiv.org/abs/1806.09055
  • S. Saxena and J. Verbeek, “Convolutional neural fabrics,” inAdvances in Neural Information Processing Systems, 2016, pp. 4053–4061.
  • K. Ahmed and L. Torresani, “Connectivity learning in multi-branchnetworks,” arXiv preprint arXiv:1709.09582, 2017.
  • R. Shin, C. Packer, and D. Song, “Differentiable neural networkarchitecture search,” 2018.
  • D. Maclaurin, D. Duvenaud, and R. Adams, “Gradient-basedhyperparameter optimization through嗯深化 reversible learning,” in InternationalConference on Machine Learning, 2015, pp. 2113–2122.
  • F. Pedregosa, “Hyperparameter optimization with approximategradient,” arXiv preprint arXiv:1602.02355, 2016.
  • S. H. Han Cai, Ligeng Zhu, “PROXYLESSNAS: DIRECT NEURAL ARCHITECTURESEARCH ON TARGET TASK AND HARDWARE,” 2019
  • G. D. H. Andrew Hundt, Varun Jain, “sharpDARTS: Faster and MoreAccurate Differentiable Architecture Search,” Tech. Rep. [Online]. Available:https://arxiv.org/pdf/1903.09900.pdf

模型结构规划好后咱们需求对模型进行评价,最简略的办法是将模型练习至收敛,然后依据其在验证集上的成果判别其好坏。可是这种办法需求许多时刻和核算资源。因而有不少加快模型评价进程的算法被提出,总结如下:

1、低保真度评价

  • Klein, S. Falkner, S. Bartels, P. Hennig, and F. Hutter,咨询工程师 “Fastbayesian optimization of machine learning 天津天气预报,主动机器学习:最新进展总述-188bet亚洲体育_金博宝188官方网站_188bethyperparameters on large datasets.”[Online]. Available: http://arxiv.org/abs/1605.07079
  • B. Zoph, V. Vasudevan, J. Shlens, and Q. V. Le, “Learningtransferable architectures for scalable image recognition.” [Online]. Available:http://arxiv.org/abs/1707.07012
  • E. Real, A. Aggarwal, Y. Huang, and Q. V. Le, “Regularized evolutionfor image classifier architecture search.” [Online]. Available:http://arxiv.org/abs/1802.01548
  • A. Zela, A. Klein, S. Falkner, and F. Hutter, “Towards automateddeep learning: Efficient joint neural architecture and hyperparameter search.” [Online].Available: http://arxiv.org/abs/1807.06906
  • Y.-q. Hu, Y. Yu, W.-w. Tu, Q. Yang, Y. Chen, and W. Dai,“Multi-Fidelity Automatic Hyper-Parameter Tuning via Transfer Series Expansion,” p. 8, 2019.
  • C. Wong, N. Houlsby, Y. Lu, and A. Gesmundo, “Transfer learning withneural automl,” in Advances in Neural Information Processing Systems, 2018, pp.8356–8365.
  • T. Wei, C. Wang, Y. Rui, and C. W. Chen, “Netzara我国官网work morphism,” inInternational Conference on Machine Learning, 2016, pp. 564–572.
  • T. Chen, I. Goodfellow, and J. Shlens, “Net2net: Acceleratinglearning via knowledge transfer,” arXiv preprint arXiv:1511.05641, 2015.
  • K. Eggensperger, F. Hutter, H. H. Hoos, and K. Leyton-Brown,“Surrogate benchmarks for hyperparameter optimization.” in MetaSel@ ECAI, 2014,pp. 24–31.
  • C. Wang, Q. Duan, W. Gong, A. Ye, Z. Di, and C. Miao, “An evaluationof adaptive surrogate modeling based optimization with two benchmark problems,”Environmental Modelling & Software, vol. 60, pp. 167–179,2014.
  • K. Eggensperger, F. Hutter, H. Hoos, and K. Leyton-Brown, “Efficientbenchmarking of hyperparameter optimizers via surrogates,” in Twenty-Ninth AAAIConference on Artificial Intelligence, 2015.te
  • K. K. Vu, C. D’Ambrosio, Y. Hamadi, and L. Liberti, “Surrogate-basedmethods for black-box optimization,” International Transactions in OperationalResearch, vol.黄霑不文集 24, no. 3, pp. 393–424, 2017.
  • C. Liu, B. Zoph, M. Neumann, J. Shlens, W. Hua, L.-J. Li, L.Fei-Fei, A. Yuille, J. Huang, and K. Murphy, “Progressive neural architecturesearch.” [Online]. Available: http://arxiv.org/abs/1712.00559
    天津天气预报,主动机器学习:最新进展总述-188bet亚洲体育_金博宝188官方网站_188bet
  • A. Klein, S. Falkner, J. T. Springenberg, and F. Hutter, “Learningcurve prediction with bayesian neural networks,” 2016.
  • B. Deng, J. Yan, and D. Lin, “Peephole: Predicting networkperformance before training,” arXiv preprint arXiv:1712.03351, 2017.
  • T. Domhan, J. T.Springenberg, and F.95512 Hutter, “Speeding uppoi automatic hyperparameter optimizationof deep neural networks by extrapolation of learning curves,” in Twenty-FourthInternational Joint Conference on Artificial Intelligence, 2015.
  • M. Mahsereci, L. Balles, C. Lassner, and P. Hennig, “Early stoppingwithout a validation set,” arXiv preprint arXiv:1703.09580, 2017.

下图总结了不同NAS算法在CIFAR10上的查找网络所花费的时刻以及准确率。可以看到比较于依据强化学习和进化算法的办法,依据梯度下降和随机查找的办法可以运用更少的时刻查找得到体现优异的网络模型。

五、总结

经过对AutoML最新研讨进展的总结咱们发现还有如下问题值得考虑和处理:

1、完好pipeline体系

现如今有不少开源AutoML库,如TPOT,Auto-sklearn都仅仅触及整个pipeline的某一个或多个进程,可是还没有真实完成整个进程全主动,因而怎么将上述一切流程整合到一个体系内完成彻底主动化是未来需求不断研讨的方向。

2、可解说性

深度学习网络的一个缺陷就是它的可解说性差,AutoML在查找网络进程中相同存在这个问题。现在还缺少一个谨慎的科学证明来解说

为什么某些操作体现更好,例如

就依据单元结构规划的网络而言,很难解说为什么经过叠加单元结构就能得到体现不错的网络结构。别的为何ENAS提出的权值同享可以work相同值得考虑。

3、可复现性

大多数的AutoML研讨工作都仅仅报告了其研讨成果,很少会开源其完好代码,有的仅仅供给了终究查找得到的网络结构而没有供给查找进程的代码。别的较多论文提出的办法难以复现,一方面是因为他们在实践查找进程中运用了许多技巧,而这些都没有在论文中详细描述,另一方面是网络结构的查找存在必定的概率性质。因而怎么保证AutoML技能的可复现性也是未来的一个方向。

4、灵敏的编码办法

经过总结NAS办法咱们可以发现,一切办法的查找空间都是在人类经历的基础上规划的,所以终究得到的网络结构一直无法跳出人类规划的结构。例如现如今的NAS无法随便生成一种新的类似于卷积的根本操作,也无法生成像Transformer那样杂乱的网络结构。因而怎么界说一种泛化性更强,更灵敏的网络结构编码办法也是未来一个值得研讨的问题。

5、终身学习(lifelong learn)

大多数的AutoML都需求针对特定数据集和使命规划网络结构,而关于新的数据则缺少泛化性。而人类在学习了一部分猫狗的相片后,当呈现未曾见过的猫狗仍然可以辨认出来。因而一个强健的AutoML体系应当可以终身学习,即既能坚持对旧数据的回忆才能,又能学习新的数据。