Artificial Intelligence Preprint | 2019-05-27

Artificial Intelligence


Metatrace Actor-Critic: Online Step-size Tuning by Meta-gradient Descent for Reinforcement Learning Control (1805.04514v2)

Kenny Young, Baoxiang Wang, Matthew E. Taylor

2018-05-10

Reinforcement learning (RL) has had many successes in both "deep" and "shallow" settings. In both cases, significant hyperparameter tuning is often required to achieve good performance. Furthermore, when nonlinear function approximation is used, non-stationarity in the state representation can lead to learning instability. A variety of techniques exist to combat this --- most notably large experience replay buffers or the use of multiple parallel actors. These techniques come at the cost of moving away from the online RL problem as it is traditionally formulated (i.e., a single agent learning online without maintaining a large database of training examples). Meta-learning can potentially help with both these issues by tuning hyperparameters online and allowing the algorithm to more robustly adjust to non-stationarity in a problem. This paper applies meta-gradient descent to derive a set of step-size tuning algorithms specifically for online RL control with eligibility traces. Our novel technique, Metatrace, makes use of an eligibility trace analogous to methods like . We explore tuning both a single scalar step-size and a separate step-size for each learned parameter. We evaluate Metatrace first for control with linear function approximation in the classic mountain car problem and then in a noisy, non-stationary version. Finally, we apply Metatrace for control with nonlinear function approximation in 5 games in the Arcade Learning Environment where we explore how it impacts learning speed and robustness to initial step-size choice. Results show that the meta-step-size parameter of Metatrace is easy to set, Metatrace can speed learning, and Metatrace can allow an RL algorithm to deal with non-stationarity in the learning task.

Learning to Reason: Leveraging Neural Networks for Approximate DNF Counting (1904.02688v2)

Ralph Abboud, Ismail Ilkan Ceylan, Thomas Lukasiewicz

2019-04-04

Weighted model counting (WMC) has emerged as a prevalent approach for probabilistic inference. In its most general form, WMC is #P-hard and, as a result, solving real-world WMC instances is intractable. Weighted DNF counting (weighted #DNF) is a special case where approximations with probabilistic guarantees can be tractably obtained, but this requires time O(mn), where m denotes the number of variables and n the number of clauses of the input DNF. In this paper, we propose a novel approach for weighted #DNF that combines approximate model counting with deep learning and accurately approximates model counts in just O(m + n). We conduct experiments to validate our method, and show that our model learns and generalizes very well to large-scale #DNF instances.

Large Batch Optimization for Deep Learning: Training BERT in 76 minutes (1904.00962v3)

Yang You, Jing Li, Sashank Reddi, Jonathan Hseu, Sanjiv Kumar, Srinadh Bhojanapalli, Xiaodan Song, James Demmel, Cho-Jui Hsieh

2019-04-01

Training large deep neural networks on massive datasets is very challenging. One promising approach to tackle this issue is through the use of large batch stochastic optimization. However, our understanding of this approach in the context of deep learning is still very limited. Furthermore, the current approaches in this direction are heavily hand-tuned. To this end, we first study a general adaptation strategy to accelerate training of deep neural networks using large minibatches. Using this strategy, we develop a new layer-wise adaptive large batch optimization technique called LAMB. We also provide a formal convergence analysis of LAMB as well as the previous published layerwise optimizer LARS, showing convergence to a stationary point in general nonconvex settings. Our empirical results demonstrate the superior performance of LAMB for BERT and ResNet-50 training. In particular, for BERT training, our optimization technique enables use of very large batches sizes of 32868; thereby, requiring just 8599 iterations to train (as opposed to 1 million iterations in the original paper). By increasing the batch size to the memory limit of a TPUv3 pod, BERT training time can be reduced from 3 days to 76 minutes. Finally, we also demonstrate that LAMB outperforms previous large-batch training algorithms for ResNet-50 on ImageNet; obtaining state-of-the-art performance in just a few minutes.

Continuous DR-submodular Maximization: Structure and Algorithms (1711.02515v4)

An Bian, Kfir Y. Levy, Andreas Krause, Joachim M. Buhmann

2017-11-04

DR-submodular continuous functions are important objectives with wide real-world applications spanning MAP inference in determinantal point processes (DPPs), and mean-field inference for probabilistic submodular models, amongst others. DR-submodularity captures a subclass of non-convex functions that enables both exact minimization and approximate maximization in polynomial time. In this work we study the problem of maximizing non-monotone DR-submodular continuous functions under general down-closed convex constraints. We start by investigating geometric properties that underlie such objectives, e.g., a strong relation between (approximately) stationary points and global optimum is proved. These properties are then used to devise two optimization algorithms with provable guarantees. Concretely, we first devise a "two-phase" algorithm with approximation guarantee. This algorithm allows the use of existing methods for finding (approximately) stationary points as a subroutine, thus, harnessing recent progress in non-convex optimization. Then we present a non-monotone Frank-Wolfe variant with approximation guarantee and sublinear convergence rate. Finally, we extend our approach to a broader class of generalized DR-submodular continuous functions, which captures a wider spectrum of applications. Our theoretical findings are validated on synthetic and real-world problem instances.

Teaching AI, Ethics, Law and Policy (1904.12470v2)

Asher Wilk

2019-04-29

The cyberspace and the development of intelligent systems using Artificial Intelligence (AI) created new challenges to computer professionals, data scientists, regulators and policy makers. For example, self-driving cars raise new technical, ethical, legal and policy issues. This paper proposes a course Computers, Ethics, Law, and Public Policy, and suggests a curriculum for such a course. This paper presents ethical, legal, and public policy issues relevant to building and using software and artificial intelligence. It describes ethical principles and values relevant to AI systems.

Devil in the Detail: Attack Scenarios in Industrial Applications (1905.10292v1)

Simon D. Duque Anton, Alexander Hafner, Hans Dieter Schotten

2019-05-24

In the past years, industrial networks have become increasingly interconnected and opened to private or public networks. This leads to an increase in efficiency and manageability, but also increases the attack surface. Industrial networks often consist of legacy systems that have not been designed with security in mind. In the last decade, an increase in attacks on cyber-physical systems was observed, with drastic consequences on the physical work. In this work, attack vectors on industrial networks are categorised. A real-world process is simulated, attacks are then introduced. Finally, two machine learning-based methods for time series anomaly detection are employed to detect the attacks. Matrix Profiles are employed more successfully than a predictor Long Short-Term Memory network, a class of neural networks.

Privacy-Preserving Obfuscation of Critical Infrastructure Networks (1905.09778v2)

Ferdinando Fioretto, Terrence W. K. Mak, Pascal Van Hentenryck

2019-05-23

The paper studies how to release data about a critical infrastructure network (e.g., the power network or a transportation network) without disclosing sensitive information that can be exploited by malevolent agents, while preserving the realism of the network. It proposes a novel obfuscation mechanism that combines several privacy-preserving building blocks with a bi-level optimization model to significantly improve accuracy. The obfuscation is evaluated for both realism and privacy properties on real energy and transportation networks. Experimental results show the obfuscation mechanism substantially reduces the potential damage of an attack exploiting the released data to harm the real network.

From Here to There: Video Inbetweening Using Direct 3D Convolutions (1905.10240v1)

Yunpeng Li, Dominik Roblek, Marco Tagliasacchi

2019-05-24

We consider the problem of generating plausible and diverse video sequences, when we are only given a start and an end frame. This task is also known as inbetweening, and it belongs to the broader area of stochastic video generation, which is generally approached by means of recurrent neural networks (RNN). In this paper, we propose instead a fully convolutional model to generate video sequences directly in the pixel domain. We first obtain a latent video representation using a stochastic fusion mechanism that learns how to incorporate information from the start and end frames. Our model learns to produce such latent representation by progressively increasing the temporal resolution, and then decode in the spatiotemporal domain using 3D convolutions. The model is trained end-to-end by minimizing an adversarial loss. Experiments on several widely-used benchmark datasets show that it is able to generate meaningful and diverse in-between video sequences, according to both quantitative and qualitative evaluations.

Clustering with feature selection using alternating minimization, Application to computational biology (1711.02974v4)

Cyprien Gilet, Marie Deprez, Jean-Baptiste Caillau, Michel Barlaud

2017-11-08

This paper deals with unsupervised clustering with feature selection. The problem is to estimate both labels and a sparse projection matrix of weights. To address this combinatorial non-convex problem maintaining a strict control on the sparsity of the matrix of weights, we propose an alternating minimization of the Frobenius norm criterion. We provide a new efficient algorithm named K-sparse which alternates k-means with projection-gradient minimization. The projection-gradient step is a method of splitting type, with exact projection on the ball to promote sparsity. The convergence of the gradient-projection step is addressed, and a preliminary analysis of the alternating minimization is made. The Frobenius norm criterion converges as the number of iterates in Algorithm K-sparse goes to infinity. Experiments on Single Cell RNA sequencing datasets show that our method significantly improves the results of PCA k-means, spectral clustering, SIMLR, and Sparcl methods, and achieves a relevant selection of genes. The complexity of K-sparse is linear in the number of samples (cells), so that the method scales up to large datasets.

Diffusion and Auction on Graphs (1905.09604v2)

Bin Li, Dong Hao, Dengji Zhao, Makoto Yokoo

2019-05-23

Auction is the common paradigm for resource allocation which is a fundamental problem in human society. Existing research indicates that the two primary objectives, the seller's revenue and the allocation efficiency, are generally conflicting in auction design. For the first time, we expand the domain of the classic auction to a social graph and formally identify a new class of auction mechanisms on graphs. All mechanisms in this class are incentive-compatible and also promote all buyers to diffuse the auction information to others, whereby both the seller's revenue and the allocation efficiency are significantly improved comparing with the Vickrey auction. It is found that the recently proposed information diffusion mechanism is an extreme case with the lowest revenue in this new class. Our work could potentially inspire a new perspective for the efficient and optimal auction design and could be applied into the prevalent online social and economic networks.



Sort:  

Congratulations @wholesome-post! You have completed the following achievement on the Steem blockchain and have been rewarded with new badge(s) :

You published a post every day of the week

You can view your badges on your Steem Board and compare to others on the Steem Ranking
If you no longer want to receive notifications, reply to this comment with the word STOP

To support your work, I also upvoted your post!

Vote for @Steemitboard as a witness to get one more award and increased upvotes!

Coin Marketplace

STEEM 0.17
TRX 0.13
JST 0.027
BTC 59139.97
ETH 2676.50
USDT 1.00
SBD 2.44