Machine Learning Latest Submitted Preprints | 2019-03-05

in #learning6 years ago

Machine Learning


Optimistic Adaptive Acceleration for Optimization (1903.01435v1)

Jun-Kun Wang, Xiaoyun Li, Ping Li

2019-03-04

We consider a new variant of \textsc{AMSGrad}. AMSGrad \cite{RKK18} is a popular adaptive gradient based optimization algorithm that is widely used in training deep neural networks. Our new variant of the algorithm assumes that mini-batch gradients in consecutive iterations have some underlying structure, which makes the gradients sequentially predictable. By exploiting the predictability and some ideas from the field of \textsc{Optimistic Online learning}, the new algorithm can accelerate the convergence and enjoy a tighter regret bound. We conduct experiments on training various neural networks on several datasets to show that the proposed method speeds up the convergence in practice.

VideoFlow: A Flow-Based Generative Model for Video (1903.01434v1)

Manoj Kumar, Mohammad Babaeizadeh, Dumitru Erhan, Chelsea Finn, Sergey Levine, Laurent Dinh, Durk Kingma

2019-03-04

Generative models that can model and predict sequences of future events can, in principle, learn to capture complex real-world phenomena, such as physical interactions. In particular, learning predictive models of videos offers an especially appealing mechanism to enable a rich understanding of the physical world: videos of real-world interactions are plentiful and readily available, and a model that can predict future video frames can not only capture useful representations of the world, but can be useful in its own right, for problems such as model-based robotic control. However, a central challenge in video prediction is that the future is highly uncertain: a sequence of past observations of events can imply many possible futures. Although a number of recent works have studied probabilistic models that can represent uncertain futures, such models are either extremely expensive computationally (as in the case of pixel-level autoregressive models), or do not directly optimize the likelihood of the data. In this work, we propose a model for video prediction based on normalizing flows, which allows for direct optimization of the data likelihood, and produces high-quality stochastic predictions. To our knowledge, our work is the first to propose multi-frame video prediction with normalizing flows. We describe an approach for modeling the latent space dynamics, and demonstrate that flow-based generative models offer a viable and competitive approach to generative modeling of video.

Contingency-Aware Exploration in Reinforcement Learning (1811.01483v3)

Jongwook Choi, Yijie Guo, Marcin Moczulski, Junhyuk Oh, Neal Wu, Mohammad Norouzi, Honglak Lee

2018-11-05

This paper investigates whether learning contingency-awareness and controllable aspects of an environment can lead to better exploration in reinforcement learning. To investigate this question, we consider an instantiation of this hypothesis evaluated on the Arcade Learning Element (ALE). In this study, we develop an attentive dynamics model (ADM) that discovers controllable elements of the observations, which are often associated with the location of the character in Atari games. The ADM is trained in a self-supervised fashion to predict the actions taken by the agent. The learned contingency information is used as a part of the state representation for exploration purposes. We demonstrate that combining actor-critic algorithm with count-based exploration using our representation achieves impressive results on a set of notoriously challenging Atari games due to sparse rewards. For example, we report a state-of-the-art score of >11,000 points on Montezuma's Revenge without using expert demonstrations, explicit high-level information (e.g., RAM states), or supervisory data. Our experiments confirm that contingency-awareness is indeed an extremely powerful concept for tackling exploration problems in reinforcement learning and opens up interesting research questions for further investigations.

Semi-supervised Deep Kernel Learning: Regression with Unlabeled Data by Minimizing Predictive Variance (1805.10407v4)

Neal Jean, Sang Michael Xie, Stefano Ermon

2018-05-26

Large amounts of labeled data are typically required to train deep learning models. For many real-world problems, however, acquiring additional data can be expensive or even impossible. We present semi-supervised deep kernel learning (SSDKL), a semi-supervised regression model based on minimizing predictive variance in the posterior regularization framework. SSDKL combines the hierarchical representation learning of neural networks with the probabilistic modeling capabilities of Gaussian processes. By leveraging unlabeled data, we show improvements on a diverse set of real-world regression tasks over supervised deep kernel learning and semi-supervised methods such as VAT and mean teacher adapted for regression.

Data Amplification: Instance-Optimal Property Estimation (1903.01432v1)

Yi Hao, Alon Orlitsky

2019-03-04

The best-known and most commonly used distribution-property estimation technique uses a plug-in estimator, with empirical frequency replacing the underlying distribution. We present novel linear-time-computable estimators that significantly "amplify" the effective amount of data available. For a large variety of distribution properties including four of the most popular ones and for every underlying distribution, they achieve the accuracy that the empirical-frequency plug-in estimators would attain using a logarithmic-factor more samples. Specifically, for Shannon entropy and a broad class of properties including -distance, the new estimators use samples to achieve the accuracy attained by the empirical estimators with samples. For support-size and coverage, the new estimators use samples to achieve the performance of empirical frequency with sample size times the logarithm of the property value. Significantly strengthening the traditional min-max formulation, these results hold not only for the worst distributions, but for each and every underlying distribution. Furthermore, the logarithmic amplification factors are optimal. Experiments on a wide variety of distributions show that the new estimators outperform the previous state-of-the-art estimators designed for each specific property.

Towards Deep Conversational Recommendations (1812.07617v2)

Raymond Li, Samira Kahou, Hannes Schulz, Vincent Michalski, Laurent Charlin, Chris Pal

2018-12-18

There has been growing interest in using neural networks and deep learning techniques to create dialogue systems. Conversational recommendation is an interesting setting for the scientific exploration of dialogue with natural language as the associated discourse involves goal-driven dialogue that often transforms naturally into more free-form chat. This paper provides two contributions. First, until now there has been no publicly available large-scale dataset consisting of real-world dialogues centered around recommendations. To address this issue and to facilitate our exploration here, we have collected ReDial, a dataset consisting of over 10,000 conversations centered around the theme of providing movie recommendations. We make this data available to the community for further research. Second, we use this dataset to explore multiple facets of conversational recommendations. In particular we explore new neural architectures, mechanisms, and methods suitable for composing conversational recommendation systems. Our dataset allows us to systematically probe model sub-components addressing different parts of the overall problem domain ranging from: sentiment analysis and cold-start recommendation generation to detailed aspects of how natural language is used in this setting in the real world. We combine such sub-components into a full-blown dialogue system and examine its behavior.

Individual Fairness in Hindsight (1812.04069v3)

Swati Gupta, Vijay Kamble

2018-12-10

Since many critical decisions impacting human lives are increasingly being made by algorithms, it is important to ensure that the treatment of individuals under such algorithms is demonstrably fair under reasonable notions of fairness. One compelling notion proposed in the literature is that of individual fairness (IF), which advocates that similar individuals should be treated similarly (Dwork et al. 2012). Originally proposed for offline decisions, this notion does not, however, account for temporal considerations relevant for online decision-making. In this paper, we extend the notion of IF to account for the time at which a decision is made, in settings where there exists a notion of conduciveness of decisions as perceived by the affected individuals. We introduce two definitions: (i) fairness-across-time (FT) and (ii) fairness-in-hindsight (FH). FT is the simplest temporal extension of IF where treatment of individuals is required to be individually fair relative to the past as well as future, while in FH, we require a one-sided notion of individual fairness that is defined relative to only the past decisions. We show that these two definitions can have drastically different implications in the setting where the principal needs to learn the utility model. Linear regret relative to optimal individually fair decisions is inevitable under FT for non-trivial examples. On the other hand, we design a new algorithm: Cautious Fair Exploration (CaFE), which satisfies FH and achieves sub-linear regret guarantees for a broad range of settings. We characterize lower bounds showing that these guarantees are order-optimal in the worst case. FH can thus be embedded as a primary safeguard against unfair discrimination in algorithmic deployments, without hindering the ability to take good decisions in the long-run.

Clustering Time Series with Nonlinear Dynamics: A Bayesian Non-Parametric and Particle-Based Approach (1810.09920v4)

Alexander Lin, Yingzhuo Zhang, Jeremy Heng, Stephen A. Allsop, Kay M. Tye, Pierre E. Jacob, Demba Ba

2018-10-23

We propose a general statistical framework for clustering multiple time series that exhibit nonlinear dynamics into an a-priori-unknown number of sub-groups. Our motivation comes from neuroscience, where an important problem is to identify, within a large assembly of neurons, subsets that respond similarly to a stimulus or contingency. Upon modeling the multiple time series as the output of a Dirichlet process mixture of nonlinear state-space models, we derive a Metropolis-within-Gibbs algorithm for full Bayesian inference that alternates between sampling cluster assignments and sampling parameter values that form the basis of the clustering. The Metropolis step employs recent innovations in particle-based methods. We apply the framework to clustering time series acquired from the prefrontal cortex of mice in an experiment designed to characterize the neural underpinnings of fear.

A General Approach to Adding Differential Privacy to Iterative Training Procedures (1812.06210v2)

H. Brendan McMahan, Galen Andrew, Ulfar Erlingsson, Steve Chien, Ilya Mironov, Nicolas Papernot, Peter Kairouz

2018-12-15

In this work we address the practical challenges of training machine learning models on privacy-sensitive datasets by introducing a modular approach that minimizes changes to training algorithms, provides a variety of configuration strategies for the privacy mechanism, and then isolates and simplifies the critical logic that computes the final privacy guarantees. A key challenge is that training algorithms often require estimating many different quantities (vectors) from the same set of examples --- for example, gradients of different layers in a deep learning architecture, as well as metrics and batch normalization parameters. Each of these may have different properties like dimensionality, magnitude, and tolerance to noise. By extending previous work on the Moments Accountant for the subsampled Gaussian mechanism, we can provide privacy for such heterogeneous sets of vectors, while also structuring the approach to minimize software engineering challenges.

Database Alignment with Gaussian Features (1903.01422v1)

Osman Emre Dai, Daniel Cullina, Negar Kiyavash

2019-03-04

We consider the problem of aligning a pair of databases with jointly Gaussian features. We consider two algorithms, complete database alignment via MAP estimation among all possible database alignments, and partial alignment via a thresholding approach of log likelihood ratios. We derive conditions on mutual information between feature pairs, identifying the regimes where the algorithms are guaranteed to perform reliably and those where they cannot be expected to succeed.



Coin Marketplace

STEEM 0.17
TRX 0.15
JST 0.027
BTC 60654.57
ETH 2343.25
USDT 1.00
SBD 2.48