Search

3LC: Lightweight and Effective Traffic Compression for Distributed Machine Learning.

The performance and efficiency of distributed machine learning (ML) dependssignificantly on how long it takes for nodes to exchange state changes.Overly-aggressive attempts to reduce communication often sacrifice final modelaccuracy and necessitate additional ML techniques to compensate for this...

Direct Learning to Rank and Rerank.

Learning-to-rank techniques have proven to be extremely useful forprioritization problems, where we rank items in order of their estimatedprobabilities, and dedicate our limited resources to the top-ranked items. Thiswork exposes a serious problem with the state of learning-to-rank algorithms,which...

A Study into the similarity in generator and discriminator in GAN architecture.

One popular generative model that has high-quality results is the GenerativeAdversarial Networks(GAN). This type of architecture consists of two separatenetworks that play against each other. The generator creates an output from theinput noise that is given to it. The discriminator has the task of...

Physics and Human-Based Information Fusion for Improved Resident Space Object Tracking.

Maintaining a catalog of Resident Space Objects (RSOs) can be cast in atypical Bayesian multi-object estimation problem, where the various sources ofuncertainty in the problem - the orbital mechanics, the kinematic states of theidentified objects, the data sources, etc. - are modeled as random...

Generalization in Machine Learning via Analytical Learning Theory.

This paper introduces a novel measure-theoretic learning theory to analyzegeneralization behaviors of practical interest. The proposed learning theoryhas the following abilities: 1) to utilize the qualities of each learnedrepresentation on the path from raw inputs to outputs in...

Nonparametric Bayesian Sparse Graph Linear Dynamical Systems.

A nonparametric Bayesian sparse graph linear dynamical system (SGLDS) isproposed to model sequentially observed multivariate data. SGLDS uses theBernoulli-Poisson link together with a gamma process to generate an infinitedimensional sparse random graph to model state transitions. Depending on...

Learning to Play with Intrinsically-Motivated Self-Aware Agents.

Infants are experts at playing, with an amazing ability to generate novelstructured behaviors in unstructured environments that lack clear extrinsicreward signals. We seek to mathematically formalize these abilities using aneural network that implements curiosity-driven intrinsic motivation. Using...

Scaling-up Split-Merge MCMC with Locality Sensitive Sampling .

Split-Merge MCMC (Monte Carlo Markov Chain) is one of the essential andpopular variants of MCMC for problems when an MCMC state consists of an unknownnumber of components or clusters. It is well known that state-of-the-artmethods for split-merge MCMC do not scale well. Strategies for rapid...

Emergence of Structured Behaviors from Curiosity-Based Intrinsic Motivation.

Infants are experts at playing, with an amazing ability to generate novelstructured behaviors in unstructured environments that lack clear extrinsicreward signals. We seek to replicate some of these abilities with a neuralnetwork that implements curiosity-driven intrinsic motivation. Using a...

Dual Extrapolation for Faster Lasso Solvers.

Convex sparsity-inducing regularizations are ubiquitous in high-dimensionmachine learning, but their non-differentiability requires the use of iterativesolvers. To accelerate such solvers, state-of-the-art approaches consist inreducing the size of the optimization problem at hand. In the context...

Spectrally approximating large graphs with smaller graphs.

How does coarsening affect the spectrum of a general graph? We provideconditions such that the principal eigenvalues and eigenspaces of a coarsenedand original graph Laplacian matrices are close. The achieved approximation isshown to depend on standard graph-theoretic properties, such as the degree...

Adversarial classification: An adversarial risk analysis approach.

Classification problems in security settings are usually contemplated asconfrontations in which one or more adversaries try to fool a classifier toobtain a benefit. Most approaches to such adversarial classification problemshave focused on game theoretical ideas with strong underlying common...

Subspace-Induced Gaussian Processes.

We present a new Gaussian process (GP) regression model where the covariancekernel is indexed or parameterized by a sufficient dimension reduction subspaceof a reproducing kernel Hilbert space. The covariance kernel will be low-rankwhile capturing the statistical dependency of the response to the...

A Generative Deep Recurrent Model for Exchangeable Data.

We present a novel model architecture which leverages deep learning tools toperform exact Bayesian inference on sets of high dimensional, complexobservations. Our model is provably exchangeable, meaning that the jointdistribution over observations is invariant under permutation: this propertylies at...

The Many Faces of Exponential Weights in Online Learning.

A standard introduction to online learning might place Online GradientDescent at its center and then proceed to develop generalizations andextensions like Online Mirror Descent and second-order methods. Here we explorethe alternative approach of putting exponential weights (EW) first. We showthat...

Clipped Action Policy Gradient.

Many continuous control tasks have bounded action spaces and clipout-of-bound actions before execution. Policy gradient methods often optimizepolicies as if actions were not clipped. We propose clipped action policygradient (CAPG) as an alternative policy gradient estimator that exploits...

Continual Lifelong Learning with Neural Networks: A Review.

Humans and animals have the ability to continually acquire and fine-tuneknowledge throughout their lifespan. This ability is mediated by a rich set ofneurocognitive functions that together contribute to the early development andexperience-driven specialization of our sensorimotor skills...

Information Theoretic Co-Training.

This paper introduces an information theoretic co-training objective forunsupervised learning. We consider the problem of predicting the future. Ratherthan predict future sensations (image pixels or sound waves) we predict"hypotheses" to be confirmed by future sensations. More formally, we assume...

Emulating dynamic non-linear simulators using Gaussian processes.

In this paper, we examine the emulation of non-linear deterministic computercodes where the output is a time series, possibly multivariate. Such computermodels simulate the evolution of some real-world phenomena over time, forexample models of the climate or the functioning of the human brain. The...

Universal Hypothesis Testing with Kernels: Asymptotically Optimal Tests for Goodness of Fit.

We characterize the asymptotic performance of nonparametric goodness of fittesting, otherwise known as the universal hypothesis testing that dates back toHoeffding (1965). The exponential decay rate of the type-II error probabilityis used as the asymptotic performance metric, hence an optimal test...

Stay in the loop

Subscribe to our newsletter for a weekly update on the latest podcast, news, events, and jobs postings.