Search
The performance and efficiency of distributed machine learning (ML) dependssignificantly on how long it takes for nodes to exchange state changes.Overly-aggressive attempts to reduce communication often sacrifice final modelaccuracy and necessitate additional ML techniques to compensate for this...
Learning-to-rank techniques have proven to be extremely useful forprioritization problems, where we rank items in order of their estimatedprobabilities, and dedicate our limited resources to the top-ranked items. Thiswork exposes a serious problem with the state of learning-to-rank algorithms,which...
One popular generative model that has high-quality results is the GenerativeAdversarial Networks(GAN). This type of architecture consists of two separatenetworks that play against each other. The generator creates an output from theinput noise that is given to it. The discriminator has the task of...
Maintaining a catalog of Resident Space Objects (RSOs) can be cast in atypical Bayesian multi-object estimation problem, where the various sources ofuncertainty in the problem - the orbital mechanics, the kinematic states of theidentified objects, the data sources, etc. - are modeled as random...
This paper introduces a novel measure-theoretic learning theory to analyzegeneralization behaviors of practical interest. The proposed learning theoryhas the following abilities: 1) to utilize the qualities of each learnedrepresentation on the path from raw inputs to outputs in...
A nonparametric Bayesian sparse graph linear dynamical system (SGLDS) isproposed to model sequentially observed multivariate data. SGLDS uses theBernoulli-Poisson link together with a gamma process to generate an infinitedimensional sparse random graph to model state transitions. Depending on...
Infants are experts at playing, with an amazing ability to generate novelstructured behaviors in unstructured environments that lack clear extrinsicreward signals. We seek to mathematically formalize these abilities using aneural network that implements curiosity-driven intrinsic motivation. Using...
Split-Merge MCMC (Monte Carlo Markov Chain) is one of the essential andpopular variants of MCMC for problems when an MCMC state consists of an unknownnumber of components or clusters. It is well known that state-of-the-artmethods for split-merge MCMC do not scale well. Strategies for rapid...
Infants are experts at playing, with an amazing ability to generate novelstructured behaviors in unstructured environments that lack clear extrinsicreward signals. We seek to replicate some of these abilities with a neuralnetwork that implements curiosity-driven intrinsic motivation. Using a...
Convex sparsity-inducing regularizations are ubiquitous in high-dimensionmachine learning, but their non-differentiability requires the use of iterativesolvers. To accelerate such solvers, state-of-the-art approaches consist inreducing the size of the optimization problem at hand. In the context...
How does coarsening affect the spectrum of a general graph? We provideconditions such that the principal eigenvalues and eigenspaces of a coarsenedand original graph Laplacian matrices are close. The achieved approximation isshown to depend on standard graph-theoretic properties, such as the degree...
Classification problems in security settings are usually contemplated asconfrontations in which one or more adversaries try to fool a classifier toobtain a benefit. Most approaches to such adversarial classification problemshave focused on game theoretical ideas with strong underlying common...
We present a new Gaussian process (GP) regression model where the covariancekernel is indexed or parameterized by a sufficient dimension reduction subspaceof a reproducing kernel Hilbert space. The covariance kernel will be low-rankwhile capturing the statistical dependency of the response to the...
We present a novel model architecture which leverages deep learning tools toperform exact Bayesian inference on sets of high dimensional, complexobservations. Our model is provably exchangeable, meaning that the jointdistribution over observations is invariant under permutation: this propertylies at...
A standard introduction to online learning might place Online GradientDescent at its center and then proceed to develop generalizations andextensions like Online Mirror Descent and second-order methods. Here we explorethe alternative approach of putting exponential weights (EW) first. We showthat...
Many continuous control tasks have bounded action spaces and clipout-of-bound actions before execution. Policy gradient methods often optimizepolicies as if actions were not clipped. We propose clipped action policygradient (CAPG) as an alternative policy gradient estimator that exploits...
Humans and animals have the ability to continually acquire and fine-tuneknowledge throughout their lifespan. This ability is mediated by a rich set ofneurocognitive functions that together contribute to the early development andexperience-driven specialization of our sensorimotor skills...
This paper introduces an information theoretic co-training objective forunsupervised learning. We consider the problem of predicting the future. Ratherthan predict future sensations (image pixels or sound waves) we predict"hypotheses" to be confirmed by future sensations. More formally, we assume...
In this paper, we examine the emulation of non-linear deterministic computercodes where the output is a time series, possibly multivariate. Such computermodels simulate the evolution of some real-world phenomena over time, forexample models of the climate or the functioning of the human brain. The...
We characterize the asymptotic performance of nonparametric goodness of fittesting, otherwise known as the universal hypothesis testing that dates back toHoeffding (1965). The exponential decay rate of the type-II error probabilityis used as the asymptotic performance metric, hence an optimal test...
Stay in the loop
Subscribe to our newsletter for a weekly update on the latest podcast, news, events, and jobs postings.