Search
Benchmarks are central to the improvement of named entity recognition andentity linking solutions. However, recent works have shown that manuallycreated benchmarks often contain mistakes. We hence investigate the automaticgeneration of benchmarks for named entity recognition and linking from...
In this note, we consider the problem of choosing which nodes of a lineardynamical system should be actuated so that the state transfer from thesystem's initial condition to a given final state is possible. Assuming astandard complexity hypothesis, we show that this problem cannot be...
In this paper we describe the Japanese-English Subtitle Corpus (JESC). JESCis a large Japanese-English parallel corpus covering the underrepresenteddomain of conversational dialogue. It consists of more than 3.2 millionexamples, making it the largest freely available dataset of its kind. Thecorpus...
This paper aims at developing a faster and a more accurate solution to theamodal 3D object detection problem for indoor scenes. It is achieved through anovel neural network that takes a pair of RGB-D images as the input anddelivers oriented 3D bounding boxes as the output. The network, named 3D-SSD...
Among the statistical tools for online information diffusion modeling, bothepidemic models and Hawkes point processes are popular choices. The formeroriginate from epidemiology, and consider information as a viral contagionwhich spreads into a population of online users. The latter have roots...
We exhibit an $O((\log k)^6)$-competitive randomized algorithm for the$k$-server problem on any metric space. It is shown that a potential-basedalgorithm for the fractional $k$-server problem on hierarchically separatedtrees (HSTs) with competitive ratio $f(k)$ can be used to obtain a...
In numerous occasions Agile practitioners have warned about the negativeaspects of adopting Agile tools and techniques, without implementing theprimary practices of Agile. They have coined this observation as "doing" Agile,but not "being" Agile. However such warnings are opinion-based, as...
It is well known that sequential decision making may lead to informationcascades. That is, when agents make decisions based on their privateinformation, as well as observing the actions of those before them, then itmight be rational to ignore their private signal and imitate the action ofprevious...
Examination of facial-analysis software shows error rate of 0.8 percent for light-skinned men, 34.7 percent for dark-skinned women.
As the prevalence and everyday use of machine learning algorithms, along withour reliance on these algorithms grow dramatically, so do the efforts to attackand undermine these algorithms with malicious intent, resulting in a growinginterest in adversarial machine learning. A number of approaches...
We establish connections between the problem of learning a two-layers neuralnetwork with good generalization error and tensor decomposition. We consider amodel with input $\boldsymbol x \in \mathbb R^d$, $r$ hidden units with weights$\{\boldsymbol w_i\}_{1\le i \le r}$ and output $y\in \mathbb R$, i...
Analyzing large X-ray diffraction (XRD) datasets is a key step inhigh-throughput mapping of the compositional phase diagrams of combinatorialmaterials libraries. Optimizing and automating this task can help acceleratethe process of discovery of materials with novel and desirable properties.Here, we...
We study the problem of detecting the presence of a single unknown spike in arectangular data matrix, in a high-dimensional regime where the spike has fixedstrength and the aspect ratio of the matrix converges to a finite limit. Thissituation comprises Johnstone's spiked covariance model. We analyze...
In industrial machine learning pipelines, data often arrive in parts.Particularly in the case of deep neural networks, it may be too expensive totrain the model from scratch each time, so one would rather use a previouslylearned model and the new data to improve performance. However, deep...
A folded type model is developed for analyzing compositional data. Theproposed model, which is based upon the $\alpha$-transformation forcompositional data, provides a new and flexible class of distributions formodeling data defined on the simplex sample space. Despite its rather seeminglycomplex...
This paper describes a novel communication-spare cooperative localizationalgorithm for a team of mobile unmanned robotic vehicles. Exploiting anevent-based estimation paradigm, robots only send measurements to neighborswhen the expected innovation for state estimation is high. Since agents knowthe...
As the buzzword phenomenon, procrastination holds a continued need for acomprehensive examination of its nature and the associated factors. Thepresented study explores the potential relationship between music taste, lifestyle and the youngsters' procrastination through quantitative modelling...
Echo state networks are powerful recurrent neural networks. However, they areoften unstable and shaky, making the process of finding an good ESN for aspecific dataset quite hard. Obtaining a superb accuracy by using the EchoState Network is a challenging task. We create, develop and implement a...
The popular cubic regularization (CR) method converges with first- andsecond-order optimality guarantee for nonconvex optimization, but encounters ahigh sample complexity issue for solving large-scale problems. Varioussub-sampling variants of CR have been proposed to improve the samplecomplexity.In...
Calcium imaging data promises to transform the field of neuroscience bymaking it possible to record from large populations of neurons simultaneously.However, determining the exact moment in time at which a neuron spikes, from acalcium imaging data set, amounts to a non-trivial deconvolution problem...
Stay in the loop
Subscribe to our newsletter for a weekly update on the latest podcast, news, events, and jobs postings.