Affara, Lama Ahmed; Ghanem, Bernard; Wonka, Peter(arXiv, 2017-09-27)[Preprint]
Convolutional sparse coding (CSC) is an important building block of many computer vision applications ranging from image and video compression to deep learning. We present two contributions to the state of the art in CSC. First, we significantly speed up the computation by proposing a new optimization framework that tackles the problem in the dual domain. Second, we extend the original formulation to higher dimensions in order to process a wider range of inputs, such as color inputs, or HOG features. Our results show a significant speedup compared to the current state of the art in CSC.
Li, Lichun; Langbort, Cedric; Shamma, Jeff S.(arXiv, 2017-11-07)[Preprint]
This paper considers a zero-sum two-player asymmetric information stochastic game where only one player knows the system state, and the transition law is controlled by the informed player only. For the informed player, it has been shown that the security strategy only depends on the belief and the current stage. We provide LP formulations whose size is only linear in the size of the uninformed player's action set to compute both history based and belief based security strategies. For the uninformed player, we focus on the regret, the difference between 0 and the future payoff guaranteed by the uninformed player in every possible state. Regret is a real vector of the same size as the belief, and depends only on the action of the informed player and the strategy of the uninformed player. This paper shows that the uninformed player has a security strategy that only depends on the regret and the current stage. LP formulations are then given to compute the history based security strategy, the regret at every stage, and the regret based security strategy. The size of the LP formulations are again linear in the size of the uninformed player action set. Finally, an intrusion detection problem is studied to demonstrate the main results in this paper.
Visual Question Answering (VQA) models should have both high robustness and accuracy. Unfortunately, most of the current VQA research only focuses on accuracy because there is a lack of proper methods to measure the robustness of VQA models. There are two main modules in our algorithm. Given a natural language question about an image, the first module takes the question as input and then outputs the ranked basic questions, with similarity scores, of the main given question. The second module takes the main question, image and these basic questions as input and then outputs the text-based answer of the main question about the given image. We claim that a robust VQA model is one, whose performance is not changed much when related basic questions as also made available to it as input. We formulate the basic questions generation problem as a LASSO optimization, and also propose a large scale Basic Question Dataset (BQD) and Rscore (novel robustness measure), for analyzing the robustness of VQA models. We hope our BQD will be used as a benchmark for to evaluate the robustness of VQA models, so as to help the community build more robust and accurate VQA models.
Following the seminal work of Nesterov, accelerated optimization methods have been used to powerfully boost the performance of first-order, gradient-based parameter estimation in scenarios where second-order optimization strategies are either inapplicable or impractical. Not only does accelerated gradient descent converge considerably faster than traditional gradient descent, but it also performs a more robust local search of the parameter space by initially overshooting and then oscillating back as it settles into a final configuration, thereby selecting only local minimizers with a basis of attraction large enough to contain the initial overshoot. This behavior has made accelerated and stochastic gradient search methods particularly popular within the machine learning community. In their recent PNAS 2016 paper, Wibisono, Wilson, and Jordan demonstrate how a broad class of accelerated schemes can be cast in a variational framework formulated around the Bregman divergence, leading to continuum limit ODE's. We show how their formulation may be further extended to infinite dimension manifolds (starting here with the geometric space of curves and surfaces) by substituting the Bregman divergence with inner products on the tangent space and explicitly introducing a distributed mass model which evolves in conjunction with the object of interest during the optimization process. The co-evolving mass model, which is introduced purely for the sake of endowing the optimization with helpful dynamics, also links the resulting class of accelerated PDE based optimization schemes to fluid dynamical formulations of optimal mass transport.
This article carries out a large dimensional analysis of standard regularized discriminant analysis classifiers designed on the assumption that data arise from a Gaussian mixture model with different means and covariances. The analysis relies on fundamental results from random matrix theory (RMT) when both the number of features and the cardinality of the training data within each class grow large at the same pace. Under mild assumptions, we show that the asymptotic classification error approaches a deterministic quantity that depends only on the means and covariances associated with each class as well as the problem dimensions. Such a result permits a better understanding of the performance of regularized discriminant analsysis, in practical large but finite dimensions, and can be used to determine and pre-estimate the optimal regularization parameter that minimizes the misclassification error probability. Despite being theoretically valid only for Gaussian data, our findings are shown to yield a high accuracy in predicting the performances achieved with real data sets drawn from the popular USPS data base, thereby making an interesting connection between theory and practice.
Traditional approaches for action detection use trimmed data to learn sophisticated action detector models. Although these methods have achieved great success at detecting human actions, we argue that huge information is discarded when ignoring the process, through which this trimmed data is obtained. In this paper, we propose Action Search, a novel approach that mimics the way people annotate activities in video sequences. Using a Recurrent Neural Network, Action Search can efficiently explore a video and determine the time boundaries during which an action occurs. Experiments on the THUMOS14 dataset reveal that our model is not only able to explore the video efficiently but also accurately find human activities, outperforming state-of-the-art methods.
Li, Lichun; Shamma, Jeff S.(arXiv, 2017-03-06)[Preprint]
Zero-sum asymmetric games model decision making scenarios involving two competing players who have different information about the game being played. A particular case is that of nested information, where one (informed) player has superior information over the other (uninformed) player. This paper considers the case of nested information in repeated zero-sum games and studies the computation of strategies for both the informed and uninformed players for finite-horizon and discounted infinite-horizon nested information games. For finite-horizon settings, we exploit that for both players, the security strategy, and also the opponent's corresponding best response depend only on the informed player's history of actions. Using this property, we refine the sequence form, and formulate an LP computation of player strategies that is linear in the size of the uninformed player's action set. For the infinite-horizon discounted game, we construct LP formulations to compute the approximated security strategies for both players, and provide a bound on the performance difference between the approximated security strategies and the security strategies. Finally, we illustrate the results on a network interdiction game between an informed system administrator and uniformed intruder.
Jaleel, Hassan; Shamma, Jeff S.(arXiv, 2017-09-21)[Preprint]
Submodular set-functions are extensively used in large-scale combinatorial optimization problems arising in complex networks and machine learning. While there has been significant interest in distributed formulations of convex optimization, distributed minimization of submodular functions has not received significant attention. Thus, our main contribution is a framework for minimizing submodular functions in a distributed manner. The proposed framework is based on the ideas of Lovasz extension of submodular functions and distributed optimization of convex functions. The framework exploits a fundamental property of submodularity that the Lovasz extension of a submodular function is a convex function and can be computed efficiently. Moreover, a minimizer of a submodular function can be computed by computing the minimizer of its Lovasz extension. In the proposed framework, we employ a consensus based distributed optimization algorithm to minimize set-valued submodular functions as well as general submodular functions defined over set products. We also identify distributed motion coordination in multiagent systems as a new application domain for submodular function minimization. For demonstrating key ideas of the proposed framework, we select a complex setup of the capture the flag game, which offers a variety of challenges relevant to multiagent system. We formulate the problem as a submodular minimization problem and verify through extensive simulations that the proposed framework results in feasible policies for the agents.
Traditional methods for image compressive sensing (CS) reconstruction solve a well-defined inverse problem that is based on a predefined CS model, which defines the underlying structure of the problem and is generally solved by employing convergent iterative solvers. These optimization-based CS methods face the challenge of choosing optimal transforms and tuning parameters in their solvers, while also suffering from high computational complexity in most cases. Recently, some deep network based CS algorithms have been proposed to improve CS reconstruction performance, while dramatically reducing time complexity as compared to optimization-based methods. Despite their impressive results, the proposed networks (either with fully-connected or repetitive convolutional layers) lack any structural diversity and they are trained as a black box, void of any insights from the CS domain. In this paper, we combine the merits of both types of CS methods: the structure insights of optimization-based method and the performance/speed of network-based ones. We propose a novel structured deep network, dubbed ISTA-Net, which is inspired by the Iterative Shrinkage-Thresholding Algorithm (ISTA) for optimizing a general $l_1$ norm CS reconstruction model. ISTA-Net essentially implements a truncated form of ISTA, where all ISTA-Net parameters are learned end-to-end to minimize a reconstruction error in training. Borrowing more insights from the optimization realm, we propose an accelerated version of ISTA-Net, dubbed FISTA-Net, which is inspired by the fast iterative shrinkage-thresholding algorithm (FISTA). Interestingly, this acceleration naturally leads to skip connections in the underlying network design. Extensive CS experiments demonstrate that the proposed ISTA-Net and FISTA-Net outperform existing optimization-based and network-based CS methods by large margins, while maintaining a fast runtime.
The export option will allow you to export the current search results of the entered query to a file. Different
formats are available for download. To export the items, click on the button corresponding with the preferred download format.
By default, clicking on the export buttons will result in a download of the allowed maximum amount of items.
For anonymous users the allowed maximum amount is 50 search results.
To select a subset of the search results, click "Selective Export" button and make a selection of the items you want to export.
The amount of items that can be exported at once is similarly restricted as the full export.
After making a selection, click one of the export format buttons. The amount of items that will be exported is indicated in the bubble next to export format.