Hierarchically Robust Representation Learning
With the tremendous success of deep learning in visual tasks, the representations extracted from intermediate layers of learned models, that is, deep features, attract much attention of researchers. The previous analysis shows that those features include appropriate semantic information. By training the deep models on a large-scale benchmark data set (e.g., ImageNet), the features can work well on other tasks. In this work, we investigate this phenomenon and demonstrate that deep features can fail due to the fact that they are learned by minimizing empirical risk. When the distribution of data is different from that of the benchmark data set, the performance of deep features can degrade. Hence, we propose a hierarchically robust optimization to learn more generic features. Considering the example-level and concept-level robustness simultaneously, we formulate the problem as a distributionally robust optimization problem with Wasserstein ambiguity set constraints. An efficient algorithm with the conventional training pipeline is proposed. Experiments on benchmark data sets confirm our claim and demonstrate the effectiveness of the robust deep representations.
Open Access Status
OA Disciplinary Repository
Qian, Q., Hu, J., & Li, H. (2019). Hierarchically Robust Representation Learning. ArXiv:1911.04047 [Cs]. Retrieved from http://arxiv.org/abs/1911.04047