GithubHelp home page GithubHelp logo

cvpr2018-papers's Introduction

CVPR2018-papers

  1. Transductive Unbiased Embedding for Zero-Shot Learning
  2. Frustum PointNets for 3D Object Detection from RGB-D Data
  3. Enhancing the Spatial Resolution of Stereo Images using a Parallax Prior
  4. DiverseNet: When One Right Answer Is Not Enough
  5. SSNet: Scale Selection Network for Online 3D Action Prediction
  6. Very Large-Scale Global SfM by Distributed Motion Averaging
  7. PAD-Net: Multi-Tasks Guided Prediciton-and-Distillation Network for Simultaneous Depth Estimation and Scene Parsing
  8. Dynamic Feature Learning for Partial Face Recognition
  9. Context-aware Deep Feature Compression for High-speed Visual Tracking
  10. Between-class Learning for Image Classification
  11. DVQA: Understanding Data Visualizations via Question Answering
  12. Human Appearance Transfer
  13. Learning to Segment Every Thing
  14. Globally Optimal Inlier Set Maximization for Atlanta Frame Estimation
  15. Re-weighted Adversarial Adaptation Network for Unsupervised Domain Adaptation
  16. Learning to Compare: Relation Network for Few-Shot Learning
  17. Arbitrary Style Transfer with Deep Feature Reshuffle
  18. Dynamic Scene Deblurring Using Spatially Variant Recurrent Neural Networks
  19. Robust Video Content Alignment and Compensation for Rain Removal in a CNN Framework
  20. Guided Proofreading of Automatic Segmentations for Connectomics
  21. Deep PhaseNet for Video Frame Interpolation
  22. Context-aware Synthesis for Video Frame Interpolation
  23. Lean Multiclass Crowdsourcing
  24. Unsupervised Deep Generative Adversarial Hashing Network
  25. R-FCN-3000 at 30fps: Decoupling Detection and Classification
  26. Tips and Tricks for Visual Question Answering: Learnings from the 2017 Challenge
  27. Gated Fusion Network for Single Image Dehazing
  28. Learning a Complete Image Indexing Pipeline
  29. Mask-guided Contrastive Attention Model for Person Re-Identification
  30. Learning Pose Specific Representations by Predicting different Views
  31. Deep Mutual Learning
  32. Improving Occlusion and Hard Negative Handling for Single-Stage Object Detectors
  33. Defense against adversarial attacks using guided denoiser
  34. Learning Attentions: Residual Attentional Siamese Network for High Performance Online Visual Tracking
  35. Structure Inference Net: Object Detection Using Scene-Level Context and Instance-Level Relationships
  36. Decorrelated Batch Normalization
  37. On the Duality Between Retinex and Image Dehazing
  38. CSRNet: Dilated Convolutional Neural Networks for Understanding the Highly Congested Scenes
  39. The Perception-Distortion Tradeoff
  40. Image Blind Denoising With Generative Adversarial Network Based Noise Modeling
  41. Distort-and-Recover: Color Enhancement using Deep Reinforcement Learning
  42. A Low Power, High Throughput, Fully Event-Based Stereo System
  43. Regularizing RNNs for Caption Generation by Reconstructing The Past with The Present
  44. End-to-end Flow Correlation Tracking with Spatial-temporal Attention
  45. Exploiting Transitivity for Learning Person Re-identification Models on a Budget
  46. Imagination-IQA: No-reference Image Quality Assessment via Adversarial Learning
  47. Egocentric Activity Recognition on a Budget
  48. Person Transfer GAN to Bridge Domain Gap for Person Re-Identification
  49. Duplex Generative Adversarial Network for Unsupervised Domain Adaptation
  50. Fine-grained Video Captioning for Sports Narrative
  51. High Performance Visual Tracking with Siamese Region Proposal Network
  52. Adversarially Occluded Samples for Person Re-identification
  53. MatNet: Modular Attention Network for Referring Expression Comprehension
  54. Low-Latency Video Semantic Segmentation
  55. MapNet: An Allocentric Spatial Memory for Mapping Environments
  56. Fast End-to-End Trainable Guided Filter
  57. Partial Transfer Learning with Selective Adversarial Networks
  58. Reconstruction Network for Video Captioning
  59. Improving Landmark Localization with Semi-Supervised Learning
  60. Unsupervised Person Image Synthesis in Arbitrary Poses
  61. Efficient Large-scale Approximate Nearest Neighbor Search on OpenCL FPGA
  62. Deep End-to-End Time-of-Flight Imaging
  63. Augmenting Crowd-Sourced 3D Reconstructions using Semantic Detections
  64. DocUNet: Document Image Unwarping via A Stacked U-Net
  65. Geometry Aware Optimization for Deep Learning: The Good Practice
  66. Learning to Detect Features in Texture Images
  67. LiteFlowNet: A Lightweight Convolutional Neural Network for Optical Flow Estimation
  68. Spatially-Adaptive Filter Units for Deep Neural Networks
  69. Revisiting Video Saliency: A Large-scale Benchmark and a New Model
  70. Real-World Repetition Estimation by Div, Grad and Curl
  71. Learning Visual Knowledge Memory Networks for Visual Question Answering
  72. Attention-aware Compositional Network for Person Re-Identification
  73. Sim2Real View Invariant Visual Servoing by Recurrent Control
  74. Time-resolved Light Transport Decomposition for Thermal Photometric Stereo
  75. Trapping Light for Time of Flight
  76. A Unifying Contrast Maximization Framework for Event Cameras, with Applications to Motion, Depth, and Optical Flow Estimation
  77. Global versus Localized Generative Adversarial Nets
  78. Shift: A Zero FLOP, Zero Parameter Alternative to Spatial Convolutions
  79. Learning a Toolchain for Image Restoration
  80. CNN based Learning using Reflection and Retinex Models for Intrinsic Image Decomposition
  81. Feature Quantization for Defending Against Distortion of Images
  82. A Minimalist Approach to Type-Agnostic Detection of Quadrics in Point Clouds
  83. Quantization of Fully Convolutional Networks for Accurate Biomedical Image Segmentation
  84. Aperture Supervision for Monocular Depth Estimation
  85. Divide and Conquer for Full-Resolution Light Field Deblurring
  86. Multi-shot Pedestrian Re-identification via Sequential Decision Making
  87. Weakly-Supervised Semantic Segmentation by Iteratively Mining Common Object Features
  88. Depth-Aware Stereo Video Retargeting
  89. Multistage Adversarial Losses for Pose-Based Human Image Synthesis
  90. Multi-Content GAN for Few-Shot Font Style Transfer
  91. Multi-Cue Correlation Filters for Robust Visual Tracking
  92. A Causal And-Or Graph Model for Visibility Fluent Reasoning in Tracking Interacting Objects
  93. Bottom-Up and Top-Down Attention for Image Captioning and Visual Question Answering
  94. Improving Color Reproduction Accuracy in the Camera Imaging Pipeline
  95. Net2Vec: Quantifying and Explaining how Concepts are Encoded by Filters in Deep Neural Networks
  96. Sketch-a-Classifier: Sketch-based Photo Classifier Generation
  97. Learning Time/Memory-Efficient Deep Architectures with Budgeted Super Networks
  98. TOM-Net: Learning Transparent Object Matting from a Single Image
  99. Estimation of Camera Locations in Highly Corrupted Scenarios: All About the Base, No Shape Trouble
  100. Direction-aware Spatial Context Features for Shadow Detection
  101. Neural Motifs: Scene Graph Parsing with Global Context
  102. Object Referring in Videos with Language and Human Gaze
  103. Learning Transferable Architectures for Scalable Image Recognition
  104. View Extrapolation of Human Body from a Single Image
  105. Probabilistic Plant Modeling via Multi-View Image-to-Image Translation
  106. Learning a Discriminative Prior for Blind Image Deblurring
  107. Optimal Structured Light a la Carte
  108. Revisiting Deep Intrinsic Image Decompositions
  109. GAGAN: Geometry Aware Generative Adverserial Networks
  110. Learning Multi-grid Generative ConvNets by Minimal Contrastive Divergence
  111. Towards Faster Training of Global Covariance Pooling Networks by Iterative Matrix Square Root Normalization
  112. Diversity Regularized Spatiotemporal Attention for Video-based Person Re-identification
  113. Variational Autoencoders for Deforming 3D Mesh Models
  114. Rotation Averaging and Strong Duality
  115. 3D Hand Pose Estimation: From Current Achievements to Future Goals
  116. Benchmarking 6DOF Outdoor Visual Localization in Changing Conditions
  117. A Robust Generative Framework for Generalized Zero-Shot Learning
  118. Two can play this Game: Visual Dialog with Discriminative Visual Question Generation and Visual Question Answering
  119. Rotation-sensitive Regression for Oriented Scene Text Detection
  120. Adversarial Feature Augmentation for Unsupervised Domain Adaptation
  121. Deep Regression Forests for Age Estimation
  122. FOTS: Fast Oriented Text Spotting with a Unified Network
  123. SoS-RSC: A Sum-of-Squares Polynomial Approach to Robustifying Subspace Clustering Algorithms
  124. Efficient Subpixel Refinement with Symbolic Linear Predictors
  125. Self-Supervised Feature Learning by Learning to Spot Artifacts
  126. PointFusion: Deep Sensor Fusion for 3D Bounding Box Estimation
  127. Scale-recurrent Network for Deep Image Deblurring
  128. Multi-Cell Classification by Convolutional Dictionary Learning with Class Proportion Priors
  129. Social GAN: Socially Acceptable Trajectories with Generative Adversarial Networks
  130. On the convergence of PatchMatch and its variants
  131. Clinical Skin Lesion Diagnosis using Representations Inspired by Dermatologist Criteria
  132. PoTion: Pose MoTion Representation for Action Recognition
  133. Zigzag Learning for Weakly Supervised Object Detection
  134. VITAL: VIsual Tracking via Adversarial Learning
  135. Crowd Counting with Deep Negative Correlation Learning
  136. Multi-Label Zero-Shot Learning with Structured Knowledge Graphs
  137. Learning a Discriminative Filter Bank within a CNN for Fine-grained Recognition
  138. A Closer Look at Spatiotemporal Convolutions for Action Recognition
  139. Dual Attention Matching Network for Context-Aware Feature Sequence based Person Re-Identification
  140. End-to-End Deep Kronecker-Product Matching for Person Re-identification
  141. Consensus Maximization for Semantic Region Correspondences
  142. SBNet: Sparse Block’s Network for Fast Inference
  143. Action Sets: Weakly Supervised Action Segmentation without Ordering Constraints
  144. Group Consistent Similarity Learning via Deep CRFs for Person Re-Identification
  145. Now You Shake Me: Towards Automatic 4D Cinema
  146. Defocus Blur Detection via Multi-Stream Bottom-Top-Bottom Fully Convolutional Network
  147. Interpret Neural Networks by Identifying Critical Data Routing Paths
  148. Deep Reinforcement Learning of Region Proposal Networks for Object Detection
  149. Multi-Task Learning Using Uncertainty to Weigh Losses for Scene Geometry and Semantics
  150. Finding It": Weakly-Supervised Reference-Aware Visual Grounding in Instructional Video"
  151. Semantic Visual Localization
  152. DeblurGAN: Blind Motion Deblurring Using Conditional Adversarial Networks
  153. Composing Two Objects of Interest for Flying Camera Photography
  154. Kernelized Subspace Pooling for Deep Local Descriptors
  155. Learning to Generate Time-Lapse Videos Using Multi-Stage Dynamic Generative Adversarial Networks
  156. Stochastic Downsampling for Cost-Adjustable Inference and Improved Regularization in Convolutional Networks
  157. Deep Lesion Graph in the Wild: Relationship Learning and Organization of Significant Radiology Image Findings in a Diverse Large-scale Lesion Database
  158. An Efficient and Provable Approach for Mixture Proportion Estimation Using Linear Independence Assumption
  159. Eliminating Background-bias for Robust Person Re-identification
  160. Geometry-Aware Network for Non-Rigid Shape Prediction from a Single View
  161. High-order tensor regularization with application to attribute ranking
  162. Taskonomy: Disentangling Task Transfer Learning
  163. BlockDrop: Dynamic Inference Paths in Residual Networks
  164. Attend and Interact: Higher-Order Object Interactions for Video Understanding
  165. Bilateral Ordinal Relevance Multi-instance Regression for Facial Action Unit Intensity Estimation
  166. CarFusion: Combining Point Tracking and Part Detection for Dynamic 3D Reconstruction of Vehicles
  167. Transferable Joint Attribute-Identity Deep Learning for Unsupervised Person Re-Identification
  168. Large Scale Fine-Grained Categorization and the Effectiveness of Domain-Specific Transfer Learning
  169. BPGrad: Towards Global Optimality in Deep Learning via Branch and Pruning
  170. Improved Human Pose Estimation through Adversarial Data Augmentation
  171. ShuffleNet: An Extremely Efficient Convolutional Neural Network for Mobile Devices
  172. SINT++: Robust Visual Tracking via Adversarial Hard Positive Generation
  173. Structured Uncertainty Prediction Networks
  174. Geometry-Guided CNN for Self-supervised Video Representation learning
  175. Low-Shot Recognition with Imprinted Weights
  176. Self-supervised Learning of Geometrically Stable Features Through Probabilistic Introspection
  177. Disentangling Structure and Aesthetics for Content-aware Image Completion
  178. A Volumetric Descriptive Network for 3D Object Synthesis
  179. Interpretable Convolutional Neural Networks
  180. Single Image Dehazing via Conditional Generative Adversarial Network
  181. Neural Inverse Kinematics for Unsupervised Motion Retargetting
  182. Environment Upgrade Reinforcement Learning for Non-differentiable Multi-stage Pipelines
  183. Teaching Categories to Human Learners with Visual Explanations
  184. Facelet-Bank for Fast Portrait Manipulation
  185. Convolutional Sequence to Sequence Model for Human Dynamics
  186. Human Semantic Parsing for Person Re-identification
  187. Latent RANSAC
  188. LiDAR-Video Driving Dataset: Learning Driving Policies Effectively
  189. Actor and Observer: Joint Modeling of First and Third-Person Videos
  190. Controllable Video Generation with Sparse Trajectories
  191. What have we learned from deep representations for action recognition?
  192. Bidirectional Attentive Fusion with Context Gating for Dense Video Captioning
  193. Language-Based Image Editing with Recurrent attentive Models
  194. Graph-Cut RANSAC
  195. Optimizing Filter Size in Convolutional Neural Networks for Facial Action Unit Recognition
  196. Memory Based Online Learning of Deep Representations from Video Streams
  197. Deep Layer Aggregation
  198. Learning Convolutional Networks for Content-weighted Image Compression
  199. Self-supervised Multi-level Face Model Learning for Monocular Reconstruction at over 250Hz
  200. Efficient, sparse representation of manifold distance matrices for classical scaling
  201. Visual to Sound: Generating Natural Sound for Videos in the Wild
  202. A Prior-Less Method for Multi-Face Tracking in Unconstrained Videos
  203. Real-Time Rotation-Invariant Face Detection with Progressive Calibration Networks
  204. Self-calibrating polarising radiometric calibration
  205. Pix3D: Dataset and Methods for 3D Object Modeling from a Single Image
  206. Learning to Promote Saliency Detectors
  207. Pose Transferrable Person Re-Identification
  208. Hashing as Tie-Aware Learning to Rank
  209. Baseline Desensitizing In Translation Averaging
  210. Conditional Image-to-Image Translation
  211. Blind Predicting Similar Quality Map for Image Quality Assessment
  212. Can Spatiotemporal 3D CNNs Retrace the History of 2D CNNs and ImageNet?
  213. CNN Driven Sparse Multi-Level B-spline Image Registration
  214. Through-Wall Human Pose Estimation Using Radio Signals
  215. xUnit: Learning a Spatial Activation Function for Efficient Image Restoration
  216. CLIP-Q: Deep Network Compression Learning by In-Parallel Pruning-Quantization
  217. FoldingNet: Interpretable Unsupervised Learning on 3D Point Clouds
  218. Weakly Supervised Coupled Networks for Visual Sentiment Analysis
  219. Ring loss: Convex Feature Normalization for Face Recognition
  220. Fast Spectral Ranking for Similarity Search
  221. PackNet: Adding Multiple Tasks to a Single Network by Iterative Pruning
  222. AMNet: Memorability Estimation with Attention
  223. Webly Supervised Learning Meets Zero-shot Learning: A Hybrid Approach for Fine-grained Classification
  224. End-to-End Learning of Motion Representation for Video Understanding
  225. Smooth Neighbors on Teacher Graphs for Semi-supervised Learning
  226. SeedNet : Automatic Seed Generation with Deep Reinforcement Learning for Robust Interactive Segmentation
  227. Deep Spatio-Temporal Random Fields for Efficient Video Segmentation
  228. Perturbative Neural Networks: Rethinking Convolution in CNNs
  229. SYQ: Learning Symmetric Quantization For Efficient Deep Neural Networks
  230. Neural 3D Mesh Renderer
  231. Deep Parametric Continuous Convolutional Neural Networks
  232. Visual Question Reasoning on General Dependency Tree
  233. Non-local Neural Networks
  234. Light field intrinsics with a deep encoder-decoder network
  235. Feature Space Transfer for Data Augmentation
  236. Motion Segmentation by Exploiting Complementary Geometric Models
  237. Context Contrasted Feature and Gated Multi-scale Aggregation for Scene Segmentation
  238. Inverted Residuals and Linear Bottlenecks: Mobile Networks for Classification, Detection and Segmentation
  239. Towards a Mathematical Understanding of the Difficulty in Learning with Feedforward Neural Networks
  240. Few-Shot Image Recognition by Predicting Parameters from Activations
  241. Deep Video Super-Resolution Network Using Dynamic Upsampling Filters Without Explicit Motion Compensation
  242. CLEAR: Cumulative LEARning for One-Shot One-Class Image Recognition
  243. Pose-Robust Face Recognition via Deep Residual Equivariant Mapping
  244. Deep Cross-media Knowledge Transfer
  245. Large-scale Point Cloud Semantic Segmentation with Superpoint Graphs
  246. A Weighted Sparse Sampling and Smoothing Frame Transition Approach for Semantic Fast-Forward First-Person Videos
  247. Recurrent Slice Networks for 3D Segmentation on Point Clouds
  248. Dimensionalitys Blessing: Detecting the distributions underlying images
  249. Augmented Skeleton Space Transfer for Depth-based Hand Pose Estimation
  250. Robust Classification with Convolutional Prototype Learning
  251. DecideNet: Counting Varying Density Crowds Through Attention Guided Detection and Density Estimation
  252. ICE-BA: Efficient, Consistent and Efficient Bundle Adjustment for Visual-Inertial SLAM
  253. Grounding Referring Expressions in Images by Variational Context
  254. Pseudo-Mask Augmented Object Detection
  255. Improvements to context based self-supervised learning
  256. Left-Right Comparative Recurrent Model for Stereo Matching
  257. Learning deep structured active contours end-to-end
  258. Efficient and Deep Person Re-Identification using Multi-Level Similarity
  259. Learning Intrinsic Image Decomposition from Watching the World
  260. Learning to Understand Image Blur
  261. Gaze Prediction in Dynamic $360^\circ$ Immersive Videos
  262. Emotional Attention: A Study of Image Sentiment and Visual Attention
  263. Single View Stereo Matching
  264. Zero-shot Recognition via Semantic Embeddings and Knowledge Graphs
  265. Video Representation Learning Using Discriminative Pooling
  266. Probabilistic Joint Face-Skull Modelling for Facial Reconstruction
  267. Indoor RGB-D Compass from a Single Line and Plane
  268. pOSE: Pseudo Object Space Error for Initialization-Free Bundle Adjustment
  269. Generative Adversarial Learning Towards Fast Weakly Supervised Detection
  270. Seeing Temporal Modulation of Lights from Standard Cameras
  271. Shape from Shading through Shape Evolution
  272. Parallel Attention: A Unified Framework for Visual Object Discovery through Dialogs and Queries
  273. Neural Style Transfer via Meta Networks
  274. UV-GAN: Adversarial Facial UV Map Completion for Pose-invariant Face Recognition
  275. Cascaded Pyramid Network for Multi-Person Pose Estimation
  276. Detect-and-Track: Efficient Pose Estimation in Videos
  277. SobolevFusion: 3D Reconstruction of Scenes Undergoing Free Non-rigid Motion
  278. NAG: Network for Adversary Generation
  279. Inferring Co-Attention in Social Scene Videos
  280. Unsupervised Learning of Single View Depth Estimation and Visual Odometry with Deep Feature Reconstruction
  281. Egocentric Basketball Motion Planning from a Single First-Person Image
  282. Geometric robustness of deep networks: analysis and improvement
  283. Pose-Guided Photorealistic Face Rotation
  284. Super SloMo: High Quality Estimation of Multiple Intermediate Frames for Video Interpolation
  285. Importance Weighted Adversarial Nets for Partial Domain Adaptation
  286. Towards High Performance Video Object Detection
  287. SurfConv: Bridging 3D and 2D Convolution for RGBD Images
  288. People, Penguins and Petri Dishes: Adapting Object Counting Models To New Visual Domains And Object Types Without Forgetting
  289. Fully Convolutional Adaptation Networks for Semantic Segmentation
  290. Towards Pose Invariant Face Recognition in the Wild
  291. Interactive Image Segmentation with Latent Diversity
  292. Label Denoising Adversarial Network (LDAN) for Inverse Lighting of Face Images
  293. Detecting and Recognizing Human-Object Interactions
  294. Deep Image Prior
  295. 2D/3D Pose Estimation and Action Recognition using Multitask Deep Learning
  296. Direct Shape Regression Networks for End-to-End Face Alignment
  297. Disentangling Features in 3D Face Shapes for Joint Face Reconstruction and Recognition
  298. Scale-Transferrable Object Detection
  299. Learning by Asking Questions
  300. 3D Pose Estimation and 3D Model Retrieval for Objects in the Wild
  301. Deep Progressive Reinforcement Learning for Skeleton-based Action Recognition
  302. Future Person Localization in First-Person Videos
  303. 3D-RCNN: Instance-level 3D Scene Understanding via Render-and-Compare
  304. Manifold Learning in Quotient Spaces
  305. Image Correction via Deep Reciprocating HDR Transformation
  306. Focus Manipulation Detection via Photometric Histogram Analysis
  307. Density Adaptive Point Set Registration
  308. Multi-view Harmonized Bilinear Network for 3D Object Recognition
  309. SeGAN: Segmenting and Generating the Invisible
  310. VizWiz Grand Challenge: Answering Visual Questions from Blind People
  311. Sparse, Smart Contours to Represent and Edit Images
  312. Generative Non-Rigid Shape Completion with Graph Convolutional Autoencoders
  313. The power of ensembles for active learning in image classification
  314. OLÉ: Orthogonal Low-rank Embedding, A Plug and Play Geometric Loss for Deep Learning
  315. Learning Compositional Visual Concepts with Mutual Consistency
  316. Adversarial Complementary Learning for Weakly Supervised Object Localization
  317. Analytical Modeling of Vanishing Points and Curves in Catadioptric Cameras
  318. Exploit the Unknown Gradually:~ One-Shot Video-Based Person Re-Identification by Stepwise Learning
  319. Learning to Sketch with Shortcut Cycle Consistency
  320. Domain Adaptive Faster R-CNN for Object Detection in the Wild
  321. Attentive Generative Adversarial Network for Raindrop Removal from A Single Image
  322. Independently Recurrent Neural Network (IndRNN): Building A Longer and Deeper RNN
  323. Making Convolutional Networks Recurrent for Visual Sequence Learning
  324. Multi-Task Adversarial Network for Disentangled Feature Learning
  325. Fight ill-posedness with ill-posedness: Single-shot variational depth super-resolution from shading
  326. Zero-Shot Sketch-Image Hashing
  327. Learning to Localize Sound Source in Visual Scenes
  328. Cross-Domain Weakly-Supervised Object Detection through Progressive Domain Adaptation
  329. Semi-parametric Image Synthesis
  330. Multi-scale Location-aware Kernel Representation for Object Detection
  331. W2F: A Weakly-Supervised to Fully-Supervised Framework for Object Detection
  332. Generative Modeling using the Sliced Wasserstein Distance
  333. MX-LSTM: mixing tracklets and vislets to jointly forecast trajectories and head poses
  334. Dynamic Video Segmentation Network
  335. Learning a Discriminative Feature Network for Semantic Segmentation
  336. Video Person Re-identification with Competitive Snippet-similarity Aggregation and Co-attentive Snippet Embedding
  337. Curve Reconstruction via the Global Statistics of Natural Curves
  338. Single-Shot Refinement Neural Network for Object Detection
  339. Density-aware Single Image De-raining using a Multi-stream Dense Network
  340. Learning Answer Embeddings for Visual Question Answering
  341. Attention Clusters: Purely Attention Based Local Feature Integration for Video Classification
  342. Translating and Segmenting Multimodal Medical Volumes with Cycle- and Shape-Consistency Generative Adversarial Network
  343. Learning from the Deep: A Revised Underwater Image Formation Model
  344. Mean-Variance Loss for Deep Age Estimation from a Face
  345. Disentangled Person Image Generation
  346. Deep Sparse Coding for Invariant Multimodal Halle Berry Neurons
  347. DeepMVS: Learning Multi-View Stereopsis
  348. Embodied Question Answering
  349. Deflecting Adversarial Attacks with Pixel Deflection
  350. Dynamic-Structured Semantic Propagation Network
  351. Integrated facial landmark localization and super-resolution of real-world very low resolution faces in arbitrary poses with GANs
  352. A Two-Step Disentanglement Method
  353. Towards Effective Low-bitwidth Convolutional Neural Networks
  354. Natural and Effective Obfuscation by Head Inpainting
  355. Learning-Compression" algorithms for neural net pruning"
  356. Salient Object Detection Driven by Fixation Prediction
  357. Scalable Dense Non-rigid Structure-from-Motion: A Grassmannian Perspective
  358. Uncalibrated Photometric Stereo under Natural Illumination
  359. Learning Monocular 3D Human Pose estimation on weakly-supervised Multi-view Images
  360. An Unsupervised Learning Model for Deformable Medical Image Registration
  361. Learning Deep Correspondence through Prior and Posterior Feature Constancy
  362. Anticipating Traffic Accidents with Adaptive Loss and Large-scale Incident DB
  363. A2-RL: Aesthetics Aware Reinforcement Learning for Image Cropping
  364. Learned Shape-Tailored Descriptors for Segmentation
  365. One-shot Action Localization by Sequence Matching Network
  366. Robust Physical-World Attacks on Deep Learning Visual Classification
  367. What Makes a Video a Video: Analyzing Temporal Information in Video Understanding Models and Datasets
  368. Bidirecional Retrieval Made Simple
  369. Reward Learning by Instruction
  370. MegaDepth: Learning Single-View Depth Prediction from Internet Photos
  371. Cross-Dataset Adaptation for Visual Question Answering
  372. Interpretable Video Captioning via Trajectory Structured Localization
  373. MoCoGAN: Decomposing Motion and Content for Video Generation
  374. Left/Right Asymmetric Layer Skippable Networks
  375. Learning Pixel-level Semantic Affinity with Image-level Supervision for Weakly Supervised Semantic Segmentation
  376. Unsupervised Discovery of Object Landmarks as Structural Representations
  377. Learning Deep Descriptors with Scale-Aware Triplet Networks
  378. Robust Depth Estimation from Auto Bracketed Images
  379. Aligning Infinite-Dimensional Covariance Matrices in Reproducing Kernel Hilbert Spaces for Domain Adaptation
  380. Local and Global Optimization Techniques in Graph-based Clustering
  381. Learning from Millions of 3D Scans for Large-scale 3D Face Recognition
  382. CBMV: A Coalesced Bidirectional Matching Volume for Disparity Estimation
  383. Image Collection Pop-up: 3D Reconstruction and Clustering of Rigid and Non-Rigid Categories
  384. Ordinal Depth Supervision for 3D Human Pose Estimation
  385. Learning to Hash by Discrepancy Minimization
  386. MapNet: Geometry-Aware Learning of Maps for Camera Localization
  387. Im2Struct: Recovering 3D Shape Structure from a Single RGB Image
  388. A Pose-Sensitive Embedding for Person Re-Identification with Expanded Cross Neighborhood Re-Ranking
  389. Analytic Expressions for Probabilistic Moments of PL-DNN with Gaussian Input
  390. Cross-Domain Self-supervised Multi-task Feature Learning Using Synthetic Game Imagery
  391. Coding Kendall's Shape Trajectories for 3D Action Recognition
  392. Camera Pose Estimation with Unknown Principal Point
  393. Learning Spatial-Aware Regressions for Visual Tracking
  394. The Easy, The Medium and The Hard: Adapting Across Varied Domain Shifts
  395. Detach and Adapt: Learning Cross-Domain Disentangled Deep Representation
  396. A Hybrid L1-L0 Layer Decomposition Model for Tone Mapping
  397. LIME: Live Intrinsic Material Estimation
  398. Learning Representations for Single Cells in Microscopy Images
  399. Transparency by Design: Closing the Gap Between Performance and Interpretabilty in Visual Reasoning
  400. clcNet: Improving the Efficiency of Convolutional Neural Network using Channel Local Convolutions
  401. Spanning Patches: Deep Patch Selection for Fast Multi-View Stereo
  402. LAMV: Learning to align and match videos with kernelized temporal layers
  403. Single Image Reflection Separation with Perceptual Losses
  404. Structure from Recurrent Motion: From Rigidity to Recurrency
  405. Customized Image Narrative Generation via Interactive Visual Question Generation and Answering
  406. Relation Networks for Object Detection
  407. An End-to-End TextSpotter with Explicit Alignment and Attention
  408. Photometric Stereo in Participating Media Considering Shape-Dependent Forward Scatter
  409. Sliced Wasserstein Distance for Learning Gaussian Mixture Models
  410. Generative Adversarial Image Synthesis with Decision Tree Latent Controller
  411. Disentangling 3D Pose in A Dendritic CNN for Unconstrained 2D Face Alignment
  412. Learning Multi-Instance Enriched Image Representation via Non-Greedy Simultaneous L1 -Norm Minimization and Maximization
  413. Separating Self-Expression and Visual Content in Hashtag Supervision
  414. Residual Dense Network for Image Super-Resolution
  415. Hand PointNet: 3D Hand Pose Estimation using Point Sets
  416. Human-centric Indoor Scene Synthesis Using Stochastic Grammar
  417. Learning Facial Action Units from Web Images with Scalable Weakly Supervised Clustering
  418. Occlusion Aware Unsupervised Learning of Optical Flow
  419. Domain Generalization with Adversarial Feature Learning
  420. A Hierarchical Generative Model for Eye Image Synthesis and Eye Gaze Estimation
  421. PlaneNet: Piece-wise Planar Reconstruction from a Single RGB Image
  422. Deep Learning under Privileged Information Using Heteroscedastic Dropout
  423. Frame-Recurrent Video Super-Resolution
  424. Nonlocal Low-Rank Tensor Factor Analysis for Image Restoration
  425. Content-Sensitive Supervoxels via Uniform Tessellations on Video Manifolds
  426. Planar Shape Detection at Structural Scales
  427. Revisiting Oxford and Paris: Large-Scale Image Retrieval Benchmarking
  428. Learning to Parse Wireframes in Images of Man-Made Environments
  429. Harmonious Attention Network for Person Re-Identication
  430. Wing Loss for Robust Facial Landmark Localisation with Convolutional Neural Networks
  431. Every Smile is Unique: Landmark-guided Diverse Smile Generation
  432. Multi-Scale Weighted Nuclear Norm Image Restoration
  433. FeaStNet: Feature-Steered Graph Convolutions for 3D Shape Analysis
  434. Lightweight Probabilistic Deep Networks
  435. Learning Depth from Monocular Videos using Direct Methods
  436. Thoracic Disease Identification and Localization with Limited Supervision
  437. SGPN: Similarity Group Proposal Network for 3D Point Cloud Instance Segmentation
  438. Memory Matching Networks for One-Shot Image Recognition
  439. Compressed Video Action Recognition
  440. FFNet: Video Fast-Forwarding via Reinforcement Learning
  441. Representing and Learning High Dimensional Data with the Optimal Transport Map from a Probabilistic Viewpoint
  442. ScanComplete: Large-Scale Scene Completion and Semantic Segmentation for 3D Scans
  443. Fully Convolutional Attention Network for Multimodal Reasoning
  444. Lions and Tigers and Bears: Capturing Non-Rigid, 3D, Articulated Shape from Images
  445. Recurrent Pixel Embedding for Instance Grouping
  446. Name-removed-for-review: A Multi-camera HD Dataset for Dense Unscripted Pedestrian Detection
  447. SGAN: An Alternative Training of Generative Adversarial Networks
  448. Learning Markov Clustering Networks for Scene Text Detection
  449. Occlusion-Aware Rolling Shutter Rectification of 3D Scenes
  450. Beyond Gröbner Bases: Basis Selection for Minimal Solvers
  451. Improving Object Localization with Fitness NMS and Bounded IoU Loss
  452. Generative Adversarial Perturbations
  453. Deep Photo Enhancer: Unsupervised Learning of Image Enhancement from Photographs with GANs
  454. Eye In-Painting with Exemplar Generative Adversarial Networks
  455. Encoder-Decoder Alignment for Zero-Pair Image-to-Image Translation
  456. Learning Structure and Strength of CNN Filters for Small Sample Size Training
  457. Path Aggregation Network for Instance Segmentation
  458. Learning Superpixels with Segmentation-Aware Affinity Loss
  459. Data Distillation: Towards Omni-Supervised Learning
  460. Deep Diffeomorphic Transformer Networks
  461. CodeSLAM --- Learning a Compact, Optimisable Representation for Dense Visual SLAM
  462. Glimpse Clouds: Human Activity Recognition from Unstructured Feature Points
  463. Learning Latent Super-Events to Detect Multiple Activities in Videos
  464. MegDet: A Large Mini-Batch Object Detector
  465. Lose The Views: Limited Angle CT Reconstruction via Implicit Sinogram Completion
  466. Unsupervised Domain Adaptation with Similarity-Based Classifier
  467. Visual Feature Attribution using Wasserstein GANs
  468. Tell Me Where To Look: Guided Attention Inference Network
  469. Towards Open-Set Identity Preserving Face Synthesis
  470. Unsupervised Feature Learning via Non-Parametric Instance-level Discrimination
  471. Multi-Evidence Fusion and Filtering for Weakly Supervised Object Recognition, Detection and Segmentation
  472. Deep Material-aware Cross-spectral Stereo Matching
  473. MakeupGAN: Makeup Transfer via Cycle-Consistent Adversarial Networks
  474. M3: Multimodal Memory Modelling for Video Captioning
  475. Fooling Vision and Language Models Despite Localization and Attention Mechanism
  476. Total Capture: A 3D Deformation Model for Tracking Faces, Hands, and Bodies
  477. Jointly Localizing and Describing Events for Dense Video Captioning
  478. The Best of Both Worlds: Combining CNNs and Geometric Constraints for Hierarchical Motion Segmentation
  479. End-to-end learning of keypoint detector and descriptor for pose invariant 3D matching
  480. LDMNet: Low Dimensional Manifold Regularized Neural Networks
  481. 3D Human Pose Estimation in the Wild by Adversarial Learning
  482. Fast Video Object Segmentation by Reference-Guided Mask Propagation
  483. End-to-End Dense Video Captioning with Masked Transformer
  484. Towards dense object tracking in a 2D honeybee hive
  485. Appearance-and-Relation Networks for Video Classification
  486. StarGAN: Unified Generative Adversarial Networks for Controllable Multi-Domain Image-to-Image Translation
  487. Answer with Grounding Snippets: Focal Visual-Text Attention for Visual Question Answering
  488. GANerated Hands for Real-Time 3D Hand Tracking from Monocular RGB
  489. Weakly Supervised Human Body Part Parsing via Pose-Guided Knowledge Transfer
  490. ClusterNet: Detecting Small Objects in Large Scenes by Exploiting Spatio-Temporal Information
  491. Structured Set Matching Networks for One-Shot Part Labeling
  492. Real-Time Seamless Single Shot 6D Object Pose Prediction
  493. Triplet-Center Loss for Multi-View 3D Object Retrieval
  494. Pixels, voxels, and views: A study of shape representations for single view 3D object shape prediction
  495. Show Me a Story: Towards Coherent Neural Story Illustration
  496. DeLS-3D: Deep Localization and Segmentation with a 3D Semantic Map
  497. Missing Slice Recovery for Tensors Using a Low-rank Model in Embedded Space
  498. 3D Semantic Segmentation with Submanifold Sparse Convolutional Networks
  499. Learning Compact Recurrent Neural Networks with Block-Term Tensor Decomposition
  500. Link and code: Fast indexing with graphs and compact regression codes
  501. Two-Stream Convolutional Networks for Dynamic Texture Synthesis
  502. Weakly Supervised Action Localization by Sparse Temporal Pooling Network
  503. Viewpoint-aware Video Summarization
  504. 4D Human Body Correspondences from Panoramic Depth Maps
  505. DS*: Tighter Lifting-Free Convex Relaxations for Quadratic Matching Problems
  506. Discovering Point Lights with Intensity Distance Fields
  507. The Lovász-Softmax loss: A tractable surrogate for the optimization of the intersection-over-union measure in neural networks
  508. Geometry-aware Deep Network for Single-Image Novel View Synthesis
  509. Temporal Deformable Residual Networks for Action Segmentation in Videos
  510. Seeing Small Faces from Robust Anchor's Perspective
  511. Matryoshka Networks: Predicting 3D Geometry via Nested Shape Layers
  512. On the Importance of Label Quality for Semantic Segmentation
  513. AVA: A Video Dataset of Spatio-temporally Localized Atomic Visual Actions
  514. First-Person Hand Action Benchmark with RGB-D Videos and 3D Hand Pose Annotations
  515. Learning Deep Sketch Abstraction
  516. Non-Linear Temporal Subspace Representations for Activity Recognition
  517. A Biresolution Spectral framework for Product Quantization
  518. Unsupervised Cross-dataset Person Re-identification by Transfer Learning of Spatio-temporal Patterns
  519. Feature Super-Resolution: Make Machine See More Clearly
  520. Finding Tiny Faces in the Wild with Generative Adversarial Network
  521. DoubleFusion: Real-time Capture of Human Performance with Inner Body Shape from a Single Depth Sensor
  522. Deep Unsupervised Saliency Detection: A Multiple Noisy Labeling Perspective
  523. Multi-view Consistency as Supervisory Signal for Learning Shape and Pose Prediction
  524. Recognize Actions by Disentangling Components of Dynamics
  525. Who Let The Dogs Out? Modeling Dog Behavior From Visual Data
  526. Alive Caricature from 2D to 3D
  527. Learning Steerable Filters for Rotation Equivariant CNNs
  528. From source to target and back: Symmetric Bi-Directional Adaptive GAN
  529. Monocular Relative Depth Perception with Web Stereo Data Supervision
  530. Correlation Tracking via Joint Discrimination and Reliability Learning
  531. Boosting Domain Adaptation by Discovering Latent Domains
  532. HSA-RNN: Hierarchical Structure-Adaptive RNN for Video Summarization
  533. Learning from Noisy Web Data with Category-level Supervision
  534. Embodied Real-World Active Perception
  535. Boosting Self-Supervised Learning via Knowledge Transfer
  536. Video Captioning via Hierarchical Reinforcement Learning
  537. Weakly Supervised Phrase Localization with Multi-Scale Anchored Transformer Network
  538. Progressively Complementarity-aware Fusion Network for RGB-D Salient Object Detection
  539. Wide Compression: Tensor Ring Nets
  540. Demo2Vec: Reasoning Object Affordances from Online Videos
  541. A High-Quality Denoising Dataset for Smartphone Cameras
  542. Collaborative and Adversarial Network for Unsupervised domain adaptation
  543. End-to-end weakly-supervised semantic alignment
  544. Quantization and Training of Neural Networks for Efficient Integer-Arithmetic-Only Inference
  545. Feature Selective Networks for Object Detection
  546. Unsupervised Learning of Depth and Egomotion from Monocular Video Using 3D Geometric Constraints
  547. A Common Framework for Interactive Texture Transfer
  548. Depth and Transient Imaging with Compressive SPAD Array Cameras
  549. PointGrid: A Deep Network for 3D Shape Understanding
  550. A Network Architecture for Point Cloud Classification via Automatic Depth Images Generation
  551. Optimizing Local Feature Descriptors for Nearest Neighbor Matching
  552. 4DFAB: A Large Scale 4D Database for Facial Expression Analysis and Biometric Applications
  553. Zoom and Learn: Generalizing Deep Stereo Matching to Novel Domains
  554. Photographic Text-to-Image Synthesis with a Hierarchically-nested Adversarial Network
  555. Stacked Conditional Generative Adversarial Networks for Jointly Learning Shadow Detection and Shadow Removal
  556. What do Deep Networks Like to See?
  557. On the Robustness of Semantic Segmentation Models to Adversarial Attacks
  558. SketchMate: Deep Hashing for Million-Scale Human Sketch Retrieval
  559. Progressive Attention Guided Recurrent Network for Salient Object Detection
  560. IQA: Visual Question Answering in Interactive Environments
  561. Boosting Adversarial Attacks with Momentum
  562. Conditional Probability Models for Deep Image Compression
  563. Cascade R-CNN: Delving into High Quality Object Detection
  564. Scalable and Effective Deep CCA via Soft Decorrelation
  565. Discriminability objective for training descriptive captions
  566. Going from Image to Video Saliency: Augmenting Image Salience with Dynamic Attentional Push
  567. Recurrent Scene Parsing with Perspective Understanding in the Loop
  568. Semantic Video Segmentation by Gated Recurrent Flow Propagation
  569. FlipDial: A Generative Model for Two-Way Visual Dialogue
  570. Context Encoding for Semantic Segmentation
  571. Deep Marching Cubes: Learning Explicit Surface Representations
  572. Rethinking Feature Distribution for Loss Functions in Image Classification
  573. Optical Flow Guided Feature: A Motion Representation for Video Action Recognition
  574. Multimodal Explanations: Justifying Decisions and Pointing to the Evidence
  575. HATS: Histograms of Averaged Time Surfaces for Robust Event-based Object Classification
  576. Imagine it for me: Generative Adversarial Approach for Zero-Shot Learning from Noisy Texts
  577. Co-Occurrence Template Matching
  578. Defense against Universal Adversarial Perturbations
  579. PPFNet: Global Context Aware Local Features for Robust 3D Point Matching
  580. Dynamic Zoom-in Network for Fast Object Detection in Large Images
  581. Objects as context for detecting their semantic parts
  582. Spline Error Weighting for Robust Visual-Inertial Fusion
  583. GeoNet: Geometric Neural Network for Joint Depth and Surface Normal Estimation
  584. Where and Why Are They Looking? Jointly Inferring Human Attention and Intentions in Complex Tasks
  585. Robust Facial Landmark Detection via a Fully-Convolutional Local-Global Context Network
  586. Fast and Furious: Real Time End-to-End 3D Detection, Tracking and Motion Forecasting with a Single Convolutional Net
  587. CondenseNet: An Efficient DenseNet using Learned Group Convolutions
  588. Burst Denoising with Kernel Prediction Networks
  589. Leveraging Unlabeled Data for Crowd Counting by Learning to Rank
  590. Recurrent Saliency Transformation Network: Incorporating Multi-Stage Visual Cues for Small Organ Segmentation
  591. Classifier Learning with Prior Probabilities for Facial Action Unit Recognition
  592. Active Fixation Control to Predict Saccade Sequences
  593. Reflection Removal for Large-Scale 3D Point Clouds
  594. Mesoscopic Facial Geometry inference Using Deep Neural Networks
  595. VITON: An Image-based Virtual Try-on Network
  596. Beyond the Pixel-Wise Loss for Topology-Aware Delineation
  597. HashGAN: Deep Learning to Hash with Pair Conditional Wasserstein GAN
  598. A Globally Optimal Solution to the Non-Minimal Relative Pose Problem
  599. Learning distributions of shape trajectories from longitudinal datasets: a hierarchical model on a manifold of diffeomorphisms
  600. Multispectral Image Intrinsic Decomposition via Low Rank Constraint
  601. Dynamic Graph Generation Network: Generating Relational Knowledge from Diagrams
  602. Alternating-Stereo VINS: Observability Analysis and Performance Evaluation
  603. Im2Pano3D: Extrapolating 360 Structure and Semantics Beyond the Field of View
  604. Style Aggregated Network for Facial Landmark Detection
  605. VoxelNet: End-to-End Learning for Point Cloud Based 3D Object Detection
  606. Supervision-by-Registration: An Unsupervised Approach to Improve the Precision of Facial Landmark Detectors
  607. Deep Adversarial Subspace Clustering
  608. Compassionately Conservative Balanced Cuts for Image Segmentation
  609. Deformable GANs for Pose-based Human Image Generation
  610. Avatar-Net: Multi-scale Zero-shot Style Transfer by Feature Decoration
  611. The iNaturalist Species Classification and Detection Dataset
  612. Categorizing Concepts with Basic Level for Vision-to-Language
  613. InverseFaceNet: Deep Monocular Inverse Face Rendering at over 250 Hz
  614. Textbook Question Answering under Teacher Guidance with Memory Networks
  615. Learning to Find Good Correspondences
  616. Hyperparameter Optimization for Tracking with Continuous Deep Q-Learning
  617. Adversarial Data Programming: Using GANs to Relax the Bottleneck of Curated Labeled Data
  618. Weakly Supervised Facial Action Unit Recognition through Adversarial Training
  619. Knowledge Aided Consistency for Weakly Supervised Phrase Grounding
  620. Neighbors Do Help: Deeply Exploiting Local Structures of Point Clouds
  621. The Unreasonable Effectiveness of Deep Features as a Perceptual Metric
  622. Dense 3D Regression for Hand Pose Estimation
  623. Detail-Preserving Pooling in Deep Networks
  624. Dense Decoder Shortcut Connections for Single-Pass Semantic Segmentation
  625. Reinforcement Cutting-Agent Learning for Video Object Segmentation
  626. SketchyGAN: Towards Diverse and Realistic Sketch to Image Synthesis
  627. Wrapped Gaussian Process Regression on Riemannian Manifolds
  628. Document Enhancement using Visibility Detection
  629. Learning Discriminative Evaluation Metrics for Image Captioning
  630. GraphBit: Bitwise Interaction Mining via Deep Reinforcement Learning
  631. Learning Intelligent Dialogs for Bounding Box Annotation
  632. Efficient Diverse Ensemble for Discriminative Co-Tracking
  633. Recovering Realistic Texture in Image Super-resolution by Spatial Feature Modulation
  634. Mining on Manifolds: Metric Learning without Labels
  635. Revisiting knowledge transfer for training object class detectors
  636. GeoNet: Unsupervised Learning of Dense Depth, Optical Flow and Camera Pose
  637. Differential Attention for Visual Question Answering
  638. A PID Controller Approach for Stochastic Optimization of Deep Networks
  639. Bootstrapping the Performance of Webly Supervised Semantic Segmentation
  640. Iterative Learning with Open-set Noisy Labels
  641. A Papier-Mâché Approach to Learning 3D Surface Generation
  642. Extreme 3D Face Reconstruction: Looking Past Occlusions
  643. High-speed Tracking with Multi-kernel Correlation Filters
  644. Attentive Fashion Grammar Network for Fashion Landmark Detection and Clothing Category Classification
  645. Separating Style and Content for Generalized Style Transfer
  646. Learning Dual Convolutional Neural Networks for Low-Level Vision
  647. Wasserstein Introspective Neural Networks
  648. Deep Semantic Face Deblurring
  649. InLoc: Indoor Visual Localization with Dense Matching and View Synthesis
  650. Temporal Hallucinating for Action Recognition with Few Still Images
  651. Deep Texture Manifold for Ground Terrain Recognition
  652. Discriminative Learning of Latent Features for Zero-Shot Recognition
  653. Neural Sign Language Translation
  654. GroupCap: Group-based Image Captioning with Structured Relevance and Diversity Constraints
  655. Repulsion Loss: Detecting Pedestrians in a Crowd
  656. Pulling Actions out of Context: Explicit Separation for Effective Combination
  657. Deep Group-shuffling Random Walk for Person Re-identification
  658. DenseASPP: Densely Connected Networks for Semantic Segmentation
  659. A Variational U-Net for Conditional Appearance and Shape Generation
  660. Universal Denoising Networks : A Novel CNN-based Network Architecture for Image Denoising
  661. Automatic 3D Indoor Scene Modeling from Single Panorama
  662. Five-point Fundamental Matrix Estimation for Uncalibrated Cameras
  663. PU-Net: Point Cloud Upsampling Network
  664. Generative Image Inpainting with Contextual Attention
  665. Im2Flow: Motion Hallucination from Static Images for Action Recognition
  666. Tagging Like Humans: Diverse and Distinct Image Annotation
  667. TextureGAN: Controlling Deep Image Synthesis with Texture Patches
  668. ISTA-Net: Interpretable Optimization-Inspired Deep Network for Image Compressive Sensing
  669. Optimizing Video Object Detection via a Scale-Time Lattice
  670. Context Embedding Networks
  671. Motion-Guided Cascaded Refinement Network for Video Object Segmentation
  672. RotationNet: Joint Object Categorization and Pose Estimation Using Multiviews from Unsupervised Viewpoints
  673. Conditional Generative Adversarial Network for Structured Domain Adaptation
  674. Large-scale Distance Metric Learning with Uncertainty
  675. Hierarchical Novelty Detection for Visual Object Recognition
  676. Deeper Look at Power Normalizations.
  677. Disentangling Factors of Variation by Mixing Them
  678. Beyond Holistic Object Recognition: Enriching Image Understanding with Part States
  679. LSTM Pose Machines
  680. End-to-end Recovery of Human Shape and Pose
  681. Geometric Multi-Model Fitting with a Convex Relaxation Algorithm
  682. Revisiting Salient Object Detection: Simultaneous Detection, Ranking, and Subitizing of Multiple Salient Objects
  683. Modulated Convolutional Networks
  684. High-Resolution Image Synthesis and Semantic Manipulation with Conditional GANs
  685. Learning Compressible 360° Video Isomers
  686. Easy Identification from Better Constraints: Multi-Shot Person Re-Identification from Reference Constraints
  687. TieNet: Text-Image Embedding Network for Common Thorax Disease Classification and Reporting in Chest X-rays
  688. Good View Hunting: Learning Photo Composition from 1 Million View Pairs
  689. Visual Relationship Learning with a Factorization-based Prior
  690. Min-Entropy Latent Model for Weakly Supervised Object Detection
  691. Boundary Flow: A Siamese Network that Predicts Boundary Motion without Training on Motion
  692. SfSNet : Learning Shape, Reflectance and Illuminance of Faces `in the wild'
  693. Facial Expression Recognition by De-expression Residue Learning
  694. Empirical study of the topology and geometry of deep networks
  695. Learning Globally Optimized Object Detector via Policy Gradient
  696. Learning from Synthetic Data: Semantic Segmentation using Generative Adversarial Networks
  697. Recurrent Residual Module for Fast Inference in Videos
  698. Viewpoint-aware Attentive Multi-view Inference for Vehicle Re-identification
  699. Weakly-Supervised Semantic Segmentation Network with Deep Seeded Region Growing
  700. Deep Adversarial Metric Learning
  701. Learning Deep Models for Face Anti-Spoofing: Binary or Auxiliary Supervision
  702. Art of singular vectors and universal adversarial perturbations
  703. Free supervision from video games
  704. Unifying Identification and Context Learning for Person Recognition
  705. DensePose: Multi-Person Dense Human Pose Estimation In The Wild
  706. End-to-end Convolutional Semantic Embeddings
  707. Convolutional Image Captioning
  708. Inferring Semantic Layout for Hierarchical Text-to-Image Synthesis
  709. Efficient Interactive Annotation of Segmentation Datasets with Polygon-RNN++
  710. Nonlinear 3D Face Morphable Model
  711. OATM: Occlusion Aware Template Matching by Consensus Set Maximization
  712. Multi-Image Semantic Matching by Mining Consistent Features
  713. Explicit Loss-Error-Aware Quantization for Deep Neural Networks
  714. Modeling Facial Geometry using Compositional VAEs
  715. Encoding Crowd Interaction with Deep Neural Network for Pedestrian Trajectory Prediction
  716. DeepVoting: A Robust and Explainable Deep Network for Semantic Part Detection under Partial Occlusion
  717. Attentional ShapeContextNet for Point Cloud Recognition
  718. Weakly Supervised Instance Segmentation using Class Peak Response
  719. Fast and Robust Estimation for Unit-Norm Constrained Linear Fitting Problems
  720. Maximum Classifier Discrepancy for Unsupervised Domain Adaptation
  721. Multi-Level Factorisation Net for Person Re-Identification
  722. Video Based Reconstruction of 3D People Models
  723. Real-Time Monocular Depth Estimation using Synthetic Data with Domain Adaptation via Image Style Transfer
  724. Logo Synthesis and Manipulation with Clustered Generative Adversarial Networks
  725. Improved Fusion of Visual and Language Representations by Dense Symmetric Co-Attention for Visual Question Answering
  726. Image Super-resolution via Dual-state Recurrent Neural Networks
  727. Excitation Backprop for RNNs
  728. Image Generation from Scene Graphs
  729. Learning Spatial-Temporal Regularized Correlation Filters for Visual Tracking
  730. Image Restoration by Estimating Frequency Distribution of Local Patches
  731. Learning to Adapt Structured Output Space for Semantic Segmentation
  732. Deep Spatial Feature Reconstruction for Partial Person Re-identification
  733. Tight Nonconvex Relaxation of MAP Inference
  734. Multiple Granularity Group Interaction Prediction
  735. Accurate and Diverse Sampling of Sequences based on a ``Best of Many'' Sample Objective
  736. Learning Rich Features for Image Manipulation Detection
  737. DA-GAN: Instance-level Image Translation by Deep Attention Generative Adversarial Network
  738. A Benchmark for Articulated Human Pose Estimation and Tracking
  739. Preserving Semantic Relations for Zero-Shot Learning
  740. Geometry-Aware Scene Text Detection with Instance Transformation Network
  741. CleanNet: Transfer Learning for Scalable Image Classifier Training with Label Noise
  742. Joint Cuts and Matching of Partitions in One Graph
  743. Fast and Accurate Online Video Object Segmentation via Tracking Parts
  744. Learning Nested Structures in Deep Neural Networks
  745. Practical Block-wise Neural Network Architecture Generation
  746. AdaDepth: Unsupervised Content Congruent Adaptation for Depth Estimation
  747. Modifying Non-Local Variations Across Multiple Views
  748. Connecting Pixels to Privacy and Utility: Automatic Redaction of Private Information in Images
  749. Divide and Grow: Capturing Huge Diversity in Crowd Images with Incrementally Growing CNN
  750. When will you do what? - Anticipating Temporal Occurrences of Activities
  751. Visual Question Answering with Memory-Augmented Networks
  752. Stochastic Variational Inference with Gradient Linearization
  753. Human Pose Estimation with Parsing Induced Learner
  754. 3D Registration of Curves and Surfaces using Local Differential Information
  755. Deformation Aware Image Compression
  756. PoseFlow: A Deep Motion Representation for Understanding Human Behaviors in Videos
  757. MovieGraphs: Towards Understanding Human-Centric Situations from Videos
  758. Hybrid Camera Pose Estimation
  759. Fast Monte-Carlo Localization on Aerial Vehicles using Approximate Continuous Belief Representations
  760. PWC-Net: CNNs for Optical Flow Using Pyramid, Warping, and Cost Volume
  761. Hierarchical Recurrent Attention Networks for Structured Online Maps
  762. Learning Less is More - 6D Camera Localization via 3D Surface Regression
  763. Visual Question Generation as Dual Task of Visual Question Answering
  764. 3D Object Detection with Latent Support Surfaces
  765. An Analysis of Scale Invariance in Object Detection - SNIP
  766. 3D Semantic Trajectory Reconstruction from 3D Pixel Continuum
  767. KIPPI: KInetic Polygonal Partitioning of Images
  768. COCO-Stuff: Thing and Stuff Classes in Context
  769. Joint Optimization Framework for Learning with Noisy Labels
  770. Improved Lossy Image Compression with Priming and Spatially Adaptive Bit Rates for Recurrent Networks
  771. Deep Cost-Sensitive and Order-Preserving Feature Learning for Cross-Population Age Estimation
  772. Vision-and-Language Navigation: Interpreting visually-grounded navigation instructions in real environments
  773. Deep Back-Projection Networks For Super-Resolution
  774. Generating a Fusion Image: One' s Identity and Another's Shape
  775. V2V-PoseNet: Voxel-to-Voxel Prediction Network for Accurate 3D Hand and Human Pose Estimation from a Single Depth Map
  776. Long-Term On-Board Prediction of People in Traffic Scenes under Uncertainty
  777. Cross-modal Deep Variational Hand Pose Estimation
  778. Learning to Estimate 3D Human Pose and Shape from a Single Color Image
  779. Video Rain Removal By Multiscale Convolutional Sparse Coding
  780. Toward Driving Scene Understanding: A Dataset for Learning Driver Behavior and Causal Reasoning
  781. Learning 3D Shape Completion from Point Clouds with Weak Supervision
  782. SplineCNN: Fast Geometric Deep Learning with Continuous B-Spline Kernels
  783. Salience Guided Depth Calibration for Perceptually Optimized Compressive Light Field 3D Display
  784. Weakly-supervised Deep Convolutional Neural Network Learning for Facial Action Unit Intensity Estimation
  785. Rolling Shutter and Radial Distortion are Features for High Frame Rate Multi-camera Tracking
  786. Robust Hough Transform Based 3D Reconstruction from Circular Light Fields
  787. Feedback-prop: Convolutional Neural Network Inference under Partial Evidence
  788. Learning Strict Identity Mappings in Deep Residual Networks
  789. Residual Parameter Transfer for Deep Domain Adaptation
  790. Exploring Disentangled Feature Representation Beyond Face Identification
  791. SPLATNet: Sparse Lattice Networks for Point Cloud Processing
  792. Unsupervised Training for 3D Morphable Model Regression
  793. A Bi-directional Message Passing Model for Salient Object Detection
  794. Learning to See in the Dark
  795. Erase or Fill? Deep Joint Recurrent Rain Removal and Reconstruction in Videos
  796. Finding beans in burgers: Deep semantic-visual embedding with localization
  797. Referring Relationships
  798. Adversarially Learned One-Class Classifier for Novelty Detection
  799. Surface Networks
  800. Efficient parametrization of multi-domain deep neural networks
  801. Recognizing Human Actions as Evolution of Pose Estimation Maps
  802. Soccer on Your Tabletop
  803. CVM-Net: Cross-View Matching Network for Image-Based Ground-to-Aerial Geo-Localization
  804. Gesture Recognition: Focus on the Hands
  805. Video Object Segmentation via Inference in A CNN-Based Higher-Order Spatio-Temporal MRF
  806. Real-world Anomaly Detection in Surveillance Videos
  807. Learning a Single Convolutional Super-Resolution Network for Multiple Degradations
  808. Iterative Visual Reasoning Beyond Convolutions
  809. Guide Me: Interacting with Deep Networks
  810. PiCANet: Learning Pixel-wise Contextual Attention for Saliency Detection
  811. Future Frame Prediction for Anomaly Detection A New Baseline
  812. Structure Preserving Video Prediction
  813. Zero-Shot Visual Recognition using Semantics-Preserving Adversarial Embedding Networks
  814. Captioning Images with Style Transfer from Unaligned Text Corpora
  815. Anatomical Priors in Convolutional Networks for Unsupervised Biomedical Segmentation
  816. Illuminant Spectra-based Source Separation Using Flash Photography
  817. 3D Human Pose Reconstruction and Action Classification in Robot Assisted Therapy of Children with Autism
  818. Discrete-Continuous ADMM for Transductive Inference in Higher-Order MRFs
  819. Classification Driven Dynamic Image Enhancement
  820. Feature Generating Networks for Zero-Shot Learning
  821. Beyond Trade-off: Accelerate FCN-based Face Detection with Higher Accuracy
  822. MiCT: Mixed 3D/2D Convolutional Tube for Human Action Recognition
  823. Unsupervised Learning and Segmentation of Complex Activities from Video
  824. Sparse Photometric 3D Face Reconstruction Guided by Morphable Models
  825. LSTM stack-based Neural Multi-sequence Alignment TeCHnique (NeuMATCH)
  826. Inverse Composition Discriminative Optimization for Point Cloud Registration
  827. Inference in Higher Order MRF-MAP Problems with Small and Large Cliques
  828. Look at Boundary: A Boundary-Aware Face Alignment Algorithm
  829. LEGO: Learning Edge with Geometry all at Once by Watching Videos
  830. CosFace: Large Margin Cosine Loss for Deep Face Recognition
  831. Learning Semantic Concepts and Order for Image and Sentence Matching
  832. Learning to Look Around: Intelligently Exploring Unseen Environments for Unknown Tasks
  833. Low-shot learning with large-scale diffusion
  834. Multimodal Visual Concept Learning with Weakly Supervised Techniques
  835. Cross-View Image Synthesis using Conditional Generative Adversarial Nets
  836. Pixel-Wise Metric Learning for Blazingly Fast Video Object Segmentation
  837. PieAPP: Perceptual Image-Error Assessment through Pairwise Preference
  838. Cube Padding for Weakly-Supervised Saliency Prediction in 360$^{\circ}$ Videos
  839. CRRN: Multi-Scale Guided Concurrent Reflection Removal Network
  840. Stereoscopic Neural Style Transfer
  841. Low-shot Learning from Imaginary Data
  842. Fast, Simple, and Effective Resource-Constrained Structure Learning of Deep Networks
  843. Unsupervised Sparse Dirichlet-Net for Hyperspectral Image Super-Resolution
  844. Visual Grounding via Accumulated Attention
  845. Event-based Vision meets Deep Learning on Steering Prediction for Self-driving Cars
  846. Monocular 3D Pose and Shape Estimation of Multiple People in Natural Scenes
  847. Actor and Action Video Segmentation from a Sentence
  848. AttnGAN: Fine-Grained Text to Image Generation with Attentional Generative Adversarial Networks
  849. CartoonGAN: Generative Adversarial Networks for Photo Cartoonization
  850. RayNet: Learning Volumetric 3D Reconstruction with Ray Potentials
  851. Tracking Multiple Objects Outside the Line of Sight using Speckle Imaging
  852. Structured Attention Guided Convolutional Neural Fields for Monocular Depth Estimation
  853. Densely Connected Pyramid Dehazing Network
  854. Matching Adversarial Networks
  855. Automatic Map Inference from Aerial Images
  856. Polarimetric Dense Monocular SLAM
  857. Learning Attribute Representations with Localization for Flexible Fashion Search
  858. Self-Supervised Adversarial Hashing Networks for Cross-Modal Retrieval
  859. Unsupervised CCA
  860. Analyzing Filters Toward Efficient ConvNet
  861. Good Appearance Features for Multi-Target Multi-Camera Tracking
  862. Are You Talking to Me? Reasoned Visual Dialog Generation through Adversarial Learning
  863. Efficient Optimization for Rank-based Loss Functions
  864. ST-GAN: Spatial Transformer Generative Adversarial Networks for Image Compositing
  865. A Perceptual Measure for Deep Single Image Camera Calibration
  866. Radially-Distorted Conjugate Translations
  867. Multi-task Learning by Maximizing Statistical Dependence
  868. Creating Capsule Wardrobes from Fashion Images
  869. Towards Human-Machine Cooperation: Evolving Active Learning with Self-supervised Process for Object Detection
  870. Synthesizing Images of Humans in Unseen Poses
  871. Learning to Act Properly: Predicting and Explaining Affordances from Images
  872. Pyramid Stereo Matching Network
  873. Factoring Shape, Pose, and Layout from the 2D Image of a 3D Scene
  874. A General Two-Step Quantization Approach for Low-bit Neural Networks with High Accuracy
  875. GVCNN: Group-View Convolutional Neural Networks for 3D Shape Recognition
  876. Convolutional Neural Networks with Alternately Updated Clique
  877. Squeeze-and-Excitation Networks
  878. NISP: Pruning Networks using Neuron Importance Score Propagation
  879. Audio to Body Dynamics
  880. ID-GAN: Learning a Symmetry Three-Player GAN for Identity-Preserving Face Synthesis
  881. Deep Learning of Graph Matching
  882. Neural Baby Talk
  883. Efficient Video Object Segmentation via Network Modulation
  884. Regularizing Deep Networks by Modeling and Predicting Label Structure
  885. Revisiting Dilated Convolution: A Simple Approach for Weakly- and Semi- Supervised Semantic Segmentation
  886. Face Detector Adaptation without Negative Transfer or Catastrophic Forgetting
  887. Motion-Appearance Co-Memory Networks for Video Question Answering
  888. Compare and Contrast: Learning Prominent Visual Differences
  889. Tangent Convolutions for Dense Prediction in 3D
  890. Single-Shot Object Detection with Enriched Semantics
  891. Generating Synthetic X-ray Images of a Person from the Surface Geometry
  892. Don't Just Assume; Look and Answer: Overcoming Priors for Visual Question Answering
  893. Edit Probability for Scene Text Recognition
  894. MaskLab: Instance Segmentation by Refining Object Detection with Semantic and Direction Features
  895. Weakly-Supervised Action Segmentation with Iterative Soft Boundary Assignment
  896. Texture Mapping for 3D Reconstruction with RGB-D Sensor
  897. Multi-Agent Diverse Generative Adversarial Networks
  898. Towards Universal Representation for Unseen Action Recognition
  899. Zero-Shot Kernel Learning.
  900. DOTA: A Large-scale Dataset for Object Detection in Aerial Images
  901. Multi-Frame Quality Enhancement for Compressed Video
  902. From Lifestyle VLOGs to Everyday Interactions
  903. Occluded Pedestrian Detection through Guided Attention in CNNs
  904. Decoupled Networks
  905. Deep Cocktail Networks: Multi-source Unsupervised Domain Adaptation with Category Shift
  906. Partially Shared Multi-Task Convolutional Neural Network with Local Constraint for Face Attribute Learning
  907. Joint Pose and Expression Modeling for Facial Expression Recognition
  908. Unsupervised Textual Grounding: Linking Words to Image Concepts
  909. Interleaved Structured Sparse Convolutional Neural Networks
  910. Look, Imagine and Match: Improving Textual-Visual Cross-Modal Retrieval with Generative Models
  911. ROAD: Reality Oriented Adaptation for Semantic Segmentation of Urban Scenes
  912. Image to Image Translation for Domain Adaptation
  913. A Face to Face Neural Conversation Model
  914. Image-Image Domain Adaptation with Preserved Self-Similarity and Domain-Dissimilarity for Person Re-identification
  915. FSRNet: End-to-End Learning Face Super-Resolution with Facial Priors
  916. SO-Net: Self-Organizing Network for Point Cloud Analysis
  917. MoNet: Moments Embedding Network
  918. Coupled End-to-end Transfer Learning with Generalized Fisher Information
  919. Inferring Light Fields from Shadows
  920. LayoutNet: Reconstructing the 3D Room Layout from a Single RGB Image
  921. Multi-Level Fusion based 3D Object Detection from Monocular Images
  922. Single-Image Depth Estimation Based on Fourier Domain Analysis
  923. Flow Guided Recurrent Neural Encoder for Video Salient Object Detection
  924. Super-Resolving Very Low-Resolution Face Images with Supplementary Attributes
  925. Seeing Voices and Hearing Faces: Cross-modal biometric matching
  926. Feature Mapping for Learning Fast and Accurate 3D Pose Inference from Synthetic Images
  927. Fast and Accurate Single Image Super-Resolution via Information Distillation Network
  928. Learning and Using the Arrow of Time
  929. Rethinking the Faster R-CNN Architecture for Temporal Action Localization
  930. Deeply Learned Filter Response Functions for Hyperspectral Reconstruction
  931. Fusing Crowd Density Maps and Visual Object Trackers for People Tracking in Crowd Scenes
  932. Intrinsic Image Transformation via Scale Space Decomposition
  933. Deep Ordinal Regression Network for Monocular Depth Estimation
  934. Multi-Oriented Scene Text Detection via Corner Localization and Region Segmentation
  935. Functional Map of the World
  936. CSGNet: Neural Shape Parser for Constructive Solid Geometry
  937. Instance Embedding Transfer to Unsupervised Video Object Segmentation
  938. Statistical Tomography of Microscopic Life
  939. Point-wise Convolutional Neural Networks
  940. Pixar: Real-time 3D Object Detection from Point Clouds
  941. HydraNets: Specialized Dynamic Architectures for Efficient Inference
  942. Deep Depth Completion of a Single RGB-D Image
  943. Learning to Extract a Video Sequence from a Single Motion-Blurred Image
  944. A Fast Resection-Intersection Method for the Known Rotation Problem
  945. iVQA: Inverse Visual Question Answering
  946. Crowd Counting via Adversarial Cross-Scale Consistency Pursuit
  947. Trust your Model: Light Field Depth Estimation with inline Occlusion Handling
  948. PointNetVLAD: Deep Point Cloud Based Retrieval for Large-Scale Place Recognition
  949. A Memory Network Approach for Story-based Temporal Summarization of 360° Videos
  950. Tags2Parts: Discovering Semantic Regions from Shape Tags
  951. Jerk-Aware Video Acceleration Magnification
  952. A Robust Method for Strong Rolling Shutter Effects Correction Using Lines with Automatic Feature Selection
  953. Mobile Video Object Detection with Temporally-Aware Feature Maps
  954. VirtualHome: Simulating Household Activities via Programs
  955. MoNet: Deep Motion Exploitation for Video Object Segmentation
  956. Detect globally, refine locally: A novel approach to saliency detection
  957. EPINET: A Fully-Convolutional Neural Network for Light Field Depth Estimation by Using Epipolar Geometry
  958. Learning Face Age Progression: A Pyramid Architecture of GANs
  959. Normalized Cut Loss for Weakly Supervised CNN Segmentation
  960. Reconstructing Thin Structures of Manifold Surfaces by Integrating Spatial Curves
  961. Dynamic Few-Shot Visual Learning without Forgetting
  962. Camera Style Adaptation for Person Re-identification
  963. In-Place Activated BatchNorm for Memory-Optimized Training of DNNs
  964. NeuralNetwork-Viterbi: A Framework for Weakly Supervised Video Learning
  965. Resource Aware Person Re-identification across Multiple Resolutions
  966. Zero-Shot Super-Resolution using Deep Internal Learning
  967. Analysis of Hand Segmentation in the Wild
  968. Who's Better? Who's Best? Pairwise Deep Ranking for Skill Determination
  969. Face Aging with Identity-Preserved Conditional Generative Adversarial Networks
  970. Deep Extreme Cut: From Extreme Points to Object Segmentation
  971. Person Re-identification with Cascaded Pairwise Convolutions
  972. Distributable Consistent Multi-Graph Matching
  973. A Twofold Siamese Network for Real-Time Object Tracking
  974. AON: Towards Arbitrarily-Oriented Text Recognition
  975. Deep Cauchy Hashing for Hamming Space Retrieval
  976. Non-blind Deblurring: Handling Kernel Uncertainty with CNNs
  977. Referring Image Segmentation via Recurrent Refinement Networks
  978. Deep Density Clustering of Unconstrained Faces
  979. A Constrained Deep Neural Network for Ordinal Regression

cvpr2018-papers's People

Contributors

kaluo-zz avatar

Watchers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.