Header logo is


2019


Thumb xl teaser singlecol
Attacking Optical Flow

Ranjan, A., Janai, J., Geiger, A., Black, M. J.

In International Conference on Computer Vision, November 2019 (inproceedings)

Abstract
Deep neural nets achieve state-of-the-art performance on the problem of optical flow estimation. Since optical flow is used in several safety-critical applications like self-driving cars, it is important to gain insights into the robustness of those techniques. Recently, it has been shown that adversarial attacks easily fool deep neural networks to misclassify objects. The robustness of optical flow networks to adversarial attacks, however, has not been studied so far. In this paper, we extend adversarial patch attacks to optical flow networks and show that such attacks can compromise their performance. We show that corrupting a small patch of less than 1% of the image size can significantly affect optical flow estimates. Our attacks lead to noisy flow estimates that extend significantly beyond the region of the attack, in many cases even completely erasing the motion of objects in the scene. While networks using an encoder-decoder architecture are very sensitive to these attacks, we found that networks using a spatial pyramid architecture are less affected. We analyse the success and failure of attacking both architectures by visualizing their feature maps and comparing them to classical optical flow techniques which are robust to these attacks. We also demonstrate that such attacks are practical by placing a printed pattern into real scenes.

avg ps

Video Project Page Paper Supplementary Material link (url) [BibTex]

2019


Video Project Page Paper Supplementary Material link (url) [BibTex]


Thumb xl occ flow
Occupancy Flow: 4D Reconstruction by Learning Particle Dynamics

Niemeyer, M., Mescheder, L., Oechsle, M., Geiger, A.

International Conference on Computer Vision, October 2019 (conference)

Abstract
Deep learning based 3D reconstruction techniques have recently achieved impressive results. However, while state-of-the-art methods are able to output complex 3D geometry, it is not clear how to extend these results to time-varying topologies. Approaches treating each time step individually lack continuity and exhibit slow inference, while traditional 4D reconstruction methods often utilize a template model or discretize the 4D space at fixed resolution. In this work, we present Occupancy Flow, a novel spatio-temporal representation of time-varying 3D geometry with implicit correspondences. Towards this goal, we learn a temporally and spatially continuous vector field which assigns a motion vector to every point in space and time. In order to perform dense 4D reconstruction from images or sparse point clouds, we combine our method with a continuous 3D representation. Implicitly, our model yields correspondences over time, thus enabling fast inference while providing a sound physical description of the temporal dynamics. We show that our method can be used for interpolation and reconstruction tasks, and demonstrate the accuracy of the learned correspondences. We believe that Occupancy Flow is a promising new 4D representation which will be useful for a variety of spatio-temporal reconstruction tasks.

avg

pdf poster suppmat code Project page video blog [BibTex]


Thumb xl tex felds
Texture Fields: Learning Texture Representations in Function Space

Oechsle, M., Mescheder, L., Niemeyer, M., Strauss, T., Geiger, A.

International Conference on Computer Vision, October 2019 (conference)

Abstract
In recent years, substantial progress has been achieved in learning-based reconstruction of 3D objects. At the same time, generative models were proposed that can generate highly realistic images. However, despite this success in these closely related tasks, texture reconstruction of 3D objects has received little attention from the research community and state-of-the-art methods are either limited to comparably low resolution or constrained experimental setups. A major reason for these limitations is that common representations of texture are inefficient or hard to interface for modern deep learning techniques. In this paper, we propose Texture Fields, a novel texture representation which is based on regressing a continuous 3D function parameterized with a neural network. Our approach circumvents limiting factors like shape discretization and parameterization, as the proposed texture representation is independent of the shape representation of the 3D object. We show that Texture Fields are able to represent high frequency texture and naturally blend with modern deep learning techniques. Experimentally, we find that Texture Fields compare favorably to state-of-the-art methods for conditional texture reconstruction of 3D objects and enable learning of probabilistic generative models for texturing unseen 3D models. We believe that Texture Fields will become an important building block for the next generation of generative 3D models.

avg

pdf suppmat video poster blog Project Page [BibTex]


Thumb xl lv
Taking a Deeper Look at the Inverse Compositional Algorithm

Lv, Z., Dellaert, F., Rehg, J. M., Geiger, A.

In Proceedings IEEE Conf. on Computer Vision and Pattern Recognition (CVPR), IEEE International Conference on Computer Vision and Pattern Recognition (CVPR) 2019, June 2019 (inproceedings)

Abstract
In this paper, we provide a modern synthesis of the classic inverse compositional algorithm for dense image alignment. We first discuss the assumptions made by this well-established technique, and subsequently propose to relax these assumptions by incorporating data-driven priors into this model. More specifically, we unroll a robust version of the inverse compositional algorithm and replace multiple components of this algorithm using more expressive models whose parameters we train in an end-to-end fashion from data. Our experiments on several challenging 3D rigid motion estimation tasks demonstrate the advantages of combining optimization with learning-based techniques, outperforming the classic inverse compositional algorithm as well as data-driven image-to-pose regression approaches.

avg

pdf suppmat Video Project Page Poster [BibTex]

pdf suppmat Video Project Page Poster [BibTex]


Thumb xl mots
MOTS: Multi-Object Tracking and Segmentation

Voigtlaender, P., Krause, M., Osep, A., Luiten, J., Sekar, B. B. G., Geiger, A., Leibe, B.

In Proceedings IEEE Conf. on Computer Vision and Pattern Recognition (CVPR), IEEE International Conference on Computer Vision and Pattern Recognition (CVPR) 2019, June 2019 (inproceedings)

Abstract
This paper extends the popular task of multi-object tracking to multi-object tracking and segmentation (MOTS). Towards this goal, we create dense pixel-level annotations for two existing tracking datasets using a semi-automatic annotation procedure. Our new annotations comprise 65,213 pixel masks for 977 distinct objects (cars and pedestrians) in 10,870 video frames. For evaluation, we extend existing multi-object tracking metrics to this new task. Moreover, we propose a new baseline method which jointly addresses detection, tracking, and segmentation with a single convolutional network. We demonstrate the value of our datasets by achieving improvements in performance when training on MOTS annotations. We believe that our datasets, metrics and baseline will become a valuable resource towards developing multi-object tracking approaches that go beyond 2D bounding boxes.

avg

pdf suppmat Project Page Poster Video Project Page [BibTex]

pdf suppmat Project Page Poster Video Project Page [BibTex]


Thumb xl behl
PointFlowNet: Learning Representations for Rigid Motion Estimation from Point Clouds

Behl, A., Paschalidou, D., Donne, S., Geiger, A.

In Proceedings IEEE Conf. on Computer Vision and Pattern Recognition (CVPR), IEEE International Conference on Computer Vision and Pattern Recognition (CVPR) 2019, June 2019 (inproceedings)

Abstract
Despite significant progress in image-based 3D scene flow estimation, the performance of such approaches has not yet reached the fidelity required by many applications. Simultaneously, these applications are often not restricted to image-based estimation: laser scanners provide a popular alternative to traditional cameras, for example in the context of self-driving cars, as they directly yield a 3D point cloud. In this paper, we propose to estimate 3D motion from such unstructured point clouds using a deep neural network. In a single forward pass, our model jointly predicts 3D scene flow as well as the 3D bounding box and rigid body motion of objects in the scene. While the prospect of estimating 3D scene flow from unstructured point clouds is promising, it is also a challenging task. We show that the traditional global representation of rigid body motion prohibits inference by CNNs, and propose a translation equivariant representation to circumvent this problem. For training our deep network, a large dataset is required. Because of this, we augment real scans from KITTI with virtual objects, realistically modeling occlusions and simulating sensor noise. A thorough comparison with classic and learning-based techniques highlights the robustness of the proposed approach.

avg

pdf suppmat Project Page Poster Video [BibTex]

pdf suppmat Project Page Poster Video [BibTex]


Thumb xl liao
Connecting the Dots: Learning Representations for Active Monocular Depth Estimation

Riegler, G., Liao, Y., Donne, S., Koltun, V., Geiger, A.

In Proceedings IEEE Conf. on Computer Vision and Pattern Recognition (CVPR), IEEE International Conference on Computer Vision and Pattern Recognition (CVPR) 2019, June 2019 (inproceedings)

Abstract
We propose a technique for depth estimation with a monocular structured-light camera, \ie, a calibrated stereo set-up with one camera and one laser projector. Instead of formulating the depth estimation via a correspondence search problem, we show that a simple convolutional architecture is sufficient for high-quality disparity estimates in this setting. As accurate ground-truth is hard to obtain, we train our model in a self-supervised fashion with a combination of photometric and geometric losses. Further, we demonstrate that the projected pattern of the structured light sensor can be reliably separated from the ambient information. This can then be used to improve depth boundaries in a weakly supervised fashion by modeling the joint statistics of image and depth edges. The model trained in this fashion compares favorably to the state-of-the-art on challenging synthetic and real-world datasets. In addition, we contribute a novel simulator, which allows to benchmark active depth prediction algorithms in controlled conditions.

avg

pdf suppmat Poster Project Page [BibTex]

pdf suppmat Poster Project Page [BibTex]


Thumb xl donne
Learning Non-volumetric Depth Fusion using Successive Reprojections

Donne, S., Geiger, A.

In Proceedings IEEE Conf. on Computer Vision and Pattern Recognition (CVPR), IEEE International Conference on Computer Vision and Pattern Recognition (CVPR) 2019, June 2019 (inproceedings)

Abstract
Given a set of input views, multi-view stereopsis techniques estimate depth maps to represent the 3D reconstruction of the scene; these are fused into a single, consistent, reconstruction -- most often a point cloud. In this work we propose to learn an auto-regressive depth refinement directly from data. While deep learning has improved the accuracy and speed of depth estimation significantly, learned MVS techniques remain limited to the planesweeping paradigm. We refine a set of input depth maps by successively reprojecting information from neighbouring views to leverage multi-view constraints. Compared to learning-based volumetric fusion techniques, an image-based representation allows significantly more detailed reconstructions; compared to traditional point-based techniques, our method learns noise suppression and surface completion in a data-driven fashion. Due to the limited availability of high-quality reconstruction datasets with ground truth, we introduce two novel synthetic datasets to (pre-)train our network. Our approach is able to improve both the output depth maps and the reconstructed point cloud, for both learned and traditional depth estimation front-ends, on both synthetic and real data.

avg

pdf suppmat Project Page Video Poster blog [BibTex]

pdf suppmat Project Page Video Poster blog [BibTex]


Thumb xl superquadrics parsing
Superquadrics Revisited: Learning 3D Shape Parsing beyond Cuboids

Paschalidou, D., Ulusoy, A. O., Geiger, A.

In Proceedings IEEE Conf. on Computer Vision and Pattern Recognition (CVPR), IEEE International Conference on Computer Vision and Pattern Recognition (CVPR) 2019, June 2019 (inproceedings)

Abstract
Abstracting complex 3D shapes with parsimonious part-based representations has been a long standing goal in computer vision. This paper presents a learning-based solution to this problem which goes beyond the traditional 3D cuboid representation by exploiting superquadrics as atomic elements. We demonstrate that superquadrics lead to more expressive 3D scene parses while being easier to learn than 3D cuboid representations. Moreover, we provide an analytical solution to the Chamfer loss which avoids the need for computational expensive reinforcement learning or iterative prediction. Our model learns to parse 3D objects into consistent superquadric representations without supervision. Results on various ShapeNet categories as well as the SURREAL human body dataset demonstrate the flexibility of our model in capturing fine details and complex poses that could not have been modelled using cuboids.

avg

Project Page Poster suppmat pdf Video blog handout [BibTex]

Project Page Poster suppmat pdf Video blog handout [BibTex]


Thumb xl icra 19 2
Real-Time Dense Mapping for Self-Driving Vehicles using Fisheye Cameras

Cui, Z., Heng, L., Yeo, Y. C., Geiger, A., Pollefeys, M., Sattler, T.

In Proceedings of the IEEE International Conference on Robotics and Automation (ICRA) 2019, IEEE, International Conference on Robotics and Automation, May 2019 (inproceedings)

Abstract
We present a real-time dense geometric mapping algorithm for large-scale environments. Unlike existing methods which use pinhole cameras, our implementation is based on fisheye cameras which have larger field of view and benefit some other tasks including Visual-Inertial Odometry, localization and object detection around vehicles. Our algorithm runs on in-vehicle PCs at 15 Hz approximately, enabling vision-only 3D scene perception for self-driving vehicles. For each synchronized set of images captured by multiple cameras, we first compute a depth map for a reference camera using plane-sweeping stereo. To maintain both accuracy and efficiency, while accounting for the fact that fisheye images have a rather low resolution, we recover the depths using multiple image resolutions. We adopt the fast object detection framework YOLOv3 to remove potentially dynamic objects. At the end of the pipeline, we fuse the fisheye depth images into the truncated signed distance function (TSDF) volume to obtain a 3D map. We evaluate our method on large-scale urban datasets, and results show that our method works well even in complex environments.

avg

pdf video poster Project Page [BibTex]

pdf video poster Project Page [BibTex]


Thumb xl icra19 1
Project AutoVision: Localization and 3D Scene Perception for an Autonomous Vehicle with a Multi-Camera System

Heng, L., Choi, B., Cui, Z., Geppert, M., Hu, S., Kuan, B., Liu, P., Nguyen, R. M. H., Yeo, Y. C., Geiger, A., Lee, G. H., Pollefeys, M., Sattler, T.

In Proceedings of the IEEE International Conference on Robotics and Automation (ICRA) 2019, IEEE, International Conference on Robotics and Automation, May 2019 (inproceedings)

Abstract
Project AutoVision aims to develop localization and 3D scene perception capabilities for a self-driving vehicle. Such capabilities will enable autonomous navigation in urban and rural environments, in day and night, and with cameras as the only exteroceptive sensors. The sensor suite employs many cameras for both 360-degree coverage and accurate multi-view stereo; the use of low-cost cameras keeps the cost of this sensor suite to a minimum. In addition, the project seeks to extend the operating envelope to include GNSS-less conditions which are typical for environments with tall buildings, foliage, and tunnels. Emphasis is placed on leveraging multi-view geometry and deep learning to enable the vehicle to localize and perceive in 3D space. This paper presents an overview of the project, and describes the sensor suite and current progress in the areas of calibration, localization, and perception.

avg

pdf [BibTex]

pdf [BibTex]


no image
Elastic modulus affects adhesive strength of gecko-inspired synthetics in variable temperature and humidity

Mitchell, CT, Drotlef, D, Dayan, CB, Sitti, M, Stark, AY

In INTEGRATIVE AND COMPARATIVE BIOLOGY, pages: E372-E372, OXFORD UNIV PRESS INC JOURNALS DEPT, 2001 EVANS RD, CARY, NC 27513 USA, March 2019 (inproceedings)

pi

[BibTex]

[BibTex]


no image
X-ray Optics Fabrication Using Unorthodox Approaches

Sanli, U., Baluktsian, M., Ceylan, H., Sitti, M., Weigand, M., Schütz, G., Keskinbora, K.

Bulletin of the American Physical Society, APS, 2019 (article)

mms pi

[BibTex]

[BibTex]


Thumb xl as20205.f2
Microrobotics and Microorganisms: Biohybrid Autonomous Cellular Robots

Alapan, Y., Yasa, O., Yigit, B., Yasa, I. C., Erkoc, P., Sitti, M.

Annual Review of Control, Robotics, and Autonomous Systems, 2019 (article)

pi

[BibTex]

[BibTex]


Thumb xl woodw1 2892811 large
Tailored Magnetic Springs for Shape-Memory Alloy Actuated Mechanisms in Miniature Robots

Woodward, M. A., Sitti, M.

IEEE Transactions on Robotics, 35, 2019 (article)

Abstract
Animals can incorporate large numbers of actuators because of the characteristics of muscles; whereas, robots cannot, as typical motors tend to be large, heavy, and inefficient. However, shape-memory alloys (SMA), materials that contract during heating because of change in their crystal structure, provide another option. SMA, though, is unidirectional and therefore requires an additional force to reset (extend) the actuator, which is typically provided by springs or antagonistic actuation. These strategies, however, tend to limit the actuator's work output and functionality as their force-displacement relationships typically produce increasing resistive force with limited variability. In contrast, magnetic springs-composed of permanent magnets, where the interaction force between magnets mimics a spring force-have much more variable force-displacement relationships and scale well with SMA. However, as of yet, no method for designing magnetic springs for SMA-actuators has been demonstrated. Therefore, in this paper, we present a new methodology to tailor magnetic springs to the characteristics of these actuators, with experimental results both for the device and robot-integrated SMA-actuators. We found magnetic building blocks, based on sets of permanent magnets, which are well-suited to SMAs and have the potential to incorporate features such as holding force, state transitioning, friction minimization, auto-alignment, and self-mounting. We show magnetic springs that vary by more than 3 N in 750 $\mu$m and two SMA-actuated devices that allow the MultiMo-Bat to reach heights of up to 4.5 m without, and 3.6 m with, integrated gliding airfoils. Our results demonstrate the potential of this methodology to add previously impossible functionality to smart material actuators. We anticipate this methodology will inspire broader consideration of the use of magnetic springs in miniature robots and further study of the potential of tailored magnetic springs throughout mechanical systems.

pi

DOI [BibTex]


Thumb xl figure1
Magnetically Actuated Soft Capsule Endoscope for Fine-Needle Biopsy

Son, D., Gilbert, H., Sitti, M.

Soft robotics, Mary Ann Liebert, Inc., publishers 140 Huguenot Street, 3rd Floor New …, 2019 (article)

pi

[BibTex]

[BibTex]


Thumb xl smll201900472 fig 0001 m
Thrust and Hydrodynamic Efficiency of the Bundled Flagella

Danis, U., Rasooli, R., Chen, C., Dur, O., Sitti, M., Pekkan, K.

Micromachines, 10, 2019 (article)

pi

[BibTex]

[BibTex]


Thumb xl c8sm02215a f1 hi res
The near and far of a pair of magnetic capillary disks

Koens, L., Wang, W., Sitti, M., Lauga, E.

Soft Matter, 2019 (article)

pi

[BibTex]

[BibTex]


Thumb xl smll201900472 fig 0001 m
Multifarious Transit Gates for Programmable Delivery of Bio‐functionalized Matters

Hu, X., Torati, S. R., Kim, H., Yoon, J., Lim, B., Kim, K., Sitti, M., Kim, C.

Small, Wiley Online Library, 2019 (article)

pi

[BibTex]

[BibTex]


Thumb xl capture
Multi-functional soft-bodied jellyfish-like swimming

Ren, Z., Hu, W., Dong, X., Sitti, M.

Nature communications, 10, 2019 (article)

pi

[BibTex]


no image
Welcome to Progress in Biomedical Engineering

Sitti, M.

Progress in Biomedical Engineering, 1, IOP Publishing, 2019 (article)

pi

[BibTex]

[BibTex]


Thumb xl smll201900472 fig 0001 m
Mechanics of a pressure-controlled adhesive membrane for soft robotic gripping on curved surfaces

Song, S., Drotlef, D., Paik, J., Majidi, C., Sitti, M.

Extreme Mechanics Letters, Elsevier, 2019 (article)

pi

[BibTex]


Thumb xl mt 2018 00757w 0007
Graphene oxide synergistically enhances antibiotic efficacy in Vancomycin resistance Staphylococcus aureus

Singh, V., Kumar, V., Kashyap, S., Singh, A. V., Kishore, V., Sitti, M., Saxena, P. S., Srivastava, A.

ACS Applied Bio Materials, ACS Publications, 2019 (article)

pi

[BibTex]

[BibTex]


Thumb xl itxm a 1566425 f0001 c
Review of emerging concepts in nanotoxicology: opportunities and challenges for safer nanomaterial design

Singh, A. V., Laux, P., Luch, A., Sudrik, C., Wiehr, S., Wild, A., Santamauro, G., Bill, J., Sitti, M.

Toxicology Mechanisms and Methods, 2019 (article)

pi

[BibTex]

[BibTex]


Thumb xl capture
Multifunctional and biodegradable self-propelled protein motors

Pena-Francesch, A., Giltinan, J., Sitti, M.

Nature communications, 10, Nature Publishing Group, 2019 (article)

pi

[BibTex]

[BibTex]


Thumb xl capture
Cohesive self-organization of mobile microrobotic swarms

Yigit, B., Alapan, Y., Sitti, M.

arXiv preprint arXiv:1907.05856, 2019 (article)

pi

[BibTex]

[BibTex]


Thumb xl adtp201800064 fig 0004 m
Mobile microrobots for active therapeutic delivery

Erkoc, P., Yasa, I. C., Ceylan, H., Yasa, O., Alapan, Y., Sitti, M.

Advanced Therapeutics, Wiley Online Library, 2019 (article)

pi

[BibTex]

[BibTex]


Thumb xl smll201900472 fig 0001 m
Shape-encoded dynamic assembly of mobile micromachines

Alapan, Y., Yigit, B., Beker, O., Demirörs, A. F., Sitti, M.

Nature, 18, 2019 (article)

pi

[BibTex]

[BibTex]


Thumb xl adom201801313 fig 0001 m
Microfluidics Integrated Lithography‐Free Nanophotonic Biosensor for the Detection of Small Molecules

Sreekanth, K. V., Sreejith, S., Alapan, Y., Sitti, M., Lim, C. T., Singh, R.

Advanced Optical Materials, 2019 (article)

pi

[BibTex]

[BibTex]


no image
Gecko-inspired composite microfibers for reversible adhesion on smooth and rough surfaces

Drotlef, D., Dayan, C., Sitti, M.

In INTEGRATIVE AND COMPARATIVE BIOLOGY, pages: E58-E58, OXFORD UNIV PRESS INC JOURNALS DEPT, 2001 EVANS RD, CARY, NC 27513 USA, 2019 (inproceedings)

pi

[BibTex]

[BibTex]


Thumb xl 201904010817153241
ENGINEERING Bio-inspired robotic collectives

Sitti, M.

Nature, 567, pages: 314-315, Macmillan Publishers Ltd., London, England, 2019 (article)

pi

[BibTex]

[BibTex]


no image
Robust Humanoid Locomotion Using Trajectory Optimization and Sample-Efficient Learning

Yeganegi, M. H., Khadiv, M., Moosavian, S. A. A., Zhu, J., Prete, A. D., Righetti, L.

Proceedings International Conference on Humanoid Robots, IEEE, 2019 IEEE-RAS International Conference on Humanoid Robots, 2019 (conference)

Abstract
Trajectory optimization (TO) is one of the most powerful tools for generating feasible motions for humanoid robots. However, including uncertainties and stochasticity in the TO problem to generate robust motions can easily lead to intractable problems. Furthermore, since the models used in TO have always some level of abstraction, it can be hard to find a realistic set of uncertainties in the model space. In this paper we leverage a sample-efficient learning technique (Bayesian optimization) to robustify TO for humanoid locomotion. The main idea is to use data from full-body simulations to make the TO stage robust by tuning the cost weights. To this end, we split the TO problem into two phases. The first phase solves a convex optimization problem for generating center of mass (CoM) trajectories based on simplified linear dynamics. The second stage employs iterative Linear-Quadratic Gaussian (iLQG) as a whole-body controller to generate full body control inputs. Then we use Bayesian optimization to find the cost weights to use in the first stage that yields robust performance in the simulation/experiment, in the presence of different disturbance/uncertainties. The results show that the proposed approach is able to generate robust motions for different sets of disturbances and uncertainties.

mg

https://arxiv.org/abs/1907.04616 [BibTex]

https://arxiv.org/abs/1907.04616 [BibTex]


Thumb xl capture
Peptide-Induced Biomineralization of Tin Oxide (SnO2) Nanoparticles for Antibacterial Applications

Singh, A. V., Jahnke, T., Xiao, Y., Wang, S., Yu, Y., David, H., Richter, G., Laux, P., Luch, A., Srivastava, A., Saxena, P. S., Bill, J., Sitti, M.

Journal of nanoscience and nanotechnology, 19, American Scientific Publishers, 2019 (article)

pi

[BibTex]

[BibTex]


no image
Electromechanical actuation of dielectric liquid crystal elastomers for soft robotics

Davidson, Z., Shahsavan, H., Guo, Y., Hines, L., Xia, Y., Yang, S., Sitti, M.

Bulletin of the American Physical Society, APS, 2019 (article)

pi

[BibTex]

[BibTex]


Thumb xl nova
NoVA: Learning to See in Novel Viewpoints and Domains

Coors, B., Condurache, A. P., Geiger, A.

In 2019 International Conference on 3D Vision (3DV), 2019 International Conference on 3D Vision (3DV), 2019 (inproceedings)

Abstract
Domain adaptation techniques enable the re-use and transfer of existing labeled datasets from a source to a target domain in which little or no labeled data exists. Recently, image-level domain adaptation approaches have demonstrated impressive results in adapting from synthetic to real-world environments by translating source images to the style of a target domain. However, the domain gap between source and target may not only be caused by a different style but also by a change in viewpoint. This case necessitates a semantically consistent translation of source images and labels to the style and viewpoint of the target domain. In this work, we propose the Novel Viewpoint Adaptation (NoVA) model, which enables unsupervised adaptation to a novel viewpoint in a target domain for which no labeled data is available. NoVA utilizes an explicit representation of the 3D scene geometry to translate source view images and labels to the target view. Experiments on adaptation to synthetic and real-world datasets show the benefit of NoVA compared to state-of-the-art domain adaptation approaches on the task of semantic segmentation.

avg

pdf suppmat poster video [BibTex]

pdf suppmat poster video [BibTex]


Thumb xl turan1 2924846 large
Learning to Navigate Endoscopic Capsule Robots

Turan, M., Almalioglu, Y., Gilbert, H. B., Mahmood, F., Durr, N. J., Araujo, H., Sarı, A. E., Ajay, A., Sitti, M.

IEEE Robotics and Automation Letters, 4, 2019 (article)

pi

[BibTex]

[BibTex]


Thumb xl teaser website
Occupancy Networks: Learning 3D Reconstruction in Function Space

Mescheder, L., Oechsle, M., Niemeyer, M., Nowozin, S., Geiger, A.

In Proceedings IEEE Conf. on Computer Vision and Pattern Recognition (CVPR), IEEE International Conference on Computer Vision and Pattern Recognition (CVPR) 2019, 2019 (inproceedings)

Abstract
With the advent of deep neural networks, learning-based approaches for 3D reconstruction have gained popularity. However, unlike for images, in 3D there is no canonical representation which is both computationally and memory efficient yet allows for representing high-resolution geometry of arbitrary topology. Many of the state-of-the-art learning-based 3D reconstruction approaches can hence only represent very coarse 3D geometry or are limited to a restricted domain. In this paper, we propose Occupancy Networks, a new representation for learning-based 3D reconstruction methods. Occupancy networks implicitly represent the 3D surface as the continuous decision boundary of a deep neural network classifier. In contrast to existing approaches, our representation encodes a description of the 3D output at infinite resolution without excessive memory footprint. We validate that our representation can efficiently encode 3D structure and can be inferred from various kinds of input. Our experiments demonstrate competitive results, both qualitatively and quantitatively, for the challenging tasks of 3D reconstruction from single images, noisy point clouds and coarse discrete voxel grids. We believe that occupancy networks will become a useful tool in a wide variety of learning-based 3D tasks.

avg

Code Video pdf suppmat Project Page blog [BibTex]

Code Video pdf suppmat Project Page blog [BibTex]

2014


Thumb xl publications toc
Series of Multilinked Caterpillar Track-type Climbing Robots

Lee, G., Kim, H., Seo, K., Kim, J., Sitti, M., Seo, T.

Journal of Field Robotics, November 2014 (article)

Abstract
Climbing robots have been widely applied in many industries involving hard to access, dangerous, or hazardous environments to replace human workers. Climbing speed, payload capacity, the ability to overcome obstacles, and wall-to-wall transitioning are significant characteristics of climbing robots. Here, multilinked track wheel-type climbing robots are proposed to enhance these characteristics. The robots have been developed for five years in collaboration with three universities: Seoul National University, Carnegie Mellon University, and Yeungnam University. Four types of robots are presented for different applications with different surface attachment methods and mechanisms: MultiTank for indoor sites, Flexible caterpillar robot (FCR) and Combot for heavy industrial sites, and MultiTrack for high-rise buildings. The method of surface attachment is different for each robot and application, and the characteristics of the joints between links are designed as active or passive according to the requirement of a given robot. Conceptual design, practical design, and control issues of such climbing robot types are reported, and a proper choice of the attachment methods and joint type is essential for the successful multilink track wheel-type climbing robot for different surface materials, robot size, and computational costs.

pi

DOI [BibTex]

2014


DOI [BibTex]


Thumb xl thumb schoenbein2014iros
Omnidirectional 3D Reconstruction in Augmented Manhattan Worlds

Schoenbein, M., Geiger, A.

International Conference on Intelligent Robots and Systems, pages: 716 - 723, IEEE, Chicago, IL, USA, IEEE/RSJ International Conference on Intelligent Robots and System, October 2014 (conference)

Abstract
This paper proposes a method for high-quality omnidirectional 3D reconstruction of augmented Manhattan worlds from catadioptric stereo video sequences. In contrast to existing works we do not rely on constructing virtual perspective views, but instead propose to optimize depth jointly in a unified omnidirectional space. Furthermore, we show that plane-based prior models can be applied even though planes in 3D do not project to planes in the omnidirectional domain. Towards this goal, we propose an omnidirectional slanted-plane Markov random field model which relies on plane hypotheses extracted using a novel voting scheme for 3D planes in omnidirectional space. To quantitatively evaluate our method we introduce a dataset which we have captured using our autonomous driving platform AnnieWAY which we equipped with two horizontally aligned catadioptric cameras and a Velodyne HDL-64E laser scanner for precise ground truth depth measurements. As evidenced by our experiments, the proposed method clearly benefits from the unified view and significantly outperforms existing stereo matching techniques both quantitatively and qualitatively. Furthermore, our method is able to reduce noise and the obtained depth maps can be represented very compactly by a small number of image segments and plane parameters.

avg ps

pdf DOI [BibTex]

pdf DOI [BibTex]