图像和视频处理学术速递[1.10]

格林先生MrGreen arXiv每日学术速递 2022-05-05

Update！H5支持摘要折叠，体验更佳！点击阅读原文访问arxivdaily.com，涵盖CS|物理|数学|经济|统计|金融|生物|电气领域，更有搜索、收藏等功能！

eess.IV图像和视频处理，共计19篇

【1】 Deep Ultrasound Denoising Without Clean Data
标题：无清洁数据的深部超声去噪
链接：https://arxiv.org/abs/2201.02604

作者：Sobhan Goudarzi,Hassan Rivaz
摘要：On one hand, the transmitted ultrasound beam gets attenuated as propagates through the tissue. On the other hand, the received Radio-Frequency (RF) data contains an additive Gaussian noise which is brought about by the acquisition card and the sensor noise. These two factors lead to a decreasing Signal to Noise Ratio (SNR) in the RF data with depth, effectively rendering deep regions of B-Mode images highly unreliable. There are three common approaches to mitigate this problem. First, increasing the power of transmitted beam which is limited by safety threshold. Averaging consecutive frames is the second option which not only reduces the framerate but also is not applicable for moving targets. And third, reducing the transmission frequency, which deteriorates spatial resolution. Many deep denoising techniques have been developed, but they often require clean data for training the model, which is usually only available in simulated images. Herein, a deep noise reduction approach is proposed which does not need clean training target. The model is constructed between noisy input-output pairs, and the training process interestingly converges to the clean image that is the average of noisy pairs. Experimental results on real phantom as well as ex vivo data confirm the efficacy of the proposed method for noise cancellation.

【2】 An Incremental Learning Approach to Automatically Recognize Pulmonary Diseases from the Multi-vendor Chest Radiographs
标题：一种增量学习方法自动识别多厂商胸片中的肺部疾病
链接：https://arxiv.org/abs/2201.02574

作者：Mehreen Sirshar,Taimur Hassan,Muhammad Usman Akram,Shoab Ahmed Khan
备注：None
摘要：Pulmonary diseases can cause severe respiratory problems, leading to sudden death if not treated timely. Many researchers have utilized deep learning systems to diagnose pulmonary disorders using chest X-rays (CXRs). However, such systems require exhaustive training efforts on large-scale data to effectively diagnose chest abnormalities. Furthermore, procuring such large-scale data is often infeasible and impractical, especially for rare diseases. With the recent advances in incremental learning, researchers have periodically tuned deep neural networks to learn different classification tasks with few training examples. Although, such systems can resist catastrophic forgetting, they treat the knowledge representations independently of each other, and this limits their classification performance. Also, to the best of our knowledge, there is no incremental learning-driven image diagnostic framework that is specifically designed to screen pulmonary disorders from the CXRs. To address this, we present a novel framework that can learn to screen different chest abnormalities incrementally. In addition to this, the proposed framework is penalized through an incremental learning loss function that infers Bayesian theory to recognize structural and semantic inter-dependencies between incrementally learned knowledge representations to diagnose the pulmonary diseases effectively, regardless of the scanner specifications. We tested the proposed framework on five public CXR datasets containing different chest abnormalities, where it outperformed various state-of-the-art system through various metrics.

【3】 Deep Domain Adversarial Adaptation for Photon-efficient Imaging Based on Spatiotemporal Inception Network
标题：基于时空初始网络的光子有效成像的深域对抗性自适应
链接：https://arxiv.org/abs/2201.02475

作者：Yiwei Chen,Gongxin Yao,Yong Liu,Yu Pan
摘要：In single-photon LiDAR, photon-efficient imaging captures the 3D structure of a scene by only several detected signal photons per pixel. The existing deep learning models for this task are trained on simulated datasets, which poses the domain shift challenge when applied to realistic scenarios. In this paper, we propose a spatiotemporal inception network (STIN) for photon-efficient imaging, which is able to precisely predict the depth from a sparse and high-noise photon counting histogram by fully exploiting spatial and temporal information. Then the domain adversarial adaptation frameworks, including domain-adversarial neural network and adversarial discriminative domain adaptation, are effectively applied to STIN to alleviate the domain shift problem for realistic applications. Comprehensive experiments on the simulated data generated from the NYU~v2 and the Middlebury datasets demonstrate that STIN outperforms the state-of-the-art models at low signal-to-background ratios from 2:10 to 2:100. Moreover, experimental results on the real-world dataset captured by the single-photon imaging prototype show that the STIN with domain adversarial training achieves better generalization performance compared with the state-of-the-arts as well as the baseline STIN trained by simulated data.

【4】 Negative Evidence Matters in Interpretable Histology Image Classification
标题：负证据在可解释组织学图像分类中的作用
链接：https://arxiv.org/abs/2201.02445

作者：Soufiane Belharbi,Marco Pedersoli,Ismail Ben Ayed,Luke McCaffrey,Eric Granger
备注：10 figures, under review
摘要：Using only global annotations such as the image class labels, weakly-supervised learning methods allow CNN classifiers to jointly classify an image, and yield the regions of interest associated with the predicted class. However, without any guidance at the pixel level, such methods may yield inaccurate regions. This problem is known to be more challenging with histology images than with natural ones, since objects are less salient, structures have more variations, and foreground and background regions have stronger similarities. Therefore, methods in computer vision literature for visual interpretation of CNNs may not directly apply. In this work, we propose a simple yet efficient method based on a composite loss function that leverages information from the fully negative samples. Our new loss function contains two complementary terms: the first exploits positive evidence collected from the CNN classifier, while the second leverages the fully negative samples from the training dataset. In particular, we equip a pre-trained classifier with a decoder that allows refining the regions of interest. The same classifier is exploited to collect both the positive and negative evidence at the pixel level to train the decoder. This enables to take advantages of the fully negative samples that occurs naturally in the data, without any additional supervision signals and using only the image class as supervision. Compared to several recent related methods, over the public benchmark GlaS for colon cancer and a Camelyon16 patch-based benchmark for breast cancer using three different backbones, we show the substantial improvements introduced by our method. Our results shows the benefits of using both negative and positive evidence, ie, the one obtained from a classifier and the one naturally available in datasets. We provide an ablation study of both terms. Our code is publicly available.

【5】 Effect of Prior-based Losses on Segmentation Performance: A Benchmark
标题：基于先前损失对分割性能的影响：一个基准
链接：https://arxiv.org/abs/2201.02428

作者：Rosana {EL JURDI},Caroline Petitjean,Veronika Cheplygina,Paul Honeine,Fahed Abdallah
备注：To be submitted to SPIE: Journal of Medical Imaging
摘要：Today, deep convolutional neural networks (CNNs) have demonstrated state-of-the-art performance for medical image segmentation, on various imaging modalities and tasks. Despite early success, segmentation networks may still generate anatomically aberrant segmentations, with holes or inaccuracies near the object boundaries. To enforce anatomical plausibility, recent research studies have focused on incorporating prior knowledge such as object shape or boundary, as constraints in the loss function. Prior integrated could be low-level referring to reformulated representations extracted from the ground-truth segmentations, or high-level representing external medical information such as the organ's shape or size. Over the past few years, prior-based losses exhibited a rising interest in the research field since they allow integration of expert knowledge while still being architecture-agnostic. However, given the diversity of prior-based losses on different medical imaging challenges and tasks, it has become hard to identify what loss works best for which dataset. In this paper, we establish a benchmark of recent prior-based losses for medical image segmentation. The main objective is to provide intuition onto which losses to choose given a particular task or dataset. To this end, four low-level and high-level prior-based losses are selected. The considered losses are validated on 8 different datasets from a variety of medical image segmentation challenges including the Decathlon, the ISLES and the WMH challenge. Results show that whereas low-level prior-based losses can guarantee an increase in performance over the Dice loss baseline regardless of the dataset characteristics, high-level prior-based losses can increase anatomical plausibility as per data characteristics.

【6】 Auto-Weighted Layer Representation Based View Synthesis Distortion Estimation for 3-D Video Coding
标题：基于自动加权分层表示的三维视频编码视图合成失真估计
链接：https://arxiv.org/abs/2201.02420

作者：Jian Jin,Xingxing Zhang,Lili Meng,Weisi Lin,Jie Liang,Huaxiang Zhang,Yao Zhao
摘要：Recently, various view synthesis distortion estimation models have been studied to better serve for 3-D video coding. However, they can hardly model the relationship quantitatively among different levels of depth changes, texture degeneration, and the view synthesis distortion (VSD), which is crucial for rate-distortion optimization and rate allocation. In this paper, an auto-weighted layer representation based view synthesis distortion estimation model is developed. Firstly, the sub-VSD (S-VSD) is defined according to the level of depth changes and their associated texture degeneration. After that, a set of theoretical derivations demonstrate that the VSD can be approximately decomposed into the S-VSDs multiplied by their associated weights. To obtain the S-VSDs, a layer-based representation of S-VSD is developed, where all the pixels with the same level of depth changes are represented with a layer to enable efficient S-VSD calculation at the layer level. Meanwhile, a nonlinear mapping function is learnt to accurately represent the relationship between the VSD and S-VSDs, automatically providing weights for S-VSDs during the VSD estimation. To learn such function, a dataset of VSD and its associated S-VSDs are built. Experimental results show that the VSD can be accurately estimated with the weights learnt by the nonlinear mapping function once its associated S-VSDs are available. The proposed method outperforms the relevant state-of-the-art methods in both accuracy and efficiency. The dataset and source code of the proposed method will be available at https://github.com/jianjin008/.

【7】 Amplitude SAR Imagery Splicing Localization
标题：幅度SAR图像拼接定位
链接：https://arxiv.org/abs/2201.02409

作者：Edoardo Daniele Cannas,Nicolò Bonettini,Sara Mandelli,Paolo Bestagini,Stefano Tubaro
摘要：Synthetic Aperture Radar (SAR) images are a valuable asset for a wide variety of tasks. In the last few years, many websites have been offering them for free in the form of easy to manage products, favoring their widespread diffusion and research work in the SAR field. The drawback of these opportunities is that such images might be exposed to forgeries and manipulations by malicious users, raising new concerns about their integrity and trustworthiness. Up to now, the multimedia forensics literature has proposed various techniques to localize manipulations in natural photographs, but the integrity assessment of SAR images was never investigated. This task poses new challenges, since SAR images are generated with a processing chain completely different from that of natural photographs. This implies that many forensics methods developed for natural images are not guaranteed to succeed. In this paper, we investigate the problem of amplitude SAR imagery splicing localization. Our goal is to localize regions of an amplitude SAR image that have been copied and pasted from another image, possibly undergoing some kind of editing in the process. To do so, we leverage a Convolutional Neural Network (CNN) to extract a fingerprint highlighting inconsistencies in the processing traces of the analyzed input. Then, we examine this fingerprint to produce a binary tampering mask indicating the pixel region under splicing attack. Results show that our proposed method, tailored to the nature of SAR signals, provides better performances than state-of-the-art forensic tools developed for natural images.

【8】 Cross-Modality Deep Feature Learning for Brain Tumor Segmentation
标题：跨模态深度特征学习在脑肿瘤分割中的应用
链接：https://arxiv.org/abs/2201.02356

作者：Dingwen Zhang,Guohai Huang,Qiang Zhang,Jungong Han,Junwei Han,Yizhou Yu
备注：published on Pattern Recognition 2021
摘要：Recent advances in machine learning and prevalence of digital medical images have opened up an opportunity to address the challenging brain tumor segmentation (BTS) task by using deep convolutional neural networks. However, different from the RGB image data that are very widespread, the medical image data used in brain tumor segmentation are relatively scarce in terms of the data scale but contain the richer information in terms of the modality property. To this end, this paper proposes a novel cross-modality deep feature learning framework to segment brain tumors from the multi-modality MRI data. The core idea is to mine rich patterns across the multi-modality data to make up for the insufficient data scale. The proposed cross-modality deep feature learning framework consists of two learning processes: the cross-modality feature transition (CMFT) process and the cross-modality feature fusion (CMFF) process, which aims at learning rich feature representations by transiting knowledge across different modality data and fusing knowledge from different modality data, respectively. Comprehensive experiments are conducted on the BraTS benchmarks, which show that the proposed cross-modality deep feature learning framework can effectively improve the brain tumor segmentation performance when compared with the baseline methods and state-of-the-art methods.

【9】 Multiresolution Fully Convolutional Networks to detect Clouds and Snow through Optical Satellite Images
标题：利用光学卫星图像探测云雪的多分辨率全卷积网络
链接：https://arxiv.org/abs/2201.02350

作者：Debvrat Varshney,Claudio Persello,Prasun Kumar Gupta,Bhaskar Ramachandra Nikam
摘要：Clouds and snow have similar spectral features in the visible and near-infrared (VNIR) range and are thus difficult to distinguish from each other in high resolution VNIR images. We address this issue by introducing a shortwave-infrared (SWIR) band where clouds are highly reflective, and snow is absorptive. As SWIR is typically of a lower resolution compared to VNIR, this study proposes a multiresolution fully convolutional neural network (FCN) that can effectively detect clouds and snow in VNIR images. We fuse the multiresolution bands within a deep FCN and perform semantic segmentation at the higher, VNIR resolution. Such a fusion-based classifier, trained in an end-to-end manner, achieved 94.31% overall accuracy and an F1 score of 97.67% for clouds on Resourcesat-2 data captured over the state of Uttarakhand, India. These scores were found to be 30% higher than a Random Forest classifier, and 10% higher than a standalone single-resolution FCN. Apart from being useful for cloud detection purposes, the study also highlights the potential of convolutional neural networks for multi-sensor fusion problems.

【10】 RestoreDet: Degradation Equivariant Representation for Object Detection in Low Resolution Images
标题：RestoreDet：低分辨率图像目标检测的退化等变表示
链接：https://arxiv.org/abs/2201.02314

作者：Ziteng Cui,Yingying Zhu,Lin Gu,Guo-Jun Qi,Xiaoxiao Li,Peng Gao,Zenghui Zhang,Tatsuya Harada
备注：11 pages, 3figures
摘要：Image restoration algorithms such as super resolution (SR) are indispensable pre-processing modules for object detection in degraded images. However, most of these algorithms assume the degradation is fixed and known a priori. When the real degradation is unknown or differs from assumption, both the pre-processing module and the consequent high-level task such as object detection would fail. Here, we propose a novel framework, RestoreDet, to detect objects in degraded low resolution images. RestoreDet utilizes the downsampling degradation as a kind of transformation for self-supervised signals to explore the equivariant representation against various resolutions and other degradation conditions. Specifically, we learn this intrinsic visual structure by encoding and decoding the degradation transformation from a pair of original and randomly degraded images. The framework could further take the advantage of advanced SR architectures with an arbitrary resolution restoring decoder to reconstruct the original correspondence from the degraded input image. Both the representation learning and object detection are optimized jointly in an end-to-end training fashion. RestoreDet is a generic framework that could be implemented on any mainstream object detection architectures. The extensive experiment shows that our framework based on CenterNet has achieved superior performance compared with existing methods when facing variant degradation situations. Our code would be released soon.

【11】 A three-dimensional dual-domain deep network for high-pitch and sparse helical CT reconstruction
标题：用于高螺距稀疏螺旋CT重建的三维双域深度网络
链接：https://arxiv.org/abs/2201.02309

作者：Wei Wang,Xiang-Gen Xia,Chuanjiang He,Zemin Ren,Jian Lu
备注：13 pages, 5 figures
摘要：In this paper, we propose a new GPU implementation of the Katsevich algorithm for helical CT reconstruction. Our implementation divides the sinograms and reconstructs the CT images pitch by pitch. By utilizing the periodic properties of the parameters of the Katsevich algorithm, our method only needs to calculate these parameters once for all the pitches and so has lower GPU-memory burdens and is very suitable for deep learning. By embedding our implementation into the network, we propose an end-to-end deep network for the high pitch helical CT reconstruction with sparse detectors. Since our network utilizes the features extracted from both sinograms and CT images, it can simultaneously reduce the streak artifacts caused by the sparsity of sinograms and preserve fine details in the CT images. Experiments show that our network outperforms the related methods both in subjective and objective evaluations.

【12】 Persistent Homology for Breast Tumor Classification using Mammogram Scans
标题：使用乳腺X线扫描实现乳腺肿瘤分类的持久同源性
链接：https://arxiv.org/abs/2201.02295

作者：Aras Asaad,Dashti Ali,Taban Majeed,Rasber Rashid
备注：10 pages
摘要：An Important tool in the field topological data analysis is known as persistent Homology (PH) which is used to encode abstract representation of the homology of data at different resolutions in the form of persistence diagram (PD). In this work we build more than one PD representation of a single image based on a landmark selection method, known as local binary patterns, that encode different types of local textures from images. We employed different PD vectorizations using persistence landscapes, persistence images, persistence binning (Betti Curve) and statistics. We tested the effectiveness of proposed landmark based PH on two publicly available breast abnormality detection datasets using mammogram scans. Sensitivity of landmark based PH obtained is over 90% in both datasets for the detection of abnormal breast scans. Finally, experimental results give new insights on using different types of PD vectorizations which help in utilising PH in conjunction with machine learning classifiers.

【13】 A Keypoint Detection and Description Network Based on the Vessel Structure for Multi-Modal Retinal Image Registration
标题：一种基于血管结构的多模态视网膜图像配准关键点检测与描述网络
链接：https://arxiv.org/abs/2201.02242

作者：Aline Sindel,Bettina Hohberger,Sebastian Fassihi Dehcordi,Christian Mardin,Robert Lämmer,Andreas Maier,Vincent Christlein
备注：6 pages, 4 figures, 1 table, accepted to BVM 2022
摘要：Ophthalmological imaging utilizes different imaging systems, such as color fundus, infrared, fluorescein angiography, optical coherence tomography (OCT) or OCT angiography. Multiple images with different modalities or acquisition times are often analyzed for the diagnosis of retinal diseases. Automatically aligning the vessel structures in the images by means of multi-modal registration can support the ophthalmologists in their work. Our method uses a convolutional neural network to extract features of the vessel structure in multi-modal retinal images. We jointly train a keypoint detection and description network on small patches using a classification and a cross-modal descriptor loss function and apply the network to the full image size in the test phase. Our method demonstrates the best registration performance on our and a public multi-modal dataset in comparison to competing methods.

【14】 Second-Order Ultrasound Elastography with L1-norm Spatial Regularization
标题：基于L1范数空间正则化的二阶超声弹性成像
链接：https://arxiv.org/abs/2201.02226

作者：Md Ashikuzzaman,Hassan Rivaz
摘要：Time delay estimation (TDE) between two radio-frequency (RF) frames is one of the major steps of quasi-static ultrasound elastography, which detects tissue pathology by estimating its mechanical properties. Regularized optimization-based techniques, a prominent class of TDE algorithms, optimize a non-linear energy functional consisting of data constancy and spatial continuity constraints to obtain the displacement and strain maps between the time-series frames under consideration. The existing optimization-based TDE methods often consider the L2-norm of displacement derivatives to construct the regularizer. However, such a formulation over-penalizes the displacement irregularity and poses two major issues to the estimated strain field. First, the boundaries between different tissues are blurred. Second, the visual contrast between the target and the background is suboptimal. To resolve these issues, herein, we propose a novel TDE algorithm where instead of L2-, L1-norms of both first- and second-order displacement derivatives are taken into account to devise the continuity functional. We handle the non-differentiability of L1-norm by smoothing the absolute value function's sharp corner and optimize the resulting cost function in an iterative manner. We call our technique Second-Order Ultrasound eLastography with L1-norm spatial regularization (L1-SOUL). In terms of both sharpness and visual contrast, L1-SOUL substantially outperforms GLUE, OVERWIND, and SOUL, three recently published TDE algorithms in all validation experiments performed in this study. In cases of simulated, phantom, and in vivo datasets, respectively, L1-SOUL achieves 67.8%, 46.81%, and 117.35% improvements of contrast-to-noise ratio (CNR) over SOUL. The L1-SOUL code can be downloaded from http://code.sonography.ai.

【15】 3D Intracranial Aneurysm Classification and Segmentation via Unsupervised Dual-branch Learning
标题：基于无监督双分支学习的三维颅内动脉瘤分类与分割
链接：https://arxiv.org/abs/2201.02198

作者：Di Shao,Xuequan Lu,Xiao Liu
备注：submitted for review (contact: xuequan.lu@deakin.edu.au)
摘要：Intracranial aneurysms are common nowadays and how to detect them intelligently is of great significance in digital health. While most existing deep learning research focused on medical images in a supervised way, we introduce an unsupervised method for the detection of intracranial aneurysms based on 3D point cloud data. In particular, our method consists of two stages: unsupervised pre-training and downstream tasks. As for the former, the main idea is to pair each point cloud with its jittered counterpart and maximise their correspondence. Then we design a dual-branch contrastive network with an encoder for each branch and a subsequent common projection head. As for the latter, we design simple networks for supervised classification and segmentation training. Experiments on the public dataset (IntrA) show that our unsupervised method achieves comparable or even better performance than some state-of-the-art supervised techniques, and it is most prominent in the detection of aneurysmal vessels. Experiments on the ModelNet40 also show that our method achieves the accuracy of 90.79\% which outperforms existing state-of-the-art unsupervised models.

【16】 Elephant-Human Conflict Mitigation: An Autonomous UAV Approach
标题：缓解大象与人类冲突：一种自主无人机方法
链接：https://arxiv.org/abs/2201.02584

作者：Weiyun Jiang,Yukai Yang,Yogananda Isukapalli
备注：None
摘要：Elephant-human conflict (EHC) is one of the major problems in most African and Asian countries. As humans overutilize natural resources for their development, elephants' living area continues to decrease; this leads elephants to invade the human living area and raid crops more frequently, costing millions of dollars annually. To mitigate EHC, in this paper, we propose an original solution that comprises of three parts: a compact custom low-power GPS tag that is installed on the elephants, a receiver stationed in the human living area that detects the elephants' presence near a farm, and an autonomous unmanned aerial vehicle (UAV) system that tracks and herds the elephants away from the farms. By utilizing proportional-integral-derivative controller and machine learning algorithms, we obtain accurate tracking trajectories at a real-time processing speed of 32 FPS. Our proposed autonomous system can save over 68 % cost compared with human-controlled UAVs in mitigating EHC.

【17】 Deep Generative Framework for Interactive 3D Terrain Authoring and Manipulation
标题：交互式三维地形创作和操纵的深度生成框架
链接：https://arxiv.org/abs/2201.02369

作者：Shanthika Naik,Aryamaan Jain,Avinash Sharma,KS Rajan
摘要：Automated generation and (user) authoring of the realistic virtual terrain is most sought for by the multimedia applications like VR models and gaming. The most common representation adopted for terrain is Digital Elevation Model (DEM). Existing terrain authoring and modeling techniques have addressed some of these and can be broadly categorized as: procedural modeling, simulation method, and example-based methods. In this paper, we propose a novel realistic terrain authoring framework powered by a combination of VAE and generative conditional GAN model. Our framework is an example-based method that attempts to overcome the limitations of existing methods by learning a latent space from a real-world terrain dataset. This latent space allows us to generate multiple variants of terrain from a single input as well as interpolate between terrains while keeping the generated terrains close to real-world data distribution. We also developed an interactive tool, that lets the user generate diverse terrains with minimalist inputs. We perform thorough qualitative and quantitative analysis and provide comparisons with other SOTA methods. We intend to release our code/tool to the academic community.

【18】 Uncertainty-Aware Cascaded Dilation Filtering for High-Efficiency Deraining
标题：基于不确定性感知的级联膨胀滤波高效去噪
链接：https://arxiv.org/abs/2201.02366

作者：Qing Guo,Jingyang Sun,Felix Juefei-Xu,Lei Ma,Di Lin,Wei Feng,Song Wang
备注：14 pages, 10 figures, 10 tables. This is the extention of our conference version this https URL
摘要：Deraining is a significant and fundamental computer vision task, aiming to remove the rain streaks and accumulations in an image or video captured under a rainy day. Existing deraining methods usually make heuristic assumptions of the rain model, which compels them to employ complex optimization or iterative refinement for high recovery quality. This, however, leads to time-consuming methods and affects the effectiveness for addressing rain patterns deviated from from the assumptions. In this paper, we propose a simple yet efficient deraining method by formulating deraining as a predictive filtering problem without complex rain model assumptions. Specifically, we identify spatially-variant predictive filtering (SPFilt) that adaptively predicts proper kernels via a deep network to filter different individual pixels. Since the filtering can be implemented via well-accelerated convolution, our method can be significantly efficient. We further propose the EfDeRain+ that contains three main contributions to address residual rain traces, multi-scale, and diverse rain patterns without harming the efficiency. First, we propose the uncertainty-aware cascaded predictive filtering (UC-PFilt) that can identify the difficulties of reconstructing clean pixels via predicted kernels and remove the residual rain traces effectively. Second, we design the weight-sharing multi-scale dilated filtering (WS-MS-DFilt) to handle multi-scale rain streaks without harming the efficiency. Third, to eliminate the gap across diverse rain patterns, we propose a novel data augmentation method (i.e., RainMix) to train our deep models. By combining all contributions with sophisticated analysis on different variants, our final method outperforms baseline methods on four single-image deraining datasets and one video deraining dataset in terms of both recovery quality and speed.

【19】 Predicting Trust Using Automated Assessment of Multivariate Interactional Synchrony
标题：基于多变量交互同步性自动评估的信任预测
链接：https://arxiv.org/abs/2201.02223

作者：Adrien Meynard,Gayan Seneviratna,Elliot Doyle,Joyanne Becker,Hau-Tieng Wu,Jana Schaich Borg
摘要：Diverse disciplines are interested in how the coordination of interacting agents' movements, emotions, and physiology over time impacts social behavior. Here, we describe a new multivariate procedure for automating the investigation of this kind of behaviorally-relevant "interactional synchrony", and introduce a novel interactional synchrony measure based on features of dynamic time warping (DTW) paths. We demonstrate that our DTW path-based measure of interactional synchrony between facial action units of two people interacting freely in a natural social interaction can be used to predict how much trust they will display in a subsequent Trust Game. We also show that our approach outperforms univariate head movement models, models that consider participants' facial action units independently, and models that use previously proposed synchrony or similarity measures. The insights of this work can be applied to any research question that aims to quantify the temporal coordination of multiple signals over time, but has immediate applications in psychology, medicine, and robotics.

机器翻译，仅供参考

点击“阅读原文”获取带摘要的学术速递

“家属和记者取得联系”：记者的退场意味深长

广州地铁“偷拍门”事件：那个漂亮的女大学生，为啥惹了众怒...

劲爆！为了姜萍两位女CEO互揭老底！

治安处罚中“赌资较大”“情节严重”数额认定的理解与适用（各地标准）

中石化一副总被曝出轨人妻，本人嚣张回应：旧情复燃尔

图像和视频处理学术速递[1.10]

您可能也对以下帖子感兴趣

“家属和记者取得联系”：记者的退场意味深长

广州地铁“偷拍门”事件：那个漂亮的女大学生，为啥惹了众怒...

劲爆！为了姜萍两位女CEO互揭老底！

治安处罚中“赌资较大”“情节严重”数额认定的理解与适用（各地标准）

中石化一副总被曝出轨人妻，本人嚣张回应：旧情复燃尔

生成图片，分享到微信朋友圈

图像和视频处理学术速递[1.10]

您可能也对以下帖子感兴趣