查看原文
其他

计算机科学汇总学术速递[1.10]

格林先生MrGreen arXiv每日学术速递 2022-05-05

Update!H5支持摘要折叠,体验更佳!点击阅读原文访问arxivdaily.com,涵盖CS|物理|数学|经济|统计|金融|生物|电气领域,更有搜索、收藏等功能!


cs计算机科学汇总,共计146篇


【1】 Embodied Hands: Modeling and Capturing Hands and Bodies Together
标题:具体化的手:一起建模和捕捉手和身体

链接:https://arxiv.org/abs/2201.02610
作者:Javier Romero,Dimitrios Tzionas,Michael J. Black
备注:None
摘要:Humans move their hands and bodies together to communicate and solve tasks. Capturing and replicating such coordinated activity is critical for virtual characters that behave realistically. Surprisingly, most methods treat the 3D modeling and tracking of bodies and hands separately. Here we formulate a model of hands and bodies interacting together and fit it to full-body 4D sequences. When scanning or capturing the full body in 3D, hands are small and often partially occluded, making their shape and pose hard to recover. To cope with low-resolution, occlusion, and noise, we develop a new model called MANO (hand Model with Articulated and Non-rigid defOrmations). MANO is learned from around 1000 high-resolution 3D scans of hands of 31 subjects in a wide variety of hand poses. The model is realistic, low-dimensional, captures non-rigid shape changes with pose, is compatible with standard graphics packages, and can fit any human hand. MANO provides a compact mapping from hand poses to pose blend shape corrections and a linear manifold of pose synergies. We attach MANO to a standard parameterized 3D body shape model (SMPL), resulting in a fully articulated body and hand model (SMPL+H). We illustrate SMPL+H by fitting complex, natural, activities of subjects captured with a 4D scanner. The fitting is fully automatic and results in full body models that move naturally with detailed hand motions and a realism not seen before in full body performance capture. The models and data are freely available for research purposes in our website (http://mano.is.tue.mpg.de).

【2】 Generalized Category Discovery
标题:广义范畴发现

链接:https://arxiv.org/abs/2201.02609
作者:Sagar Vaze,Kai Han,Andrea Vedaldi,Andrew Zisserman
备注:13 pages, 6 figures
摘要:In this paper, we consider a highly general image recognition setting wherein, given a labelled and unlabelled set of images, the task is to categorize all images in the unlabelled set. Here, the unlabelled images may come from labelled classes or from novel ones. Existing recognition methods are not able to deal with this setting, because they make several restrictive assumptions, such as the unlabelled instances only coming from known - or unknown - classes and the number of unknown classes being known a-priori. We address the more unconstrained setting, naming it 'Generalized Category Discovery', and challenge all these assumptions. We first establish strong baselines by taking state-of-the-art algorithms from novel category discovery and adapting them for this task. Next, we propose the use of vision transformers with contrastive representation learning for this open world setting. We then introduce a simple yet effective semi-supervised $k$-means method to cluster the unlabelled data into seen and unseen classes automatically, substantially outperforming the baselines. Finally, we also propose a new approach to estimate the number of classes in the unlabelled data. We thoroughly evaluate our approach on public datasets for generic object classification including CIFAR10, CIFAR100 and ImageNet-100, and for fine-grained visual recognition including CUB, Stanford Cars and Herbarium19, benchmarking on this new setting to foster future research.

【3】 Detecting Twenty-thousand Classes using Image-level Supervision
标题:利用图像级监控检测2万个班级

链接:https://arxiv.org/abs/2201.02605
作者:Xingyi Zhou,Rohit Girdha,Armand Joulin,Phillip Krähenbühl,Ishan Misra
备注:Code is available at this https URL
摘要:Current object detectors are limited in vocabulary size due to the small scale of detection datasets. Image classifiers, on the other hand, reason about much larger vocabularies, as their datasets are larger and easier to collect. We propose Detic, which simply trains the classifiers of a detector on image classification data and thus expands the vocabulary of detectors to tens of thousands of concepts. Unlike prior work, Detic does not assign image labels to boxes based on model predictions, making it much easier to implement and compatible with a range of detection architectures and backbones. Our results show that Detic yields excellent detectors even for classes without box annotations. It outperforms prior work on both open-vocabulary and long-tail detection benchmarks. Detic provides a gain of 2.4 mAP for all classes and 8.3 mAP for novel classes on the open-vocabulary LVIS benchmark. On the standard LVIS benchmark, Detic reaches 41.7 mAP for all classes and 41.7 mAP for rare classes. For the first time, we train a detector with all the twenty-one-thousand classes of the ImageNet dataset and show that it generalizes to new datasets without fine-tuning. Code is available at https://github.com/facebookresearch/Detic.

【4】 Wavenumber-explicit hp-FEM analysis for Maxwell's equations with impedance boundary conditions

链接:https://arxiv.org/abs/2201.02602
作者:Jens M. Melenk,Stefan A. Sauter
备注:80 pages, 6 figures
摘要:The time-harmonic Maxwell equations at high wavenumber k in domains with an analytic boundary and impedance boundary conditions are considered. A wavenumber-explicit stability and regularity theory is developed that decomposes the solution into a part with finite Sobolev regularity that is controlled uniformly in k and an analytic part. Using this regularity, quasi-optimality of the Galerkin discretization based on Nedelec elements of order p on a mesh with mesh size h is shown under the k-explicit scale resolution condition that a) kh/p is sufficient small and b) p/\ln k is bounded from below.

【5】 Apples and Cars: a Comparison of Security
标题:苹果和汽车:安全性的比较

链接:https://arxiv.org/abs/2201.02601
作者:Zhendong Ma
备注:Extended Abstract, 5th ACM COMPUTER SCIENCE IN CARS SYMPOSIUM (CSCS 2021)
摘要:Cybersecurity has gained importance for cars that increasingly rely on software and networks. "Smartphone on wheels" is often used as an analogy to highlight the need for security. As a high-value target of cyberattacks, modern smartphones implement layers of protection. Automotive embedded systems share many similarities with smartphones. We compare the security architecture of an iPhone and a car to identify gaps and discuss the potentials for the cars of the future.

【6】 Equalized Focal Loss for Dense Long-Tailed Object Detection
标题:用于密集长尾目标检测的均衡焦损算法

链接:https://arxiv.org/abs/2201.02593
作者:Bo Li,Yongqiang Yao,Jingru Tan,Gang Zhang,Fengwei Yu,Jianwei Lu,Ye Luo
摘要:Despite the recent success of long-tailed object detection, almost all long-tailed object detectors are developed based on the two-stage paradigm. In practice, one-stage detectors are more prevalent in the industry because they have a simple and fast pipeline that is easy to deploy. However, in the long-tailed scenario, this line of work has not been explored so far. In this paper, we investigate whether one-stage detectors can perform well in this case. We discover the primary obstacle that prevents one-stage detectors from achieving excellent performance is: categories suffer from different degrees of positive-negative imbalance problems under the long-tailed data distribution. The conventional focal loss balances the training process with the same modulating factor for all categories, thus failing to handle the long-tailed problem. To address this issue, we propose the Equalized Focal Loss (EFL) that rebalances the loss contribution of positive and negative samples of different categories independently according to their imbalance degrees. Specifically, EFL adopts a category-relevant modulating factor which can be adjusted dynamically by the training status of different categories. Extensive experiments conducted on the challenging LVIS v1 benchmark demonstrate the effectiveness of our proposed method. With an end-to-end training pipeline, EFL achieves 29.2% in terms of overall AP and obtains significant performance improvements on rare categories, surpassing all existing state-of-the-art methods. The code is available at https://github.com/ModelTC/EOD.

【7】 Leveraging Scale-Invariance and Uncertainity with Self-Supervised Domain Adaptation for Semantic Segmentation of Foggy Scenes
标题:基于尺度不变性和不确定性的自监督领域自适应模糊场景语义分割

链接:https://arxiv.org/abs/2201.02588
作者:Javed Iqbal,Rehan Hafiz,Mohsen Ali
备注:Under Review
摘要:This paper presents FogAdapt, a novel approach for domain adaptation of semantic segmentation for dense foggy scenes. Although significant research has been directed to reduce the domain shift in semantic segmentation, adaptation to scenes with adverse weather conditions remains an open question. Large variations in the visibility of the scene due to weather conditions, such as fog, smog, and haze, exacerbate the domain shift, thus making unsupervised adaptation in such scenarios challenging. We propose a self-entropy and multi-scale information augmented self-supervised domain adaptation method (FogAdapt) to minimize the domain shift in foggy scenes segmentation. Supported by the empirical evidence that an increase in fog density results in high self-entropy for segmentation probabilities, we introduce a self-entropy based loss function to guide the adaptation method. Furthermore, inferences obtained at different image scales are combined and weighted by the uncertainty to generate scale-invariant pseudo-labels for the target domain. These scale-invariant pseudo-labels are robust to visibility and scale variations. We evaluate the proposed model on real clear-weather scenes to real foggy scenes adaptation and synthetic non-foggy images to real foggy scenes adaptation scenarios. Our experiments demonstrate that FogAdapt significantly outperforms the current state-of-the-art in semantic segmentation of foggy images. Specifically, by considering the standard settings compared to state-of-the-art (SOTA) methods, FogAdapt gains 3.8% on Foggy Zurich, 6.0% on Foggy Driving-dense, and 3.6% on Foggy Driving in mIoU when adapted from Cityscapes to Foggy Zurich.

【8】 Elephant-Human Conflict Mitigation: An Autonomous UAV Approach
标题:缓解大象与人类冲突:一种自主无人机方法

链接:https://arxiv.org/abs/2201.02584
作者:Weiyun Jiang,Yukai Yang,Yogananda Isukapalli
备注:None
摘要:Elephant-human conflict (EHC) is one of the major problems in most African and Asian countries. As humans overutilize natural resources for their development, elephants' living area continues to decrease; this leads elephants to invade the human living area and raid crops more frequently, costing millions of dollars annually. To mitigate EHC, in this paper, we propose an original solution that comprises of three parts: a compact custom low-power GPS tag that is installed on the elephants, a receiver stationed in the human living area that detects the elephants' presence near a farm, and an autonomous unmanned aerial vehicle (UAV) system that tracks and herds the elephants away from the farms. By utilizing proportional-integral-derivative controller and machine learning algorithms, we obtain accurate tracking trajectories at a real-time processing speed of 32 FPS. Our proposed autonomous system can save over 68 % cost compared with human-controlled UAVs in mitigating EHC.

【9】 Multi-Model Federated Learning
标题:多模型联合学习

链接:https://arxiv.org/abs/2201.02582
作者:Neelkamal Bhuyan,Sharayu Moharir
摘要:Federated learning is a form of distributed learning with the key challenge being the non-identically distributed nature of the data in the participating clients. In this paper, we extend federated learning to the setting where multiple unrelated models are trained simultaneously. Specifically, every client is able to train any one of M models at a time and the server maintains a model for each of the M models which is typically a suitably averaged version of the model computed by the clients. We propose multiple policies for assigning learning tasks to clients over time. In the first policy, we extend the widely studied FedAvg to multi-model learning by allotting models to clients in an i.i.d. stochastic manner. In addition, we propose two new policies for client selection in a multi-model federated setting which make decisions based on current local losses for each client-model pair. We compare the performance of the policies on tasks involving synthetic and real-world data and characterize the performance of the proposed policies. The key take-away from our work is that the proposed multi-model policies perform better or at least as good as single model training using FedAvg.

【10】 Prognosis: Closed-Box Analysis of Network Protocol Implementations
标题:预测:网络协议实施的封闭分析

链接:https://arxiv.org/abs/2201.02577
作者:Tiago Ferreira,Harrison Brewton,Loris D'Antoni,Alexandra Silva
备注:None
摘要:We present Prognosis, a framework offering automated closed-box learning and analysis of models of network protocol implementations. Prognosis can learn models that vary in abstraction level from simple deterministic automata to models containing data operations, such as register updates, and can be used to unlock a variety of analysis techniques -- model checking temporal properties, computing differences between models of two implementations of the same protocol, or improving testing via model-based test generation. Prognosis is modular and easily adaptable to different protocols (e.g., TCP and QUIC) and their implementations. We use Prognosis to learn models of (parts of) three QUIC implementations -- Quiche (Cloudflare), Google QUIC, and Facebook mvfst -- and use these models to analyze the differences between the various implementations. Our analysis provides insights into different design choices and uncovers potential bugs. Concretely, we have found critical bugs in multiple QUIC implementations, which have been acknowledged by the developers.

【11】 Charging Techniques for UAV-assisted Data Collection: Is Laser Power Beaming the Answer?
标题:无人机辅助数据收集的充电技术:激光传输能解决问题吗?

链接:https://arxiv.org/abs/2201.02573
作者:Mohamed-Amine Lahmeri,Mustafa A. Kishk,Mohamed-Slim Alouini
备注:6 pages, 5 figures
摘要:As Covid-19 has increased the need for connectivity around the world, researchers are targeting new technologies that could improve coverage and connect the unconnected in order to make progress toward the United Nations Sustainable Development Goals. In this context, drones are seen as one of the key features of 6G wireless networks that could extend the coverage of previous wireless network generations. That said, limited on-board energy seems to be the main drawback that hinders the use of drones for wireless coverage. Therefore, different wireless and wired charging techniques, such as laser beaming, charging stations, and tether stations are proposed. In this paper, we analyze and compare these different charging techniques by performing extensive simulations for the scenario of drone-assisted data collection from ground-based Internet of Things (IoT) devices. We analyze the strengths and weaknesses of each charging technique, and finally show that laser-powered drones strongly compete with, and outperform in some scenarios other charging techniques.

【12】 Neural Network Optimization for Reinforcement Learning Tasks Using Sparse Computations
标题:基于稀疏计算的强化学习任务神经网络优化

链接:https://arxiv.org/abs/2201.02571
作者:Dmitry Ivanov,Mikhail Kiselev,Denis Larionov
摘要:This article proposes a sparse computation-based method for optimizing neural networks for reinforcement learning (RL) tasks. This method combines two ideas: neural network pruning and taking into account input data correlations; it makes it possible to update neuron states only when changes in them exceed a certain threshold. It significantly reduces the number of multiplications when running neural networks. We tested different RL tasks and achieved 20-150x reduction in the number of multiplications. There were no substantial performance losses; sometimes the performance even improved.

【13】 Visual Attention Prediction Improves Performance of Autonomous Drone Racing Agents
标题:视觉注意预测提高自主无人机竞速智能体的性能

链接:https://arxiv.org/abs/2201.02569
作者:Christian Pfeiffer,Simon Wengeler,Antonio Loquercio,Davide Scaramuzza
备注:12 pages, 6 figures
摘要:Humans race drones faster than neural networks trained for end-to-end autonomous flight. This may be related to the ability of human pilots to select task-relevant visual information effectively. This work investigates whether neural networks capable of imitating human eye gaze behavior and attention can improve neural network performance for the challenging task of vision-based autonomous drone racing. We hypothesize that gaze-based attention prediction can be an efficient mechanism for visual information selection and decision making in a simulator-based drone racing task. We test this hypothesis using eye gaze and flight trajectory data from 18 human drone pilots to train a visual attention prediction model. We then use this visual attention prediction model to train an end-to-end controller for vision-based autonomous drone racing using imitation learning. We compare the drone racing performance of the attention-prediction controller to those using raw image inputs and image-based abstractions (i.e., feature tracks). Our results show that attention-prediction based controllers outperform the baselines and are able to complete a challenging race track consistently with up to 88% success rate. Furthermore, visual attention-prediction and feature-track based models showed better generalization performance than image-based models when evaluated on hold-out reference trajectories. Our results demonstrate that human visual attention prediction improves the performance of autonomous vision-based drone racing agents and provides an essential step towards vision-based, fast, and agile autonomous flight that eventually can reach and even exceed human performances.

【14】 Security Considerations for Virtual Reality Systems
标题:虚拟现实系统的安全注意事项

链接:https://arxiv.org/abs/2201.02563
作者:Karthik Viswanathan
摘要:There is a growing need for authentication methodology in virtual reality applications. Current systems assume that the immersive experience technology is a collection of peripheral devices connected to a personal computer or mobile device. Hence there is a complete reliance on the computing device with traditional authentication mechanisms to handle the authentication and authorization decisions. Using the virtual reality controllers and headset poses a different set of challenges as it is subject to unauthorized observation, unannounced to the user given the fact that the headset completely covers the field of vision in order to provide an immersive experience. As the need for virtual reality experiences in the commercial world increases, there is a need to provide other alternative mechanisms for secure authentication. In this paper, we analyze a few proposed authentication systems and reached a conclusion that a multidimensional approach to authentication is needed to address the granular nature of authentication and authorization needs of a commercial virtual reality applications in the commercial world.

【15】 A Novel Incremental Learning Driven Instance Segmentation Framework to Recognize Highly Cluttered Instances of the Contraband Items
标题:一种新的增量学习驱动的实例分割框架,用于识别高杂乱的对比带项目实例

链接:https://arxiv.org/abs/2201.02560
作者:Taimur Hassan,Samet Akcay,Mohammed Bennamoun,Salman Khan,Naoufel Werghi
备注:Accepted in IEEE T-SMC: Systems, Source code is available at this https URL
摘要:Screening cluttered and occluded contraband items from baggage X-ray scans is a cumbersome task even for the expert security staff. This paper presents a novel strategy that extends a conventional encoder-decoder architecture to perform instance-aware segmentation and extract merged instances of contraband items without using any additional sub-network or an object detector. The encoder-decoder network first performs conventional semantic segmentation and retrieves cluttered baggage items. The model then incrementally evolves during training to recognize individual instances using significantly reduced training batches. To avoid catastrophic forgetting, a novel objective function minimizes the network loss in each iteration by retaining the previously acquired knowledge while learning new class representations and resolving their complex structural inter-dependencies through Bayesian inference. A thorough evaluation of our framework on two publicly available X-ray datasets shows that it outperforms state-of-the-art methods, especially within the challenging cluttered scenarios, while achieving an optimal trade-off between detection accuracy and efficiency.

【16】 Project IRL: Playful Co-Located Interactions with Mobile Augmented Reality
标题:IRL项目:与移动增强现实进行有趣的协同交互

链接:https://arxiv.org/abs/2201.02558
作者:Ella Dagan,Ana Cárdenas Gasca,Ava Robinson,Anwar Noriega,Yu Jiang Tham,Rajan Vaish,Andrés Monroy-Hernández
摘要:We present Project IRL (In Real Life), a suite of five mobile apps we created to explore novel ways of supporting in-person social interactions with augmented reality. In recent years, the tone of public discourse surrounding digital technology has become increasingly critical, and technology's influence on the way people relate to each other has been blamed for making people feel "alone together," diverting their attention from truly engaging with one another when they interact in person. Motivated by this challenge, we focus on an under-explored design space: playful co-located interactions. We evaluated the apps through a deployment study that involved interviews and participant observations with 101 people. We synthesized the results into a series of design guidelines that focus on four themes: (1) device arrangement (e.g., are people sharing one phone, or does each person have their own?), (2) enablers (e.g., should the activity focus on an object, body part, or pet?), (3) affordances of modifying reality (i.e., features of the technology that enhance its potential to encourage various aspects of social interaction), and (4) co-located play (i.e., using technology to make in-person play engaging and inviting). We conclude by presenting our design guidelines for future work on embodied social AR.

【17】 In Situ Data Summaries for Flexible Feature Analysis in Large-Scale Multiphase Flow Simulations
标题:大尺度多相流模拟中柔性特征分析的现场数据汇总

链接:https://arxiv.org/abs/2201.02557
作者:Soumya Dutta,Terece Turton,David Rogers,Jordan Musser,James Ahrens,Ann Almgren
摘要:The study of multiphase flow is essential for understanding the complex interactions of various materials. In particular, when designing chemical reactors such as fluidized bed reactors (FBR), a detailed understanding of the hydrodynamics is critical for optimizing reactor performance and stability. An FBR allows experts to conduct different types of chemical reactions involving multiphase materials, especially interaction between gas and solids. During such complex chemical processes, formation of void regions in the reactor, generally termed as bubbles, is an important phenomenon. Study of these bubbles has a deep implication in predicting the reactor's overall efficiency. But physical experiments needed to understand bubble dynamics are costly and non-trivial. Therefore, to study such chemical processes and bubble dynamics, a state-of-the-art massively parallel computational fluid dynamics discrete element model (CFD-DEM), MFIX-Exa is being developed for simulating multiphase flows. Despite the proven accuracy of MFIX-Exa in modeling bubbling phenomena, the very-large size of the output data prohibits the use of traditional post hoc analysis capabilities in both storage and I/O time. To address these issues and allow the application scientists to explore the bubble dynamics in an efficient and timely manner, we have developed an end-to-end visual analytics pipeline that enables in situ detection of bubbles using statistical techniques, followed by a flexible and interactive visual exploration of bubble dynamics in the post hoc analysis phase. Positive feedback from the experts has indicated the efficacy of the proposed approach for exploring bubble dynamics in very-large scale multiphase flow simulations.

【18】 On robust risk-based active-learning algorithms for enhanced decision support
标题:增强决策支持的基于风险的鲁棒主动学习算法研究

链接:https://arxiv.org/abs/2201.02555
作者:Aidan J. Hughes,Lawrence A. Bull,Paul Gardner,Nikolaos Dervilis,Keith Worden
备注:48 pages, 39 figures, submitted to Mechanical Systems and Signal Processing
摘要:Classification models are a fundamental component of physical-asset management technologies such as structural health monitoring (SHM) systems and digital twins. Previous work introduced \textit{risk-based active learning}, an online approach for the development of statistical classifiers that takes into account the decision-support context in which they are applied. Decision-making is considered by preferentially querying data labels according to \textit{expected value of perfect information} (EVPI). Although several benefits are gained by adopting a risk-based active learning approach, including improved decision-making performance, the algorithms suffer from issues relating to sampling bias as a result of the guided querying process. This sampling bias ultimately manifests as a decline in decision-making performance during the later stages of active learning, which in turn corresponds to lost resource/utility. The current paper proposes two novel approaches to counteract the effects of sampling bias: \textit{semi-supervised learning}, and \textit{discriminative classification models}. These approaches are first visualised using a synthetic dataset, then subsequently applied to an experimental case study, specifically, the Z24 Bridge dataset. The semi-supervised learning approach is shown to have variable performance; with robustness to sampling bias dependent on the suitability of the generative distributions selected for the model with respect to each dataset. In contrast, the discriminative classifiers are shown to have excellent robustness to the effects of sampling bias. Moreover, it was found that the number of inspections made during a monitoring campaign, and therefore resource expenditure, could be reduced with the careful selection of the statistical classifiers used within a decision-supporting monitoring system.

【19】 Code-Switching Text Augmentation for Multilingual Speech Processing
标题:用于多语言语音处理的码型转换文本增强

链接:https://arxiv.org/abs/2201.02550
作者:Amir Hussein,Shammur Absar Chowdhury,Ahmed Abdelali,Najim Dehak,Ahmed Ali
摘要:The pervasiveness of intra-utterance Code-switching (CS) in spoken content has enforced ASR systems to handle mixed input. Yet, designing a CS-ASR has many challenges, mainly due to the data scarcity, grammatical structure complexity, and mismatch along with unbalanced language usage distribution. Recent ASR studies showed the predominance of E2E-ASR using multilingual data to handle CS phenomena with little CS data. However, the dependency on the CS data still remains. In this work, we propose a methodology to augment the monolingual data for artificially generating spoken CS text to improve different speech modules. We based our approach on Equivalence Constraint theory while exploiting aligned translation pairs, to generate grammatically valid CS content. Our empirical results show a relative gain of 29-34 % in perplexity and around 2% in WER for two ecological and noisy CS test sets. Finally, the human evaluation suggests that 83.8% of the generated data is acceptable to humans.

【20】 Improving Surrogate Gradient Learning in Spiking Neural Networks via Regularization and Normalization
标题:用正则化和归一化改进尖峰神经网络的代理梯度学习

链接:https://arxiv.org/abs/2201.02538
作者:Nandan Meda
备注:Bachelor Thesis
摘要:Spiking neural networks (SNNs) are different from the classical networks used in deep learning: the neurons communicate using electrical impulses called spikes, just like biological neurons. SNNs are appealing for AI technology, because they could be implemented on low power neuromorphic chips. However, SNNs generally remain less accurate than their analog counterparts. In this report, we examine various regularization and normalization techniques with the goal of improving surrogate gradient learning in SNNs.

【21】 MGAE: Masked Autoencoders for Self-Supervised Learning on Graphs
标题:MGAE:用于图的自监督学习的屏蔽自动编码器

链接:https://arxiv.org/abs/2201.02534
作者:Qiaoyu Tan,Ninghao Liu,Xiao Huang,Rui Chen,Soo-Hyun Choi,Xia Hu
摘要:We introduce a novel masked graph autoencoder (MGAE) framework to perform effective learning on graph structure data. Taking insights from self-supervised learning, we randomly mask a large proportion of edges and try to reconstruct these missing edges during training. MGAE has two core designs. First, we find that masking a high ratio of the input graph structure, e.g., $70\%$, yields a nontrivial and meaningful self-supervisory task that benefits downstream applications. Second, we employ a graph neural network (GNN) as an encoder to perform message propagation on the partially-masked graph. To reconstruct the large number of masked edges, a tailored cross-correlation decoder is proposed. It could capture the cross-correlation between the head and tail nodes of anchor edge in multi-granularity. Coupling these two designs enables MGAE to be trained efficiently and effectively. Extensive experiments on multiple open datasets (Planetoid and OGB benchmarks) demonstrate that MGAE generally performs better than state-of-the-art unsupervised learning competitors on link prediction and node classification.

【22】 NeROIC: Neural Rendering of Objects from Online Image Collections
标题:NeROIC:在线图像集合中对象的神经绘制

链接:https://arxiv.org/abs/2201.02533
作者:Zhengfei Kuang,Kyle Olszewski,Menglei Chai,Zeng Huang,Panos Achlioptas,Sergey Tulyakov
备注:Project page including code can be found at: this https URL
摘要:We present a novel method to acquire object representations from online image collections, capturing high-quality geometry and material properties of arbitrary objects from photographs with varying cameras, illumination, and backgrounds. This enables various object-centric rendering applications such as novel-view synthesis, relighting, and harmonized background composition from challenging in-the-wild input. Using a multi-stage approach extending neural radiance fields, we first infer the surface geometry and refine the coarsely estimated initial camera parameters, while leveraging coarse foreground object masks to improve the training efficiency and geometry quality. We also introduce a robust normal estimation technique which eliminates the effect of geometric noise while retaining crucial details. Lastly, we extract surface material properties and ambient illumination, represented in spherical harmonics with extensions that handle transient elements, e.g. sharp shadows. The union of these components results in a highly modular and efficient object acquisition framework. Extensive evaluations and comparisons demonstrate the advantages of our approach in capturing high-quality geometry and appearance properties useful for rendering applications.

【23】 Learning Target-aware Representation for Visual Tracking via Informative Interactions
标题:基于信息交互的视觉跟踪学习目标感知表示

链接:https://arxiv.org/abs/2201.02526
作者:Mingzhe Guo,Zhipeng Zhang,Heng Fan,Liping Jing,Yilin Lyu,Bing Li,Weiming Hu
备注:9 pages, 6 figures
摘要:We introduce a novel backbone architecture to improve target-perception ability of feature representation for tracking. Specifically, having observed that de facto frameworks perform feature matching simply using the outputs from backbone for target localization, there is no direct feedback from the matching module to the backbone network, especially the shallow layers. More concretely, only the matching module can directly access the target information (in the reference frame), while the representation learning of candidate frame is blind to the reference target. As a consequence, the accumulation effect of target-irrelevant interference in the shallow stages may degrade the feature quality of deeper layers. In this paper, we approach the problem from a different angle by conducting multiple branch-wise interactions inside the Siamese-like backbone networks (InBN). At the core of InBN is a general interaction modeler (GIM) that injects the prior knowledge of reference image to different stages of the backbone network, leading to better target-perception and robust distractor-resistance of candidate feature representation with negligible computation cost. The proposed GIM module and InBN mechanism are general and applicable to different backbone types including CNN and Transformer for improvements, as evidenced by our extensive experiments on multiple benchmarks. In particular, the CNN version (based on SiamCAR) improves the baseline with 3.2/6.9 absolute gains of SUC on LaSOT/TNL2K, respectively. The Transformer version obtains SUC scores of 65.7/52.0 on LaSOT/TNL2K, which are on par with recent state of the arts. Code and models will be released.

【24】 RxWhyQA: a clinical question-answering dataset with the challenge of multi-answer questions
标题:RxWhyQA:一个具有多答案问题挑战的临床问答数据集

链接:https://arxiv.org/abs/2201.02517
作者:Sungrim Moon,Huan He,Hongfang Liu,Jungwei W. Fan
备注:2 tables, 3 figures
摘要:Objectives Create a dataset for the development and evaluation of clinical question-answering (QA) systems that can handle multi-answer questions. Materials and Methods We leveraged the annotated relations from the 2018 National NLP Clinical Challenges (n2c2) corpus to generate a QA dataset. The 1-to-0 and 1-to-N drug-reason relations formed the unanswerable and multi-answer entries, which represent challenging scenarios lacking in the existing clinical QA datasets. Results The result RxWhyQA dataset contains 91,440 QA entries, of which half are unanswerable, and 21% (n=19,269) of the answerable ones require multiple answers. The dataset conforms to the community-vetted Stanford Question Answering Dataset (SQuAD) format. Discussion The RxWhyQA is useful for comparing different systems that need to handle the zero- and multi-answer challenges, demanding dual mitigation of both false positive and false negative answers. Conclusion We created and shared a clinical QA dataset with a focus on multi-answer questions to represent real-world scenarios.

【25】 The Efficiency of the ANS Entropy Encoding
标题:ANS熵编码的效率分析

链接:https://arxiv.org/abs/2201.02514
作者:Dmitry Kosolobov
备注:15 pages, 5 figures, 2 algorithms
摘要:The Asymmetric Numeral Systems (ANS) is a class of entropy encoders by Duda that had an immense impact on the data compression, substituting arithmetic and Huffman coding. The optimality of ANS was studied by Duda et al. but the precise asymptotic behaviour of its redundancy (in comparison to the entropy) was not completely understood. In this paper we establish an optimal bound on the redundancy for the tabled ANS (tANS), the most popular ANS variant. Given a sequence $a_1,\ldots,a_n$ of letters from an alphabet $\{0,\ldots,\sigma-1\}$ such that each letter $a$ occurs in it $f_a$ times and $n=2^r$, the tANS encoder using Duda's ``precise initialization'' to fill tANS tables transforms this sequence into a bit string of length (frequencies are not included in the encoding size): $$ \sum\limits_{a\in [0..\sigma)}f_a\cdot\log\frac{n}{f_a}+O(\sigma+r), $$ where $O(\sigma + r)$ can be bounded by $\sigma\log e+r$. The $r$-bit term is an encoder artifact indispensable to ANS; the rest incurs a redundancy of $O(\frac{\sigma}{n})$ bits per letter. We complement this bound by a series of examples showing that an $\Omega(\sigma+r)$ redundancy is necessary when $\sigma > n/3$, where $\Omega(\sigma + r)$ is at least $\frac{\sigma-1}{4}+r-2$. We argue that similar examples exist for any methods that distribute letters in tANS tables using only the knowledge about frequencies. Thus, we refute Duda's conjecture that the redundancy is $O(\frac{\sigma}{n^2})$ bits per letter. We also propose a new variant of range ANS (rANS), called rANS with fixed accuracy, that is parameterized by $k \ge 1$. In this variant the integer division, which is unavoidable in rANS, is performed only in cases when its result belongs to $[2^k..2^{k+1})$. Hence, the division can be computed by faster methods provided $k$ is small. We bound the redundancy for the rANS with fixed accuracy $k$ by $\frac{n}{2^k-1}\log e+r$.

【26】 Predicting Patient Readmission Risk from Medical Text via Knowledge Graph Enhanced Multiview Graph Convolution
标题:基于知识图增强多视图卷积的医学文本再入院风险预测

链接:https://arxiv.org/abs/2201.02510
作者:Qiuhao Lu,Thien Huu Nguyen,Dejing Dou
备注:SIGIR 2021
摘要:Unplanned intensive care unit (ICU) readmission rate is an important metric for evaluating the quality of hospital care. Efficient and accurate prediction of ICU readmission risk can not only help prevent patients from inappropriate discharge and potential dangers, but also reduce associated costs of healthcare. In this paper, we propose a new method that uses medical text of Electronic Health Records (EHRs) for prediction, which provides an alternative perspective to previous studies that heavily depend on numerical and time-series features of patients. More specifically, we extract discharge summaries of patients from their EHRs, and represent them with multiview graphs enhanced by an external knowledge graph. Graph convolutional networks are then used for representation learning. Experimental results prove the effectiveness of our method, yielding state-of-the-art performance for this task.

【27】 Evaluation of Cyber Attacks Targeting Internet Facing IoT : An Experimental Evaluation
标题:面向互联网面向物联网的网络攻击评估:一项实验评估

链接:https://arxiv.org/abs/2201.02506
作者:Navod Neranjan Thilakrathne,Rohan Samarasinghe,Madhuka Priyashan
摘要:The rapid growth of Information and Communication Technology (ICT) in the 21st century has resulted in the emergence of a novel technological paradigm; known as the Internet of Things, or IoT. The IoT, which is at the heart of today's smart infrastructure, aids in the creation of a ubiquitous network of things by simplifying interconnection between smart digital devices and enabling Machine to Machine (M2M) communication. As of now, there are numerous examples of IoT use cases available, assisting every person in this world towards making their lives easier and more convenient. With the latest advancement of IoT in variety of cyber-attacks that targets these pervasive IoT environments, which can even lead to jeopardizing the lives of peoples; that are involving with it. In general, this IoT can be considered as every digital object that is connected to the Internet for intercommunication. Hence in this regard in order to analyse cyber threats that come through the Internet, here we are doing an experimental evaluation to analyse the requests, received to exploit the opened Secure Shell (SSH) connection service of an IoT device, which in our case a Raspberry Pi devices, which connected to the Internet for more than six consecutive days. By opening the SSH service on Raspberry Pi, it acts as a Honeypot device where we can log and retrieve all login attempt requests received to the SSH service opened. Inspired by evaluating the IoT security attacks that target objects in the pervasive IoT environment, after retrieving all the login requests that made through the open SSH connection we then provide a comprehensive analysis along with our observations about the origin of the requests and the focus areas of intruders; in this study.

【28】 Repairing Adversarial Texts through Perturbation
标题:通过扰动修复敌意文本

链接:https://arxiv.org/abs/2201.02504
作者:Guoliang Dong,Jingyi Wang,Jun Sun,Sudipta Chattopadhyay,Xinyu Wang,Ting Dai,Jie Shi,Jin Song Dong
摘要:It is known that neural networks are subject to attacks through adversarial perturbations, i.e., inputs which are maliciously crafted through perturbations to induce wrong predictions. Furthermore, such attacks are impossible to eliminate, i.e., the adversarial perturbation is still possible after applying mitigation methods such as adversarial training. Multiple approaches have been developed to detect and reject such adversarial inputs, mostly in the image domain. Rejecting suspicious inputs however may not be always feasible or ideal. First, normal inputs may be rejected due to false alarms generated by the detection algorithm. Second, denial-of-service attacks may be conducted by feeding such systems with adversarial inputs. To address the gap, in this work, we propose an approach to automatically repair adversarial texts at runtime. Given a text which is suspected to be adversarial, we novelly apply multiple adversarial perturbation methods in a positive way to identify a repair, i.e., a slightly mutated but semantically equivalent text that the neural network correctly classifies. Our approach has been experimented with multiple models trained for natural language processing tasks and the results show that our approach is effective, i.e., it successfully repairs about 80\% of the adversarial texts. Furthermore, depending on the applied perturbation method, an adversarial text could be repaired in as short as one second on average.

【29】 A Review of Deep Learning Techniques for Markerless Human Motion on Synthetic Datasets
标题:基于合成数据集的无标记人体运动深度学习技术综述

链接:https://arxiv.org/abs/2201.02503
作者:Doan Duy Vo,Russell Butler
备注:11 pages, 5 figures, 2 tables
摘要:Markerless motion capture has become an active field of research in computer vision in recent years. Its extensive applications are known in a great variety of fields, including computer animation, human motion analysis, biomedical research, virtual reality, and sports science. Estimating human posture has recently gained increasing attention in the computer vision community, but due to the depth of uncertainty and the lack of the synthetic datasets, it is a challenging task. Various approaches have recently been proposed to solve this problem, many of which are based on deep learning. They are primarily focused on improving the performance of existing benchmarks with significant advances, especially 2D images. Based on powerful deep learning techniques and recently collected real-world datasets, we explored a model that can predict the skeleton of an animation based solely on 2D images. Frames generated from different real-world datasets with synthesized poses using different body shapes from simple to complex. The implementation process uses DeepLabCut on its own dataset to perform many necessary steps, then use the input frames to train the model. The output is an animated skeleton for human movement. The composite dataset and other results are the "ground truth" of the deep model.

【30】 Sign Language Video Retrieval with Free-Form Textual Queries
标题:基于自由格式文本查询的手语视频检索

链接:https://arxiv.org/abs/2201.02495
作者:Amanda Duarte,Samuel Albanie,Xavier Giró-i-Nieto,Gül Varol
摘要:Systems that can efficiently search collections of sign language videos have been highlighted as a useful application of sign language technology. However, the problem of searching videos beyond individual keywords has received limited attention in the literature. To address this gap, in this work we introduce the task of sign language retrieval with free-form textual queries: given a written query (e.g., a sentence) and a large collection of sign language videos, the objective is to find the signing video in the collection that best matches the written query. We propose to tackle this task by learning cross-modal embeddings on the recently introduced large-scale How2Sign dataset of American Sign Language (ASL). We identify that a key bottleneck in the performance of the system is the quality of the sign video embedding which suffers from a scarcity of labeled training data. We, therefore, propose SPOT-ALIGN, a framework for interleaving iterative rounds of sign spotting and feature alignment to expand the scope and scale of available training data. We validate the effectiveness of SPOT-ALIGN for learning a robust sign video embedding through improvements in both sign recognition and the proposed video retrieval task.

【31】 Video Summarization Based on Video-text Representation
标题:基于图文表示的视频摘要

链接:https://arxiv.org/abs/2201.02494
作者:Li Haopeng,Ke Qiuhong,Gong Mingming,Zhang Rui
摘要:Modern video summarization methods are based on deep neural networks which require a large amount of annotated data for training. However, existing datasets for video summarization are small-scale, easily leading to over-fitting of the deep models. Considering that the annotation of large-scale datasets is time-consuming, we propose a multimodal self-supervised learning framework to obtain semantic representations of videos, which benefits the video summarization task. Specifically, we explore the semantic consistency between the visual information and text information of videos, for the self-supervised pretraining of a multimodal encoder on a newly-collected dataset of video-text pairs. Additionally, we introduce a progressive video summarization method, where the important content in a video is pinpointed progressively to generate better summaries. Finally, an objective evaluation framework is proposed to measure the quality of video summaries based on video classification. Extensive experiments have proved the effectiveness and superiority of our method in rank correlation coefficients, F-score, and the proposed objective evaluation compared to the state of the art.

【32】 Audio representations for deep learning in sound synthesis: A review
标题:声音合成中深度学习的音频表征:综述

链接:https://arxiv.org/abs/2201.02490
作者:Anastasia Natsiou,Sean O'Leary
摘要:The rise of deep learning algorithms has led many researchers to withdraw from using classic signal processing methods for sound generation. Deep learning models have achieved expressive voice synthesis, realistic sound textures, and musical notes from virtual instruments. However, the most suitable deep learning architecture is still under investigation. The choice of architecture is tightly coupled to the audio representations. A sound's original waveform can be too dense and rich for deep learning models to deal with efficiently - and complexity increases training time and computational cost. Also, it does not represent sound in the manner in which it is perceived. Therefore, in many cases, the raw audio has been transformed into a compressed and more meaningful form using upsampling, feature-extraction, or even by adopting a higher level illustration of the waveform. Furthermore, conditional on the form chosen, additional conditioning representations, different model architectures, and numerous metrics for evaluating the reconstructed sound have been investigated. This paper provides an overview of audio representations applied to sound synthesis using deep learning. Additionally, it presents the most significant methods for developing and evaluating a sound synthesis architecture using deep learning models, always depending on the audio representation.

【33】 Semantic-based Data Augmentation for Math Word Problems
标题:基于语义的数学应用题数据增强

链接:https://arxiv.org/abs/2201.02489
作者:Ailisi Li,Jiaqing Liang,Yanghua Xiao
摘要:It's hard for neural MWP solvers to deal with tiny local variances. In MWP task, some local changes conserve the original semantic while the others may totally change the underlying logic. Currently, existing datasets for MWP task contain limited samples which are key for neural models to learn to disambiguate different kinds of local variances in questions and solve the questions correctly. In this paper, we propose a set of novel data augmentation approaches to supplement existing datasets with such data that are augmented with different kinds of local variances, and help to improve the generalization ability of current neural models. New samples are generated by knowledge guided entity replacement, and logic guided problem reorganization. The augmentation approaches are ensured to keep the consistency between the new data and their labels. Experimental results have shown the necessity and the effectiveness of our methods.

【34】 Sparse PCA on fixed-rank matrices
标题:固定秩矩阵上的稀疏PCA

链接:https://arxiv.org/abs/2201.02487
作者:Alberto Del Pia
备注:None
摘要:Sparse PCA is the optimization problem obtained from PCA by adding a sparsity constraint on the principal components. Sparse PCA is NP-hard and hard to approximate even in the single-component case. In this paper we settle the computational complexity of sparse PCA with respect to the rank of the covariance matrix. We show that, if the rank of the covariance matrix is a fixed value, then there is an algorithm that solves sparse PCA to global optimality, whose running time is polynomial in the number of features. We also prove a similar result for the version of sparse PCA which requires the principal components to have disjoint supports.

【35】 Automated Dissipation Control for Turbulence Simulation with Shell Models
标题:壳模型湍流模拟中的自动耗散控制

链接:https://arxiv.org/abs/2201.02485
作者:Ann-Kathrin Dombrowski,Klaus-Robert Müller,Wolf Christian Müller
摘要:The application of machine learning (ML) techniques, especially neural networks, has seen tremendous success at processing images and language. This is because we often lack formal models to understand visual and audio input, so here neural networks can unfold their abilities as they can model solely from data. In the field of physics we typically have models that describe natural processes reasonably well on a formal level. Nonetheless, in recent years, ML has also proven useful in these realms, be it by speeding up numerical simulations or by improving accuracy. One important and so far unsolved problem in classical physics is understanding turbulent fluid motion. In this work we construct a strongly simplified representation of turbulence by using the Gledzer-Ohkitani-Yamada (GOY) shell model. With this system we intend to investigate the potential of ML-supported and physics-constrained small-scale turbulence modelling. Instead of standard supervised learning we propose an approach that aims to reconstruct statistical properties of turbulence such as the self-similar inertial-range scaling, where we could achieve encouraging experimental results. Furthermore we discuss pitfalls when combining machine learning with differential equations.

【36】 A sinusoidal signal reconstruction method for the inversion of the mel-spectrogram
标题:一种用于Mel谱图反演的正弦信号重构方法

链接:https://arxiv.org/abs/2201.02483
作者:Anastasia Natsiou,Sean O'Leary
摘要:The synthesis of sound via deep learning methods has recently received much attention. Some problems for deep learning approaches to sound synthesis relate to the amount of data needed to specify an audio signal and the necessity of preserving both the long and short time coherence of the synthesised signal. Visual time-frequency representations such as the log-mel-spectrogram have gained in popularity. The log-mel-spectrogram is a perceptually informed representation of audio that greatly compresses the amount of information required for the description of the sound. However, because of this compression, this representation is not directly invertible. Both signal processing and machine learning techniques have previously been applied to the inversion of the log-mel-spectrogram but they both caused audible distortions in the synthesized sounds due to issues of temporal and spectral coherence. In this paper, we outline the application of a sinusoidal model to the inversion of the log-mel-spectrogram for pitched musical instrument sounds outperforming state-of-the-art deep learning methods. The approach could be later used as a general decoding step from spectral to time intervals in neural applications.

【37】 Bayesian Neural Networks for Reversible Steganography
标题:用于可逆隐写的贝叶斯神经网络

链接:https://arxiv.org/abs/2201.02478
作者:Ching-Chun Chang
摘要:Recent advances in deep learning have led to a paradigm shift in reversible steganography. A fundamental pillar of reversible steganography is predictive modelling which can be realised via deep neural networks. However, non-trivial errors exist in inferences about some out-of-distribution and noisy data. In view of this issue, we propose to consider uncertainty in predictive models based upon a theoretical framework of Bayesian deep learning. Bayesian neural networks can be regarded as self-aware machinery; that is, a machine that knows its own limitations. To quantify uncertainty, we approximate the posterior predictive distribution through Monte Carlo sampling with stochastic forward passes. We further show that predictive uncertainty can be disentangled into aleatoric and epistemic uncertainties and these quantities can be learnt in an unsupervised manner. Experimental results demonstrate an improvement delivered by Bayesian uncertainty analysis upon steganographic capacity-distortion performance.

【38】 Modeling International Mobility using Roaming Cell Phone Traces during COVID-19 Pandemic
标题:利用冠状病毒大流行期间漫游手机痕迹模拟国际流动性

链接:https://arxiv.org/abs/2201.02470
作者:Massimiliano Luca,Bruno Lepri,Enrique Frias-Martinez,Andra Lutu
摘要:Most of the studies related to human mobility are focused on intra-country mobility. However, there are many scenarios (e.g., spreading diseases, migration) in which timely data on international commuters are vital. Mobile phones represent a unique opportunity to monitor international mobility flows in a timely manner and with proper spatial aggregation. This work proposes using roaming data generated by mobile phones to model incoming and outgoing international mobility. We use the gravity and radiation models to capture mobility flows before and during the introduction of non-pharmaceutical interventions. However, traditional models have some limitations: for instance, mobility restrictions are not explicitly captured and may play a crucial role. To overtake such limitations, we propose the COVID Gravity Model (CGM), namely an extension of the traditional gravity model that is tailored for the pandemic scenario. This proposed approach overtakes, in terms of accuracy, the traditional models by 126.9% for incoming mobility and by 63.9% when modeling outgoing mobility flows.

【39】 Similarities and Differences between Machine Learning and Traditional Advanced Statistical Modeling in Healthcare Analytics
标题:医疗分析中机器学习与传统高级统计建模的异同

链接:https://arxiv.org/abs/2201.02469
作者:Michele Bennett,Karin Hayes,Ewa J. Kleczyk,Rajesh Mehta
备注:16 pages, 2 figures
摘要:Data scientists and statisticians are often at odds when determining the best approach, machine learning or statistical modeling, to solve an analytics challenge. However, machine learning and statistical modeling are more cousins than adversaries on different sides of an analysis battleground. Choosing between the two approaches or in some cases using both is based on the problem to be solved and outcomes required as well as the data available for use and circumstances of the analysis. Machine learning and statistical modeling are complementary, based on similar mathematical principles, but simply using different tools in an overall analytics knowledge base. Determining the predominant approach should be based on the problem to be solved as well as empirical evidence, such as size and completeness of the data, number of variables, assumptions or lack thereof, and expected outcomes such as predictions or causality. Good analysts and data scientists should be well versed in both techniques and their proper application, thereby using the right tool for the right project to achieve the desired results.

【40】 On The Decoding Error Weight of One or Two Deletion Channels
标题:关于一个或两个删除信道的译码误码权重

链接:https://arxiv.org/abs/2201.02466
作者:Omer Sabary,Daniella Bar-Lev,Yotam Gershon,Alexander Yucovich,Eitan Yaakobi
备注:arXiv admin note: text overlap with arXiv:2001.05582
摘要:This paper tackles two problems that are relevant to coding for insertions and deletions. These problems are motivated by several applications, among them is reconstructing strands in DNA-based storage systems. Under this paradigm, a word is transmitted over some fixed number of identical independent channels and the goal of the decoder is to output the transmitted word or some close approximation of it. The first part of this paper studies the deletion channel that deletes a symbol with some fixed probability $p$, while focusing on two instances of this channel. Since operating the maximum likelihood (ML) decoder in this case is computationally unfeasible, we study a slightly degraded version of this decoder for two channels and its expected normalized distance. We identify the dominant error patterns and based on these observations, it is derived that the expected normalized distance of the degraded ML decoder is roughly $\frac{3q-1}{q-1}p^2$, when the transmitted word is any $q$-ary sequence and $p$ is the channel's deletion probability. We also study the cases when the transmitted word belongs to the Varshamov Tenengolts (VT) code or the shifted VT code. Additionally, the insertion channel is studied as well as the case of two insertion channels. These theoretical results are verified by corresponding simulations. The second part of the paper studies optimal decoding for a special case of the deletion channel, the $k$-deletion channel, which deletes exactly $k$ symbols of the transmitted word uniformly at random. In this part, the goal is to understand how an optimal decoder operates in order to minimize the expected normalized distance. A full characterization of an efficient optimal decoder for this setup, referred to as the maximum likelihood* (ML*) decoder, is given for a channel that deletes one or two symbols.

【41】 Churn prediction in online gambling
标题:在线赌博中的流失预测

链接:https://arxiv.org/abs/2201.02463
作者:Florian Merchie,Damien Ernst
备注:14 pages, 3 figures Submitted to Expert Systems with Applications
摘要:In business retention, churn prevention has always been a major concern. This work contributes to this domain by formalizing the problem of churn prediction in the context of online gambling as a binary classification task. We also propose an algorithmic answer to this problem based on recurrent neural network. This algorithm is tested with online gambling data that have the form of time series, which can be efficiently processed by recurrent neural networks. To evaluate the performances of the trained models, standard machine learning metrics were used, such as accuracy, precision and recall. For this problem in particular, the conducted experiments allowed to assess that the choice of a specific architecture depends on the metric which is given the greatest importance. Architectures using nBRC favour precision, those using LSTM give better recall, while GRU-based architectures allow a higher accuracy and balance two other metrics. Moreover, further experiments showed that using only the more recent time-series histories to train the networks decreases the quality of the results. We also study the performances of models learned at a specific instant $t$, at other times $t^{\prime} > t$. The results show that the performances of the models learned at time $t$ remain good at the following instants $t^{\prime} > t$, suggesting that there is no need to refresh the models at a high rate. However, the performances of the models were subject to noticeable variance due to one-off events impacting the data.

【42】 A SIMD algorithm for the detection of epistatic interactions of any order
标题:检测任意阶上位性相互作用的SIMD算法

链接:https://arxiv.org/abs/2201.02460
作者:Christian Ponte-Fernández,Jorge González-Domínguez,María J. Martín
备注:Submitted to Future Generation Computer Systems. Codes used are available at this https URL
摘要:Epistasis is a phenomenon in which a phenotype outcome is determined by the interaction of genetic variation at two or more loci and it cannot be attributed to the additive combination of effects corresponding to the individual loci. Although it has been more than 100 years since William Bateson introduced this concept, it still is a topic under active research. Locating epistatic interactions is a computationally expensive challenge that involves analyzing an exponentially growing number of combinations. Authors in this field have resorted to a multitude of hardware architectures in order to speed up the search, but little to no attention has been paid to the vector instructions that current CPUs include in their instruction sets. This work extends an existing third-order exhaustive algorithm to support the search of epistasis interactions of any order and discusses multiple SIMD implementations of the different functions that compose the search using Intel AVX Intrinsics. Results using the GCC and the Intel compiler show that the 512-bit explicit vector implementation proposed here performs the best out of all of the other implementations evaluated. The proposed 512-bit vectorization accelerates the original implementation of the algorithm by an average factor of 7 and 12, for GCC and the Intel Compiler, respectively, in the scenarios tested.

【43】 Deep Learnable Strategy Templates for Multi-Issue Bilateral Negotiation
标题:多议题双边谈判的深度学习策略模板

链接:https://arxiv.org/abs/2201.02455
作者:Pallavi Bagga,Nicola Paoletti,Kostas Stathis
备注:arXiv admin note: text overlap with arXiv:2009.08302
摘要:We study how to exploit the notion of strategy templates to learn strategies for multi-issue bilateral negotiation. Each strategy template consists of a set of interpretable parameterized tactics that are used to decide an optimal action at any time. We use deep reinforcement learning throughout an actor-critic architecture to estimate the tactic parameter values for a threshold utility, when to accept an offer and how to generate a new bid. This contrasts with existing work that only estimates the threshold utility for those tactics. We pre-train the strategy by supervision from the dataset collected using "teacher strategies", thereby decreasing the exploration time required for learning during negotiation. As a result, we build automated agents for multi-issue negotiations that can adapt to different negotiation domains without the need to be pre-programmed. We empirically show that our work outperforms the state-of-the-art in terms of the individual as well as social efficiency.

【44】 Analytical calculation formulas for capacities of classical and classical-quantum channels
标题:经典信道和经典量子信道容量的解析计算公式

链接:https://arxiv.org/abs/2201.02450
作者:Masahito Hayashi
摘要:We derive an analytical calculation formula for the channel capacity of a classical channel without any iteration while its existing algorithms require iterations and the number of iteration depends on the required precision level. Hence, our formula is its first analytical formula without any iteration. We apply the obtained formula to examples and see how the obtained formula works in these examples. Then, we extend it to the channel capacity of a classical-quantum (cq-) channel. Many existing studies proposed algorithms for a cq-channel and all of them require iterations. Our extended analytical algorithm have also no iteration and output the exactly optimum values.

【45】 Online 3-Axis Magnetometer Hard-Iron and Soft-Iron Bias and Angular Velocity Sensor Bias Estimation Using Angular Velocity Sensors for Improved Dynamic Heading Accuracy
标题:在线三轴磁强计硬铁和软铁偏差和角速度传感器偏差估计使用角速度传感器提高动态航向精度

链接:https://arxiv.org/abs/2201.02449
作者:Andrew R. Spielvogel,Abhimanyu S. Shah,Louis L. Whitcomb
备注:Preprint of an article accepted for publication in Field Robotics, this https URL, Special Issue in Unmanned Marine Systems. Submitted January 16, 2021; Revised May 28, 2021; Accepted August 2, 2021
摘要:This article addresses the problem of dynamic on-line estimation and compensation of hard-iron and soft-iron biases of 3-axis magnetometers under dynamic motion in field robotics, utilizing only biased measurements from a 3-axis magnetometer and a 3-axis angular rate sensor. The proposed magnetometer and angular velocity bias estimator (MAVBE) utilizes a 15-state process model encoding the nonlinear process dynamics for the magnetometer signal subject to angular velocity excursions, while simultaneously estimating 9 magnetometer bias parameters and 3 angular rate sensor bias parameters, within an extended Kalman filter framework. Bias parameter local observability is numerically evaluated. The bias-compensated signals, together with 3-axis accelerometer signals, are utilized to estimate bias compensated magnetic geodetic heading. Performance of the proposed MAVBE method is evaluated in comparison to the widely cited magnetometer-only TWOSTEP method in numerical simulations, laboratory experiments, and full-scale field trials of an instrumented autonomous underwater vehicle in the Chesapeake Bay, MD, USA. For the proposed MAVBE, (i) instrument attitude is not required to estimate biases, and the results show that (ii) the biases are locally observable, (iii) the bias estimates converge rapidly to true bias parameters, (iv) only modest instrument excitation is required for bias estimate convergence, and (v) compensation for magnetometer hard-iron and soft-iron biases dramatically improves dynamic heading estimation accuracy.

【46】 k-Center Clustering with Outliers in Sliding Windows
标题:滑动窗口中带离群点的K-中心聚类

链接:https://arxiv.org/abs/2201.02448
作者:Paolo Pellizzoni,Andrea Pietracaprina,Geppino Pucci
摘要:Metric $k$-center clustering is a fundamental unsupervised learning primitive. Although widely used, this primitive is heavily affected by noise in the data, so that a more sensible variant seeks for the best solution that disregards a given number $z$ of points of the dataset, called outliers. We provide efficient algorithms for this important variant in the streaming model under the sliding window setting, where, at each time step, the dataset to be clustered is the window $W$ of the most recent data items. Our algorithms achieve $O(1)$ approximation and, remarkably, require a working memory linear in $k+z$ and only logarithmic in $|W|$. As a by-product, we show how to estimate the effective diameter of the window $W$, which is a measure of the spread of the window points, disregarding a given fraction of noisy distances. We also provide experimental evidence of the practical viability of our theoretical results.

【47】 Bregman divergence based em algorithm and its application to classical and quantum rate distortion theory
标题:基于Bregman散度的em算法及其在经典和量子率失真理论中的应用

链接:https://arxiv.org/abs/2201.02447
作者:Masahito Hayashi
摘要:We formulate em algorithm in the framework of Bregman divergence, which is a general problem setting of information geometry. That is, we address the minimization problem of the Bregman divergence between an exponential subfamily and a mixture subfamily in a Bregman divergence system. Then, we show the convergence and its speed under several conditions. We apply this algorithm to rate distortion and its variants including the quantum setting, and show the usefulness of our general algorithm.

【48】 Continuous-time Radar-inertial Odometry for Automotive Radars
标题:汽车雷达的连续时间雷达惯性里程计

链接:https://arxiv.org/abs/2201.02437
作者:Yin Zhi Ng,Benjamin Choi,Robby Tan,Lionel Heng
备注:In Proceedings of the 2021 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS)
摘要:We present an approach for radar-inertial odometry which uses a continuous-time framework to fuse measurements from multiple automotive radars and an inertial measurement unit (IMU). Adverse weather conditions do not have a significant impact on the operating performance of radar sensors unlike that of camera and LiDAR sensors. Radar's robustness in such conditions and the increasing prevalence of radars on passenger vehicles motivate us to look at the use of radar for ego-motion estimation. A continuous-time trajectory representation is applied not only as a framework to enable heterogeneous and asynchronous multi-sensor fusion, but also, to facilitate efficient optimization by being able to compute poses and their derivatives in closed-form and at any given time along the trajectory. We compare our continuous-time estimates to those from a discrete-time radar-inertial odometry approach and show that our continuous-time method outperforms the discrete-time method. To the best of our knowledge, this is the first time a continuous-time framework has been applied to radar-inertial odometry.

【49】 Spatial-Temporal Sequential Hypergraph Network for Crime Prediction
标题:时空序列超图网络在犯罪预测中的应用

链接:https://arxiv.org/abs/2201.02435
作者:Lianghao Xia,Chao Huang,Yong Xu,Peng Dai,Liefeng Bo,Xiyue Zhang,Tianyi Chen
备注:IJCAI 2021 Research Paper
摘要:Crime prediction is crucial for public safety and resource optimization, yet is very challenging due to two aspects: i) the dynamics of criminal patterns across time and space, crime events are distributed unevenly on both spatial and temporal domains; ii) time-evolving dependencies between different types of crimes (e.g., Theft, Robbery, Assault, Damage) which reveal fine-grained semantics of crimes. To tackle these challenges, we propose Spatial-Temporal Sequential Hypergraph Network (ST-SHN) to collectively encode complex crime spatial-temporal patterns as well as the underlying category-wise crime semantic relationships. In specific, to handle spatial-temporal dynamics under the long-range and global context, we design a graph-structured message passing architecture with the integration of the hypergraph learning paradigm. To capture category-wise crime heterogeneous relations in a dynamic environment, we introduce a multi-channel routing mechanism to learn the time-evolving structural dependency across crime types. We conduct extensive experiments on two real-world datasets, showing that our proposed ST-SHN framework can significantly improve the prediction performance as compared to various state-of-the-art baselines. The source code is available at: https://github.com/akaxlh/ST-SHN.

【50】 Forecasting emissions through Kaya identity using Neural Ordinary Differential Equations
标题:基于神经元常微分方程的Kaya恒等式排放量预测

链接:https://arxiv.org/abs/2201.02433
作者:Pierre Browne,Aranildo Lima,Rossella Arcucci,César Quilodrán-Casas
备注:5 pages, 2 figures, Tackling Climate Change with Machine Learning workshop at ICML 2021
摘要:Starting from the Kaya identity, we used a Neural ODE model to predict the evolution of several indicators related to carbon emissions, on a country-level: population, GDP per capita, energy intensity of GDP, carbon intensity of energy. We compared the model with a baseline statistical model - VAR - and obtained good performances. We conclude that this machine-learning approach can be used to produce a wide range of results and give relevant insight to policymakers

【51】 Automatic Speech Recognition Datasets in Cantonese Language: A Survey and a New Dataset
标题:粤语自动语音识别数据集:综述和一个新的数据集

链接:https://arxiv.org/abs/2201.02419
作者:Tiezheng Yu,Rita Frieske,Peng Xu,Samuel Cahyawijaya,Cheuk Tung Shadow Yiu,Holy Lovenia,Wenliang Dai,Elham J. Barezi,Qifeng Chen,Xiaojuan Ma,Bertram E. Shi,Pascale Fung
摘要:Automatic speech recognition (ASR) on low resource languages improves access of linguistic minorities to technological advantages provided by Artificial Intelligence (AI). In this paper, we address a problem of data scarcity of Hong Kong Cantonese language by creating a new Cantonese dataset. Our dataset, Multi-Domain Cantonese Corpus (MDCC), consists of 73.6 hours of clean read speech paired with transcripts, collected from Cantonese audiobooks from Hong Kong. It combines philosophy, politics, education, culture, lifestyle and family domains, covering a wide range of topics. We also review all existing Cantonese datasets and perform experiments on the two biggest datasets (MDCC and Common Voice zh-HK). We analyze the existing datasets according to their speech type, data source, total size and availability. The results of experiments conducted with Fairseq S2T Transformer, a state-of-the-art ASR model, show the effectiveness of our dataset. In addition, we create a powerful and robust Cantonese ASR model by applying multi-dataset learning on MDCC and Common Voice zh-HK.

【52】 Developing Assistive Technology to Support Reminiscence Therapy: A User-Centered Study to Identify Caregivers' Needs
标题:开发辅助技术支持记忆治疗:一项以用户为中心的研究,以确定照顾者的需求

链接:https://arxiv.org/abs/2201.02418
作者:Soraia M. Alarcão,André Santana,Carolina Maruta,Manuel J. Fonseca
备注:27 pages, 2 figures, Manuscript submitted to the the Special Issue on Advances in Human-Centred Dementia Technology of the International Journal of Human-Computer Studies
摘要:Reminiscence therapy is an inexpensive non-pharmacological therapy commonly used due to its therapeutic value for PwD, as it can be used to promote independence, positive moods and behavior, and improve their quality of life. Caregivers are one of the main pillars in the adoption of digital technologies for reminiscence therapy, as they are responsible for its administration. Despite their comprehensive understanding of the needs and difficulties associated with the therapy, their perspective has not been fully taken into account in the development of existing technological solutions. To inform the design of technological solutions within dementia care, we followed a user-centered design approach through worldwide surveys, follow-up semi-structured interviews, and focus groups. Seven hundred and seven informal and 52 formal caregivers participated in our study. Our findings show that technological solutions must provide mechanisms to carry out the therapy in a simple way, reducing the amount of work for caregivers when preparing and conducting therapy sessions. They should also diversify and personalize the current session (and following ones) based on both the biographical information of the PwD and their emotional reactions. This is particularly important since the PwD often become agitated, aggressive or angry, and caregivers might not know how to properly deal with this situation (in particular, the informal ones). Additionally, formal caregivers need an easy way to manage information of the different PwD they take care of, and consult the history of sessions performed (in particular, to identify images that triggered negative emotional reactions, and consult any notes taken about them). As a result, we present a list of validated functional requirements gathered for the PwD and both formal and informal caregivers, as well as the corresponding expected primary and secondary outcomes.

【53】 Auction-Based Ex-Post-Payment Incentive Mechanism Design for Horizontal Federated Learning with Reputation and Contribution Measurement
标题:基于拍卖的带声誉和贡献度的横向联合学习支付后激励机制设计

链接:https://arxiv.org/abs/2201.02410
作者:Jingwen Zhang,Yuezhou Wu,Rong Pan
摘要:Federated learning trains models across devices with distributed data, while protecting the privacy and obtaining a model similar to that of centralized ML. A large number of workers with data and computing power are the foundation of federal learning. However, the inevitable costs prevent self-interested workers from serving for free. Moreover, due to data isolation, task publishers lack effective methods to select, evaluate and pay reliable workers with high-quality data. Therefore, we design an auction-based incentive mechanism for horizontal federated learning with reputation and contribution measurement. By designing a reasonable method of measuring contribution, we establish the reputation of workers, which is easy to decline and difficult to improve. Through reverse auctions, workers bid for tasks, and the task publisher selects workers combining reputation and bid price. With the budget constraint, winning workers are paid based on performance. We proved that our mechanism satisfies the individual rationality of the honest worker, budget feasibility, truthfulness, and computational efficiency.

【54】 Tight Fine-Grained Bounds for Direct Access on Join Queries
标题:连接查询直接访问的紧致细粒度界限

链接:https://arxiv.org/abs/2201.02401
作者:Karl Bringmann,Nofar Carmeli,Stefan Mengel
摘要:We consider the task of lexicographic direct access to query answers. That is, we want to simulate an array containing the answers of a join query sorted in a lexicographic order chosen by the user. A recent dichotomy showed for which queries and orders this task can be done in polylogarithmic access time after quasilinear preprocessing, but this dichotomy does not tell us how much time is required in the cases classified as hard. We determine the preprocessing time needed to achieve polylogarithmic access time for all self-join free queries and all lexicographical orders. To this end, we propose a decomposition-based general algorithm for direct access on join queries. We then explore its optimality by proving lower bounds for the preprocessing time based on the hardness of a certain online Set-Disjointness problem, which shows that our algorithm's bounds are tight for all lexicographic orders on self-join free queries. Then, we prove the hardness of Set-Disjointness based on the Zero-Clique Conjecture which is an established conjecture from fine-grained complexity theory. We also show that similar techniques can be used to prove that, for enumerating answers to Loomis-Whitney joins, it is not possible to significantly improve upon trivially computing all answers at preprocessing. This, in turn, gives further evidence (based on the Zero-Clique Conjecture) to the enumeration hardness of self-join free cyclic joins with respect to linear preprocessing and constant delay.

【55】 Neural calibration of hidden inhomogeneous Markov chains -- Information decompression in life insurance
标题:隐含非齐次马氏链的神经标定--人寿保险中的信息解压缩

链接:https://arxiv.org/abs/2201.02397
作者:Mark Kiermayer,Christian Weiß
摘要:Markov chains play a key role in a vast number of areas, including life insurance mathematics. Standard actuarial quantities as the premium value can be interpreted as compressed, lossy information about the underlying Markov process. We introduce a method to reconstruct the underlying Markov chain given collective information of a portfolio of contracts. Our neural architecture explainably characterizes the process by explicitly providing one-step transition probabilities. Further, we provide an intrinsic, economic model validation to inspect the quality of the information decompression. Lastly, our methodology is successfully tested for a realistic data set of German term life insurance contracts.

【56】 Detecting Human-to-Human-or-Object (H2O) Interactions with DIABOLO
标题:使用空竹检测人与人或物(H2O)的交互

链接:https://arxiv.org/abs/2201.02396
作者:Astrid Orcesi,Romaric Audigier,Fritz Poka Toukam,Bertrand Luvison
备注:ACCEPTED in IEEE International Conference on Automatic Face and Gesture Recognition (FG 2021)
摘要:Detecting human interactions is crucial for human behavior analysis. Many methods have been proposed to deal with Human-to-Object Interaction (HOI) detection, i.e., detecting in an image which person and object interact together and classifying the type of interaction. However, Human-to-Human Interactions, such as social and violent interactions, are generally not considered in available HOI training datasets. As we think these types of interactions cannot be ignored and decorrelated from HOI when analyzing human behavior, we propose a new interaction dataset to deal with both types of human interactions: Human-to-Human-or-Object (H2O). In addition, we introduce a novel taxonomy of verbs, intended to be closer to a description of human body attitude in relation to the surrounding targets of interaction, and more independent of the environment. Unlike some existing datasets, we strive to avoid defining synonymous verbs when their use highly depends on the target type or requires a high level of semantic interpretation. As H2O dataset includes V-COCO images annotated with this new taxonomy, images obviously contain more interactions. This can be an issue for HOI detection methods whose complexity depends on the number of people, targets or interactions. Thus, we propose DIABOLO (Detecting InterActions By Only Looking Once), an efficient subject-centric single-shot method to detect all interactions in one forward pass, with constant inference time independent of image content. In addition, this multi-task network simultaneously detects all people and objects. We show how sharing a network for these tasks does not only save computation resource but also improves performance collaboratively. Finally, DIABOLO is a strong baseline for the new proposed challenge of H2O Interaction detection, as it outperforms all state-of-the-art methods when trained and evaluated on HOI dataset V-COCO.

【57】 InRS: implementing the indicator function of NURBS-shaped planar domains

链接:https://arxiv.org/abs/2201.02393
作者:Alvise Sommarivaa,Marco Vianello
摘要:We provide an algorithm that implements the indicator function of NURBS-shaped planar domains, tailored to the fast computation on huge point clouds, together with the corresponding Matlab code.

【58】 Unwinding Rotations Improves User Comfort with Immersive Telepresence Robots
标题:使用身临其境的网真机器人,展开旋转可提高用户舒适度

链接:https://arxiv.org/abs/2201.02392
作者:Markku Suomalainen,Basak Sakcak,Adhi Widagdo,Juho Kalliokoski,Katherine J. Mimnaugh,Alexis P. Chambers,Timo Ojala,Steven M. LaValle
备注:Accepted for publication in HRI (Int. Conf. on Human-Robot Interaction) 2022
摘要:We propose unwinding the rotations experienced by the user of an immersive telepresence robot to improve comfort and reduce VR sickness of the user. By immersive telepresence we refer to a situation where a 360\textdegree~camera on top of a mobile robot is streaming video and audio into a head-mounted display worn by a remote user possibly far away. Thus, it enables the user to be present at the robot's location, look around by turning the head and communicate with people near the robot. By unwinding the rotations of the camera frame, the user's viewpoint is not changed when the robot rotates. The user can change her viewpoint only by physically rotating in her local setting; as visual rotation without the corresponding vestibular stimulation is a major source of VR sickness, physical rotation by the user is expected to reduce VR sickness. We implemented unwinding the rotations for a simulated robot traversing a virtual environment and ran a user study (N=34) comparing unwinding rotations to user's viewpoint turning when the robot turns. Our results show that the users found unwound rotations more preferable and comfortable and that it reduced their level of VR sickness. We also present further results about the users' path integration capabilities, viewing directions, and subjective observations of the robot's speed and distances to simulated people and objects.

【59】 Methods for Increasing the Resistance of Cryptographic Designs against Horizontal DPA Attacks
标题:提高密码设计抵抗水平DPA攻击的方法

链接:https://arxiv.org/abs/2201.02391
作者:Ievgen Kabin,Zoya Dyka,Dan Kreiser,Peter Langendoerfer
备注:Author's version accepted for ICICS-2017; the final publication is available at Springer via this https URL
摘要:Side-channel analysis attacks, especially horizontal DPA and DEMA attacks, are significant threats for cryptographic designs. In this paper we investigate to which extend different multiplication formulae and randomization of the field multiplier increase the resistance of an ECC design against horizontal attacks. We implemented a randomized sequence of the calculation of partial products for the field multiplication in order to increase the security features of the field multiplier. Additionally, we use the partial polynomial multiplier itself as a kind of countermeasure against DPA attacks. We demonstrate that the implemented classical multiplication formula can increase the inherent resistance of the whole ECC design. We also investigate the impact of the combination of these two approaches. For the evaluation we synthesized all these designs for a 250 nm gate library technologies, and analysed the simulated power traces. All investigated protection means help to decrease the success rate of attacks significantly: the correctness of the revealed key was decreased from 99% to 69%.

【60】 The Defeat of the Winograd Schema Challenge
标题:Winograd Schema挑战赛的失败

链接:https://arxiv.org/abs/2201.02387
作者:Vid Kocijan,Ernest Davis,Thomas Lukasiewicz,Gary Marcus,Leora Morgenstern
摘要:The Winograd Schema Challenge -- a set of twin sentences involving pronoun reference disambiguation that seem to require the use of commonsense knowledge -- was proposed by Hector Levesque in 2011. By 2019, a number of AI systems, based on large pre-trained transformer-based language models and fine-tuned on these kinds of problems, achieved better than 90% accuracy. In this paper, we review the history of the Winograd Schema Challenge and assess its significance.

【61】 Offline Reinforcement Learning for Road Traffic Control
标题:用于道路交通控制的离线强化学习

链接:https://arxiv.org/abs/2201.02381
作者:Mayuresh Kunjir,Sanjay Chawla
备注:8 pages
摘要:Traffic signal control is an important problem in urban mobility with a significant potential of economic and environmental impact. While there is a growing interest in Reinforcement Learning (RL) for traffic control, the work so far has focussed on learning through interactions which, in practice, is costly. Instead, real experience data on traffic is available and could be exploited at minimal costs. Recent progress in offline or batch RL has enabled just that. Model-based offline RL methods, in particular, have been shown to generalize to the experience data much better than others. We build a model-based learning framework, A-DAC, which infers a Markov Decision Process (MDP) from dataset with pessimistic costs built in to deal with data uncertainties. The costs are modeled through an adaptive shaping of rewards in the MDP which provides better regularization of data compared to the prior related work. A-DAC is evaluated on a complex signalized roundabout using multiple datasets varying in size and in batch collection policy. The evaluation results show that it is possible to build high performance control policies in a data efficient manner using simplistic batch collection policies.

【62】 As-Continuous-As-Possible Ceramics Printing for Shell Models
标题:贝壳模型的尽可能连续的陶瓷印刷

链接:https://arxiv.org/abs/2201.02374
作者:Fanchao Zhong,Yonglai Xu,Haisen Zhao,Lin Lu
备注:15 pages, 21 figures
摘要:We propose a novel computational framework for fabricating thin shell models on an extrusion-based Cartesian 3D printer with the clay material. Extrusion-based ceramics printing involves several inevitable challenges to achieve acceptable print quality, including continuous toolpath with the minimal number of transfer moves, separation of non-model and model structures, etc. Inertia of the extruded material may damage the surface quality during transfer moves. The viscosity also makes support material hard to remove. These challenges even increase for thin shell surfaces, as both sides are of visual significance, making it impossible to hide any intermediate structures in the interiors. To conquer these challenges, we adopt a curved layer scheme for ceramics printing. Then we introduce an original criterion "one-path patch" (OPP), for representing a shell surface patch that can be traversed in one path in the context of curved layer printing considering fabrication constraints. We propose a bottom-up OPP merging procedure for decomposing the given shell surface into a minimal number of OPPs and generating the "as-continuous-as-possible" (ACAP) toolpath. Furthermore, we customize the path planning algorithm with a decoupled orientation and support structures computation method. Results demonstrate that our ACAP algorithm prints shell models with both efficiency and surface quality.

【63】 Mirror Learning: A Unifying Framework of Policy Optimisation
标题:镜像学习:政策优化的统一框架

链接:https://arxiv.org/abs/2201.02373
作者:Jakub Grudzien Kuba,Christian Schroeder de Witt,Jakob Foerster
摘要:General policy improvement (GPI) and trust-region learning (TRL) are the predominant frameworks within contemporary reinforcement learning (RL), which serve as the core models for solving Markov decision processes (MDPs). Unfortunately, in their mathematical form, they are sensitive to modifications, and thus, the practical instantiations that implement them do not automatically inherit their improvement guarantees. As a result, the spectrum of available rigorous MDP-solvers is narrow. Indeed, many state-of-the-art (SOTA) algorithms, such as TRPO and PPO, are not proven to converge. In this paper, we propose \textsl{mirror learning} -- a general solution to the RL problem. We reveal GPI and TRL to be but small points within this far greater space of algorithms which boasts the monotonic improvement property and converges to the optimal policy. We show that virtually all SOTA algorithms for RL are instances of mirror learning, and thus suggest that their empirical performance is a consequence of their theoretical properties, rather than of approximate analogies. Excitingly, we show that mirror learning opens up a whole new space of policy learning methods with convergence guarantees.

【64】 Deep Generative Framework for Interactive 3D Terrain Authoring and Manipulation
标题:交互式三维地形创作和操纵的深度生成框架

链接:https://arxiv.org/abs/2201.02369
作者:Shanthika Naik,Aryamaan Jain,Avinash Sharma,KS Rajan
摘要:Automated generation and (user) authoring of the realistic virtual terrain is most sought for by the multimedia applications like VR models and gaming. The most common representation adopted for terrain is Digital Elevation Model (DEM). Existing terrain authoring and modeling techniques have addressed some of these and can be broadly categorized as: procedural modeling, simulation method, and example-based methods. In this paper, we propose a novel realistic terrain authoring framework powered by a combination of VAE and generative conditional GAN model. Our framework is an example-based method that attempts to overcome the limitations of existing methods by learning a latent space from a real-world terrain dataset. This latent space allows us to generate multiple variants of terrain from a single input as well as interpolate between terrains while keeping the generated terrains close to real-world data distribution. We also developed an interactive tool, that lets the user generate diverse terrains with minimalist inputs. We perform thorough qualitative and quantitative analysis and provide comparisons with other SOTA methods. We intend to release our code/tool to the academic community.

【65】 Uncertainty-Aware Cascaded Dilation Filtering for High-Efficiency Deraining
标题:基于不确定性感知的级联膨胀滤波高效去噪

链接:https://arxiv.org/abs/2201.02366
作者:Qing Guo,Jingyang Sun,Felix Juefei-Xu,Lei Ma,Di Lin,Wei Feng,Song Wang
备注:14 pages, 10 figures, 10 tables. This is the extention of our conference version this https URL
摘要:Deraining is a significant and fundamental computer vision task, aiming to remove the rain streaks and accumulations in an image or video captured under a rainy day. Existing deraining methods usually make heuristic assumptions of the rain model, which compels them to employ complex optimization or iterative refinement for high recovery quality. This, however, leads to time-consuming methods and affects the effectiveness for addressing rain patterns deviated from from the assumptions. In this paper, we propose a simple yet efficient deraining method by formulating deraining as a predictive filtering problem without complex rain model assumptions. Specifically, we identify spatially-variant predictive filtering (SPFilt) that adaptively predicts proper kernels via a deep network to filter different individual pixels. Since the filtering can be implemented via well-accelerated convolution, our method can be significantly efficient. We further propose the EfDeRain+ that contains three main contributions to address residual rain traces, multi-scale, and diverse rain patterns without harming the efficiency. First, we propose the uncertainty-aware cascaded predictive filtering (UC-PFilt) that can identify the difficulties of reconstructing clean pixels via predicted kernels and remove the residual rain traces effectively. Second, we design the weight-sharing multi-scale dilated filtering (WS-MS-DFilt) to handle multi-scale rain streaks without harming the efficiency. Third, to eliminate the gap across diverse rain patterns, we propose a novel data augmentation method (i.e., RainMix) to train our deep models. By combining all contributions with sophisticated analysis on different variants, our final method outperforms baseline methods on four single-image deraining datasets and one video deraining dataset in terms of both recovery quality and speed.

【66】 Motion Prediction via Joint Dependency Modeling in Phase Space
标题:基于相空间联合依赖建模的运动预测

链接:https://arxiv.org/abs/2201.02365
作者:Pengxiang Su,Zhenguang Liu,Shuang Wu,Lei Zhu,Yifang Yin,Xuanjing Shen
摘要:Motion prediction is a classic problem in computer vision, which aims at forecasting future motion given the observed pose sequence. Various deep learning models have been proposed, achieving state-of-the-art performance on motion prediction. However, existing methods typically focus on modeling temporal dynamics in the pose space. Unfortunately, the complicated and high dimensionality nature of human motion brings inherent challenges for dynamic context capturing. Therefore, we move away from the conventional pose based representation and present a novel approach employing a phase space trajectory representation of individual joints. Moreover, current methods tend to only consider the dependencies between physically connected joints. In this paper, we introduce a novel convolutional neural model to effectively leverage explicit prior knowledge of motion anatomy, and simultaneously capture both spatial and temporal information of joint trajectory dynamics. We then propose a global optimization module that learns the implicit relationships between individual joint features. Empirically, our method is evaluated on large-scale 3D human motion benchmark datasets (i.e., Human3.6M, CMU MoCap). These results demonstrate that our method sets the new state-of-the-art on the benchmark datasets. Our code will be available at https://github.com/Pose-Group/TEID.

【67】 Towards Trustworthy DeFi Oracles: Past,Present and Future
标题:走向值得信赖的德菲甲骨文:过去、现在和未来

链接:https://arxiv.org/abs/2201.02358
作者:Yinjie Zhao,Xin Kang,Tieyan Li,Cheng-Kang Chu,Haiguang Wang
备注:Under review
摘要:With the rapid development of blockchain technology in recent years, all kinds of blockchain-based applications have emerged. Among them, the decentralized finance (DeFi) is one of the most successful applications, which is regarded as the future of finance. The great success of DeFi relies on the real-world data which is not directly available on the blockchain. Besides, due to the deterministic nature of blockchain,the blockchain cannot directly obtain in-deterministic data from the outside world (off-chain). Thus, oracles have appeared as a viable solution to feed off-chain data to blockchain applications. In this paper, we carryout a comprehensive study on oracles, especially on DeFi oracles. We first briefly introduce the application scenarios of DeFi oracles, and then we talk about the past of DeFi oracles by categorizing them into several types based on their design features. After that, we introduce five popular DeFi oracles currently in use(such as Chainlink and Band Protocol), with the focus on their system architecture, data validation process,and their incentive mechanisms. We compare these present DeFi oracles from their data trustworthiness,data source trustworthiness and their overall trust models. Finally, we propose a set of metrics for designing trustworthiness DeFi oracles, and propose a potential trust architecture and a few promising techniques for building trustworthiness oracles.

【68】 GenLabel: Mixup Relabeling using Generative Models
标题:GenLabel:使用产生式模型的混合重标记

链接:https://arxiv.org/abs/2201.02354
作者:Jy-yong Sohn,Liang Shang,Hongxu Chen,Jaekyun Moon,Dimitris Papailiopoulos,Kangwook Lee
摘要:Mixup is a data augmentation method that generates new data points by mixing a pair of input data. While mixup generally improves the prediction performance, it sometimes degrades the performance. In this paper, we first identify the main causes of this phenomenon by theoretically and empirically analyzing the mixup algorithm. To resolve this, we propose GenLabel, a simple yet effective relabeling algorithm designed for mixup. In particular, GenLabel helps the mixup algorithm correctly label mixup samples by learning the class-conditional data distribution using generative models. Via extensive theoretical and empirical analysis, we show that mixup, when used together with GenLabel, can effectively resolve the aforementioned phenomenon, improving the generalization performance and the adversarial robustness.

【69】 Degrees of Freedom Analysis of Mechanisms using the New Zebra Crossing Method
标题:用新斑马线交叉法进行机构自由度分析

链接:https://arxiv.org/abs/2201.02352
作者:Rajashekhar V S,Debasish Ghose
备注:31 pages and 17 figures
摘要:Mobility, which is a basic property for a mechanism has to be analyzed to find the degrees of freedom. A quick method for calculation of degrees of freedom in a mechanism is proposed in this work. The mechanism is represented in a way that resembles a zebra crossing. An algorithm is proposed which is used to determine the mobility from the zebra crossing diagram. This algorithm takes into account the number of patches between the black patches, the number of joints attached to the fixed link and the number of loops in the mechanism. A number of cases have been discussed which fail to give the desired results using the widely used classical Kutzbach-Grubler formula.

【70】 Asymptotic Security using Bayesian Defense Mechanisms with Application to Cyber Deception
标题:基于贝叶斯防御机制的渐近安全性及其在网络欺骗中的应用

链接:https://arxiv.org/abs/2201.02351
作者:Hampei Sasahara,Henrik Sandberg
备注:16 pages
摘要:This study addresses the question whether model knowledge can prevent a defender from being deceived or not in cyber security. As a specific model-based defense scheme, this study treats Bayesian defense mechanism, which monitors the system's behavior, forms a belief on existence of the attacker, and chooses appropriate reactions. Sophisticated attackers aim at achieving her objective while avoiding being detected by deceiving the defender. In this paper, their dynamic decision making is formulated as a stochastic signaling game. It is revealed that the belief on the true scenario has a limit in a stochastic sense at an equilibrium based on martingale analysis. This fact implies that there are only two possible cases: the defender asymptotically detects the attack with a firm belief or the attacker takes actions such that the system's behavior becomes nominal after a certain finite time step. Consequently, if the dynamics admits no stealthy attacks, the system is guaranteed to be secure in an asymptotic manner provided that effective countermeasures are implemented. The result concludes that model knowledge can prevent deception in an asymptotic sense. As an application of the finding, a defensive deception utilizing asymmetric recognition on vulnerabilities exploited by the attacker is analyzed. It is shown that, the attacker possibly stops the attack even if the defender is unaware of the vulnerabilities as long as the defender's unawareness is concealed by the defensive deception. Those results indicate the powerful defense capability achieved by model knowledge.

【71】 The Study of Peer Assessment Impact on Group Learning Activities
标题:同伴评价对小组学习活动的影响研究

链接:https://arxiv.org/abs/2201.02344
作者:Zhiyuan Chen,Soon Boon Lee,Shazia Paras Shaikh,Mirza Rayana Sanzana
备注:Regular Research Paper Accepted by FECS'21 (The 17th Int'l Conf on Frontiers in Education: Computer Science and Computer Engineering)
摘要:Comparing with lecturer marked assessments, peer assessment is a more comprehensive learning process and many of the associated problems have occurred. In this research work, we study the peer-assessment impact on group learning activities in order to provide a complete and systematic review, increase the practice and quality of the peer assessment process. Pilot studies were conducted and took the form of surveys, focus group interviews, and questionnaires. Prelimi-nary surveys were conducted with 582 students and 276 responses were received, giving a response rate of 47.4%. The results show 37% student will choose individual work over group work if given the choice. In the case study, 82.1% of the total of 28 students have en-joyed working in a group using Facebook as communication tools. 89.3% of the students can demonstrate their skills through group-working and most importantly, 82.1% of them agree that peer assess-ment is an impartial method of assessment with the help of Facebook as proof of self-contribution. Our suggestions to make group work a pleasant experience are by identifying and taking action against the freeloader, giving credit to the deserving students, educating students on how to give constructive feedback and making the assessment pro-cess transparent to all.

【72】 SaL-Lightning Dataset: Search and Eye Gaze Behavior, Resource Interactions and Knowledge Gain during Web Search
标题:SAL-Lightning数据集:网络搜索期间的搜索和眼睛注视行为、资源交互和知识获取

链接:https://arxiv.org/abs/2201.02339
作者:Christian Otto,Markus Rokicki,Georg Pardi,Wolfgang Gritz,Daniel Hienert,Ran Yu,Johannes von Hoyer,Anett Hoppe,Stefan Dietze,Peter Holtz,Yvonne Kammerer,Ralph Ewerth
备注:To be published at the 2022 ACM SIGIR Conference on Human Information Interaction and Retrieval (CHIIR '22)
摘要:The emerging research field Search as Learning investigates how the Web facilitates learning through modern information retrieval systems. SAL research requires significant amounts of data that capture both search behavior of users and their acquired knowledge in order to obtain conclusive insights or train supervised machine learning models. However, the creation of such datasets is costly and requires interdisciplinary efforts in order to design studies and capture a wide range of features. In this paper, we address this issue and introduce an extensive dataset based on a user study, in which $114$ participants were asked to learn about the formation of lightning and thunder. Participants' knowledge states were measured before and after Web search through multiple-choice questionnaires and essay-based free recall tasks. To enable future research in SAL-related tasks we recorded a plethora of features and person-related attributes. Besides the screen recordings, visited Web pages, and detailed browsing histories, a large number of behavioral features and resource features were monitored. We underline the usefulness of the dataset by describing three, already published, use cases.

【73】 Decision problem of some bundled FOML fragments
标题:若干捆绑FOML片段的判定问题

链接:https://arxiv.org/abs/2201.02336
作者:Mo Liu
摘要:Over increasing domain interpretations, \exists\Box and \forall\Box bundled fragments are decidable and over constant domain interpretations, \exists\Box bundled fragment is decidable while \forall\Box bundled fragment is undecidable. Based on the existing results,we show that over increasing domain interpretations, \Box\exists and \Box\forall bundled fragments are decidable as well. On the other hand, over constant domain interpretations, \Box\forall bundled fragment is undecidable and \Box\exists^2 bundled fragment, an extension of \Box\exists bundled fragment, is undecidable neither.

【74】 iDECODe: In-distribution Equivariance for Conformal Out-of-distribution Detection
标题:IDECODe:用于共形分布外检测的分布内等差

链接:https://arxiv.org/abs/2201.02331
作者:Ramneet Kaur,Susmit Jha,Anirban Roy,Sangdon Park,Edgar Dobriban,Oleg Sokolsky,Insup Lee
备注:Association for the Advancement of Artificial Intelligence (AAAI), 2022
摘要:Machine learning methods such as deep neural networks (DNNs), despite their success across different domains, are known to often generate incorrect predictions with high confidence on inputs outside their training distribution. The deployment of DNNs in safety-critical domains requires detection of out-of-distribution (OOD) data so that DNNs can abstain from making predictions on those. A number of methods have been recently developed for OOD detection, but there is still room for improvement. We propose the new method iDECODe, leveraging in-distribution equivariance for conformal OOD detection. It relies on a novel base non-conformity measure and a new aggregation method, used in the inductive conformal anomaly detection framework, thereby guaranteeing a bounded false detection rate. We demonstrate the efficacy of iDECODe by experiments on image and audio datasets, obtaining state-of-the-art results. We also show that iDECODe can detect adversarial examples.

【75】 On the Effectiveness of Sampled Softmax Loss for Item Recommendation
标题:抽样软最大损失在项目推荐中的有效性研究

链接:https://arxiv.org/abs/2201.02327
作者:Jiancan Wu,Xiang Wang,Xingyu Gao,Jiawei Chen,Hongcheng Fu,Tianyu Qiu,Xiangnan He
备注:10 Pages, 1 figure, 5 tables
摘要:Learning objectives of recommender models remain largely unexplored. Most methods routinely adopt either pointwise or pairwise loss to train the model parameters, while rarely pay attention to softmax loss due to the high computational cost. Sampled softmax loss emerges as an efficient substitute for softmax loss. Its special case, InfoNCE loss, has been widely used in self-supervised learning and exhibited remarkable performance for contrastive learning. Nonetheless, limited studies use sampled softmax loss as the learning objective to train the recommender. Worse still, none of them explore its properties and answer "Does sampled softmax loss suit for item recommendation?" and "What are the conceptual advantages of sampled softmax loss, as compared with the prevalent losses?", to the best of our knowledge. In this work, we aim to better understand sampled softmax loss for item recommendation. Specifically, we first theoretically reveal three model-agnostic advantages: (1) mitigating popularity bias, which is beneficial to long-tail recommendation; (2) mining hard negative samples, which offers informative gradients to optimize model parameters; and (3) maximizing the ranking metric, which facilitates top-K performance. Moreover, we probe the model-specific characteristics on the top of various recommenders. Experimental results suggest that sampled softmax loss is more friendly to history and graph-based recommenders (e.g., SVD++ and LightGCN), but performs poorly for ID-based models (e.g., MF). We ascribe this to its shortcoming in learning representation magnitude, making the combination with the models that are also incapable of adjusting representation magnitude learn poor representations. In contrast, the history- and graph-based models, which naturally adjust representation magnitude according to node degree, are able to compensate for the shortcoming of sampled softmax loss.

【76】 Distributed Nash Equilibrium Seeking over Time-Varying Directed Communication Networks
标题:时变有向通信网络上的分布式纳什均衡求解

链接:https://arxiv.org/abs/2201.02323
作者:Duong Thuy Anh Nguyen,Duong Tung Nguyen,Angelia Nedić
摘要:We study distributed algorithms for finding a Nash equilibrium (NE) in a class of non-cooperative convex games under partial information. Specifically, each agent has access only to its own smooth local cost function and can receive information from its neighbors in a time-varying directed communication network. To this end, we propose a distributed gradient play algorithm to compute a NE by utilizing local information exchange among the players. In this algorithm, every agent performs a gradient step to minimize its own cost function while sharing and retrieving information locally among its neighbors. The existing methods impose strong assumptions such as balancedness of the mixing matrices and global knowledge of the network communication structure, including Perron-Frobenius eigenvector of the adjacency matrix and other graph connectivity constants. In contrast, our approach relies only on a reasonable and widely-used assumption of row-stochasticity of the mixing matrices. We analyze the algorithm for time-varying directed graphs and prove its convergence to the NE, when the agents' cost functions are strongly convex and have Lipschitz continuous gradients. Numerical simulations are performed for a Nash-Cournot game to illustrate the efficacy of the proposed algorithm.

【77】 An Unsupervised Masking Objective for Abstractive Multi-Document News Summarization
标题:一种面向抽象多文档新闻摘要的无监督掩蔽目标

链接:https://arxiv.org/abs/2201.02321
作者:Nikolai Vogler,Songlin Li,Yujie Xu,Yujian Mi,Taylor Berg-Kirkpatrick
摘要:We show that a simple unsupervised masking objective can approach near supervised performance on abstractive multi-document news summarization. Our method trains a state-of-the-art neural summarization model to predict the masked out source document with highest lexical centrality relative to the multi-document group. In experiments on the Multi-News dataset, our masked training objective yields a system that outperforms past unsupervised methods and, in human evaluation, surpasses the best supervised method without requiring access to any ground-truth summaries. Further, we evaluate how different measures of lexical centrality, inspired by past work on extractive summarization, affect final performance.

【78】 A Transfer Learning Pipeline for Educational Resource Discovery with Application in Leading Paragraph Generation
标题:教育资源发现的迁移学习流水线及其在前导段落生成中的应用

链接:https://arxiv.org/abs/2201.02312
作者:Irene Li,Thomas George,Alexander Fabbri,Tammy Liao,Benjamin Chen,Rina Kawamura,Richard Zhou,Vanessa Yan,Swapnil Hingmire,Dragomir Radev
摘要:Effective human learning depends on a wide selection of educational materials that align with the learner's current understanding of the topic. While the Internet has revolutionized human learning or education, a substantial resource accessibility barrier still exists. Namely, the excess of online information can make it challenging to navigate and discover high-quality learning materials. In this paper, we propose the educational resource discovery (ERD) pipeline that automates web resource discovery for novel domains. The pipeline consists of three main steps: data collection, feature extraction, and resource classification. We start with a known source domain and conduct resource discovery on two unseen target domains via transfer learning. We first collect frequent queries from a set of seed documents and search on the web to obtain candidate resources, such as lecture slides and introductory blog posts. Then we introduce a novel pretrained information retrieval deep neural network model, query-document masked language modeling (QD-MLM), to extract deep features of these candidate resources. We apply a tree-based classifier to decide whether the candidate is a positive learning resource. The pipeline achieves F1 scores of 0.94 and 0.82 when evaluated on two similar but novel target domains. Finally, we demonstrate how this pipeline can benefit an application: leading paragraph generation for surveys. This is the first study that considers various web resources for survey generation, to the best of our knowledge. We also release a corpus of 39,728 manually labeled web resources and 659 queries from NLP, Computer Vision (CV), and Statistics (STATS).

【79】 Multi-Behavior Enhanced Recommendation with Cross-Interaction Collaborative Relation Modeling
标题:基于交叉交互协同关系建模的多行为增强推荐

链接:https://arxiv.org/abs/2201.02307
作者:Lianghao Xia,Chao Huang,Yong Xu,Peng Dai,Mengyin Lu,Liefeng Bo
备注:Published on ICDE 2021
摘要:Many previous studies aim to augment collaborative filtering with deep neural network techniques, so as to achieve better recommendation performance. However, most existing deep learning-based recommender systems are designed for modeling singular type of user-item interaction behavior, which can hardly distill the heterogeneous relations between user and item. In practical recommendation scenarios, there exist multityped user behaviors, such as browse and purchase. Due to the overlook of user's multi-behavioral patterns over different items, existing recommendation methods are insufficient to capture heterogeneous collaborative signals from user multi-behavior data. Inspired by the strength of graph neural networks for structured data modeling, this work proposes a Graph Neural Multi-Behavior Enhanced Recommendation (GNMR) framework which explicitly models the dependencies between different types of user-item interactions under a graph-based message passing architecture. GNMR devises a relation aggregation network to model interaction heterogeneity, and recursively performs embedding propagation between neighboring nodes over the user-item interaction graph. Experiments on real-world recommendation datasets show that our GNMR consistently outperforms state-of-the-art methods. The source code is available at https://github.com/akaxlh/GNMR.

【80】 Learning Multi-Tasks with Inconsistent Labels by using Auxiliary Big Task
标题:利用辅助大任务学习标签不一致的多任务

链接:https://arxiv.org/abs/2201.02305
作者:Quan Feng,Songcan Chen
摘要:Multi-task learning is to improve the performance of the model by transferring and exploiting common knowledge among tasks. Existing MTL works mainly focus on the scenario where label sets among multiple tasks (MTs) are usually the same, thus they can be utilized for learning across the tasks. While almost rare works explore the scenario where each task only has a small amount of training samples, and their label sets are just partially overlapped or even not. Learning such MTs is more challenging because of less correlation information available among these tasks. For this, we propose a framework to learn these tasks by jointly leveraging both abundant information from a learnt auxiliary big task with sufficiently many classes to cover those of all these tasks and the information shared among those partially-overlapped tasks. In our implementation of using the same neural network architecture of the learnt auxiliary task to learn individual tasks, the key idea is to utilize available label information to adaptively prune the hidden layer neurons of the auxiliary network to construct corresponding network for each task, while accompanying a joint learning across individual tasks. Our experimental results demonstrate its effectiveness in comparison with the state-of-the-art approaches.

【81】 Budget-aware Few-shot Learning via Graph Convolutional Network
标题:基于图卷积网络的预算感知小概率学习

链接:https://arxiv.org/abs/2201.02304
作者:Shipeng Yan,Songyang Zhang,Xuming He
摘要:This paper tackles the problem of few-shot learning, which aims to learn new visual concepts from a few examples. A common problem setting in few-shot classification assumes random sampling strategy in acquiring data labels, which is inefficient in practical applications. In this work, we introduce a new budget-aware few-shot learning problem that not only aims to learn novel object categories, but also needs to select informative examples to annotate in order to achieve data efficiency. We develop a meta-learning strategy for our budget-aware few-shot learning task, which jointly learns a novel data selection policy based on a Graph Convolutional Network (GCN) and an example-based few-shot classifier. Our selection policy computes a context-sensitive representation for each unlabeled data by graph message passing, which is then used to predict an informativeness score for sequential selection. We validate our method by extensive experiments on the mini-ImageNet, tiered-ImageNet and Omniglot datasets. The results show our few-shot learning strategy outperforms baselines by a sizable margin, which demonstrates the efficacy of our method.

【82】 From Textual Experiments to Experimental Texts: Expressive Repetition in "Artificial Intelligence Literature"

链接:https://arxiv.org/abs/2201.02303
作者:Tianhua Zhu
备注:12 pages; to appear on SASS Studies, 2021 Winter. This is an English version; please consider citing the original paper in Chinese
摘要:Since the birth of artificial intelligence 70 years ago, attempts at literary "creation" with computers are present in the course of technological development, creating what one might call "artificial intelligence literature" (AI literature). Evolving from "textual experiments" conducted by technologists to "experimental texts" that explore the possibilities of conceptions of literature, AI literature integrates primitive problems including machine thinking, text generation, and machine creativity, which exhibits the two-way interaction between social ideas and technology. In the early stage, the mutual support between technological path and artistic ideas turned out to be a failure, while AI-driven expressive repetitions are made probable in the contemporary technological context, paving the way for the transformation of AI literature from proof for technical possibilities to self-verification of literary value.

【83】 Extending One-Stage Detection with Open-World Proposals
标题:利用开放世界方案扩展一阶段检测

链接:https://arxiv.org/abs/2201.02302
作者:Sachin Konan,Kevin J Liang,Li Yin
摘要:In many applications, such as autonomous driving, hand manipulation, or robot navigation, object detection methods must be able to detect objects unseen in the training set. Open World Detection(OWD) seeks to tackle this problem by generalizing detection performance to seen and unseen class categories. Recent works have seen success in the generation of class-agnostic proposals, which we call Open-World Proposals(OWP), but this comes at the cost of a big drop on the classification task when both tasks are considered in the detection model. These works have investigated two-stage Region Proposal Networks (RPN) by taking advantage of objectness scoring cues; however, for its simplicity, run-time, and decoupling of localization and classification, we investigate OWP through the lens of fully convolutional one-stage detection network, such as FCOS. We show that our architectural and sampling optimizations on FCOS can increase OWP performance by as much as 6% in recall on novel classes, marking the first proposal-free one-stage detection network to achieve comparable performance to RPN-based two-stage networks. Furthermore, we show that the inherent, decoupled architecture of FCOS has benefits to retaining classification performance. While two-stage methods worsen by 6% in recall on novel classes, we show that FCOS only drops 2% when jointly optimizing for OWP and classification.

【84】 Time Series Forecasting Using Fuzzy Cognitive Maps: A Survey
标题:基于模糊认知图的时间序列预测研究综述

链接:https://arxiv.org/abs/2201.02297
作者:Omid Orang,Petrônio Cândido de Lima e Silva,Frederico Guimarães Gadelha
摘要:Among various soft computing approaches for time series forecasting, Fuzzy Cognitive Maps (FCM) have shown remarkable results as a tool to model and analyze the dynamics of complex systems. FCM have similarities to recurrent neural networks and can be classified as a neuro-fuzzy method. In other words, FCMs are a mixture of fuzzy logic, neural network, and expert system aspects, which act as a powerful tool for simulating and studying the dynamic behavior of complex systems. The most interesting features are knowledge interpretability, dynamic characteristics and learning capability. The goal of this survey paper is mainly to present an overview on the most relevant and recent FCM-based time series forecasting models proposed in the literature. In addition, this article considers an introduction on the fundamentals of FCM model and learning methodologies. Also, this survey provides some ideas for future research to enhance the capabilities of FCM in order to cover some challenges in the real-world experiments such as handling non-stationary data and scalability issues. Moreover, equipping FCMs with fast learning algorithms is one of the major concerns in this area.

【85】 Delay Alignment Modulation: Enabling Equalization-Free Single-Carrier Communication
标题:延迟对齐调制:实现无均衡单载波通信

链接:https://arxiv.org/abs/2201.02291
作者:Haiquan Lu,Yong Zeng
备注:5 pages, 6 figures
摘要:This paper proposes a novel broadband transmission technology, termed delay alignment modulation (DAM), which enables the low-complexity equalization-free single-carrier communication, yet without suffering from inter-symbol interference (ISI). The key idea of DAM is to deliberately introduce appropriate delays for information-bearing symbols at the transmitter side, so that after propagating over the time-dispersive channel, all multi-path signal components will arrive at the receiver simultaneously and constructively. We first show that by applying DAM for the basic multiple-input single-output (MISO) communication system, an ISI-free additive white Gaussian noise (AWGN) system can be obtained with the simple zero-forcing (ZF) beamforming. Furthermore, the more general DAM scheme is studied with the ISI-maximal-ratio transmission (MRT) and the ISI-minimum mean-square error (MMSE) beamforming. Simulation results are provided to show that when the channel is sparse and/or the antenna dimension is large, DAM not only resolves the notorious practical issues suffered by orthogonal frequency-division multiplexing (OFDM) such as high peak-to-average-power ratio (PAPR), severe out-of-band (OOB) emission, and vulnerability to carrier frequency offset (CFO), with low complexity, but also achieves higher spectral efficiency due to the saving of guard interval overhead.

【86】 Voltage-Based State of Charge Correction at Charge-End

链接:https://arxiv.org/abs/2201.02282
作者:Ali Abdollahi,Jianwei Li,Xiaojun Li,Trevor Jones,Asif Habeebullah
摘要:A voltage-based method is proposed to correct battery pack state of charge (SOC) estimation at the charge-end. Two main characteristics make the charge-end time span a good opportunity to correct SOC estimation: first, it is easy to detect when the battery is at the last stage of charging because the charging profile is known to the BMS designer and also during the charge-end time span the amount of current is low, and the terminal voltage of the battery cells are high; second, as the battery reaches the charge-end stage, we know that the true SOC is approaching to 100%. This paper presents a method to utilize these important features to correct the SOC estimation error. Using a voltage threshold method, the algorithm detects when the battery is close to the charge-end to activate the charge-end SOC correction strategy. Once activated, the strategy corrects the SOC using the maximum cell voltage to guarantee that SOC is 100% when charging is complete. The amount of correction is a function of maximum cell voltage and the charge current C-rate.

【87】 Repurposing Existing Deep Networks for Caption and Aesthetic-Guided Image Cropping
标题:重新利用现有的深层网络进行字幕和美学引导的图像裁剪

链接:https://arxiv.org/abs/2201.02280
作者:Nora Horanyi,Kedi Xia,Kwang Moo Yi,Abhishake Kumar Bojja,Ales Leonardis,Hyung Jin Chang
备注:None
摘要:We propose a novel optimization framework that crops a given image based on user description and aesthetics. Unlike existing image cropping methods, where one typically trains a deep network to regress to crop parameters or cropping actions, we propose to directly optimize for the cropping parameters by repurposing pre-trained networks on image captioning and aesthetic tasks, without any fine-tuning, thereby avoiding training a separate network. Specifically, we search for the best crop parameters that minimize a combined loss of the initial objectives of these networks. To make the optimization table, we propose three strategies: (i) multi-scale bilinear sampling, (ii) annealing the scale of the crop region, therefore effectively reducing the parameter space, (iii) aggregation of multiple optimization results. Through various quantitative and qualitative evaluations, we show that our framework can produce crops that are well-aligned to intended user descriptions and aesthetically pleasing.

【88】 De-rendering 3D Objects in the Wild
标题:在野外取消渲染3D对象

链接:https://arxiv.org/abs/2201.02279
作者:Felix Wimbauer,Shangzhe Wu,Christian Rupprecht
摘要:With increasing focus on augmented and virtual reality applications (XR) comes the demand for algorithms that can lift objects from images and videos into representations that are suitable for a wide variety of related 3D tasks. Large-scale deployment of XR devices and applications means that we cannot solely rely on supervised learning, as collecting and annotating data for the unlimited variety of objects in the real world is infeasible. We present a weakly supervised method that is able to decompose a single image of an object into shape (depth and normals), material (albedo, reflectivity and shininess) and global lighting parameters. For training, the method only relies on a rough initial shape estimate of the training objects to bootstrap the learning process. This shape supervision can come for example from a pretrained depth network or - more generically - from a traditional structure-from-motion pipeline. In our experiments, we show that the method can successfully de-render 2D images into a decomposed 3D representation and generalizes to unseen object categories. Since in-the-wild evaluation is difficult due to the lack of ground truth data, we also introduce a photo-realistic synthetic test set that allows for quantitative evaluation.

【89】 Investigating Expectation Violations in Mobile Apps
标题:调查移动应用中的预期违规行为

链接:https://arxiv.org/abs/2201.02269
作者:Sherlock A. Licorish,Helen E. Owen,Bastin Tony Roy Savarimuthu,Priyanka Patel
备注:32 pages, 4 figures, 8 tables
摘要:Information technology and software services are pervasive, occupying the centre of most aspects of contemporary societies. This has given rise to commonly expected norms and expectations around how such systems should work, appropriate penalties for violating these expectations, and more importantly, indicators of how to reduce the consequences of violations and sanctions. Evidence for expectation violations and ensuing sanctions exists in a range of portals used by individuals and groups to start new friendships, explore new ideas, and provide feedback for products and services. Therein lies insights that could lead to functional socio-technical systems, and general awareness and anticipations of human actions (and interactions) when using information technology and software services. However, limited previous work has examined such artifacts to provide these understandings. To contribute to such understandings and theoretical advancement we study expectation violations in mobile apps, considered among the most engaging socio-technical systems. We used content analysis and expectancy violation theory (EVT) and expectation confirmation theory (ECT) to explore the evidence and nature of sanctions in app reviews for a specific domain of apps. Our outcomes show that users respond to expectation violation with sanctions when their app does not work as anticipated, developers seem to target specific market niches when providing services in an app domain, and users within an app domain respond with similar sanctions. We contribute to the advancement of expectation violation theories, and we provide practical insights for the mobile app community.

【90】 Learning to be adversarially robust and differentially private
标题:学会变得相反的健壮和与众不同的私密

链接:https://arxiv.org/abs/2201.02265
作者:Jamie Hayes,Borja Balle,M. Pawan Kumar
备注:Preliminary work appeared at PPML 2021
摘要:We study the difficulties in learning that arise from robust and differentially private optimization. We first study convergence of gradient descent based adversarial training with differential privacy, taking a simple binary classification task on linearly separable data as an illustrative example. We compare the gap between adversarial and nominal risk in both private and non-private settings, showing that the data dimensionality dependent term introduced by private optimization compounds the difficulties of learning a robust model. After this, we discuss what parts of adversarial training and differential privacy hurt optimization, identifying that the size of adversarial perturbation and clipping norm in differential privacy both increase the curvature of the loss landscape, implying poorer generalization performance.

【91】 ITSA: An Information-Theoretic Approach to Automatic Shortcut Avoidance and Domain Generalization in Stereo Matching Networks
标题:ITSA:立体匹配网络中自动回避捷径和区域泛化的信息论方法

链接:https://arxiv.org/abs/2201.02263
作者:WeiQin Chuah,Ruwan Tennakoon,Reza Hoseinnezhad,Alireza Bab-Hadiashar,David Suter
备注:11 pages, 4 figures
摘要:State-of-the-art stereo matching networks trained only on synthetic data often fail to generalize to more challenging real data domains. In this paper, we attempt to unfold an important factor that hinders the networks from generalizing across domains: through the lens of shortcut learning. We demonstrate that the learning of feature representations in stereo matching networks is heavily influenced by synthetic data artefacts (shortcut attributes). To mitigate this issue, we propose an Information-Theoretic Shortcut Avoidance~(ITSA) approach to automatically restrict shortcut-related information from being encoded into the feature representations. As a result, our proposed method learns robust and shortcut-invariant features by minimizing the sensitivity of latent features to input variations. To avoid the prohibitive computational cost of direct input sensitivity optimization, we propose an effective yet feasible algorithm to achieve robustness. We show that using this method, state-of-the-art stereo matching networks that are trained purely on synthetic data can effectively generalize to challenging and previously unseen real data scenarios. Importantly, the proposed method enhances the robustness of the synthetic trained networks to the point that they outperform their fine-tuned counterparts (on real data) for challenging out-of-domain stereo datasets.

【92】 A unified software/hardware scalable architecture for brain-inspired computing based on self-organizing neural models
标题:基于自组织神经模型的脑启发计算软硬件统一可扩展体系结构

链接:https://arxiv.org/abs/2201.02262
作者:Artem R. Muliukov,Laurent Rodriguez,Benoit Miramond,Lyes Khacef,Joachim Schmidt,Quentin Berthet,Andres Upegui
摘要:The field of artificial intelligence has significantly advanced over the past decades, inspired by discoveries from the fields of biology and neuroscience. The idea of this work is inspired by the process of self-organization of cortical areas in the human brain from both afferent and lateral/internal connections. In this work, we develop an original brain-inspired neural model associating Self-Organizing Maps (SOM) and Hebbian learning in the Reentrant SOM (ReSOM) model. The framework is applied to multimodal classification problems. Compared to existing methods based on unsupervised learning with post-labeling, the model enhances the state-of-the-art results. This work also demonstrates the distributed and scalable nature of the model through both simulation results and hardware execution on a dedicated FPGA-based platform named SCALP (Self-configurable 3D Cellular Adaptive Platform). SCALP boards can be interconnected in a modular way to support the structure of the neural model. Such a unified software and hardware approach enables the processing to be scaled and allows information from several modalities to be merged dynamically. The deployment on hardware boards provides performance results of parallel execution on several devices, with the communication between each board through dedicated serial links. The proposed unified architecture, composed of the ReSOM model and the SCALP hardware platform, demonstrates a significant increase in accuracy thanks to multimodal association, and a good trade-off between latency and power consumption compared to a centralized GPU implementation.

【93】 CitySurfaces: City-Scale Semantic Segmentation of Sidewalk Materials
标题:CitySurfaces:人行道材质的城市尺度语义分割

链接:https://arxiv.org/abs/2201.02260
作者:Maryam Hosseini,Fabio Miranda,Jianzhe Lin,Claudio Silva
备注:Sustainable Cities and Society journal (accepted); Model: this https URL
摘要:While designing sustainable and resilient urban built environment is increasingly promoted around the world, significant data gaps have made research on pressing sustainability issues challenging to carry out. Pavements are known to have strong economic and environmental impacts; however, most cities lack a spatial catalog of their surfaces due to the cost-prohibitive and time-consuming nature of data collection. Recent advancements in computer vision, together with the availability of street-level images, provide new opportunities for cities to extract large-scale built environment data with lower implementation costs and higher accuracy. In this paper, we propose CitySurfaces, an active learning-based framework that leverages computer vision techniques for classifying sidewalk materials using widely available street-level images. We trained the framework on images from New York City and Boston and the evaluation results show a 90.5% mIoU score. Furthermore, we evaluated the framework using images from six different cities, demonstrating that it can be applied to regions with distinct urban fabrics, even outside the domain of the training data. CitySurfaces can provide researchers and city agencies with a low-cost, accurate, and extensible method to collect sidewalk material data which plays a critical role in addressing major sustainability issues, including climate change and surface water management.

【94】 Applying Word Embeddings to Measure Valence in Information Operations Targeting Journalists in Brazil
标题:在巴西以记者为目标的信息操作中应用词嵌入来衡量价位

链接:https://arxiv.org/abs/2201.02257
作者:David A. Broniatowski
摘要:Among the goals of information operations are to change the overall information environment vis-\'a-vis specific actors. For example, "trolling campaigns" seek to undermine the credibility of specific public figures, leading others to distrust them and intimidating these figures into silence. To accomplish these aims, information operations frequently make use of "trolls" -- malicious online actors who target verbal abuse at these figures. In Brazil, in particular, allies of Brazil's current president have been accused of operating a "hate cabinet" -- a trolling operation that targets journalists who have alleged corruption by this politician and other members of his regime. Leading approaches to detecting harmful speech, such as Google's Perspective API, seek to identify specific messages with harmful content. While this approach is helpful in identifying content to downrank, flag, or remove, it is known to be brittle, and may miss attempts to introduce more subtle biases into the discourse. Here, we aim to develop a measure that might be used to assess how targeted information operations seek to change the overall valence, or appraisal, of specific actors. Preliminary results suggest known campaigns target female journalists more so than male journalists, and that these campaigns may leave detectable traces in overall Twitter discourse.

【95】 Data-Efficient Learning of High-Quality Controls for Kinodynamic Planning used in Vehicular Navigation
标题:用于车辆导航的高质量运动规划控制的数据高效学习

链接:https://arxiv.org/abs/2201.02254
作者:Seth Karten,Aravind Sivaramakrishnan,Edgar Granados,Troy McMahon,Kostas E. Bekris
备注:None
摘要:This paper aims to improve the path quality and computational efficiency of kinodynamic planners used for vehicular systems. It proposes a learning framework for identifying promising controls during the expansion process of sampling-based motion planners for systems with dynamics. Offline, the learning process is trained to return the highest-quality control that reaches a local goal state (i.e., a waypoint) in the absence of obstacles from an input difference vector between its current state and a local goal state. The data generation scheme provides bounds on the target dispersion and uses state space pruning to ensure high-quality controls. By focusing on the system's dynamics, this process is data efficient and takes place once for a dynamical system, so that it can be used for different environments with modular expansion functions. This work integrates the proposed learning process with a) an exploratory expansion function that generates waypoints with biased coverage over the reachable space, and b) proposes an exploitative expansion function for mobile robots, which generates waypoints using medial axis information. This paper evaluates the learning process and the corresponding planners for a first and second-order differential drive systems. The results show that the proposed integration of learning and planning can produce better quality paths than kinodynamic planning with random controls in fewer iterations and computation time.

【96】 A Taxonomy of Social VR Design
标题:社会虚拟现实设计的一种分类学

链接:https://arxiv.org/abs/2201.02253
作者:Douglas Zytko,Ryan Handley,Bert Guerra,Rukkmini Goli
摘要:Social VR has experienced tremendous growth in the commercial space recently as an emerging technology for rich interactions themed around leisure, work, and relationship building. As a result, the state of social VR application design has become rapidly obfuscated, which complicates identification of design trends and uncommon features that could inform future design, and hinders inclusion of new voices in this design space. To help address this problem, we present a taxonomy of social VR application design choices as informed by 44 commercial and prototypical applications. Our taxonomy was informed by multiple discovery strategies including literature review, search of VR-themed subreddits, and autobiographical landscape research. The taxonomy elucidates various features across three design areas: the self, interaction, and the environment.

【97】 Efficient Algebraic Two-Level Schwarz Preconditioner For Sparse Matrices

链接:https://arxiv.org/abs/2201.02250
作者:Hussam Al Daas,Pierre Jolivet,Tyrone Rees
摘要:Domain decomposition methods are among the most efficient for solving sparse linear systems of equations. Their effectiveness relies on a judiciously chosen coarse space. Originally introduced and theoretically proved to be efficient for self-adjoint operators, spectral coarse spaces have been proposed in the past few years for indefinite and non-self-adjoint operators. This paper presents a new spectral coarse space that can be constructed in a fully-algebraic way unlike most existing spectral coarse spaces. We present theoretical convergence result for Hermitian positive definite diagonally dominant matrices. Numerical experiments and comparisons against state-of-the-art preconditioners in the multigrid community show that the resulting two-level Schwarz preconditioner is efficient especially for non-self-adjoint operators. Furthermore, in this case, our proposed preconditioner outperforms state-of-the-art preconditioners.

【98】 Fixation Maximization in the Positional Moran Process
标题:位置性Moran过程中的注视最大化

链接:https://arxiv.org/abs/2201.02248
作者:Joachim Brendborg,Panagiotis Karras,Andreas Pavlogiannis,Asger Ullersted Rasmussen,Josef Tkadlec
备注:11 pages, 6 figures, to appear at AAAI 2022
摘要:The Moran process is a classic stochastic process that models invasion dynamics on graphs. A single "mutant" (e.g., a new opinion, strain, social trait etc.) invades a population of residents spread over the nodes of a graph. The mutant fitness advantage $\delta\geq 0$ determines how aggressively mutants propagate to their neighbors. The quantity of interest is the fixation probability, i.e., the probability that the initial mutant eventually takes over the whole population. However, in realistic settings, the invading mutant has an advantage only in certain locations. E.g., a bacterial mutation allowing for lactose metabolism only confers an advantage on places where dairy products are present. In this paper we introduce the positional Moran process, a natural generalization in which the mutant fitness advantage is only realized on specific nodes called active nodes. The associated optimization problem is fixation maximization: given a budget $k$, choose a set of $k$ active nodes that maximize the fixation probability of the invading mutant. We show that the problem is NP-hard, while the optimization function is not submodular, thus indicating strong computational hardness. Then we focus on two natural limits. In the limit of $\delta\to\infty$ (strong selection), although the problem remains NP-hard, the optimization function becomes submodular and thus admits a constant-factor approximation using a simple greedy algorithm. In the limit of $\delta\to 0$ (weak selection), we show that in $O(m^\omega)$ time we can obtain a tight approximation, where $m$ is the number of edges and $\omega$ is the matrix-multiplication exponent. Finally, we present an experimental evaluation of the new algorithms together with some proposed heuristics.

【99】 Source Code Anti-Plagiarism: a C# Implementation using the Routing Approach
标题:源代码反剽窃:使用路由方法的C#实现

链接:https://arxiv.org/abs/2201.02241
作者:Fabrizio d'Amore,Lorenzo Zarfati
摘要:Despite the approaches proposed so far, software plagiarism is still a problem which has not been solved entirely yet. The approach introduced throughout this paper is about a source code anti-plagiarism technique which aims at rendering the source code incomprehensible to a possible plagiarist and at the same time preventing source code modifications. The proposal is based on the concept of Router and makes use of both symmetric encryption and cryptographic hashing functions to provide such guarantees.

【100】 An Input-to-State Safety Approach to Anomaly-Resilient Parabolic PDEs: Application to Cyber-Physical Battery Modules

链接:https://arxiv.org/abs/2201.02239
作者:Tanushree Roy,Ashley Knichel,Satadru Dey
摘要:Distributed Parameter Cyber-Physical Systems (DPCPSs), modelled by Partial Differential Equations (PDEs), are increasingly vulnerable to anomalies such as physical faults as well as cyber-attacks. This motivates the need for strategies towards anomaly-resilient control of these systems. Although anomaly detection and diagnostics in PDE systems have received considerable attention in existing literature, fault-tolerant or anomaly-resilient control for PDEs remains relatively under-explored. However, given the vulnerabilities of these systems against anomalies, it is essential that the control systems possess resilience against these disruptions. In this context, we explore a Practical Input-to-Safety (pISSf) based control design approach for a class of DPCPSs modelled by linear Parabolic PDEs. Specifically, we develop a design framework for anomaly-resilient control for this class of system with both safety and stability guarantees based on control Lyapunov functional and control barrier functional. To illustrate our methodology, we apply our strategy to design a thermal-anomaly resilient boundary coolant control system for a cyber-physical battery module. Several simulation studies are done to show the efficacy of our method under anomalies such as mechanical battery degradation and cyber-attack mediated overdischarge.

【101】 Multi-modal data fusion of Voice and EMG data for Robotic Control
标题:机器人控制中语音和肌电数据的多模态数据融合

链接:https://arxiv.org/abs/2201.02237
作者:Tauheed Khan Mohd,Jackson Carvalho,Ahmad Y Javaid
摘要:Wearable electronic equipment is constantly evolving and is increasing the integration of humans with technology. Available in various forms, these flexible and bendable devices sense and can measure the physiological and muscular changes in the human body and may use those signals to machine control. The MYO gesture band, one such device, captures Electromyography data (EMG) using myoelectric signals and translates them to be used as input signals through some predefined gestures. Use of this device in a multi-modal environment will not only increase the possible types of work that can be accomplished with the help of such device, but it will also help in improving the accuracy of the tasks performed. This paper addresses the fusion of input modalities such as speech and myoelectric signals captured through a microphone and MYO band, respectively, to control a robotic arm. Experimental results obtained as well as their accuracies for performance analysis are also presented.

【102】 Detecting Anomalies using Overlapping Electrical Measurements in Smart Power Grids
标题:利用重叠电测量检测智能电网中的异常

链接:https://arxiv.org/abs/2201.02236
作者:Sina Sontowski,Nigel Lawrence,Deepjyoti Deka,Maanak Gupta
摘要:As cyber-attacks against critical infrastructure become more frequent, it is increasingly important to be able to rapidly identify and respond to these threats. This work investigates two independent systems with overlapping electrical measurements with the goal to more rapidly identify anomalies. The independent systems include HIST, a SCADA historian, and ION, an automatic meter reading system (AMR). While prior research has explored the benefits of fusing measurements, the possibility of overlapping measurements from an existing electrical system has not been investigated. To that end, we explore the potential benefits of combining overlapping measurements both to improve the speed/accuracy of anomaly detection and to provide additional validation of the collected measurements. In this paper, we show that merging overlapping measurements provide a more holistic picture of the observed systems. By applying Dynamic Time Warping more anomalies were found -- specifically, an average of 349 times more anomalies, when considering anomalies from both overlapping measurements. When merging the overlapping measurements, a percent change of anomalies of up to 785\% can be achieved compared to a non-merge of the data as reflected by experimental results.

【103】 Consistent Style Transfer
标题:一致的风格传递

链接:https://arxiv.org/abs/2201.02233
作者:Xuan Luo,Zhen Han,Lingkang Yang,Lingling Zhang
备注:10 pages, 11 figures
摘要:Recently, attentional arbitrary style transfer methods have been proposed to achieve fine-grained results, which manipulates the point-wise similarity between content and style features for stylization. However, the attention mechanism based on feature points ignores the feature multi-manifold distribution, where each feature manifold corresponds to a semantic region in the image. Consequently, a uniform content semantic region is rendered by highly different patterns from various style semantic regions, producing inconsistent stylization results with visual artifacts. We proposed the progressive attentional manifold alignment (PAMA) to alleviate this problem, which repeatedly applies attention operations and space-aware interpolations. The attention operation rearranges style features dynamically according to the spatial distribution of content features. This makes the content and style manifolds correspond on the feature map. Then the space-aware interpolation adaptively interpolates between the corresponding content and style manifolds to increase their similarity. By gradually aligning the content manifolds to style manifolds, the proposed PAMA achieves state-of-the-art performance while avoiding the inconsistency of semantic regions. Codes are available at https://github.com/computer-vision2022/PAMA.

【104】 Large-scale protein-protein post-translational modification extraction with distant supervision and confidence calibrated BioBERT
标题:具有远程监控和置信度校准的大规模蛋白质翻译后修饰提取

链接:https://arxiv.org/abs/2201.02229
作者:Aparna Elangovan,Yuan Li,Douglas E. V. Pires,Melissa J. Davis,Karin Verspoor
备注:None
摘要:Protein-protein interactions (PPIs) are critical to normal cellular function and are related to many disease pathways. However, only 4% of PPIs are annotated with PTMs in biological knowledge databases such as IntAct, mainly performed through manual curation, which is neither time nor cost-effective. We use the IntAct PPI database to create a distant supervised dataset annotated with interacting protein pairs, their corresponding PTM type, and associated abstracts from the PubMed database. We train an ensemble of BioBERT models - dubbed PPI-BioBERT-x10 to improve confidence calibration. We extend the use of ensemble average confidence approach with confidence variation to counteract the effects of class imbalance to extract high confidence predictions. The PPI-BioBERT-x10 model evaluated on the test set resulted in a modest F1-micro 41.3 (P =5 8.1, R = 32.1). However, by combining high confidence and low variation to identify high quality predictions, tuning the predictions for precision, we retained 19% of the test predictions with 100% precision. We evaluated PPI-BioBERT-x10 on 18 million PubMed abstracts and extracted 1.6 million (546507 unique PTM-PPI triplets) PTM-PPI predictions, and filter ~ 5700 (4584 unique) high confidence predictions. Of the 5700, human evaluation on a small randomly sampled subset shows that the precision drops to 33.7% despite confidence calibration and highlights the challenges of generalisability beyond the test set even with confidence calibration. We circumvent the problem by only including predictions associated with multiple papers, improving the precision to 58.8%. In this work, we highlight the benefits and challenges of deep learning-based text mining in practice, and the need for increased emphasis on confidence calibration to facilitate human curation efforts.

【105】 PIEEG: Turn a Raspberry Pi into a Brain-Computer-Interface to measure biosignals
标题:PIEEG:把树莓PI变成脑机接口来测量生物信号

链接:https://arxiv.org/abs/2201.02228
作者:Ildar Rakhmatulin,Sebastian Volkl
摘要:This paper presents an inexpensive, high-precision, but at the same time, easy-to-maintain PIEEG board to convert a RaspberryPI to a Brain-computer interface. This shield allows measuring and processing eight real-time EEG (Electroencephalography) signals. We used the most popular programming languages - C, C++ and Python to read the signals, recorded by the device . The process of reading EEG signals was demonstrated as completely and clearly as possible. This device can be easily used for machine learning enthusiasts to create projects for controlling robots and mechanical limbs using the power of thought. We will post use cases on GitHub (https://github.com/Ildaron/EEGwithRaspberryPI) for controlling a robotic machine, unmanned aerial vehicle, and more just using the power of thought.

【106】 Predicting Trust Using Automated Assessment of Multivariate Interactional Synchrony
标题:基于多变量交互同步性自动评估的信任预测

链接:https://arxiv.org/abs/2201.02223
作者:Adrien Meynard,Gayan Seneviratna,Elliot Doyle,Joyanne Becker,Hau-Tieng Wu,Jana Schaich Borg
摘要:Diverse disciplines are interested in how the coordination of interacting agents' movements, emotions, and physiology over time impacts social behavior. Here, we describe a new multivariate procedure for automating the investigation of this kind of behaviorally-relevant "interactional synchrony", and introduce a novel interactional synchrony measure based on features of dynamic time warping (DTW) paths. We demonstrate that our DTW path-based measure of interactional synchrony between facial action units of two people interacting freely in a natural social interaction can be used to predict how much trust they will display in a subsequent Trust Game. We also show that our approach outperforms univariate head movement models, models that consider participants' facial action units independently, and models that use previously proposed synchrony or similarity measures. The insights of this work can be applied to any research question that aims to quantify the temporal coordination of multiple signals over time, but has immediate applications in psychology, medicine, and robotics.

【107】 Nonlocal Kernel Network (NKN): a Stable and Resolution-Independent Deep Neural Network
标题:非局部核网络(NKN):一种稳定的与分辨率无关的深度神经网络

链接:https://arxiv.org/abs/2201.02217
作者:Huaiqian You,Yue Yu,Marta D'Elia,Tian Gao,Stewart Silling
摘要:Neural operators have recently become popular tools for designing solution maps between function spaces in the form of neural networks. Differently from classical scientific machine learning approaches that learn parameters of a known partial differential equation (PDE) for a single instance of the input parameters at a fixed resolution, neural operators approximate the solution map of a family of PDEs. Despite their success, the uses of neural operators are so far restricted to relatively shallow neural networks and confined to learning hidden governing laws. In this work, we propose a novel nonlocal neural operator, which we refer to as nonlocal kernel network (NKN), that is resolution independent, characterized by deep neural networks, and capable of handling a variety of tasks such as learning governing equations and classifying images. Our NKN stems from the interpretation of the neural network as a discrete nonlocal diffusion reaction equation that, in the limit of infinite layers, is equivalent to a parabolic nonlocal equation, whose stability is analyzed via nonlocal vector calculus. The resemblance with integral forms of neural operators allows NKNs to capture long-range dependencies in the feature space, while the continuous treatment of node-to-node interactions makes NKNs resolution independent. The resemblance with neural ODEs, reinterpreted in a nonlocal sense, and the stable network dynamics between layers allow for generalization of NKN's optimal parameters from shallow to deep networks. This fact enables the use of shallow-to-deep initialization techniques. Our tests show that NKNs outperform baseline methods in both learning governing equations and image classification tasks and generalize well to different resolutions and depths.

【108】 On the Prevalence, Impact, and Evolution of SQL Code Smells in Data-Intensive Systems
标题:SQL代码嗅觉在数据密集型系统中的流行、影响和演变

链接:https://arxiv.org/abs/2201.02215
作者:Biruk Asmare Muse,Mohammad Masudur Rahman,Csaba Nagy,Anthony Cleve,Foutse Khomh,Giuliano Antoniol
备注:None
摘要:Code smells indicate software design problems that harm software quality. Data-intensive systems that frequently access databases often suffer from SQL code smells besides the traditional smells. While there have been extensive studies on traditional code smells, recently, there has been a growing interest in SQL code smells. In this paper, we conduct an empirical study to investigate the prevalence and evolution of SQL code smells in open-source, data-intensive systems. We collected 150 projects and examined both traditional and SQL code smells in these projects. Our investigation delivers several important findings. First, SQL code smells are indeed prevalent in data-intensive software systems. Second, SQL code smells have a weak co-occurrence with traditional code smells. Third, SQL code smells have a weaker association with bugs than that of traditional code smells. Fourth, SQL code smells are more likely to be introduced at the beginning of the project lifetime and likely to be left in the code without a fix, compared to traditional code smells. Overall, our results show that SQL code smells are indeed prevalent and persistent in the studied data-intensive software systems. Developers should be aware of these smells and consider detecting and refactoring SQL code smells and traditional code smells separately, using dedicated tools.

【109】 Towards Industry 5.0: Intelligent Reflecting Surface (IRS) in Smart Manufacturing
标题:走向行业5.0:智能制造中的智能反射面(IRS)

链接:https://arxiv.org/abs/2201.02214
作者:Md. Noor-A-Rahim,Fadhil Firyaguna,Jobish John,M. Omar Khyam,Dirk Pesch,Eddie Armstrong,Holger Claussen,H. Vincent Poor
摘要:Intelligent Reflecting Surface (IRS) is expected to become a key enabling technology for 6G wireless communication networks as they can significantly improve the wireless network's performance, creating a controllable radio environment in preferred directions. The vision for Industry 5.0 is for close cooperation between humans and machines, requiring ultra-reliability and low latency communications (URLLC). IRS is expected to play a crucial role in realizing wireless URLLC for Industry 5.0. In this paper, we first provide an overview of IRS technology and then conceptualize the potential for IRS implementation in a smart manufacturing environment to support the emergence of Industry 5.0 with a series of applications. Finally, to stimulate future research in this area, we discuss the strength, open challenges, maturity, and enhancing areas of the IRS technology in modern smart manufacturing.

【110】 Explainable deep learning for insights in El Nino and river flows
标题:可解释的深度学习,以洞察厄尔尼诺现象和河流流动

链接:https://arxiv.org/abs/2201.02596
作者:Yumin Liu,Kate Duffy,Jennifer G. Dy,Auroop R. Ganguly
摘要:The El Nino Southern Oscillation (ENSO) is a semi-periodic fluctuation in sea surface temperature (SST) over the tropical central and eastern Pacific Ocean that influences interannual variability in regional hydrology across the world through long-range dependence or teleconnections. Recent research has demonstrated the value of Deep Learning (DL) methods for improving ENSO prediction as well as Complex Networks (CN) for understanding teleconnections. However, gaps in predictive understanding of ENSO-driven river flows include the black box nature of DL, the use of simple ENSO indices to describe a complex phenomenon and translating DL-based ENSO predictions to river flow predictions. Here we show that eXplainable DL (XDL) methods, based on saliency maps, can extract interpretable predictive information contained in global SST and discover novel SST information regions and dependence structures relevant for river flows which, in tandem with climate network constructions, enable improved predictive understanding. Our results reveal additional information content in global SST beyond ENSO indices, develop new understanding of how SSTs influence river flows, and generate improved river flow predictions with uncertainties. Observations, reanalysis data, and earth system model simulations are used to demonstrate the value of the XDL-CN based methods for future interannual and decadal scale climate projections.

【111】 The E-Intelligence System
标题:电子情报系统

链接:https://arxiv.org/abs/2201.02590
作者:Vibhor Gautam,Vikalp Shishodia
摘要:Electronic Intelligence (ELINT), often known as E-Intelligence, is intelligence obtained through electronic sensors. Other than personal communications, ELINT intelligence is usually obtained. The goal is usually to determine a target's capabilities, such as radar placement. Active or passive sensors can be employed to collect data. A provided signal is analyzed and contrasted to collected data for recognized signal types. The information may be stored if the signal type is detected; it can be classed as new if no match is found. ELINT collects and categorizes data. In a military setting (and others that have adopted the usage, such as a business), intelligence helps an organization make decisions that can provide them a strategic advantage over the competition. The term "intel" is frequently shortened. The two main subfields of signals intelligence (SIGINT) are ELINT and Communications Intelligence (COMINT). The US Department of Defense specifies the terminologies, and intelligence communities use the categories of data reviewed worldwide.

【112】 An Incremental Learning Approach to Automatically Recognize Pulmonary Diseases from the Multi-vendor Chest Radiographs
标题:一种增量学习方法自动识别多厂商胸片中的肺部疾病

链接:https://arxiv.org/abs/2201.02574
作者:Mehreen Sirshar,Taimur Hassan,Muhammad Usman Akram,Shoab Ahmed Khan
备注:None
摘要:Pulmonary diseases can cause severe respiratory problems, leading to sudden death if not treated timely. Many researchers have utilized deep learning systems to diagnose pulmonary disorders using chest X-rays (CXRs). However, such systems require exhaustive training efforts on large-scale data to effectively diagnose chest abnormalities. Furthermore, procuring such large-scale data is often infeasible and impractical, especially for rare diseases. With the recent advances in incremental learning, researchers have periodically tuned deep neural networks to learn different classification tasks with few training examples. Although, such systems can resist catastrophic forgetting, they treat the knowledge representations independently of each other, and this limits their classification performance. Also, to the best of our knowledge, there is no incremental learning-driven image diagnostic framework that is specifically designed to screen pulmonary disorders from the CXRs. To address this, we present a novel framework that can learn to screen different chest abnormalities incrementally. In addition to this, the proposed framework is penalized through an incremental learning loss function that infers Bayesian theory to recognize structural and semantic inter-dependencies between incrementally learned knowledge representations to diagnose the pulmonary diseases effectively, regardless of the scanner specifications. We tested the proposed framework on five public CXR datasets containing different chest abnormalities, where it outperformed various state-of-the-art system through various metrics.

【113】 AugmentedPCA: A Python Package of Supervised and Adversarial Linear Factor Models
标题:增强的PCA:一个Python包,包含监督和对抗的线性因素模型

链接:https://arxiv.org/abs/2201.02547
作者:William E. Carson IV,Austin Talbot,David Carlson
备注:NeurIPS 2021 (Learning Meaningful Representations of Life Workshop)
摘要:Deep autoencoders are often extended with a supervised or adversarial loss to learn latent representations with desirable properties, such as greater predictivity of labels and outcomes or fairness with respects to a sensitive variable. Despite the ubiquity of supervised and adversarial deep latent factor models, these methods should demonstrate improvement over simpler linear approaches to be preferred in practice. This necessitates a reproducible linear analog that still adheres to an augmenting supervised or adversarial objective. We address this methodological gap by presenting methods that augment the principal component analysis (PCA) objective with either a supervised or an adversarial objective and provide analytic and reproducible solutions. We implement these methods in an open-source Python package, AugmentedPCA, that can produce excellent real-world baselines. We demonstrate the utility of these factor models on an open-source, RNA-seq cancer gene expression dataset, showing that augmenting with a supervised objective results in improved downstream classification performance, produces principal components with greater class fidelity, and facilitates identification of genes aligned with the principal axes of data variance with implications to development of specific types of cancer.

【114】 Machine-learning-based arc selection for constrained shortest path problems in column generation
标题:基于机器学习的列生成约束最短路径问题的圆弧选择

链接:https://arxiv.org/abs/2201.02535
作者:Mouad Morabit,Guy Desaulniers,Andrea Lodi
摘要:Column generation is an iterative method used to solve a variety of optimization problems. It decomposes the problem into two parts: a master problem, and one or more pricing problems (PP). The total computing time taken by the method is divided between these two parts. In routing or scheduling applications, the problems are mostly defined on a network, and the PP is usually an NP-hard shortest path problem with resource constraints. In this work, we propose a new heuristic pricing algorithm based on machine learning. By taking advantage of the data collected during previous executions, the objective is to reduce the size of the network and accelerate the PP, keeping only the arcs that have a high chance to be part of the linear relaxation solution. The method has been applied to two specific problems: the vehicle and crew scheduling problem in public transit and the vehicle routing problem with time windows. Reductions in computational time of up to 40% can be obtained.

【115】 TOWER-Complete Problems in Contraction-Free Substructural Logics
标题:无收缩子结构逻辑中的塔式完备性问题

链接:https://arxiv.org/abs/2201.02496
作者:Hiromi Tanaka
备注:Draft
摘要:We investigate the computational complexity of a family of substructural logics with exchange and weakening but without contraction. With the aid of the techniques provided by Lazi\'c and Schmitz (2015), we show that the deducibility problem for full Lambek calculus with exchange and weakening ($\mathbf{FL}_{\mathbf{ew}}$) is TOWER-complete, where TOWER is one of the non-elementary complexity classes introduced by Schmitz (2016). The same complexity result holds even for deducibility in BCK-logic, i.e., the implicational fragment of $\mathbf{FL}_{\mathbf{ew}}$. We furthermore show the TOWER-completeness of the provability problem for elementary affine logic, which was proved to be decidable by Dal Lago and Martini (2004).

【116】 Deep Domain Adversarial Adaptation for Photon-efficient Imaging Based on Spatiotemporal Inception Network
标题:基于时空初始网络的光子有效成像的深域对抗性自适应

链接:https://arxiv.org/abs/2201.02475
作者:Yiwei Chen,Gongxin Yao,Yong Liu,Yu Pan
摘要:In single-photon LiDAR, photon-efficient imaging captures the 3D structure of a scene by only several detected signal photons per pixel. The existing deep learning models for this task are trained on simulated datasets, which poses the domain shift challenge when applied to realistic scenarios. In this paper, we propose a spatiotemporal inception network (STIN) for photon-efficient imaging, which is able to precisely predict the depth from a sparse and high-noise photon counting histogram by fully exploiting spatial and temporal information. Then the domain adversarial adaptation frameworks, including domain-adversarial neural network and adversarial discriminative domain adaptation, are effectively applied to STIN to alleviate the domain shift problem for realistic applications. Comprehensive experiments on the simulated data generated from the NYU~v2 and the Middlebury datasets demonstrate that STIN outperforms the state-of-the-art models at low signal-to-background ratios from 2:10 to 2:100. Moreover, experimental results on the real-world dataset captured by the single-photon imaging prototype show that the STIN with domain adversarial training achieves better generalization performance compared with the state-of-the-arts as well as the baseline STIN trained by simulated data.

【117】 Negative Evidence Matters in Interpretable Histology Image Classification
标题:负证据在可解释组织学图像分类中的作用

链接:https://arxiv.org/abs/2201.02445
作者:Soufiane Belharbi,Marco Pedersoli,Ismail Ben Ayed,Luke McCaffrey,Eric Granger
备注:10 figures, under review
摘要:Using only global annotations such as the image class labels, weakly-supervised learning methods allow CNN classifiers to jointly classify an image, and yield the regions of interest associated with the predicted class. However, without any guidance at the pixel level, such methods may yield inaccurate regions. This problem is known to be more challenging with histology images than with natural ones, since objects are less salient, structures have more variations, and foreground and background regions have stronger similarities. Therefore, methods in computer vision literature for visual interpretation of CNNs may not directly apply. In this work, we propose a simple yet efficient method based on a composite loss function that leverages information from the fully negative samples. Our new loss function contains two complementary terms: the first exploits positive evidence collected from the CNN classifier, while the second leverages the fully negative samples from the training dataset. In particular, we equip a pre-trained classifier with a decoder that allows refining the regions of interest. The same classifier is exploited to collect both the positive and negative evidence at the pixel level to train the decoder. This enables to take advantages of the fully negative samples that occurs naturally in the data, without any additional supervision signals and using only the image class as supervision. Compared to several recent related methods, over the public benchmark GlaS for colon cancer and a Camelyon16 patch-based benchmark for breast cancer using three different backbones, we show the substantial improvements introduced by our method. Our results shows the benefits of using both negative and positive evidence, ie, the one obtained from a classifier and the one naturally available in datasets. We provide an ablation study of both terms. Our code is publicly available.

【118】 Applications of Signature Methods to Market Anomaly Detection
标题:签名方法在市场异常检测中的应用

链接:https://arxiv.org/abs/2201.02441
作者:Erdinc Akyildirim,Matteo Gambara,Josef Teichmann,Syang Zhou
摘要:Anomaly detection is the process of identifying abnormal instances or events in data sets which deviate from the norm significantly. In this study, we propose a signatures based machine learning algorithm to detect rare or unexpected items in a given data set of time series type. We present applications of signature or randomized signature as feature extractors for anomaly detection algorithms; additionally we provide an easy, representation theoretic justification for the construction of randomized signatures. Our first application is based on synthetic data and aims at distinguishing between real and fake trajectories of stock prices, which are indistinguishable by visual inspection. We also show a real life application by using transaction data from the cryptocurrency market. In this case, we are able to identify pump and dump attempts organized on social networks with F1 scores up to 88% by means of our unsupervised learning algorithm, thus achieving results that are close to the state-of-the-art in the field based on supervised learning.

【119】 Optimality in Noisy Importance Sampling
标题:噪声重要抽样中的最优性

链接:https://arxiv.org/abs/2201.02432
作者:Fernando Llorente,Luca Martino,Jesse Read,David Delgado-Gómez
摘要:In this work, we analyze the noisy importance sampling (IS), i.e., IS working with noisy evaluations of the target density. We present the general framework and derive optimal proposal densities for noisy IS estimators. The optimal proposals incorporate the information of the variance of the noisy realizations, proposing points in regions where the noise power is higher. We also compare the use of the optimal proposals with previous optimality approaches considered in a noisy IS framework.

【120】 Effect of Prior-based Losses on Segmentation Performance: A Benchmark
标题:基于先前损失对分割性能的影响:一个基准

链接:https://arxiv.org/abs/2201.02428
作者:Rosana {EL JURDI},Caroline Petitjean,Veronika Cheplygina,Paul Honeine,Fahed Abdallah
备注:To be submitted to SPIE: Journal of Medical Imaging
摘要:Today, deep convolutional neural networks (CNNs) have demonstrated state-of-the-art performance for medical image segmentation, on various imaging modalities and tasks. Despite early success, segmentation networks may still generate anatomically aberrant segmentations, with holes or inaccuracies near the object boundaries. To enforce anatomical plausibility, recent research studies have focused on incorporating prior knowledge such as object shape or boundary, as constraints in the loss function. Prior integrated could be low-level referring to reformulated representations extracted from the ground-truth segmentations, or high-level representing external medical information such as the organ's shape or size. Over the past few years, prior-based losses exhibited a rising interest in the research field since they allow integration of expert knowledge while still being architecture-agnostic. However, given the diversity of prior-based losses on different medical imaging challenges and tasks, it has become hard to identify what loss works best for which dataset. In this paper, we establish a benchmark of recent prior-based losses for medical image segmentation. The main objective is to provide intuition onto which losses to choose given a particular task or dataset. To this end, four low-level and high-level prior-based losses are selected. The considered losses are validated on 8 different datasets from a variety of medical image segmentation challenges including the Decathlon, the ISLES and the WMH challenge. Results show that whereas low-level prior-based losses can guarantee an increase in performance over the Dice loss baseline regardless of the dataset characteristics, high-level prior-based losses can increase anatomical plausibility as per data characteristics.

【121】 Auto-Weighted Layer Representation Based View Synthesis Distortion Estimation for 3-D Video Coding
标题:基于自动加权分层表示的三维视频编码视图合成失真估计

链接:https://arxiv.org/abs/2201.02420
作者:Jian Jin,Xingxing Zhang,Lili Meng,Weisi Lin,Jie Liang,Huaxiang Zhang,Yao Zhao
摘要:Recently, various view synthesis distortion estimation models have been studied to better serve for 3-D video coding. However, they can hardly model the relationship quantitatively among different levels of depth changes, texture degeneration, and the view synthesis distortion (VSD), which is crucial for rate-distortion optimization and rate allocation. In this paper, an auto-weighted layer representation based view synthesis distortion estimation model is developed. Firstly, the sub-VSD (S-VSD) is defined according to the level of depth changes and their associated texture degeneration. After that, a set of theoretical derivations demonstrate that the VSD can be approximately decomposed into the S-VSDs multiplied by their associated weights. To obtain the S-VSDs, a layer-based representation of S-VSD is developed, where all the pixels with the same level of depth changes are represented with a layer to enable efficient S-VSD calculation at the layer level. Meanwhile, a nonlinear mapping function is learnt to accurately represent the relationship between the VSD and S-VSDs, automatically providing weights for S-VSDs during the VSD estimation. To learn such function, a dataset of VSD and its associated S-VSDs are built. Experimental results show that the VSD can be accurately estimated with the weights learnt by the nonlinear mapping function once its associated S-VSDs are available. The proposed method outperforms the relevant state-of-the-art methods in both accuracy and efficiency. The dataset and source code of the proposed method will be available at https://github.com/jianjin008/.

【122】 Amplitude SAR Imagery Splicing Localization
标题:幅度SAR图像拼接定位

链接:https://arxiv.org/abs/2201.02409
作者:Edoardo Daniele Cannas,Nicolò Bonettini,Sara Mandelli,Paolo Bestagini,Stefano Tubaro
摘要:Synthetic Aperture Radar (SAR) images are a valuable asset for a wide variety of tasks. In the last few years, many websites have been offering them for free in the form of easy to manage products, favoring their widespread diffusion and research work in the SAR field. The drawback of these opportunities is that such images might be exposed to forgeries and manipulations by malicious users, raising new concerns about their integrity and trustworthiness. Up to now, the multimedia forensics literature has proposed various techniques to localize manipulations in natural photographs, but the integrity assessment of SAR images was never investigated. This task poses new challenges, since SAR images are generated with a processing chain completely different from that of natural photographs. This implies that many forensics methods developed for natural images are not guaranteed to succeed. In this paper, we investigate the problem of amplitude SAR imagery splicing localization. Our goal is to localize regions of an amplitude SAR image that have been copied and pasted from another image, possibly undergoing some kind of editing in the process. To do so, we leverage a Convolutional Neural Network (CNN) to extract a fingerprint highlighting inconsistencies in the processing traces of the analyzed input. Then, we examine this fingerprint to produce a binary tampering mask indicating the pixel region under splicing attack. Results show that our proposed method, tailored to the nature of SAR signals, provides better performances than state-of-the-art forensic tools developed for natural images.

【123】 Model-Free Nonlinear Feedback Optimization

链接:https://arxiv.org/abs/2201.02395
作者:Zhiyu He,Saverio Bolognani,Jianping He,Florian Dörfler,Xinping Guan
摘要:Feedback optimization is a control paradigm that enables physical systems to autonomously reach efficient operating points. Its central idea is to interconnect optimization iterations in closed-loop with the physical plant. Since iterative gradient-based methods are extensively used to achieve optimality, feedback optimization controllers typically require the knowledge of the steady-state sensitivity of the plant, which may not be easily accessible in some applications. In contrast, in this paper we develop a model-free feedback controller for efficient steady-state operation of general dynamical systems. The proposed design consists in updating control inputs via gradient estimates constructed from evaluations of the nonconvex objective at the current input and at the measured output. We study the dynamic interconnection of the proposed iterative controller with a stable nonlinear discrete-time plant. For this setup, we characterize the optimality and the stability of the closed-loop behavior as functions of the problem dimension, the number of iterations, and the rate of convergence of the physical plant. To handle general constraints that affect multiple inputs, we enhance the controller with Frank-Wolfe type updates.

【124】 Investigation of the Relationship Between Localization Accuracy and Sensor Array
标题:定位精度与传感器阵列关系的研究

链接:https://arxiv.org/abs/2201.02372
作者:Y Li
摘要:The magnetic localization method has been widely studied, which is mainly based on the accurate mapping of the magnetic field generated by magnetic sources. Many factors affect localization accuracy in the experiment. Therefore, this paper tends to study the relationship between localization accuracy and sensor array with different experiments. This system uses a small magnet as the magnetic source, and the mathematical model of the magnetic positioning system is established based on the magnetic dipole model to estimate the magnetic field. The Levenberg-Marquardt algorithm was used to construct a magnetic positioning objective function for comparison experiments. Experimental results show:When the sensor is evenly distributed around the magnet, the positioning accuracy is higher than other layout of the sensor array, the average localization error is 0.47mm and the average orientation error is 0.92 degree.

【125】 The Green's function of the Lax-Wendroff and Beam-Warming schemes

链接:https://arxiv.org/abs/2201.02371
作者:Jean-François Coulombel
摘要:We prove a sharp uniform generalized Gaussian bound for the Green's function of the Lax-Wendroff and Beam-Warming schemes. Our bound highlights the spatial region that leads to the well-known (rather weak) instability of these schemes in the maximum norm. We also recover uniform bounds in the maximum norm when these schemes are applied to initial data of bounded variation.

【126】 Cross-Modality Deep Feature Learning for Brain Tumor Segmentation
标题:跨模态深度特征学习在脑肿瘤分割中的应用

链接:https://arxiv.org/abs/2201.02356
作者:Dingwen Zhang,Guohai Huang,Qiang Zhang,Jungong Han,Junwei Han,Yizhou Yu
备注:published on Pattern Recognition 2021
摘要:Recent advances in machine learning and prevalence of digital medical images have opened up an opportunity to address the challenging brain tumor segmentation (BTS) task by using deep convolutional neural networks. However, different from the RGB image data that are very widespread, the medical image data used in brain tumor segmentation are relatively scarce in terms of the data scale but contain the richer information in terms of the modality property. To this end, this paper proposes a novel cross-modality deep feature learning framework to segment brain tumors from the multi-modality MRI data. The core idea is to mine rich patterns across the multi-modality data to make up for the insufficient data scale. The proposed cross-modality deep feature learning framework consists of two learning processes: the cross-modality feature transition (CMFT) process and the cross-modality feature fusion (CMFF) process, which aims at learning rich feature representations by transiting knowledge across different modality data and fusing knowledge from different modality data, respectively. Comprehensive experiments are conducted on the BraTS benchmarks, which show that the proposed cross-modality deep feature learning framework can effectively improve the brain tumor segmentation performance when compared with the baseline methods and state-of-the-art methods.

【127】 Projective Embedding of Dynamical Systems: uniform mean field equations
标题:动力系统的射影嵌入:一致平均场方程

链接:https://arxiv.org/abs/2201.02355
作者:Francesco Caravelli,Fabio L. Traversa,Michele Bonnin,Fabrizio Bonani
备注:45 pages; one column; 10 figures;
摘要:We study embeddings of continuous dynamical systems in larger dimensions via projector operators. We call this technique PEDS, projective embedding of dynamical systems, as the stable fixed point of the dynamics are recovered via projection from the higher dimensional space. In this paper we provide a general definition and prove that for a particular type of projector operator of rank-1, the uniform mean field projector, the equations of motion become a mean field approximation of the dynamical system. While in general the embedding depends on a specified variable ordering, the same is not true for the uniform mean field projector. In addition, we prove that the original stable fixed points remain stable fixed points of the dynamics, saddle points remain saddle, but unstable fixed points become saddles.

【128】 Multiresolution Fully Convolutional Networks to detect Clouds and Snow through Optical Satellite Images
标题:利用光学卫星图像探测云雪的多分辨率全卷积网络

链接:https://arxiv.org/abs/2201.02350
作者:Debvrat Varshney,Claudio Persello,Prasun Kumar Gupta,Bhaskar Ramachandra Nikam
摘要:Clouds and snow have similar spectral features in the visible and near-infrared (VNIR) range and are thus difficult to distinguish from each other in high resolution VNIR images. We address this issue by introducing a shortwave-infrared (SWIR) band where clouds are highly reflective, and snow is absorptive. As SWIR is typically of a lower resolution compared to VNIR, this study proposes a multiresolution fully convolutional neural network (FCN) that can effectively detect clouds and snow in VNIR images. We fuse the multiresolution bands within a deep FCN and perform semantic segmentation at the higher, VNIR resolution. Such a fusion-based classifier, trained in an end-to-end manner, achieved 94.31% overall accuracy and an F1 score of 97.67% for clouds on Resourcesat-2 data captured over the state of Uttarakhand, India. These scores were found to be 30% higher than a Random Forest classifier, and 10% higher than a standalone single-resolution FCN. Apart from being useful for cloud detection purposes, the study also highlights the potential of convolutional neural networks for multi-sensor fusion problems.

【129】 Bayesian Online Change Point Detection for Baseline Shifts
标题:基线偏移的贝叶斯在线变化点检测

链接:https://arxiv.org/abs/2201.02325
作者:Ginga Yoshizawa
备注:None
摘要:In time series data analysis, detecting change points on a real-time basis (online) is of great interest in many areas, such as finance, environmental monitoring, and medicine. One promising means to achieve this is the Bayesian online change point detection (BOCPD) algorithm, which has been successfully adopted in particular cases in which the time series of interest has a fixed baseline. However, we have found that the algorithm struggles when the baseline irreversibly shifts from its initial state. This is because with the original BOCPD algorithm, the sensitivity with which a change point can be detected is degraded if the data points are fluctuating at locations relatively far from the original baseline. In this paper, we not only extend the original BOCPD algorithm to be applicable to a time series whose baseline is constantly shifting toward unknown values but also visualize why the proposed extension works. To demonstrate the efficacy of the proposed algorithm compared to the original one, we examine these algorithms on two real-world data sets and six synthetic data sets.

【130】 RestoreDet: Degradation Equivariant Representation for Object Detection in Low Resolution Images
标题:RestoreDet:低分辨率图像目标检测的退化等变表示

链接:https://arxiv.org/abs/2201.02314
作者:Ziteng Cui,Yingying Zhu,Lin Gu,Guo-Jun Qi,Xiaoxiao Li,Peng Gao,Zenghui Zhang,Tatsuya Harada
备注:11 pages, 3figures
摘要:Image restoration algorithms such as super resolution (SR) are indispensable pre-processing modules for object detection in degraded images. However, most of these algorithms assume the degradation is fixed and known a priori. When the real degradation is unknown or differs from assumption, both the pre-processing module and the consequent high-level task such as object detection would fail. Here, we propose a novel framework, RestoreDet, to detect objects in degraded low resolution images. RestoreDet utilizes the downsampling degradation as a kind of transformation for self-supervised signals to explore the equivariant representation against various resolutions and other degradation conditions. Specifically, we learn this intrinsic visual structure by encoding and decoding the degradation transformation from a pair of original and randomly degraded images. The framework could further take the advantage of advanced SR architectures with an arbitrary resolution restoring decoder to reconstruct the original correspondence from the degraded input image. Both the representation learning and object detection are optimized jointly in an end-to-end training fashion. RestoreDet is a generic framework that could be implemented on any mainstream object detection architectures. The extensive experiment shows that our framework based on CenterNet has achieved superior performance compared with existing methods when facing variant degradation situations. Our code would be released soon.

【131】 Stochastic Saddle Point Problems with Decision-Dependent Distributions
标题:决策相关分布的随机鞍点问题

链接:https://arxiv.org/abs/2201.02313
作者:Killian Wood,Emiliano Dall'Anese
摘要:This paper focuses on stochastic saddle point problems with decision-dependent distributions in both the static and time-varying settings. These are problems whose objective is the expected value of a stochastic payoff function, where random variables are drawn from a distribution induced by a distributional map. For general distributional maps, the problem of finding saddle points is in general computationally burdensome, even if the distribution is known. To enable a tractable solution approach, we introduce the notion of equilibrium points -- which are saddle points for the stationary stochastic minimax problem that they induce -- and provide conditions for their existence and uniqueness. We demonstrate that the distance between the two classes of solutions is bounded provided that the objective has a strongly-convex-strongly-concave payoff and Lipschitz continuous distributional map. We develop deterministic and stochastic primal-dual algorithms and demonstrate their convergence to the equilibrium point. In particular, by modeling errors emerging from a stochastic gradient estimator as sub-Weibull random variables, we provide error bounds in expectation and in high probability that hold for each iteration; moreover, we show convergence to a neighborhood in expectation and almost surely. Finally, we investigate a condition on the distributional map -- which we call opposing mixture dominance -- that ensures the objective is strongly-convex-strongly-concave. Under this assumption, we show that primal-dual algorithms converge to the saddle points in a similar fashion.

【132】 Electric Vehicle Routing Problem with Spatio-temporal Varying Electricity Price and Incentive-aware Customers

链接:https://arxiv.org/abs/2201.02311
作者:Canqi Yao,Shibo Chen,Mauro Salazar,Zaiyue Yang
备注:Submitted to IEEE TSG. arXiv admin note: substantial text overlap with arXiv:2110.06441
摘要:This paper investigates the optimization problem of a fleet of electric vehicles (EVs) serving a set of time-specified customers, where the operator needs to optimize routing and charging problem jointly for each EV. In particular, regarding to the spatio-temporal varying electricity price, we consider incentive-aware customers and propose that the operator offers monetary incentives to exchange time flexibility of customers. In this manner, a win-win situation is achievable since time flexibility enables the fleet operator to obtain a routing and charging schedule with lower cost, whilst the customers receives monetary compensation. Specifically, we first devise a bi-level model whereby the fleet operator optimizes the routing and charging schedule jointly with a monetary incentive to reimburse the delivery time flexibility experienced by the customers. At the same time, the customers choose the optimal time flexibility by minimizing its own cost. Second, we tackle the complexity resulting from the bi-level and nonlinear problem with an equivalent transformation method. Eventually, we reformulate the problem as a single-level optimization problem, which later is solved by proposed Benders dual decomposition method holding a faster convergence rate than the generalized Benders decomposition method. To evaluate the effectiveness of our framework and proposed Benders dual decomposition algorithm, we carry out extensive numerical experiments using VRP-REP data from Belgium.

【133】 Generalized quantum similarity learning
标题:广义量子相似学习

链接:https://arxiv.org/abs/2201.02310
作者:Santosh Kumar Radha,Casey Jao
摘要:The similarity between objects is significant in a broad range of areas. While similarity can be measured using off-the-shelf distance functions, they may fail to capture the inherent meaning of similarity, which tends to depend on the underlying data and task. Moreover, conventional distance functions limit the space of similarity measures to be symmetric and do not directly allow comparing objects from different spaces. We propose using quantum networks (GQSim) for learning task-dependent (a)symmetric similarity between data that need not have the same dimensionality. We analyze the properties of such similarity function analytically (for a simple case) and numerically (for a complex case) and showthat these similarity measures can extract salient features of the data. We also demonstrate that the similarity measure derived using this technique is $(\epsilon,\gamma,\tau)$-good, resulting in theoretically guaranteed performance. Finally, we conclude by applying this technique for three relevant applications - Classification, Graph Completion, Generative modeling.

【134】 A three-dimensional dual-domain deep network for high-pitch and sparse helical CT reconstruction
标题:用于高螺距稀疏螺旋CT重建的三维双域深度网络

链接:https://arxiv.org/abs/2201.02309
作者:Wei Wang,Xiang-Gen Xia,Chuanjiang He,Zemin Ren,Jian Lu
备注:13 pages, 5 figures
摘要:In this paper, we propose a new GPU implementation of the Katsevich algorithm for helical CT reconstruction. Our implementation divides the sinograms and reconstructs the CT images pitch by pitch. By utilizing the periodic properties of the parameters of the Katsevich algorithm, our method only needs to calculate these parameters once for all the pitches and so has lower GPU-memory burdens and is very suitable for deep learning. By embedding our implementation into the network, we propose an end-to-end deep network for the high pitch helical CT reconstruction with sparse detectors. Since our network utilizes the features extracted from both sinograms and CT images, it can simultaneously reduce the streak artifacts caused by the sparsity of sinograms and preserve fine details in the CT images. Experiments show that our network outperforms the related methods both in subjective and objective evaluations.

【135】 A Theoretical Framework of Almost Hyperparameter-free Hyperparameter Selection Methods for Offline Policy Evaluation
标题:用于离线政策评估的几乎无超参数超参数选择方法的理论框架

链接:https://arxiv.org/abs/2201.02300
作者:Kohei Miyaguchi
备注:AAAI22-AI4DO (workshop)
摘要:We are concerned with the problem of hyperparameter selection of offline policy evaluation (OPE). OPE is a key component of offline reinforcement learning, which is a core technology for data-driven decision optimization without environment simulators. However, the current state-of-the-art OPE methods are not hyperparameter-free, which undermines their utility in real-life applications. We address this issue by introducing a new approximate hyperparameter selection (AHS) framework for OPE, which defines a notion of optimality (called selection criteria) in a quantitative and interpretable manner without hyperparameters. We then derive four AHS methods each of which has different characteristics such as convergence rate and time complexity. Finally, we verify effectiveness and limitation of these methods with a preliminary experiment.

【136】 Local and Global Convergence of General Burer-Monteiro Tensor Optimizations
标题:广义布里-蒙泰罗张量优化问题的局部收敛性和全局收敛性

链接:https://arxiv.org/abs/2201.02298
作者:Shuang Li,Qiuwei Li
摘要:Tensor optimization is crucial to massive machine learning and signal processing tasks. In this paper, we consider tensor optimization with a convex and well-conditioned objective function and reformulate it into a nonconvex optimization using the Burer-Monteiro type parameterization. We analyze the local convergence of applying vanilla gradient descent to the factored formulation and establish a local regularity condition under mild assumptions. We also provide a linear convergence analysis of the gradient descent algorithm started in a neighborhood of the true tensor factors. Complementary to the local analysis, this work also characterizes the global geometry of the best rank-one tensor approximation problem and demonstrates that for orthogonally decomposable tensors the problem has no spurious local minima and all saddle points are strict except for the one at zero which is a third-order saddle point.

【137】 Persistent Homology for Breast Tumor Classification using Mammogram Scans
标题:使用乳腺X线扫描实现乳腺肿瘤分类的持久同源性

链接:https://arxiv.org/abs/2201.02295
作者:Aras Asaad,Dashti Ali,Taban Majeed,Rasber Rashid
备注:10 pages
摘要:An Important tool in the field topological data analysis is known as persistent Homology (PH) which is used to encode abstract representation of the homology of data at different resolutions in the form of persistence diagram (PD). In this work we build more than one PD representation of a single image based on a landmark selection method, known as local binary patterns, that encode different types of local textures from images. We employed different PD vectorizations using persistence landscapes, persistence images, persistence binning (Betti Curve) and statistics. We tested the effectiveness of proposed landmark based PH on two publicly available breast abnormality detection datasets using mammogram scans. Sensitivity of landmark based PH obtained is over 90% in both datasets for the detection of abnormal breast scans. Finally, experimental results give new insights on using different types of PD vectorizations which help in utilising PH in conjunction with machine learning classifiers.

【138】 Strategic Storage Investment in Electricity Markets

链接:https://arxiv.org/abs/2201.02290
作者:Dongwei Zhao,Mehdi Jafari,Audun Botterud,Apurba Sakti
摘要:Arbitrage is one important revenue source for energy storage in electricity markets. However, a large amount of storage in the market will impact the energy price and reduce potential revenues. This can lead to strategic behaviors of profit-seeking storage investors. To study the investors' strategic storage investments, we formulate a non-cooperative game between competing investors. Each investor decides the storage investment over a long investment horizon, and operates the storage for arbitrage revenues in the daily electricity market. Different investors can deploy storage with different characteristics. Their decisions are coupled due to the market price that is determined by all the investors' decisions. We use market data from California ISO to characterize the storage impact on the market price, based on which we establish a centralized optimization problem to compute the market equilibrium. We show that an increasing number of investors will increase the market competition, which reduces investors' profits but increases the total invested storage capacity. Furthermore, we find that a slight increase in the storage efficiency (e.g., increased charge and discharge efficiency) can significantly improve an investor's profit share in the market.

【139】 GCWSNet: Generalized Consistent Weighted Sampling for Scalable and Accurate Training of Neural Networks
标题:GCWSNet:可扩展精确训练神经网络的广义一致加权抽样

链接:https://arxiv.org/abs/2201.02283
作者:Ping Li,Weijie Zhao
摘要:We develop the "generalized consistent weighted sampling" (GCWS) for hashing the "powered-GMM" (pGMM) kernel (with a tuning parameter $p$). It turns out that GCWS provides a numerically stable scheme for applying power transformation on the original data, regardless of the magnitude of $p$ and the data. The power transformation is often effective for boosting the performance, in many cases considerably so. We feed the hashed data to neural networks on a variety of public classification datasets and name our method ``GCWSNet''. Our extensive experiments show that GCWSNet often improves the classification accuracy. Furthermore, it is evident from the experiments that GCWSNet converges substantially faster. In fact, GCWS often reaches a reasonable accuracy with merely (less than) one epoch of the training process. This property is much desired because many applications, such as advertisement click-through rate (CTR) prediction models, or data streams (i.e., data seen only once), often train just one epoch. Another beneficial side effect is that the computations of the first layer of the neural networks become additions instead of multiplications because the input data become binary (and highly sparse). Empirical comparisons with (normalized) random Fourier features (NRFF) are provided. We also propose to reduce the model size of GCWSNet by count-sketch and develop the theory for analyzing the impact of using count-sketch on the accuracy of GCWS. Our analysis shows that an ``8-bit'' strategy should work well in that we can always apply an 8-bit count-sketch hashing on the output of GCWS hashing without hurting the accuracy much. There are many other ways to take advantage of GCWS when training deep neural networks. For example, one can apply GCWS on the outputs of the last layer to boost the accuracy of trained deep neural networks.

【140】 Well-Conditioned Linear Minimum Mean Square Error Estimation
标题:良态线性最小均方误差估计

链接:https://arxiv.org/abs/2201.02275
作者:Edwin K. P. Chong
摘要:Computing linear minimum mean square error (LMMSE) filters is often ill conditioned, suggesting that unconstrained minimization of the mean square error is an inadequate principle for filter design. To address this, we first develop a unifying framework for studying constrained LMMSE estimation problems. Using this framework, we expose an important structural property of all constrained LMMSE filters and show that they all involve an inherent preconditioning step. This parameterizes all such filters only by their preconditioners. Moreover, each filters is invariant to invertible linear transformations of its preconditioner. We then clarify that merely constraining the rank of the filters, leading to the well-known low-rank Wiener filter, does not suitably address the problem of ill conditioning. Instead, we use a constraint that explicitly requires solutions to be well conditioned in a certain specific sense. We introduce two well-conditioned estimators and evaluate their mean-squared-error performance. We show these two estimators converge to the standard LMMSE filter as their truncated-power ratio converges to zero, but more slowly than the low-rank Wiener filter in terms of scaling law. This exposes the price for being well conditioned. We also show quantitative results with historical VIX data to illustrate the performance of our two well-conditioned estimators.

【141】 PWM2Vec: An Efficient Embedding Approach for Viral Host Specification from Coronavirus Spike Sequences
标题:PWM2Vec:一种有效的从冠状病毒棘突序列中嵌入病毒宿主的方法

链接:https://arxiv.org/abs/2201.02273
作者:Sarwan Ali,Babatunde Bello,Prakash Chourasia,Ria Thazhe Punathil,Yijing Zhou,Murray Patterson
摘要:COVID-19 pandemic, is still unknown and is an important open question. There are speculations that bats are a possible origin. Likewise, there are many closely related (corona-) viruses, such as SARS, which was found to be transmitted through civets. The study of the different hosts which can be potential carriers and transmitters of deadly viruses to humans is crucial to understanding, mitigating and preventing current and future pandemics. In coronaviruses, the surface (S) protein, or spike protein, is an important part of determining host specificity since it is the point of contact between the virus and the host cell membrane. In this paper, we classify the hosts of over five thousand coronaviruses from their spike protein sequences, segregating them into clusters of distinct hosts among avians, bats, camels, swines, humans and weasels, to name a few. We propose a feature embedding based on the well-known position-weight matrix (PWM), which we call PWM2Vec, and use to generate feature vectors from the spike protein sequences of these coronaviruses. While our embedding is inspired by the success of PWMs in biological applications such as determining protein function, or identifying transcription factor binding sites, we are the first (to the best of our knowledge) to use PWMs in the context of host classification from viral sequences to generate a fixed-length feature vector representation. The results on the real world data show that in using PWM2Vec, we are able to perform comparably well as compared to baseline models. We also measure the importance of different amino acids using information gain to show the amino acids which are important for predicting the host of a given coronavirus.

【142】 Surveying 5G Techno-Economic Research to Inform the Evaluation of 6G Wireless Technologies
标题:调查5G技术-经济研究为6G无线技术评估提供信息

链接:https://arxiv.org/abs/2201.02272
作者:Edward J. Oughton,William Lehr
摘要:Techno-economic assessment is a fundamental technique engineers use for evaluating new communications technologies. However, despite the techno-economics of the fifth cellular generation (5G) being an active research area, it is surprising there are few comprehensive evaluations of this growing literature. With mobile network operators deploying 5G across their networks, it is therefore an opportune time to appraise current accomplishments and review the state-of-the-art. Such insight can inform the flurry of 6G research papers currently underway and help engineers in their mission to provide affordable high-capacity, low-latency broadband connectivity, globally. The survey discusses emerging trends from the 5G techno-economic literature and makes six key recommendations for the design and standardization of Next Generation 6G wireless technologies.

【143】 A Keypoint Detection and Description Network Based on the Vessel Structure for Multi-Modal Retinal Image Registration
标题:一种基于血管结构的多模态视网膜图像配准关键点检测与描述网络

链接:https://arxiv.org/abs/2201.02242
作者:Aline Sindel,Bettina Hohberger,Sebastian Fassihi Dehcordi,Christian Mardin,Robert Lämmer,Andreas Maier,Vincent Christlein
备注:6 pages, 4 figures, 1 table, accepted to BVM 2022
摘要:Ophthalmological imaging utilizes different imaging systems, such as color fundus, infrared, fluorescein angiography, optical coherence tomography (OCT) or OCT angiography. Multiple images with different modalities or acquisition times are often analyzed for the diagnosis of retinal diseases. Automatically aligning the vessel structures in the images by means of multi-modal registration can support the ophthalmologists in their work. Our method uses a convolutional neural network to extract features of the vessel structure in multi-modal retinal images. We jointly train a keypoint detection and description network on small patches using a classification and a cross-modal descriptor loss function and apply the network to the full image size in the test phase. Our method demonstrates the best registration performance on our and a public multi-modal dataset in comparison to competing methods.

【144】 Comprehensive RF Dataset Collection and Release: A Deep Learning-Based Device Fingerprinting Use Case
标题:全面的射频数据集收集和发布:基于深度学习的设备指纹识别使用案例

链接:https://arxiv.org/abs/2201.02213
作者:Abdurrahman Elmaghbub,Bechir Hamdaoui
备注:This paper has been presented in IEEE GLOBECOM Workshop 2021
摘要:Deep learning-based RF fingerprinting has recently been recognized as a potential solution for enabling newly emerging wireless network applications, such as spectrum access policy enforcement, automated network device authentication, and unauthorized network access monitoring and control. Real, comprehensive RF datasets are now needed more than ever to enable the study, assessment, and validation of newly developed RF fingerprinting approaches. In this paper, we present and release a large-scale RF fingerprinting dataset, collected from 25 different LoRa-enabled IoT transmitting devices using USRP B210 receivers. Our dataset consists of a large number of SigMF-compliant binary files representing the I/Q time-domain samples and their corresponding FFT-based files of LoRa transmissions. This dataset provides a comprehensive set of essential experimental scenarios, considering both indoor and outdoor environments and various network deployments and configurations, such as the distance between the transmitters and the receiver, the configuration of the considered LoRa modulation, the physical location of the conducted experiment, and the receiver hardware used for training and testing the neural network models.

【145】 3D Intracranial Aneurysm Classification and Segmentation via Unsupervised Dual-branch Learning
标题:基于无监督双分支学习的三维颅内动脉瘤分类与分割

链接:https://arxiv.org/abs/2201.02198
作者:Di Shao,Xuequan Lu,Xiao Liu
备注:submitted for review (contact: xuequan.lu@deakin.edu.au)
摘要:Intracranial aneurysms are common nowadays and how to detect them intelligently is of great significance in digital health. While most existing deep learning research focused on medical images in a supervised way, we introduce an unsupervised method for the detection of intracranial aneurysms based on 3D point cloud data. In particular, our method consists of two stages: unsupervised pre-training and downstream tasks. As for the former, the main idea is to pair each point cloud with its jittered counterpart and maximise their correspondence. Then we design a dual-branch contrastive network with an encoder for each branch and a subsequent common projection head. As for the latter, we design simple networks for supervised classification and segmentation training. Experiments on the public dataset (IntrA) show that our unsupervised method achieves comparable or even better performance than some state-of-the-art supervised techniques, and it is most prominent in the detection of aneurysmal vessels. Experiments on the ModelNet40 also show that our method achieves the accuracy of 90.79\% which outperforms existing state-of-the-art unsupervised models.

【146】 Inferring Turbulent Parameters via Machine Learning
标题:基于机器学习的湍流参数推断

链接:https://arxiv.org/abs/2201.00732
作者:Michele Buzzicotti,Fabio Bonaccorso,Luca Biferale
摘要:We design a machine learning technique to solve the general problem of inferring physical parameters from the observation of turbulent flows, a relevant exercise in many theoretical and applied fields, from engineering to earth observation and astrophysics. Our approach is to train the machine learning system to regress the rotation frequency of the flow's reference frame, from the observation of the flow's velocity amplitude on a 2d plane extracted from the 3d domain. The machine learning approach consists of a Deep Convolutional Neural Network (DCNN) of the same kind developed in computer vision. The training and validation datasets are produced by means of fully resolved direct numerical simulations. This study shows interesting results from two different points of view. From the machine learning point of view it shows the potential of DCNN, reaching good results on such a particularly complex problem that goes well outside the limits of human vision. Second, from the physics point of view, it provides an example on how machine learning can be exploited in data analysis to infer information that would be inaccessible otherwise. Indeed, by comparing DCNN with the other possible Bayesian approaches, we find that DCNN yields to a much higher inference accuracy in all the examined cases.

机器翻译,仅供参考

点击“阅读原文”获取带摘要的学术速递

您可能也对以下帖子感兴趣

文章有问题?点此查看未经处理的缓存