Check out our Software
There are no public reports available yet.
A holistic learning framework for Novel Class Discovery (NCD), which adopts contrastive learning to learn discriminate features with both the labeled and unlabeled data. The Neighborhood Contrastive Learning (NCL) framework effectively leverages the local neighborhood in the embedding space, enabling us to take the knowledge from more positive samples and thus improve the clustering accuracy. In addition, we also introduce the Hard Negative Generation (HNG), which leverages the labeled samples to produce informative hard negative samples and brings further advantage to NCL.
Keywords
PyTorch implementation of a Geometry-Contrastive Transformer for Generalized 3D Pose Transfer. The novel GC-Transformer can freely conduct robust pose transfer on LARGE meshes at no cost, which could be a boost to Transformers in 3D fields.
Keywords
A novel unsupervised domain adaptation approach for action recognition from videos, inspired by recent literature on contrastive learning. It comprises a novel two-headed deep architecture that simultaneously adopts cross-entropy and contrastive losses from different network branches to robustly learn a target classifier.
Keywords
PyTorch implementation of AniFormer, a novel Transformer-based architecture, that generates animated 3D sequences by directly taking the raw driving sequences and arbitrary same-type target meshes as inputs. The Transformer architecture is customised for 3D animation that generates mesh sequences by integrating styles from target meshes and motions from the driving meshes. Besides, instead of the conventional single regression head in the vanilla Transformer, AniFormer generates multiple frames as outputs to preserve the sequential consistency of the generated meshes. This is achieved by a pair of regression constraints, i.e., motion and appearance constraints, that can provide strong regularization on the generated mesh sequences.
Keywords
PyTorch implementation of Intrinsic-Extrinsic Preserved Generative Adversarial Network (IEP-GAN) for both intrinsic (i.e., shape) and extrinsic (i.e., pose) information preservation. Extrinsically, a co-occurrence discriminator is used to capture the structural/pose invariance from distinct Laplacians of the mesh. Intrinsically, a local intrinsic-preserved loss is introduced to preserve the geodesic priors while avoiding heavy computations. IEP-GAN can be sued to manipulate 3D human meshes in various ways, including pose transfer, identity swapping and pose interpolation with latent code vector arithmetic. The extensive experiments on various 3D datasets of humans, animals and hands demonstrate the generality of this approach.
Keywords
Code for Word-Class Embeddings (WCEs), a form of supervised embeddings especially suited for multiclass text classification. WCEs are meant to be used as extensions (i.e., by concatenation) to pre-trained embeddings (e.g., GloVe or word2vec) embeddings in order to improve the performance of neural classifiers.
Keywords
Graph Neural Networks (GNNs) have seen a dramatic increase in popularity thanks to their ability to understand relations between graph nodes. This library aims to provide GNN capabilities to native Java applications, for example, to perform machine learning on Android. It does so by avoiding c-based machine learning libraries, such as TensorFlow Lite, that are often designed with pure performance in mind but which often require specific hardware to run, such as GPUs, and drastically increase the size of deployed applications.
Keywords
A package for implementing and simulating decentralized Graph Neural Network algorithms for the classification of peer-to-peer nodes.
Keywords
A framework for easy experimentation with Graph Neural Network (GNN) architectures by separating them from predictive components.
Keywords
Python implementation of mini-batch trimming, a novel strategy for improving the generalization capability of a trained network model. It is easy to implement and add to a training pipeline and independent of the employed model and optimizer.
Keywords
A wrapper for several SoA adaptive-gradient optimizer (Adam/AdamW/EAdam/AdaBelief/AdaMomentum/AdaFamily), including our novel 'AdaFamily' optimizer, via one API.
Keywords
The ability of artificial agents to increment their capabilities when confronted with new data is an open challenge in artificial intelligence. The main challenge faced in such cases is catastrophic forgetting, i.e., the tendency of neural networks to underfit past data when new ones are ingested. The repository includes implementations of several incremental learning techniques including among others LUCIR, iCaRL, BiC, LwF, REMIND, Deep-SLDA, ScaIL, IL2M, DeeSIL, FT, and SIW.
Keywords
CNN-based algorithm for traffic density estimation and counting that can generalize to new data sources for which there are no annotations available. This generalization is achieved by exploiting an Unsupervised Domain Adaptation strategy, whereby a discriminator attached to the output forces similar density distribution in the target and source domains.
Keywords
QuaPy is an open-source framework for quantification (a.k.a. supervised prevalence estimation, or learning to quantify) written in Python. QuaPy provides implementations of the most important aspects of the quantification workflow, such as (baseline and advanced) quantification methods, quantification-oriented model selection mechanisms, evaluation measures, and evaluation protocols used for evaluating quantification methods. QuaPy also makes available commonly used datasets, and offers visualization tools for facilitating the analysis and interpretation of the experimental results. QuaPy is accompanied by rich API documentation and a wiki guide. The software is open-source, and distributed under the BSD-3 license; it is available on GitHub and can be installed via pip.
Keywords
ql4facct is a software for replicating experiments concerning the evaluation of estimators of classifier "fairness". This repository makes available baseline systems used in literature, along with our proposed framework based on quantification. The experiments implemented in this software show, through four different experimental protocols and with the aid of visualization tools, that estimating classifier fairness via quantification yields a clear advantage with respect to the previous state-of-the-art.
Keywords
Novel fixed classifier for incremental learning in which a number of pre-allocated output nodes are subject to the classification loss right from the beginning of the learning phase. Contrarily to the standard expanding classifier, this allows: (a) the output nodes of future unseen classes to firstly see negative samples since the beginning of learning together with the positive samples that incrementally arrive; (b) to learn features that do not change their geometric configuration as novel classes are incorporated in the learning model.
Keywords
Discovery of non-linear interpretable paths in GAN latent space in an unsupervised and model-agnostic manner. Non-linear paths are modeled using RBF-based warping functions, optimized in order to be distinguishable from each other. This leads to paths that correspond to an interpretable generation where only a small number of generative factors are affected for each path. A quantitative evaluation protocol for the case of face-generating GANs is also implemented, which can be used to automatically associate the discovered paths with interpretable attributes such as smiling and rotation.
Keywords
A method that offers an intuitive way to find different types of interpretable transformations in a pre-trained GAN. We achieve this by decomposing the generator’s activations in a multilinear manner and regressing back to the latent space.
Keywords
PyTorch implementation of Multi-target Graph Domain Adaptation framework. The framework is pivoted around two key concepts: graph feature aggregation and curriculum learning.
Keywords
PyTorch implementation of the Memory-based Multi-Source MetaLearning (M^3L) framework for multi-source domain generalization (DG) in person ReID. The proposed meta-learning strategy enables the model to simulate the train-test process of DG during training, which can efficiently improve the generalization ability of the model on unseen domains. A memory-based module and MetaBN are also introduced to take full advantage of meta-learning and obtain further improvement.
Keywords
Python code for Generalised Funnelling. Funneling is a new ensemble method for heterogeneous transfer learning that can be applied to cross-lingual text classification. Funneling consists of generating a two-tier classification system where all documents, irrespective of language, are classified by the same (second-tier) classifier. For this classifier, all documents are represented in a common, language-independent feature space consisting of the posterior probabilities generated by first-tier, language-dependent classifiers. This allows the classification of all test documents, of any language, to benefit from the information present in all training documents, of any language.
Keywords
A library of self-supervised methods for unsupervised visual representation learning powered by PyTorch Lightning. It aims at providing SotA self-supervised methods in a comparable environment while, at the same time, implementing training tricks. While the library is self-contained, it is possible to use the models outside of solo-learn.
Keywords
Adversarial Robustness Toolbox (ART) is a Python library for Machine Learning Security. ART provides tools that enable developers and researchers to defend and evaluate Machine Learning models and applications against the adversarial threats of Evasion, Poisoning, Extraction, and Inference. ART supports all popular machine learning frameworks (TensorFlow, Keras, PyTorch, MXNet, scikit-learn, XGBoost, LightGBM, CatBoost, GPy, etc.), all data types (images, tables, audio, video, etc.) and machine learning tasks (classification, object detection, speech recognition, generation, certification, etc.).
Keywords
Novel training-time attacks resulting in corrupted Deep Generative Models (DGMs) that synthesize regular data under normal operations and designated target outputs for inputs sampled from a trigger distribution. Depending on the control that the adversary has over the random number generation, this imposes various degrees of risk that harmful data may enter the machine learning development pipelines, potentially causing material or reputational damage to the victim organization. The attacks are based on adversarial loss functions that combine the dual objectives of attack stealth and fidelity. Its effectiveness is shown for a variety of DGM architectures like StyleGANs and WaveGANs.
Keywords
Repository with the main tools for computing Regression Concept Vectors.
Keywords
Python implementation of OBjectGraphs, a new approach for video event recognition that exploits the relations among objects within each frame. More specifically, a graph, constructed using the appearance features of the objects, is exploited by the model to recognize the video event. Moreover, using the weighted in-degrees of the graph’s adjacency matrix, the model is able to provide insightful explanations for its decisions.
Keywords
Multi-task and Adversarial CNN Training: Learning Interpretable Pathology Features Improves CNN Generalization
Keywords
Privacy-preserving, architecture-agnostic GNN learning algorithm with formal privacy guarantees based on Local Differential Privacy (LDP). This includes a multidimensional ε-LDP algorithm that allows the server to privately collect node features and estimate the first-layer graph convolution of the GNN using the noisy features. Then, to further decrease the estimation error, we introduce KProp, a simple graph convolution layer that aggregates features from higher-order neighbors, which is prepended to the backbone GNN.
Keywords
Diffprivlib is a general-purpose library developed by IBM for experimenting with, investigating, and developing applications in, differential privacy: - Experiment with differential privacy - Explore the impact of differential privacy on machine learning accuracy using classification and clustering models - Build your own differential privacy applications, using an extensive collection of mechanisms
Keywords
Prototype of the AI4Media Evaluation as a Service platform. This platform is derived from the open-source Codalab EaaS platform and contains specific functions adapted for the AI4Media project, as well as an appropriate use-case scenario. This is a prototype version of the platform and will be updated as the project continues.
Keywords
Python implementation of novel Cycle In Cycle Generative Adversarial Network (C2GAN) for the task of keypoint-guided image generation. The C2GAN is a cross-modal framework exploring joint exploitation of the keypoint and the image data in an interactive manner. C2GAN contains two different types of generators, i.e., keypoint-oriented generator and image-oriented generator. Both of them are mutually connected in an end-to-end learnable fashion and explicitly form three cycled sub-networks, i.e., one image generation cycle and two keypoint generation cycles. Each cycle not only aims at reconstructing the input domain, and also produces useful output involved in the generation of another cycle. By so doing, the cycles constrain each other implicitly, which provides complementary information from the two different modalities and brings extra supervision across cycles, thus facilitating more robust optimization of the whole network.
Keywords
Source code for the DVMS model and training procedure, as well as pre-trained network weights for reproducibility. This deep learning model allows for multiple trajectory predictions of head movements while experiencing 360° videos with a VR headset. The necessary libraries are bundled in a Docker image but can also be installed separately.
Keywords
Implementation of Fast SR-Net for fast video visual quality and resolution improvement. It comprises a GAN-based training procedure for obtaining a fast neural network that enables better bitrate performances with respect to the H.265 codec for the same quality, or better quality at the same bitrate.
Keywords
Novel framework for Playable Video Generation that is trained in a self-supervised manner on a large dataset of unlabelled videos. We employ an encoder-decoder architecture where the predicted action labels act as bottlenecks. The network is constrained to learn a rich action space using, as the main driving loss, a reconstruction loss on the generated video.
Keywords
This is a fork of Few-shot Object Detection (FsDet) (https://github.com/ucbdrive/few-shot-object-detection), adding an easy-to-use tool for training on custom datasets. We have extended the FsDet framework with a tool that dynamically generates datasets from annotation files and drives the training process. The tool has the following features: - Determine the base and novel classes from the provided annotations (for the novel classes only a subset may be used for training). - Determine how many instances are available, and set up the k-shot n-way problem accordingly. - Prepare model structures for the novel only and combined base+novel finetuning by adjusting the layer sizes to match the number of classes in the different sets. - If the number of samples strongly varies, set up multiple training problems to make the best use of the data, and run multiple fine-tuning steps.
Keywords
VISIONE is a content-based retrieval system that supports various search functionalities (text search, object/color-based search, semantic and visual similarity search, temporal search). It uses a full-text search engine as a search backend.
Keywords
Novel Deep Micro-Dictionary Learning and Coding Network (DDLCN). DDLCN has most of the standard deep learning layers (pooling, fully, connected, input/output, etc.) but the main difference is that the fundamental convolutional layers are replaced by novel compound dictionary learning and coding layers. The dictionary learning layer learns an over-complete dictionary for the input training data. At the deep coding layer, a locality constraint is added to guarantee that the activated dictionary bases are close to each other. Next, the activated dictionary atoms are assembled together and passed to the next compound dictionary learning and coding layers. In this way, the activated atoms in the first layer can be represented by the deeper atoms in the second dictionary. Intuitively, the second dictionary is designed to learn the fine-grained components which are shared among the input dictionary atoms. In this way, a more informative and discriminative low-level representation of the dictionary atoms can be obtained.
Keywords
The new loss function for self-supervised representation learning (SSL), is based on the whitening of the latent-space features. The whitening operation has a "scattering" effect on the batch samples, avoiding degenerate solutions where all the sample representations collapse to a single point. Our solution does not require asymmetric networks and it is conceptually simple. Moreover, since negatives are not needed, we can extract multiple positive pairs from the same image instance.
Keywords
A tool to allow Visual Transformers (VTs) to learn spatial relations within an image making the VT training much more robust when training data is scarce. The tool can be used jointly with the standard (supervised) training and it does not depend on specific architectural choices, thus it can be easily plugged into the existing VTs. Our method can improve (sometimes dramatically) the final accuracy of the VTs.
Keywords
As the backward algorithm of SVD is prone to numerical instability, we implement a variety of end-to-end SVD methods by manipulating the backward algorithms in this repository. They include: - SVD-Pad'e: use Pad'e approximants to closely approximate the gradient. - SVD-Taylor: use the Taylor polynomial to approximate the smooth gradient. - SVD-PI: use Power Iteration (PI) to approximate the gradients. - SVD-Newton: use the gradient of the Newton-Schulz iteration. - SVD-Trunc: set an upper limit of the gradient and apply truncation. - SVD-TopN: select the Top-N eigenvalues and abandon the rest. - SVD-Original: ordinary SVD with gradient overflow check.
Keywords
Novel two-stage framework with a new Cascaded Cross MLP-Mixer (CrossMLP) sub-network in the first stage and one refined pixel-level loss in the second stage. In the first stage, the CrossMLP sub-network learns the latent transformation cues between image code and semantic map code via our novel CrossMLP blocks. Then, the coarse results are generated progressively under the guidance of those cues. Moreover, in the second stage, we use a refined pixel-level loss that eases the noisy semantic label problem with more reasonable regularization in a more compact fashion for better optimization.
Keywords
DeepFusion source code. This code corresponds to a DNN-based late fusion approach, that uses a custom number of inducers as inputs and outputs a new result, according to late fusion schemes.
Keywords
Social networks give free access to their services in exchange for the right to exploit their users' data. Data sharing is done in an initial context which is chosen by the users. However, data are used by social networks and third parties in different contexts which are often not transparent. In order to unveil such usages, we propose an approach that focuses on the effects of data sharing in impactful real-life situations. Focus is put on visual content because of its strong influence in shaping online user profiles. The approach relies on three components: (1) a set of visual objects with associated situation impact ratings obtained by crowdsourcing, (2) a corresponding set of object detectors for mining users' photos and (3) a ground truth dataset made of 500 visual user profiles which are manually rated per situation. These components are combined in LERVUP, a method which learns to rate visual user profiles in each situation. LERVUP exploits a new image descriptor which aggregates object ratings and object detections at user level and an attention mechanism which boosts highly-rated objects to prevent them from being overwhelmed by low-rated ones. Performance is evaluated per situation by measuring the correlation between the automatic ranking of profile ratings and a manual ground truth
Keywords
AI4Media may use cookies to store your login data, collect statistics to optimize the website's functionality and to perform marketing actions based on your interests.
Required Cookies They allow you to browse the website and use its applications as well as to access secure areas of the website. Without these cookies, the services you have requested cannot be provided.
Functional Cookies These cookies are necessary to allow the main functionality of the website and they are activated automatically when you enter this website. They store user preferences for site usage so that you do not need to reconfigure the site each time you visit it.
Advertising Cookies These cookies direct advertising according to the interests of each user so as to direct advertising campaigns, taking into account the tastes of users, and they also limit the number of times you see the ad, helping to measure the effectiveness of advertising and the success of the website organisation.