[R] HyperTransformer: Model Generation for Supervised and Semi-Supervised Few-Shot Learning arxiv.org/abs/2201.04182
πŸ‘︎ 6
πŸ’¬︎
πŸ‘€︎ u/hardmaru
πŸ“…︎ Jan 14 2022
🚨︎ report
Differences between supervised, semi-supervised, unsupervised, and self-supervised learning

I wrote a summary explaining the difference in datasets for the four types of learning and the goals they are trying to achieve. Hope you enjoy!

https://taying-cheng.medium.com/supervised-semi-supervised-unsupervised-and-self-supervised-learning-7fa79aa9247c

πŸ‘︎ 15
πŸ’¬︎
πŸ‘€︎ u/Ok-Peanut-2681
πŸ“…︎ Nov 25 2021
🚨︎ report
What is Contrastive Learning? (Contrastive Learning/Semi-supervised lear... youtube.com/watch?v=cH8I8…
πŸ‘︎ 15
πŸ’¬︎
πŸ‘€︎ u/Ziinxx
πŸ“…︎ Dec 04 2021
🚨︎ report
What is Contrastive Learning? (Contrastive Learning/Semi-supervised Lear... youtube.com/watch?v=cH8I8…
πŸ‘︎ 7
πŸ’¬︎
πŸ“…︎ Dec 04 2021
🚨︎ report
What is the point of pseudo-labeling for a semi-supervised learning task?

Semi-supervised learning is a mix between supervised and unsupervised learning; some of the data are labeled, while usually most of it is unlabeled. Or, the researcher has a budget with which to assign labels to a previously wholly-unlabeled dataset.

The pseudo-labeling training procedure, as I understand it, works as follows:

  1. Train your model on the labeled portion of the dataset.
  2. Predict the labels for the unlabeled portion of the dataset.
  3. Define some confidence cutoff, e.g. collect the predicted instances for which the Pr(predicted label) > 0.99.
  4. Take the labeled data, and the "pseudo-labeled" (highly-confidently-predicted) unlabeled data, and put them together in another training run.
  5. This is your official trained model.

A few things that confuse me:

  • Every description I see, it sounds like the loss function for step (4) does not take into account a difference between ground-truth and pseudo-labeled data. Doesn't this mean the model is just going to reinforce whatever it learned in step (1)? If performance isn't excellent, isn't this just reinforcing bad habits?
  • Why do we care about the validation loss on pseudo-labeled instances? Shouldn't the loss function in step (4) only be calculated for ground-truth labels?

Sorry if I'm misunderstanding, and thanks for any help you can provide!

πŸ‘︎ 13
πŸ’¬︎
πŸ‘€︎ u/eadala
πŸ“…︎ Oct 27 2021
🚨︎ report
Learning with Not Enough Data: Semi-Supervised Learning lilianweng.github.io/lil-…
πŸ‘︎ 3
πŸ’¬︎
πŸ‘€︎ u/qznc_bot2
πŸ“…︎ Dec 08 2021
🚨︎ report
"Learning with not Enough Data Part 1: Semi-Supervised Learning", Weng 2021 lilianweng.github.io/lil-…
πŸ‘︎ 2
πŸ’¬︎
πŸ‘€︎ u/gwern
πŸ“…︎ Dec 08 2021
🚨︎ report
What is Contrastive Learning? (Contrastive Learning/Semi-supervised Lear... youtube.com/watch?v=cH8I8…
πŸ‘︎ 3
πŸ’¬︎
πŸ‘€︎ u/Ziinxx
πŸ“…︎ Dec 04 2021
🚨︎ report
Bounding Box-Free Instance Segmentation Using Semi-Supervised Learning for Generating a City-Scale Vehicle Dataset by Osmar Luiz Ferreira de Carvalho et al. deepai.org/publication/bo…
πŸ‘︎ 2
πŸ’¬︎
πŸ‘€︎ u/deep_ai
πŸ“…︎ Nov 30 2021
🚨︎ report
[D] What does "x% supervision" really mean on semi-supervised learning?

I am not sure what does the term *1%, 5%, and 10% supervision* mean on semi-supervised learning. My intuition is that only those percentage are the only ones that are labeled and the remaining percentage are the percent of unlabeled data. Can someone confirm if that is correct?

Also, where does the test set come there? Say I have 100K whole training dataset (labeled) and I will be testing 10% supervision, does the breakdown of the dataset look like the following?

- 10K (10% supervision)
- 80K (unlabeled)
- 10K test set

or what should be the right division of the dataset?

πŸ‘︎ 5
πŸ’¬︎
πŸ‘€︎ u/sarmientoj24
πŸ“…︎ Oct 17 2021
🚨︎ report
For topic classification, what's the difference between supervised & semi-supervised learning?

In particular, say I'm using some transformer architecture like BERT, and am fine-tuning it for a downstream task like multiclass classification of documents into 1 of 5 topics. I have 10,000 labeled instances in my data, and 40,000 unlabeled instances. If I wanted to handle this in a supervised fashion, I would maybe:

  • Train the model on 8,000 of the labeled instances, and validate on 1,000 of them, with the remaining 1,000 set aside as test set.
  • Tinker with the model until performance on the validation set is as good as I can get it.
  • Cross-validate by re-constructing the 8,000-1,000 split (never touching the 1,000 test set).
  • Repeat the tinkering process until cross-validated average performance is as good as I can get it.
  • Test on the 1,000 test set.
  • If the performance is adequate, I "trust" the model and predict the topics for the remaining 40,000 instances.

I am a bit confused how semi-supervised learning would differ in this sense. As I understand it, I'm still going through a training process. Would it be something like this?

  • Train on 8,000, validate on 1,000. Cross-validate & repeat tinkering as above.
  • Predict remaining 40,000 instances.
  • Retrain the model on these 49,000 instances split in some way for training & validation.
  • Assign extra penalty (greater loss) for mis-classifying the instances for which ground-truth labels exist.

Sorry if that's completely wrong; just thinking out loud. Is semi-supervised learning something that would be appropriate in this setting? What would it look like?

πŸ‘︎ 3
πŸ’¬︎
πŸ‘€︎ u/eadala
πŸ“…︎ Oct 25 2021
🚨︎ report
Huawei releases the industry's largest 2D autonomous driving data set, focusing on semi/self-supervised learning min.news/en/tech/df8eefe3…
πŸ‘︎ 5
πŸ’¬︎
πŸ‘€︎ u/bladerskb
πŸ“…︎ Nov 02 2021
🚨︎ report
[2109.13226] BigSSL: Exploring the Frontier of Large-Scale Semi-Supervised Learning for Automatic Speech Recognition arxiv.org/abs/2109.13226
πŸ‘︎ 3
πŸ’¬︎
πŸ‘€︎ u/nshmyrev
πŸ“…︎ Sep 28 2021
🚨︎ report
[D] Questions on Semi-Supervised Learning on Object Detection using Video Footages

While there had been advances on the Semi-Supervised Learning area, there are just few in SSL on Object Detection. There is one on Unbiased Teacher for SSL.

I would like to ask for some perspectives on SSL on Object Detection.

Semi-Supervised Learning in a nutshell

From what I know, SSL utilizes labeled and unlabeled datasets to come up with better results than, just having the labeled datasets. Of course that is an oversimplification of things. But usually, the number of unlabeled datasets outnumber the number of labeled datasets. Most of the time, there are plentiful of unlabeled datasets around as well.

SSL using Video Footages

One of the most tedious process in acquiring datasets for Object Detection (OD) is labelling datasets. But if I have a video footage that runs, say 30FPS, I could just label a few, say a couple, on that short burst of 1second clip which has about 30 frames of images containing the object on slight variations.

Theoretically, for a 1 second clip, I can annotate n data and then use 30 - n images as unlabelled dataset, right? Could I? And should I?

SSL on Variations of Objects in the Video/Image

Our dataset will be collected using a moving video camera. It is a video footage of road survey. Hence, frame by frame comparison of images could have slight to moderate difference from occlusions to motion blur, to shadows, perspectives, etc.

Questions:

  1. Has there been papers dealing with similar problem? I know that SSL just assumes unlabeled datasets are present. But for these ones, datasets would be much closer to their annotated counterpart on one point in the video.
  2. Any advice or perspectives on approaching the problem as well? We could manually find frames in the video then get the frames from time t, with t-n and t+n to get the unlabelled dataset and one to two annotations at that time t.
  3. What approaches in SSL in terms of algorithm would be preferable? Consistency Regularizations, Proxy Methods, etc?
πŸ‘︎ 5
πŸ’¬︎
πŸ‘€︎ u/sarmientoj24
πŸ“…︎ Jul 08 2021
🚨︎ report
SemiBin: Incorporating information from reference genomes with semi-supervised deep learning leads to better metagenomic assembled genomes (MAGs) biorxiv.org/content/10.11…
πŸ‘︎ 2
πŸ’¬︎
πŸ‘€︎ u/sburgess86
πŸ“…︎ Aug 17 2021
🚨︎ report
Computer vision inches toward β€˜common sense’ with Facebook’s latest research - DINO (Distilled knowledge with NO labels) semi-supervised learning. No labels--that's big... techcrunch.com/2021/04/30…
πŸ‘︎ 35
πŸ’¬︎
πŸ‘€︎ u/izumi3682
πŸ“…︎ Apr 30 2021
🚨︎ report
Memory-Efficient Semi-Supervised Continual Learning (IJCNN2021 oral) arxiv.org/abs/2101.09536
πŸ‘︎ 5
πŸ’¬︎
πŸ‘€︎ u/TheOverGrad
πŸ“…︎ May 01 2021
🚨︎ report
AdaMatch Explained - Bridging Semi-Supervised Learning and Domain Adaptation!

https://youtu.be/ORufPOY8H14

πŸ‘︎ 2
πŸ’¬︎
πŸ‘€︎ u/HenryAILabs
πŸ“…︎ Jun 17 2021
🚨︎ report
What's the current "go-to" for semi-supervised learning?

I'm working on a project at the moment with a relatively small unlabelled dataset (~10k labelled images), and a large pool of unlabelled images (1million+). I was just hoping to get an idea about what the current "go-to" semi-supervised learning approaches are. The primary focus is classification (just using a CNN). Any insight is appreciated!

πŸ‘︎ 3
πŸ’¬︎
πŸ‘€︎ u/Yerren
πŸ“…︎ May 12 2021
🚨︎ report
How to do training for semi-supervised learning for classification task?

Hi,

I am struggling to understand the training part of this paper by Thomas Kipf [https://arxiv.org/pdf/1609.02907.pdf ]. The github repo is here [ https://github.com/tkipf/pygcn/blob/master/pygcn/train.py ].

What I do not understand what is happening with masking.

I input the whole data, but use a small portion of labeled data to train. Here should I mask the rest of the data?

What will be my test set then?

Can someone who has worked on this before please guide me through?

πŸ‘︎ 3
πŸ’¬︎
πŸ‘€︎ u/popkept09
πŸ“…︎ May 25 2021
🚨︎ report
[D] How advanced is the current practice of Semi-Supervised Learning?

I'm currently working with a whack-ton of unlabeled data, and a small amount of labeled data. So I'd like to use semi-supervised learning, or at least unsupervised pre-training, to try and actually make use of the oodles of unlabeled data that I have. But I can't seem to find any SSL survey literature that doesn't seem... weirdly naive? I mean, compared to some of the crazy constructs I've seen in generative modeling for computer vision, most of what I've seen for SSL involves either the use of classical models, or just assuming that a model is right and using its own predictions as further training.

Am I just completely wrong about this? Does anybody have something more advanced, that might be more readily applicable to large scale computer vision tasks? I have some thoughts on first stabs, like training VAEs and GANs on the unlabeled data, and then breaking them apart and using the convolutional portions of the models as blocks in a ResNet, to try and "seed" the ResNet with good saliency estimators and domain understanding, but obviously I'd like to get up to speed with what's actually out there.

πŸ‘︎ 3
πŸ’¬︎
πŸ‘€︎ u/MrAcurite
πŸ“…︎ Dec 28 2020
🚨︎ report
"Large-Scale Self- and Semi-Supervised Learning for Speech Translation", Wang et al 2021 (wav2vec) arxiv.org/abs/2104.06678#…
πŸ‘︎ 9
πŸ’¬︎
πŸ‘€︎ u/gwern
πŸ“…︎ Apr 15 2021
🚨︎ report
VoxPopuli: A Large-Scale Multilingual Speech Corpus for Representation Learning, Semi-Supervised Learning and Interpretation. arxiv.org/abs/2101.00390
πŸ‘︎ 8
πŸ’¬︎
πŸ‘€︎ u/nshmyrev
πŸ“…︎ Jan 16 2021
🚨︎ report
[R] Google Brain Sets New Semi-Supervised Learning SOTA in Speech Recognition

A team of researchers from Google Brain has improved the SOTA on the LibriSpeech automatic speech recognition task, with their score of 1.4 percent/ 2.6 percent word-error-rates bettering the previous 1.7 percent/ 3.3 percent. The team’s novel approach leverages a combination of recent advancements in semi-supervised learning, using noisy student training with adaptive SpecAugment as the iterative self-training pipeline and giant Conformer models pretrained using the wav2vec 2.0 pretraining method.

Here is a quick read: Google Brain Sets New Semi-Supervised Learning SOTA in Speech Recognition

The paper Pushing the Limits of Semi-Supervised Learning for Automatic Speech Recognition is on arXiv.

πŸ‘︎ 58
πŸ’¬︎
πŸ‘€︎ u/Yuqing7
πŸ“…︎ Oct 23 2020
🚨︎ report
[R] Parametric UMAP: learning embeddings with deep neural networks for representation and semi-supervised learning

https://preview.redd.it/5jgyz9q094q51.png?width=917&format=png&auto=webp&s=cd8b939db8df424081f9339bd07a7fe56f3b1038

https://preview.redd.it/13r1oh7394q51.png?width=919&format=png&auto=webp&s=08a6dd35c4774f23916019a352a51ba6fb25ece4

Abstract: We propose Parametric UMAP, a parametric variation of the UMAP (Uniform Manifold Approximation and Projection) algorithm. UMAP is a non-parametric graph-based dimensionality reduction algorithm using applied Riemannian geometry and algebraic topology to find low-dimensional embeddings of structured data. The UMAP algorithm consists of two steps: (1) Compute a graphical representation of a dataset (fuzzy simplicial complex), and (2) Through stochastic gradient descent, optimize a low-dimensional embedding of the graph. Here, we replace the second step of UMAP with a deep neural network that learns a parametric relationship between data and embedding. We demonstrate that our method performs similarly to its non-parametric counterpart while conferring the benefit of a learned parametric mapping (e.g. fast online embeddings for new data). We then show that UMAP loss can be extended to arbitrary deep learning applications, for example constraining the latent distribution of autoencoders, and improving classifier accuracy for semi-supervised learning by capturing structure in unlabeled data.

Paper: https://arxiv.org/abs/2009.12981

Code: https://github.com/timsainb/ParametricUMAP_paper/ & https://github.com/lmcinnes/umap/tree/0.5dev

πŸ‘︎ 19
πŸ’¬︎
πŸ‘€︎ u/timburg
πŸ“…︎ Sep 29 2020
🚨︎ report
[R] Learning imbalanced dataset? Semi-supervised & self-supervised learning helps!

Check out our recent work on tackling class imbalance --- We show theoretically and empirically that, both semi-supervised learning (using unlabeled data) and self-supervised pre-training (first pre-train the model with self-supervision) can substantially improve the performance on imbalanced (long-tailed) datasets, regardless of the imbalanceness on labeled/unlabeled data and the base training techniques.

Using unlabeled data helps to shape clearer class boundaries and results in better class separation, especially for the tail classes.

Self-supervised pre-training (SSP) helps mitigate the tail classes leakage during testing, which results in better learned boundaries and representations.

πŸ‘︎ 27
πŸ’¬︎
πŸ‘€︎ u/yuzheyang
πŸ“…︎ Oct 03 2020
🚨︎ report
[R] ClassMix: Segmentation-Based Data Augmentation for Semi-Supervised Learning

Check out this new paper from the Workshop on Applications of Computer Vision Conference (WACV 2021) that looks into a new data augmentation mechanism called ClassMix. [5-Minute Paper Video] [arXiv Link]

Abstract: The state of the art in semantic segmentation is steadily increasing in performance, resulting in more precise and reliable segmentations in many different applications. However, progress is limited by the cost of generating labels for training, which sometimes requires hours of manual labor for a single image. Because of this, semi-supervised methods have been applied to this task, with varying degrees of success. A key challenge is that common augmentations used in semi-supervised classification are less effective for semantic segmentation. We propose a novel data augmentation mechanism called ClassMix, which generates augmentations by mixing unlabelled samples, by leveraging on the network's predictions for respecting object boundaries. We evaluate this augmentation technique on two common semi-supervised semantic segmentation benchmarks, showing that it attains state-of-the-art results. Lastly, we also provide extensive ablation studies comparing different design decisions and training regimes.

Example of how ClassMix works

Authors: Viktor Olsson, Wilhelm Tranheden, Juliano Pinto, Lennart Svensson (Chalmers University of Technology)

πŸ‘︎ 10
πŸ’¬︎
πŸ‘€︎ u/m1900kang2
πŸ“…︎ Jan 27 2021
🚨︎ report
"FERM: A Framework for Efficient Robotic Manipulation", Zhan et al 2021 {BAIR} (contrastive semi-supervised learning + data augmentation for sample-efficiency) arxiv.org/abs/2012.07975
πŸ‘︎ 7
πŸ’¬︎
πŸ‘€︎ u/gwern
πŸ“…︎ Jan 20 2021
🚨︎ report
If semi-supervised learning works well, why not just use that as your model?

If you're trying to use semi-supervised models to predict an output to unlabelled data, why wouldn't you use that as your main model? What value does it add strictly as an intermediate step before using supervised learning?

Unless I’m mistaken that it IS NOT an intermediate step and actually is used as a final model?

πŸ‘︎ 6
πŸ’¬︎
πŸ‘€︎ u/anthologyxxviii
πŸ“…︎ Aug 08 2020
🚨︎ report
Conditional VAE and semi-supervised learning

Hi everybody, I'm trying to create a CVAE(actually it is the code implementation of the paper "Deep Generative model")

I need to condition my encoder and decoder to the labels of the dataset (only a fraction) and I'm not sure how to design the net

I plan on working with linear layers and not convolution layers

For my understanding I can simply concatenate the one hot encoded label to the image tensor, but is there a better design for my VAE?

My idea was to add a dimension, for example in MNIST: size=(batch_size, num_classes, image_vector) = (64,10,784)

The linear layer in torch can take this input without problem, but I'm not sure if I'm overcomplicating the implementation

Any help(pro-cons about the 2 options) or different designs choice are welcome

πŸ‘︎ 6
πŸ’¬︎
πŸ‘€︎ u/Nsccfq
πŸ“…︎ Dec 02 2020
🚨︎ report
"FERM: A Framework for Efficient Robotic Manipulation", Zhan et al 2021 {BAIR} (contrastive semi-supervised learning + data augmentation for sample-efficiency) arxiv.org/abs/2012.07975
πŸ‘︎ 3
πŸ’¬︎
πŸ‘€︎ u/research_mlbot
πŸ“…︎ Jan 21 2021
🚨︎ report
Supervised, Semi-Supervised, Unsupervised, and Self-Supervised Learning

When I first began learning machine learning, I had difficulty understanding what exactly supervised and unsupervised learning are. I wrote an article describing my understanding of them, with the addition of semi-supervised and self-supervised learning. Hope you will like it!

https://taying-cheng.medium.com/supervised-semi-supervised-unsupervised-and-self-supervised-learning-7fa79aa9247c

πŸ‘︎ 12
πŸ’¬︎
πŸ‘€︎ u/Ok-Peanut-2681
πŸ“…︎ Nov 25 2021
🚨︎ report
Supervised, Unsupervised, Semi-Supervised, and Self-Supervised Learning

Understanding the dataset and how you can approach a problem is an important thing to consider when first beginning learning. I decided to write a short article discussing the different data for supervised, unsupervised, semi-supervised, and self-supervised learning.

https://taying-cheng.medium.com/supervised-semi-supervised-unsupervised-and-self-supervised-learning-7fa79aa9247c

πŸ‘︎ 4
πŸ’¬︎
πŸ‘€︎ u/Ok-Peanut-2681
πŸ“…︎ Nov 25 2021
🚨︎ report

Please note that this site uses cookies to personalise content and adverts, to provide social media features, and to analyse web traffic. Click here for more information.