A list of puns related to "Supervised Classification"
Hey everyone!
I'm trying to quantify how much the Urban Tree Canopy (UTC) has changed within parks for a local municipality. Using ArcGIS Pro and orthoimagery, I've run through a supervised classification using the Imagery Wizard and using the Classification tool. In both cases it seems that my Water category and my Forest classes are being confused, as well as some slight mixing of the other classes.
I was curious if anyone had some suggestions for how to minimize this? I've created several (14) classes within my schema and tried to create 20-30 training samples for any classes that were included in the imagery. Open to any suggestions for how to produce less "muddled" land use classification. Thanks!
Suppose there's a set of (multiclass) labeled documents, say 10 classes, and the team found intercoder reliability to be about 90%. I could account for this directly in PyTorch's nn.CrossEntropyLoss by defining label_smooth=0.1
to give the model soft rather than hard targets.
Beyond this though, what kind of performance should I interpret as "good enough"? Right now, with that stated intercoder reliability, my model is at approximately 89.8% accuracy / 90% F1 score. There are some examples of duplicate text entries in the training data that are coded differently, i.e. conditioned on two documents having the same input text, they get coded as different labels roughly 5.3% of the time.
Should I just expect that 90% is about as good as it gets, since the labels themselves slightly weaker than would be the case in "pure" supervised learning?
In particular, say I'm using some transformer architecture like BERT, and am fine-tuning it for a downstream task like multiclass classification of documents into 1 of 5 topics. I have 10,000 labeled instances in my data, and 40,000 unlabeled instances. If I wanted to handle this in a supervised fashion, I would maybe:
I am a bit confused how semi-supervised learning would differ in this sense. As I understand it, I'm still going through a training process. Would it be something like this?
Sorry if that's completely wrong; just thinking out loud. Is semi-supervised learning something that would be appropriate in this setting? What would it look like?
I studied and implemented modern ConvNets for image classification task. Now I want to classify images using slef-supervised learning methods. I don't know about self-supervised learning. Where should I start from? And what track track should I follow to master self-supervised learning methods? Please recommend me books, papers, and blogs so that I can follow it.
Everytime i try to use the Signature Editor in Erdas Imagine V. 16.6.0 Build 2100, it stops working. Anyone else having this issue?
I'm writing a paper on machine learning and I'd like to list a few examples of business applications for classification. So far I have sentiment analysis, document classification and recommendations. Are there any more that would be good to include or is that pretty much it?
Thanks in advance.
A new work from Google Brain (authors of SimCLR) and Google Health shows self-supervised pretraining on unlabeled medical images is much more effective than supervised pretraining on ImageNet.
Hi,
I am going through the Graph convolutional neural network paper for semisupervised label classification (https://arxiv.org/pdf/1609.02907.pdf ).
The github repository for the same is here [https://github.com/tkipf/pygcn/blob/master/pygcn/train.py]
What I do not understand is how they do semi-supervised training. I see that they input all the data but train with only few examples. They at some-point mask the labels. This is the part that I am not getting.
Should I mask all labels other than the training examples? Then what am I testing against? Can someone who has done this before please clarify?
I just want to know from the professionals that how to perform supervised classification by auto encoder My data has 500 sensor variables and a class label which defines abnormal and normal class of the system. If someone gives me codes to implement autoencoder in Keras will be highly appreciable
Hi,
I am currently trying to determine which classes to use in my supervised classifications. I have a pre and post sentinel-2 image https://imgur.com/a/MP5utAT of an area of Victoria affected by the 2019-2020 wildfires. I have completed my pre-processing and changed the spectral bands to R: Band 12, G: Band 8 and B: Band 4 which provides good visibility to vegetation classes, highlighting the burnt areas. I currently have a list of classes which include;
I currently have a few issues with these classes as scrub (Dark purple), bare soil (Light pink) and burn scar (Purple/Pink) all seem to have a similar spectral reflectance and it could make distinguishing between them difficult when creating my training classes and for the computer when creating the classification. I wondering if there's any spectral band combinations that will make it easier to differentiate between them? I also have the same issue with sand, cloud and impervious surfaces which all have a spectral reflectance of White.
I'm also wondering if I've missed any obvious classes off to include?
Thanks
TLDR: I mean we need a new name, why do we even call it unsupervised if we use the entire labelled dataset.
If I misunderstood anything here, please do correct me.
From what I have understood is that, in unsupervised linear classification we use the feature extractor from semi-supervised learning and place a linear classifier on top of it. We keep the feature extractor's parameters the same, and only allow the linear classifier to update it's params on the entire training set.
The word unsupervised should not be used here, we need a new word, in my opinion.
What do you guys think?
https://arxiv.org/abs/2006.11325
Summary: "ProtoTransfer" method: metric self-supervised pre-training (ProtoCLR) combined with a matching few-shot transfer-learning approach (ProtoTune). On mini-ImageNet, ProtoTransfer outperforms all state-of-the-art un/self-supervised few-shot approaches (by 4% to 8%). Competitive with fully supervised performance (0% to 4% gap) on 4 cross-domain datasets, at a fraction of the label cost (<1%).
Generalization gap observation: Negligible generalization gap from training classes to test classes (from the same class distribution, e.g. mini-ImageNet). Other supervised & self-supervised few-shot approaches, such as ProtoNet (Snell et al., 2017) and UMTRA (Khodadadeh et al., 2019), respectively, show non-negligible generalization gaps.
e.g. 5-way 5-shot:
Method | Train accuracy (%) | Test accuracy (%) | Generalization gap (%) |
---|---|---|---|
ProtoNet | 79.09 Β± 0.69 | 66.33 Β± 0.68 | 12.76 |
UMTRA(-ProtoNet) | 56.43 Β± 0.78 | 53.37 Β± 0.68 | 3.06 |
ProtoCLR-ProtoNet (this paper) | 63.47 Β± 0.58 | 63.35 Β± 0.54 | 0.12 |
I studied and implemented modern ConvNets for image classification task. Now I want to classify images using slef-supervised learning methods. I don't know about self-supervised learning. Where should I start from? And what track track should I follow to master self-supervised learning methods? Please recommend me books or papers or blogs.
Hi,
I am struggling to understand the training part of this paper by Thomas Kipf [https://arxiv.org/pdf/1609.02907.pdf ]. The github repo is here [ https://github.com/tkipf/pygcn/blob/master/pygcn/train.py ].
What I do not understand what is happening with masking.
I input the whole data, but use a small portion of labeled data to train. Here should I mask the rest of the data?
What will be my test set then?
Can someone who has worked on this before please guide me through?
A new work from Google Brain (authors of SimCLR) and Google Health shows self-supervised pretraining on unlabeled medical images is much more effective than supervised pretraining on ImageNet.
They also propose a new method called Multi-Instance Contrastive Learning (MICLe), which uses multiple images of the same underlying pathology per patient case, when available, to construct more informative positive pairs for self-supervised learning.
A new work from Google Brain (authors of SimCLR) and Google Health shows self-supervised pretraining on unlabeled medical images is much more effective than supervised pretraining on ImageNet.
They also propose a new method called Multi-Instance Contrastive Learning (MICLe), which uses multiple images of the same underlying pathology per patient case, when available, to construct more informative positive pairs for self-supervised learning.
Please note that this site uses cookies to personalise content and adverts, to provide social media features, and to analyse web traffic. Click here for more information.