[P] Adapting Class Activation Maps for Object Detection and Semantic Segmentation

Hi r/MachineLearning,

https://github.com/jacobgil/pytorch-grad-cam is a project that has a comprehensive collection of Pixel Attribution Methods for PyTorch (like the package name grad-cam that was the original algorithm implemented).

Class Activation Maps can help diagnose properties about the model predictions, like "where does the model see a cat in the image".

After many requests I added support for Object Detection and Semantic Segmentation, and wanted to share this with you.

Here you can find detailed notebook tutorials about this:

Computing the CAM for object detection

Computing the CAM for semantic segmentation

The problem

Class Activation Maps are usually researched and applied for classification models.

A repeating request in this repository, and also in some object detection projects, was to add support for grad-cam for object detection.

One challenge with this, is that object detection frameworks typically don't output tensors you can back-propagate through to compute gradients.

They typically output dictionaries with bounding boxes, labels, etc, after a lot of processing, and don't expose any way to compute gradients with respect to those detections.

If you want to compute CAMs for them, you typically have to dive into the code of these object detection packages and create solutions that work only with them.

There was no "generic" tool that just works and can be adapted to new object detection models.

The solution - gradient free methods

Some Class Activation Map methods don't depend on computing the gradients. Examples of these:

  • EigenCAM, computes PCA on the activations and returns the first principle component.
    It's very fast since it requires a single forward pass, but it doesn't have good enou
... keep reading on reddit ➑

πŸ‘︎ 46
πŸ’¬︎
πŸ‘€︎ u/jacobgil
πŸ“…︎ Dec 30 2021
🚨︎ report
How to apply transfer learning and add a new class to a pre trained mobilenet SSD model (object detection)? ELi5?

So sorry because I’m a total newbie at this! I’ve been having a really hard time understanding how to do this.

I found a pre trained caffemodel online which can detect classes like a tv monitor, water bottle and a person. For my school project, I need the model to detect new classes like matchsticks, lighters, etc.

I already have the folder containing about 300 images of each class I need for the training/fine tuning but I am so lost at how to go about this.

I would greatly appreciate any help! Thanks so much guys!!

πŸ‘︎ 2
πŸ’¬︎
πŸ‘€︎ u/lyra_in
πŸ“…︎ Dec 23 2021
🚨︎ report
Is using object detection model to do a multi-class binary classification good approach?

I need to do a multi-class binary classification using object detection. For example, I need a binary classification for mask or no mask, helmet or no helmet, hot-dog or no hot-dog. Is it better to train each binary classifier and run all of the model at the same time or is it better to use object detection and train 6 classes object detection?

Sorry English is not my first language.

πŸ‘︎ 2
πŸ’¬︎
πŸ‘€︎ u/GTHell
πŸ“…︎ Oct 01 2021
🚨︎ report
"We’ve never seen anything like it" University of Sydney researchers detect strange radio waves from the heart of the Milky Way which fit no currently understood pattern of variable radio source & could suggest a new class of stellar object. sydney.edu.au/news-opinio…
πŸ‘︎ 39k
πŸ’¬︎
πŸ‘€︎ u/marcom06
πŸ“…︎ Oct 12 2021
🚨︎ report
How do you add a class to coco classes in YOLO (CNN) object detection?

In this git repo (https://github.com/qqwweee/keras-yolo3), I want to add my own class to coco_classes.txt. I have tried another git repo (https://github.com/AntonMu/TrainYourOwnYOLO) for custom training, but its not as accurate as the pretrained coco dataset. I have created a CSV for my custom classes. How do I merge my custom classes with the coco dataset, so that I can make use of it without affecting the existing 80 classes?

As in, I want my model to be able to detect these 80 classes plus my custom classes.

Sorry if this is a stupid question, I am a beginner to Machine Learning.

πŸ‘︎ 3
πŸ’¬︎
πŸ‘€︎ u/fegelman
πŸ“…︎ Mar 24 2021
🚨︎ report
YOLO Object Detection | ROS Developers Live Class #119 youtube.com/watch?v=dB0Si…
πŸ‘︎ 31
πŸ’¬︎
πŸ‘€︎ u/Rollb0t
πŸ“…︎ Mar 30 2021
🚨︎ report
YOLO Object Detection | ROS Developers Live Class #119 youtube.com/watch?v=dB0Si…
πŸ‘︎ 20
πŸ’¬︎
πŸ‘€︎ u/Rollb0t
πŸ“…︎ Mar 30 2021
🚨︎ report
Extreme Class Imbalance: Transfer learning on a custom object detection dataset

We have created a dataset of 10,000 images and 17 classes. There are 177,106 objects in total. Their percentages (number of occurrence of each class / total number of objects in dataset * 100%) are as follows:

29.72 %, 24.41 %, 15.90 %, 11.18 %, 4.86 %, 4.19 %, 2.99 %, 2.86 %, 1.01 %, 1.01 %, 0.66 %, 0.55 %, 0.5 %, 0.09 %, 0.04 %, 0.02 %

We are training pre-trained CNNs (EfficientDet, YOLOv2 and YOLOv4/5) on this dataset.

As one might expect, we are having trouble detecting objects that occur less than 1% of the time in the dataset. Any idea how we can tackle this problem?

πŸ‘︎ 2
πŸ’¬︎
πŸ‘€︎ u/uncle-iroh-11
πŸ“…︎ Feb 25 2021
🚨︎ report
Adaptive Class Suppression Loss for Long-Tail Object Detection by Tong Wang et al. deepai.org/publication/ad…
πŸ‘︎ 2
πŸ’¬︎
πŸ‘€︎ u/deep_ai
πŸ“…︎ Apr 06 2021
🚨︎ report
YOLO Object Detection | ROS Developers Live Class #119 youtube.com/watch?v=dB0Si…
πŸ‘︎ 2
πŸ’¬︎
πŸ‘€︎ u/Rollb0t
πŸ“…︎ Mar 30 2021
🚨︎ report
Keras Custom Multi-Class Object Detection CNN with Custom Dataset

Hey guys!

I’m very new to ML, and I’m working a college project to detect allow entry to places with automatic doors (I.E. supermarkets, hospitals) only if the person is wearing a mask using a Raspberry Pi 4. I was able to train a model based on a pre-trained MobileNetV2 with TensorFlow 1.13.1, but the code is mostly written on the tensorflow/models GitHub repo & it’s a very ugly and un-intuitive code to rewrite by myself & it is based off another pre-trained model - all of which are things I’m trying to avoid.

I started to learn how to do something similar with TensorFlow 2 and Keras. I’ve watched the I/O 2019 ML Zero to Hero & the 4 part mini-series to try and get the basic idea of convolutional networks and machine learning in general; However, I can’t find a single example of using Keras for creating a convolutional network for object detection with multi-class classification in a single model - I DID find a version where the code used a model to predict bounding boxes for faces first, extract them, and use image classification on it similar to the Rock/Paper/Scissors model in ML Zero To Hero - but that’s kind of a compromise instead of using a single model that detects masks directly, and can also slow things down considering it will be running on a Raspberry Pi, that also does some more calculations based on the results of the model (temperature check, and audio instructions to the person requesting access if they are wearing the mask incorrectly).

Here’s my current implementation (dataset is available with tfrecord, xml, and csv annotations):

Google Colab Notebook

It currently relies on the MobileNetV2 model, and for some reason While training the loss and accuracy are staying relatively the same - they’re not improving over time as they should. If there is a better way or a more efficient way of implementing what I have been trying to do, I’d be happy to learn about it.

To summarize, What I need help with (in relation to the notebook linked above):

β€’ Need to figure out if the way I implemented it is the ideal / code efficient way to do such task (probably not?)

β€’ re-implement it as a completely custom network (as in the ML Zero to Hero, but object detection instead of image classification), without relying on MobileNetV2 or any other pre-trained model.

β€’ Figure out why the model does not improve it’s accuracy when training.

πŸ‘︎ 2
πŸ’¬︎
πŸ‘€︎ u/LGariv
πŸ“…︎ Dec 18 2020
🚨︎ report
In object detection model, why does the model detect a class that is not trained?

I'm building an detection model by YOLOv4 darknet, and my configuration contains 7 different classes. However, I only train model with a single class, which I provided the same label for all training images.

After training, the detect image shows all 7 classes, although other classes has low probability, but that confused me. Why would a model detect class that is not trained?

If it's because I list all 7 classes in the configuration, then do I have to set only one class in case I only want to detect one class?

I'm new to object detection, thank you!

πŸ‘︎ 4
πŸ’¬︎
πŸ‘€︎ u/Laurence-Lin
πŸ“…︎ Dec 25 2020
🚨︎ report
Object Detection Training: do unlabeled instances reduce class accuracy?

I'm training a model to detect both vehicles and components of vehicles (e.g. wheels, doors.) I have a large, open dataset where only vehicles are labelled, and a smaller bespoke one with the vehicle and components labelled.

Obviously, a picture from the first dataset may have unlabeled instances of these components in it. Do these unlabelled instances reduce the accuracy of the component classes by essentially being used as false negative examples, or are positive examples more important for a multiclass detector? If they do detract, how can I use the larger dataset to improve vehicle detection accuracy?

If it's implementation specific, I'm finetuning the SSDLite MobileDet model using the TF Object Detection API with TF 1.15.

Thanks!

πŸ‘︎ 7
πŸ’¬︎
πŸ‘€︎ u/Futurekevin
πŸ“…︎ Dec 08 2020
🚨︎ report
Suitable Object Detection Algorithm for Single Class?

Pretty much the title. I have about 1250 images with Bounding Boxes coordinates.

Which architecture will be suitable? Any code I can follow for the same?

πŸ‘︎ 2
πŸ’¬︎
πŸ“…︎ Feb 03 2021
🚨︎ report
Object Detection For One Class Of Image

Hey all.

So this is my first time posting in this Subreddit.

I have this task of detecting the white circles in my link. It's basically LED light reflected onto the iris from a camera. It's for a positioning system that uses a 3-axis robot.

I tried to use open CV initially but due to vast variation in the lighting condition it wasn't able to detect the object in all frames.

Then I tried using YOLO V2. Specifically Tiny YOLO. So the link is basically the result of using YOLO. The tracking is fine.

Now what I have to do is to implement this on a Raspberry Pi 4 Model B. So when I tried this I got 1FPS when I was using real time video. I understand that there are hardware constraints. I tried using SSD mobileNET as well. It gave me around 2FPS.

So I want detect these objects in real time with a frame rate of around 7-10 FPS. Due to budget restrictions I cannot use a hardware accelerator.

I just wanted to know how I can do the object detection in real time with a good frame rate on the Raspberry pi 4.

Also I'm new to this and I'm trying to learn on the go.

Image

πŸ‘︎ 3
πŸ’¬︎
πŸ‘€︎ u/DarkMessiah1826
πŸ“…︎ Mar 09 2020
🚨︎ report
Made an object detection program on my systems integration class to try and tell what the objects are on this infamous picture
πŸ‘︎ 45
πŸ’¬︎
πŸ‘€︎ u/willymartin99
πŸ“…︎ Dec 10 2019
🚨︎ report
[D] Concepts and Papers to Read for Multi-class Object Detection/Segmentation for Automated Surface Inspection on Uniformly-Textured Backgrounds such as concretes

I would like to ask for advice on what specific topic/architectures should I be looking for if I have this problem.

Broad Topic: Multi-class Object Detection or Semantic Segmentation (which includes multiple classes for classification) of defects

Problems:

  1. Defects are on a uniform-textured background such as concretes or pavements.
  2. Some of the classes have large variations of structural appearances hence some classes have a small ratio of object-to-bounding box ratio while some have a high ratio. For example, a bounding box could contain a crack (which is relatively thin causing the bounding box to enclose a very small ratio of the object pixels inside the bounding box). Or the bounding box could enclose a whole "disintegration" of the concrete (the whole bounding box is composed of the whole "object pixels")

I have started looking into salient object detection but I think it is the wrong route. Those are my main problem. I am having an intuition of improving the feature maps extraction or the whole process by sort of introducing the "background" without any object (or in this case, defects) to the architecture (since concretes are kind of uniformly textured and the defects that I am detecting and segmenting is just the deformity in that uniformly textured concrete".

Any advice would be much appreciated.

πŸ‘︎ 3
πŸ’¬︎
πŸ‘€︎ u/sarmientoj24
πŸ“…︎ Sep 08 2020
🚨︎ report
Multi-class Object Detection on TF API

I'm using TensorFlow's 1.x Object Detection API to train a model for detection on satellite imagery tagged with 15 classes. Essentially trying to recreate Xia et all. https://vision.cornell.edu/se3/wp-content/uploads/2018/03/2666.pdf

I have only ever done this with a single class detector before, so I am unsure what the usual training time looks like. I am transfer learning a SSD trained on COCO model. After about two or so hours it's sitting at a loss of ~4, is this usual or should I start looking into adjustments?

As a side note: I have not done object detection since 2017 when this API just came out and was all the rage. Are there better methods out there nowadays?

Thanks!

πŸ‘︎ 3
πŸ’¬︎
πŸ‘€︎ u/Sudden-Mango
πŸ“…︎ Jun 12 2020
🚨︎ report
Underwater object detection using Invert Multi-Class Adaboost with deep learning by Long Chen et al. deepai.org/publication/un…
πŸ‘︎ 2
πŸ’¬︎
πŸ‘€︎ u/deep_ai
πŸ“…︎ Jun 06 2020
🚨︎ report
One-class YOLO vs 2-class YOLO object detection.

I am working on a problem where we are interested in only 1-class at the end. Simple example detecting only person. I have an annotated data for only person. Now I need to decide whether to train a 1-class YOLO or put more annotation in background and mark as "other" class then train a 2-class YOLO.? Which one would give better accuracy or recall?

Additional query is: a) How the YOLO will behave when there is only 1-class. I mean does the network internally learns person vs non-person? How ot will do a contrast when there only 1-class. b) If a) does not work well, I feel creating a separate class should actually help. BUT here "other" class could contain wide variety of objects fitted under 1 annotation. Is network will get even more confused? Will it affect "person" class as well?

πŸ‘︎ 8
πŸ’¬︎
πŸ‘€︎ u/srbh66
πŸ“…︎ Feb 17 2019
🚨︎ report
Best mAP model for object detection on fairly small dataset with class imbalance

I was wondering what model people recommend (trying to maximize mAP where speed is not a concern) for object detection with roughly 20 classes on a small dataset of roughly 12 thousand images with fairly high class imbalance. I’ve tried Retinanet50 with mAP of 33 and Retinanet101 with mAP of 36. Models I’m considering trying in the future are as follows (I’d love to test them all but as these models take a long time to train I’m trying to not waste too much training time): YOLOv3 SNIPER NASnet Faster R-CNN Mask-R-CNN

What models (or strategies) do people recommend for this scenario to try to maximize mAP?

πŸ‘︎ 2
πŸ’¬︎
πŸ‘€︎ u/ski233
πŸ“…︎ Jul 16 2019
🚨︎ report
multi label/class image classification vs object detection

let's say my objective is to identify 3 dogs, 2 cats, 4 elephants in a picture, putting aside object localisation (drawing bounding box) and object segmentation (mask rcnn) in object detection, can we do multi label/class image classification instead? and how to do that?

πŸ‘︎ 3
πŸ’¬︎
πŸ‘€︎ u/wiama
πŸ“…︎ Jun 12 2019
🚨︎ report
Can I know what kind of dataset annotation for object detection is this?

Hello I have an existing yolo format dataset of vehicle plate and I want to convert it to this format , Does anyone know what kind of format is this?

https://preview.redd.it/ecwnlq2rg8d81.png?width=389&format=png&auto=webp&s=c18f0a92723d5a6fae4dbd215cb844c34122d21b

πŸ‘︎ 10
πŸ’¬︎
πŸ‘€︎ u/leonardocrook
πŸ“…︎ Jan 22 2022
🚨︎ report
[P] Multi-class Object Detection on Nvidia Jetson TK1

Hi, I'm starting this project for my final year Msc thesis and, in the last 6-7 days, I've been searching and reading papers, blogs, forums to evaluate the possible solutions and whether they might be viable or not.

The proposed project is for a first prototype of a Smart Adapter which connects to an IP camera and does object detection through a configurable Neural Network, outputting relevant data through binary or parsable format. As a matter of fact they didn't mention tracking but that's a problem I think I'll have to address too, if I'll want to include that kind of data too. An example they showed me was from traffic aerial static road cameras, where all the detection was done on a server so that might be a use case on which I might work.

The supervisor proposed the use of an Nvidia Jetson TK1 (which they have in the laboratory), since it has the advantage of being low powered and low-cost. My idea was to start and try and use YOLOv3 pretrained models (normal or tiny) over darknet and then try to train it on a dataset like UA-DETRAC, for the roads use case. I thought about tensorflow with other algorithms but my guess is that I would have less fps performance.

I know from start that I won't get real time detection but what about reaching 7-8 fps? Maybe with the custom trained tiny model do you think I could have acceptable detecting and fps performance?

MOST IMPORTANT QUESTION, PLEASE, RELATED TO OFFLINE TRAINING: I couldn't find any relevant info on YOLOv3 normal/tiny models training time with darknet, I would like to do it on the UA-DETRAC dataset with ~80.000 images with multiple tagged boxes. Could I do it with a gtx 970 in a reasonable time or would it take days, if not weeks? In the latter case, are there online services which offer linux workstations with CUDA and cuDNN gpus for darknet training?

Thanks for any answer you might give :)

πŸ‘︎ 5
πŸ’¬︎
πŸ“…︎ May 15 2018
🚨︎ report
YOLOv4/5 - Object Detection for Autonomous Driving - Datasets

Hi everyone,

I am currently working on my bachelor thesis in the field of object detection. I have chosen the Yolov5 model from "https://github.com/ultralytics/yolov5". Looking for a dataset for autonomous driving I found NuScenes and Waymo, but in Waymo I have problems converting the TFRecords files to .yaml files. Does anyone know of an approach?

Does anyone knows of any other good datasets in the area of Autonomous Driving? They should also be optimally convertible to .yaml files.

Greetings

GT_King0895

πŸ‘︎ 23
πŸ’¬︎
πŸ‘€︎ u/GT_King0895
πŸ“…︎ Jan 05 2022
🚨︎ report
Some objects can be pushed just by moving towards them. This can open up new paths, or help you avoid detection. [Diurnal Eidolon] v.redd.it/91yc04vvgd081
πŸ‘︎ 415
πŸ’¬︎
πŸ‘€︎ u/Quixoma
πŸ“…︎ Nov 18 2021
🚨︎ report
Is there any difference between AP and mAP for single class object detection?

Perhaps a stupid question, but as I understand it mAP is the average of AP scores for all classes. If I only have a single class would mAP = AP?

πŸ‘︎ 2
πŸ’¬︎
πŸ‘€︎ u/bigum
πŸ“…︎ May 31 2019
🚨︎ report
"We’ve never seen anything like it" University of Sydney researchers detect strange radio waves from the heart of the Milky Way which fit no currently understood pattern of variable radio source & could suggest a new class of stellar object. sydney.edu.au/news-opinio…
πŸ‘︎ 20
πŸ’¬︎
πŸ‘€︎ u/lovesaints
πŸ“…︎ Oct 12 2021
🚨︎ report
[R] ByteTrack: Multi-Object Tracking by Associating Every Detection Box v.redd.it/sf125fyg0bv71
πŸ‘︎ 1k
πŸ’¬︎
πŸ“…︎ Oct 24 2021
🚨︎ report
How to train object detection system for 2 classes having two seperate datasets for each class?

I have dataset of class A and a dataset of class B. Instances of class B exist in dataset A but are not annotated in dataset A and vice-versa. Is there a way to somehow train object detection system like SSD to detect these two classes? Intuition tells me it depends on the penalty(loss) for false positive samples.

πŸ‘︎ 2
πŸ’¬︎
πŸ‘€︎ u/dlenthusiast123
πŸ“…︎ Nov 06 2018
🚨︎ report
Tutorial/script for generating synthetic datasets for object detection

Preparing dataset for training object detection model is a time-consuming task.

Generating synthetic dataset is much faster and easier, so it can save a lot of time for data scientists and ML engineers.

I've created a detailed tutorial on how to do that with Python and published it here:

https://medium.com/@alexppppp/how-to-create-synthetic-dataset-for-computer-vision-object-detection-fd8ab2fa5249

Or, you can skip the tutorial and use the script from here:

https://github.com/alexppppp/synthetic-dataset-object-detection

Take a look at an example of synthetic scene here:

https://preview.redd.it/lzr920vntlc81.png?width=2053&format=png&auto=webp&s=a39442e63eb02eeb2897530323d0a92e81041122

I hope the script will be useful for you! Also I will appreciate any feedback and ideas how to improve it)

πŸ‘︎ 18
πŸ’¬︎
πŸ‘€︎ u/alexppppp42
πŸ“…︎ Jan 19 2022
🚨︎ report
Camera and GPU board suggestions for object detection

Hello, I am a newbie in drones and UAVs. I have been training deep learning models to recognize objects in the air. In order to actually deploy it, I was wondering if I could get some suggestions about possible cameras that I can stream and a GPU unit that would allow me to deploy the trained model. Some ideas that I came across:

  1. Nvidia Jetson Tx1/Tx2 and nano
  2. USB 3.0 Cameras
  3. Go pros.

Any suggestions would be great!

πŸ‘︎ 5
πŸ’¬︎
πŸ‘€︎ u/PinPitiful
πŸ“…︎ Jan 14 2022
🚨︎ report
How do I train an object detection model with classes and subclasses?

Hi everyone!

So I have a project related to object detection, specifically real time animal and person detection. Would it be possible to train an object detection model with generalized classes and subclasses?

What I mean is, if an animal enters the frame is it possible for a model to classify it as an animal and when there is more information (ex. the animal comes closer to the camera) to classify it as the type of animal (eg. a deer). Similarly, if it detects a person, it will also be able to tell the gender next to the person classification.

I have heard that it is possible to train with labels and sublabels (or attributes), but I haven't found anything to proceed as of now. Does anyone have any idea on how to proceed? Any lead is welcome!

Cheers! :)

πŸ‘︎ 10
πŸ’¬︎
πŸ‘€︎ u/lumeneques
πŸ“…︎ Jul 04 2021
🚨︎ report
Keras Custom Multi-Class Object Detection CNN with Custom Dataset

Hey guys!

I’m very new to ML, and I’m working a college project to detect allow entry to places with automatic doors (I.E. supermarkets, hospitals) only if the person is wearing a mask using a Raspberry Pi 4. I was able to train a model based on a pre-trained MobileNetV2 with TensorFlow 1.13.1, but the code is mostly written on the tensorflow/models GitHub repo & it’s a very ugly and un-intuitive code to rewrite by myself & it is based off another pre-trained model - all of which are things I’m trying to avoid.

I started to learn how to do something similar with TensorFlow 2 and Keras. I’ve watched the I/O 2019 ML Zero to Hero & the 4 part mini-series to try and get the basic idea of convolutional networks and machine learning in general; However, I can’t find a single example of using Keras for creating a convolutional network for object detection with multi-class classification in a single model - I DID find a version where the code used a model to predict bounding boxes for faces first, extract them, and use image classification on it similar to the Rock/Paper/Scissors model in ML Zero To Hero - but that’s kind of a compromise instead of using a single model that detects masks directly, and can also slow things down considering it will be running on a Raspberry Pi, that also does some more calculations based on the results of the model (temperature check, and audio instructions to the person requesting access if they are wearing the mask incorrectly).

Here’s my current implementation (dataset is available with tfrecord, xml, and csv annotations):

Google Colab Notebook

It currently relies on the MobileNetV2 model, and for some reason While training the loss and accuracy are staying relatively the same - they’re not improving over time as they should. If there is a better way or a more efficient way of implementing what I have been trying to do, I’d be happy to learn about it.

To summarize, What I need help with (in relation to the notebook linked above):

β€’ Need to figure out if the way I implemented it is the ideal / code efficient way to do such task (probably not?)

β€’ re-implement it as a completely custom network (as in the ML Zero to Hero, but object detection instead of image classification), without relying on MobileNetV2 or any other pre-trained model.

β€’ Figure out why the model does not improve it’s accuracy when training.

πŸ‘︎ 3
πŸ’¬︎
πŸ‘€︎ u/LGariv
πŸ“…︎ Dec 18 2020
🚨︎ report
How to finetune mobilenet ssd model to add more object classes for object detection?

I found a trained object detection model online which can detect object classes such as a water bottle and a person. I would like to add another class. How can this be achieved?

πŸ‘︎ 2
πŸ’¬︎
πŸ‘€︎ u/lyra_in
πŸ“…︎ Sep 14 2021
🚨︎ report

Please note that this site uses cookies to personalise content and adverts, to provide social media features, and to analyse web traffic. Click here for more information.