Training YOLOv9 on custom dataset

  • Context: Comp Sci 
  • Thread starter Thread starter falyusuf
  • Start date Start date
Click For Summary

Discussion Overview

The discussion centers around training the YOLOv9 model on a custom dataset, specifically addressing an error encountered during the training process. Participants explore potential causes and solutions related to data handling and indexing issues within the training script.

Discussion Character

  • Technical explanation
  • Debate/contested
  • Exploratory

Main Points Raised

  • One participant reports an error during training, specifically an IndexError related to data indexing in the DataLoader process.
  • Another participant suggests that the IndexError indicates that the index is out of range for one or more lists involved in the data loading process.
  • A participant notes that they followed the official instructions for uploading the dataset from Roboflow and installed the necessary requirements for YOLOv9.
  • Another participant prompts the original poster to check the indices and their ranges, implying that this might be a source of the error.

Areas of Agreement / Disagreement

Participants do not reach a consensus on the specific cause of the error, and multiple viewpoints regarding potential issues with data indexing remain. The discussion is ongoing and unresolved.

Contextual Notes

There is a lack of clarity regarding the structure and contents of the dataset being used, which may contribute to the indexing issue. The exact nature of the data being processed and its compatibility with the YOLOv9 training script is not fully detailed.

falyusuf
Messages
35
Reaction score
3
Homework Statement
I want to use the Python language in Kaggle to train YOLOv9 on a custom dataset containing three folders: train, test, and valid. Each folder has two subfolders: images and labels. The dataset is installed from Roboflow, using the YOLOv9 model. However, I encountered an error during the training process.

Below, I have attached my code and the error message.
Relevant Equations
-
[CODE lang="python" title="training YOLOv9"]!git clone https://github.com/SkalskiP/yolov9.git
%cd yolov9
!pip3 install -r requirements.txt -q

import os
Home = os.getcwd()
print(Home)

!mkdir -p /kaggle/working/yolov9/weights
!wget -P /kaggle/working/yolov9/weights -q https://github.com/WongKinYiu/yolov9/releases/download/v0.1/yolov9-c.pt
!wget -P /kaggle/working/yolov9/weights -q https://github.com/WongKinYiu/yolov9/releases/download/v0.1/yolov9-e.pt
!wget -P /kaggle/working/yolov9/weights -q https://github.com/WongKinYiu/yolov9/releases/download/v0.1/gelan-c.pt
!wget -P /kaggle/working/yolov9/weights -q https://github.com/WongKinYiu/yolov9/releases/download/v0.1/gelan-e.pt

%cd /kaggle/working/yolov9

!pip install roboflow

from roboflow import Roboflow
rf = Roboflow(api_key="#my_api")
project = rf.workspace("egat-43h2x").project("parking_lot-1lqk2")
version = project.version(1)
dataset = version.download("yolov9")

[/CODE]
The dataset has been uploaded.

[CODE lang="python" title="training YOLOv9"]%cd /kaggle/working/yolov9

!python train.py \
--batch 16 --epochs 50 --img 640 --min-items 0 --close-mosaic 15 \
--data /kaggle/input/parkingspacesyolov9/data.yaml\
--weights /kaggle/working/yolov9/weights/gelan-c.pt \
--cfg /kaggle/working/yolov9/models/detect/gelan-c.yaml \
--hyp hyp.scratch-high.yaml
[/CODE]

I am getting this error:
Logging results to runs/train/exp
Starting training for 50 epochs...

Epoch GPU_mem box_loss cls_loss dfl_loss Instances Size
0%| | 0/84 00:00
Traceback (most recent call last):
File "/kaggle/working/yolov9/train.py", line 634, in <module>
main(opt)
File "/kaggle/working/yolov9/train.py", line 528, in main
train(opt.hyp, opt, device, callbacks)
File "/kaggle/working/yolov9/train.py", line 277, in train
for i, (imgs, targets, paths, _) in pbar: # batch -------------------------------------------------------------
File "/opt/conda/lib/python3.10/site-packages/tqdm/std.py", line 1182, in __iter__
for obj in iterable:
File "/opt/conda/lib/python3.10/site-packages/torch/utils/data/dataloader.py", line 630, in __next__
data = self._next_data()
File "/opt/conda/lib/python3.10/site-packages/torch/utils/data/dataloader.py", line 1345, in _next_data
return self._process_data(data)
File "/opt/conda/lib/python3.10/site-packages/torch/utils/data/dataloader.py", line 1371, in _process_data
data.reraise()
File "/opt/conda/lib/python3.10/site-packages/torch/_utils.py", line 694, in reraise
raise exception
IndexError: Caught IndexError in DataLoader worker process 0.
Original Traceback (most recent call last):
File "/opt/conda/lib/python3.10/site-packages/torch/utils/data/_utils/worker.py", line 308, in _worker_loop
data = fetcher.fetch(index)
File "/opt/conda/lib/python3.10/site-packages/torch/utils/data/_utils/fetch.py", line 51, in fetch
data = [self.dataset[idx] for idx in possibly_batched_index]
File "/opt/conda/lib/python3.10/site-packages/torch/utils/data/_utils/fetch.py", line 51, in <listcomp>
data = [self.dataset[idx] for idx in possibly_batched_index]
File "/kaggle/working/yolov9/utils/dataloaders.py", line 656, in __getitem__
img, labels = self.load_mosaic(index)
File "/kaggle/working/yolov9/utils/dataloaders.py", line 791, in load_mosaic
img4, labels4, segments4 = copy_paste(img4, labels4, segments4, p=self.hyp['copy_paste'])
File "/kaggle/working/yolov9/utils/augmentations.py", line 248, in copy_paste
l, box, s = labels[j], boxes[j], segments[j]
IndexError: list index out of range

How to fix?
 
Physics news on Phys.org
falyusuf said:
Traceback (most recent call last):
<snip>
File "/kaggle/working/yolov9/utils/augmentations.py", line 248, in copy_paste
l, box, s = labels[j], boxes[j], segments[j]
IndexError: list index out of range
The above is the last line you showed, so represents the most recent call. It looks to me like the index j is out of range for one or more of the lists.
 
Mark44 said:
The above is the last line you showed, so represents the most recent call. It looks to me like the index j is out of range for one or more of the lists.
I have copied the code snippet for uploading the dataset directly from Roboflow for the YOLOv9 model. I installed YOLOv9 and its requirements as mentioned on the official website. Where might the problem be?
 
falyusuf said:
I have copied the code snippet for uploading the dataset directly from Roboflow for the YOLOv9 model. I installed YOLOv9 and its requirements as mentioned on the official website. Where might the problem be?
Mark just told you. Have you checked the indices and their ranges yet?
 

Similar threads

  • · Replies 3 ·
Replies
3
Views
2K
Replies
3
Views
2K
  • · Replies 6 ·
Replies
6
Views
4K