Category : pytorch-dataloader

I am working with multiple csv files, each containing multiple 1D data. I have about 9000 such files and total combined data is about 40 GB. I have written a dataloader like this: class data_gen(torch.utils.data.Dataset): def __init__(self, files): self.files = files my_data = np.genfromtxt(‘/data/’+files, delimiter=’,’) self.dim = my_data.shape[1] self.data = [] def __getitem__(self, i): file1 ..

Read more

I have multiple csv files which contain 1D data and I want to use each row. Each file contains different number of rows. So I have written a dataloader like this: class data_gen(torch.utils.data.Dataset): def __init__(self, files): self.files = files print("FILES: ", type(self.files)) def __getitem__(self, i): print("GETite,") file1 = self.files[i] print("FILE1: ", file1) my_data = np.genfromtxt(‘/data/’+file1, ..

Read more

When I Processed video and its audio,I encountered an error: Original Traceback (most recent call last): File “/home/yzx/anaconda3/envs/pytorch/lib/python3.8/site-packages/torch/utils/data/_utils/worker.py”, line 202, in _worker_loop data = fetcher.fetch(index) File “/home/yzx/anaconda3/envs/pytorch/lib/python3.8/site-packages/torch/utils/data/_utils/fetch.py”, line 44, in fetch data = [self.dataset[idx] for idx in possibly_batched_index] File “/home/yzx/anaconda3/envs/pytorch/lib/python3.8/site-packages/torch/utils/data/_utils/fetch.py”, line 44, in data = [self.dataset[idx] for idx in possibly_batched_index] File “/home/yzx/lunwen/PseudoBinaural_CVPR2021-master/data/Augment_dataset.py”, line 66, in ..

Read more

I am currently following the ASRfromScratch tutorial but I am trying to make it work with the Fluent Speech Dataset https://fluent.ai/fluent-speech-commands-a-dataset-for-spoken-language-understanding-research/. I was able to go through the Tokenizer section and the Language Model section with no problem but I am struggling with the SpeechRecognizer section. I modified the dataio_prepare function as such, but I ..

Read more

While using Pytorch’s DataLoader utility, in sampler what is the purpose of RandomIdentitySampler? And in RandomIdentitySampler there is an argument instances. Does instances depends upon number of workers? If there is are 4 workers then should there be 4 instances as well? Following is the chunk of code: c_dataloaders = DataLoader(Preprocessor(cluster_dataset.train_set, root=cluster_dataset.images_dir, transform=train_transformer), batch_size=args.batch_size_stage2, num_workers=args.workers, ..

Read more