VoxCeleb1Identification¶

class torchaudio.datasets.VoxCeleb1Identification(root: Union[str, Path], subset: str = 'train', meta_url: str = 'https://www.robots.ox.ac.uk/~vgg/data/voxceleb/meta/iden_split.txt', download: bool = False)[原始碼]¶

用於說話者辨識任務的 VoxCeleb1 [Nagrani et al., 2017] 資料集。

每個資料範例都包含波形、取樣率、說話者 ID 和檔案 ID。

參數:

root (str 或 Path) – 數據集所在的目錄路徑，或下載數據集的目標路徑。
subset (str, 可選) – 要使用的數據集子集。選項：[“train”, “dev”, “test”]。 (預設值： "train")
meta_url (str, 可選) – 包含子集標籤和檔案路徑列表的中繼資料檔案的 URL。每行的格式為 subset file_path"。例如：``1 id10006/nLEBBc9oIFs/00003.wav``。1、2、3 分別代表 train、dev 和 test 子集。(預設值："https://www.robots.ox.ac.uk/~vgg/data/voxceleb/meta/iden_split.txt")
download (bool, 可選) – 如果在 root 路徑下找不到數據集，是否下載數據集。(預設值：False)。

注意

VoxCeleb1Identification 數據集的檔案結構如下

└─ root/

└─ wav/

└─ speaker_id 資料夾

預先下載 "vox1_dev_wav.zip" 和 "vox1_test_wav.zip" 檔案的使用者需要將解壓縮後的檔案移動到相同的 root 目錄中。

getitem¶

VoxCeleb1Identification.__getitem__(n: int) → Tuple[Tensor, int, int, str][source]¶

從數據集中載入第 n 個樣本。

參數:

n (int) – 要載入的樣本的索引

傳回值:

包含以下項目的 Tuple；

Tensor: 波形
int: 取樣率
int: Speaker ID
str: File ID

get_metadata¶

VoxCeleb1Identification.get_metadata(n: int) → Tuple[str, int, int, str][source]¶

取得數據集中第 n 個樣本的中繼資料。傳回檔案路徑而不是波形，但其他欄位與 __getitem__() 相同。

參數:

n (int) – 樣本的索引

傳回值:

包含以下項目的 Tuple；

str: 音訊檔案的路徑
int: 取樣率
int: Speaker ID
str: File ID

VoxCeleb1Identification¶

getitem¶

get_metadata¶

文件

教學

資源

VoxCeleb1Identification¶

__getitem__¶

get_metadata¶

文件

教學

資源

getitem¶