2017:Automatic Lyrics-to-Audio Alignment - Revision history

Georgi Dzhambazov: /* Mauch's Dataset */

2017-12-08T17:11:34Z

‎Mauch's Dataset

Georgi Dzhambazov: /* Hansen's Dataset */

2017-12-08T17:11:17Z

‎Hansen's Dataset

Georgi Dzhambazov: /* Evaluation Datasets */

2017-12-08T17:10:56Z

‎Evaluation Datasets

Georgi Dzhambazov: /* Mauch's Dataset */

2017-08-26T13:24:37Z

‎Mauch's Dataset

Georgi Dzhambazov: /* Hansen's Dataset */

2017-08-26T13:19:08Z

‎Hansen's Dataset

Georgi Dzhambazov: /* Potential Participants */

2017-08-26T13:08:16Z

‎Potential Participants

Georgi Dzhambazov: /* Potential Participants */

2017-08-26T13:08:07Z

‎Potential Participants

Georgi Dzhambazov: /* Phonetization */

2017-08-26T13:02:51Z

‎Phonetization

Georgi Dzhambazov: /* Evaluation Datasets */

2017-08-23T19:13:32Z

‎Evaluation Datasets

Georgi Dzhambazov: /* Description */

2017-08-23T19:06:03Z

‎Description

@@ Line 28: / Line 28: @@
 The audio has instrumental accompaniment. An example song can be seen [https://www.dropbox.com/sh/8pp4u2xg93z36d4/AAAsCE2eYW68gxRhKiPH_VvFa?dl=0 here] "_" are used instead of "'" in the annotation.
-[https://www.dropbox.com/sh/y6kwqdgq8ous12e/AABWrMXOmLOZoNFO06STLQkAa?dl=0 Half of the dataset) is being released after the competition!
+[https://www.dropbox.com/sh/y6kwqdgq8ous12e/AABWrMXOmLOZoNFO06STLQkAa?dl=0 Half of the dataset] is being released after the competition!
 You can read in detail about how the dataset was used for the first time here: [https://pdfs.semanticscholar.org/547d/7a5d105380562ca3543bf05b4d5f7a8bee66.pdf Integrating Additional Chord Information Into HMM-Based Lyrics-to-Audio Alignment]. The dataset has been kindly provided by Sungkyun Chang.

@@ Line 17: / Line 17: @@
 The audio has two versions: the original with instrumental accompaniment and a cappella singing voice only one. An example song can be seen [https://www.dropbox.com/sh/wm6k4dqrww0fket/AAC1o1uRFxBPg9iAeSAd1Wxta?dl=0 here]
-[https://www.dropbox.com/sh/evg395yz1ciyy2r/AABwUHXnVlXK_YrN1Rov7iU6a?dl=0 Half of the dataset) is being released after the competition!
+[https://www.dropbox.com/sh/evg395yz1ciyy2r/AABwUHXnVlXK_YrN1Rov7iU6a?dl=0 Half of the dataset] is being released after the competition!
 You can read in detail about how the dataset was made here: [http://smcnetwork.org/system/files/smc2012-198.pdf Recognition of Phonemes in A-cappella Recordings using Temporal Patterns and Mel Frequency Cepstral Coefficients]. The dataset has been kindly provided by Jens Kofod Hansen.

@@ Line 14: / Line 14: @@
 ==== Hansen's Dataset ====
-The dataset contains 9 popular music songs in English with annotations of both beginning- and ending-timestamps of each word. The ending timestamps are for convenience (copies of next word's beginning timestamp) and are not used in the evaluation. Non-vocal segments are assigned a special word BREATH*. Sentence-level annotations are also provided.
+The dataset contains 9 popular music songs in English with annotations of both beginnings- and ending-timestamps of each word. The ending timestamps are for convenience (copies of next word's beginning timestamp) and are not used in the evaluation. Non-vocal segments are assigned a special word BREATH*. Sentence-level annotations are also provided.
 The audio has two versions: the original with instrumental accompaniment and a cappella singing voice only one. An example song can be seen [https://www.dropbox.com/sh/wm6k4dqrww0fket/AAC1o1uRFxBPg9iAeSAd1Wxta?dl=0 here]
 You can read in detail about how the dataset was made here: [http://smcnetwork.org/system/files/smc2012-198.pdf Recognition of Phonemes in A-cappella Recordings using Temporal Patterns and Mel Frequency Cepstral Coefficients]. The dataset has been kindly provided by Jens Kofod Hansen.
@@ Line 25: / Line 27: @@
 The dataset contains 20 popular music songs in English with annotations of beginning-timestamps of each word. Non-vocal sections are not explicitly annotated (but remain included in the last preceding word). We prefer to leave it this way, in order to enable comparison to previous work, evaluated on this dataset.
 The audio has instrumental accompaniment. An example song can be seen [https://www.dropbox.com/sh/8pp4u2xg93z36d4/AAAsCE2eYW68gxRhKiPH_VvFa?dl=0 here] "_" are used instead of "'" in the annotation.
 You can read in detail about how the dataset was used for the first time here: [https://pdfs.semanticscholar.org/547d/7a5d105380562ca3543bf05b4d5f7a8bee66.pdf Integrating Additional Chord Information Into HMM-Based Lyrics-to-Audio Alignment]. The dataset has been kindly provided by Sungkyun Chang.

@@ Line 24: / Line 24: @@
 ==== Mauch's Dataset ====
 The dataset contains 20 popular music songs in English with annotations of beginning-timestamps of each word. Non-vocal sections are not explicitly annotated (but remain included in the last preceding word). We prefer to leave it this way, in order to enable comparison to previous work, evaluated on this dataset.
-The audio has instrumental accompaniment. An example song can be seen [https://www.dropbox.com/sh/8pp4u2xg93z36d4/AAAsCE2eYW68gxRhKiPH_VvFa?dl=0 here]
+The audio has instrumental accompaniment. An example song can be seen [https://www.dropbox.com/sh/8pp4u2xg93z36d4/AAAsCE2eYW68gxRhKiPH_VvFa?dl=0 here] "_" are used instead of "'" in the annotation.
 You can read in detail about how the dataset was used for the first time here: [https://pdfs.semanticscholar.org/547d/7a5d105380562ca3543bf05b4d5f7a8bee66.pdf Integrating Additional Chord Information Into HMM-Based Lyrics-to-Audio Alignment]. The dataset has been kindly provided by Sungkyun Chang.

@@ Line 14: / Line 14: @@
 ==== Hansen's Dataset ====
-The dataset contains 9 popular music songs in English with annotations of both beginning- and ending-timestamps of each word. The ending timestamps are for convenience (copies of next word's beginning timestamp) and are not used in the evaluation. Non-vocal segments are assigned a special word BREATH. Sentence-level annotations are also provided.
+The dataset contains 9 popular music songs in English with annotations of both beginning- and ending-timestamps of each word. The ending timestamps are for convenience (copies of next word's beginning timestamp) and are not used in the evaluation. Non-vocal segments are assigned a special word BREATH*. Sentence-level annotations are also provided.
 The audio has two versions: the original with instrumental accompaniment and a cappella singing voice only one. An example song can be seen [https://www.dropbox.com/sh/wm6k4dqrww0fket/AAC1o1uRFxBPg9iAeSAd1Wxta?dl=0 here]

@@ Line 125: / Line 125: @@
 == Potential Participants ==
 Nikolaos Tsipas  nitsipas [at] auth [dot] gr
 Anna Kruspe kpe [at] idmt [dot] fraunhofer [dot] de

← Older revision		Revision as of 13:08, 26 August 2017
Line 125:		Line 125:
	== Potential Participants ==		== Potential Participants ==
	Nikolaos Tsipas nitsipas [at] auth [dot] gr		Nikolaos Tsipas nitsipas [at] auth [dot] gr
		+	Anna Kruspe kpe [at] idmt [dot] fraunhofer [dot] de

@@ Line 32: / Line 32: @@
 ==== Phonetization ====
-A popular choice for phonetization of the words is the [http://www.speech.cs.cmu.edu/cgi-bin/cmudict CMU pronunciation dictionary]. One can phonetize them with the [http://www.speech.cs.cmu.edu/tools/lextool.html online tool]. A list of all rare + specific names of both datasets is given here.
+A popular choice for phonetization of the words is the [http://www.speech.cs.cmu.edu/cgi-bin/cmudict CMU pronunciation dictionary]. One can phonetize them with the [http://www.speech.cs.cmu.edu/tools/lextool.html online tool]. A list of all words of both datasets, which are outside of the [https://github.com/georgid/AlignmentDuration/blob/noteOnsets/src/for_english/cmudict.0.6d.syll list of CMU words] is given [https://www.dropbox.com/s/flu4cpqff916bas/words_not_in_dict?dl=0 here].
 ==== Audio Format ====

@@ Line 1: / Line 1: @@
 ==Description==
-The task of automatic lyrics-to-audio alignment has as an end goal the synchronization between an audio recording of singing and its corresponding written lyrics.  The beginning  timestamps of lyrics units can be estimated on different granularity: phonemes, words, lyrics lines, phrases.  For this task word-level alignment is required.
+The task of automatic lyrics-to-audio alignment has as an end goal the synchronization between an audio recording of singing and its corresponding written lyrics.  The beginning timestamps of lyrics units can be estimated on different granularity: phonemes, words, lyrics lines, phrases.  For this task word-level alignment is required.
 ==Data==