Difference between revisions of "2015:Set List Identification"
Ming-Chi Yen (talk | contribs) (→Description) |
Ming-Chi Yen (talk | contribs) (→Sub task 1) |
||
(54 intermediate revisions by the same user not shown) | |||
Line 16: | Line 16: | ||
*To identify the order of songs which be performed in a live concert. | *To identify the order of songs which be performed in a live concert. | ||
− | In this sub task, the participants known the the artist and artist's studio song collection. Assigning a live concert audio and studio songs collection of a specific artist | + | In this sub task, the participants known the the artist and artist's studio song collection. Assigning a live concert audio and studio songs collection of a specific artist, all songs in live concert are included in studio songs collection, to identify the order of songs in this live concert. |
===Sub task 2: Time boundary identification=== | ===Sub task 2: Time boundary identification=== | ||
− | *To identify the start/end time of each song in song | + | *To identify the start/end time of each song in song sequence |
− | In this sub task, the participants known the artist, artist's studio song collection and the '''song sequence'''. Assigning a live concert audio, song sequence and studio songs collection of a specific artist, to identify start time and end time of each song in the live concert. | + | In this sub task, the participants known the artist, artist's studio song collection and the '''song sequence'''. Assigning a live concert audio, song sequence and studio songs collection of a specific artist, all songs in live concert are included in studio songs collection, to identify start time and end time of each song in the live concert. |
== Data == | == Data == | ||
− | To satisfy our assessment, we pre-process all audio -- ''' | + | To satisfy our assessment, we pre-process all audio -- '''remove the "out of artist song" form live concert audio''' for following our assumption. (See the [https://www.music-ir.org/mirex/wiki/2015:Set_List_Identification#Description description]) |
We provide two set for this task,participating algorithms will have to read audio in the following format. | We provide two set for this task,participating algorithms will have to read audio in the following format. | ||
− | * Sample rate: | + | * Sample rate: 22050 Hz |
* Sample size: 16 bit | * Sample size: 16 bit | ||
* Number of channels: 1 (mono) | * Number of channels: 1 (mono) | ||
Line 34: | Line 34: | ||
===Developing set=== | ===Developing set=== | ||
− | This set contain 3 artists and 7 live concerts, the following information would be release | + | This set contain 3 artists and 7 live concerts, the following information would be release ([https://www.dropbox.com/sh/t83ogdrxi0f050n/AABb11MCcQUokqSjOsqhArOFa?dl=0 Dropbox]) |
* artist | * artist | ||
* live concert name and links | * live concert name and links | ||
* studio collection list | * studio collection list | ||
* start/end time tags | * start/end time tags | ||
+ | |||
+ | We extract features for the convenience of participants, the links is the tool we used. ([https://www.dropbox.com/s/bote36k8pkmt2f8/MIREX_2015_Setlist_ID_Developing_set_chroma_fea.rar?dl=0 Dropbox]) | ||
+ | *chroma (CRP features [http://resources.mpi-inf.mpg.de/MIR/chromatoolbox/ Chroma Toolbox]) | ||
Collection statistics: | Collection statistics: | ||
Line 54: | Line 57: | ||
== Evaluation == | == Evaluation == | ||
+ | |||
+ | For two tasks, the evaluation metrics were different. | ||
=== Sub task 1=== | === Sub task 1=== | ||
* Edit distance (see [http://en.wikipedia.org/wiki/Edit_distance Edit distance]) | * Edit distance (see [http://en.wikipedia.org/wiki/Edit_distance Edit distance]) | ||
+ | |||
+ | We evaluated the two sequence (ground truth and your result) by edit distance, there three errors included | ||
+ | * insertion error <math>I</math> | ||
+ | * substitution error <math>S</math> | ||
+ | * deletion error <math>D</math> | ||
+ | |||
+ | Edit Distance: <big><math>ED = I+S+D </math> </big> | ||
+ | |||
+ | Percent Correct: <big><math>Corr = \frac{N-D-S}{N}</math></big> | ||
+ | |||
+ | Percent Accuracy: <big><math> Acc = \frac{N-D-S-I}{N}</math></big> | ||
=== Sub task 2=== | === Sub task 2=== | ||
Line 63: | Line 79: | ||
* average time boundary | * average time boundary | ||
− | + | We will evaluate two time boundaries as follow: average start time boundary and average end time boundary. The evaluation function is described below: | |
+ | |||
+ | * Set list contains '''<math>N</math>''' songs | ||
+ | |||
+ | ''' Ground truth: ''' | ||
− | Start time of | + | * Start time of song '''<math>i</math>''':<math>sBD_{GT_i}</math> |
− | + | * End time of song '''<math>i</math>''':<math>eBD_{GT_i}</math> | |
− | + | ''' Identification result: ''' | |
− | + | * Start time of song '''<math>i</math>''':<math>sBD_{ID_i}</math> | |
− | < | + | * End time of song '''<math>i</math>''':<math>eBD_{ID_i}</math> |
− | < | + | <math> AVGsBD =\frac{\sum_{i=1}^N |sBD_{GT_i} - sBD_{ID_i}|}{N} </math>, |
+ | |||
+ | <math> AVGeBD =\frac{\sum_{i=1}^N |eBD_{GT_i} - eBD_{ID_i}|}{N} </math>, | ||
=== Runtime performance === | === Runtime performance === | ||
Line 91: | Line 113: | ||
=== Sub task 1 === | === Sub task 1 === | ||
+ | |||
+ | Two inputs : live file list and studio song file list | ||
+ | |||
+ | One output: song ID sequence | ||
==== Input file ==== | ==== Input file ==== | ||
Line 113: | Line 139: | ||
=== Sub task 2 === | === Sub task 2 === | ||
+ | |||
+ | Three inputs : song ID sequence list, live file list and studio song file list | ||
+ | |||
+ | One output: time label of song list | ||
+ | |||
==== Input file ==== | ==== Input file ==== | ||
Line 123: | Line 154: | ||
59\n | 59\n | ||
... | ... | ||
+ | |||
+ | The input for live concert list file format will be of the form: | ||
+ | |||
+ | /path/to/artist_1/live/concert/001.wav\n | ||
+ | |||
+ | The input for studio songs list file format will be of the form: | ||
+ | |||
+ | /path/to/artist_1/studio/song/001.wav\n 1st | ||
+ | /path/to/artist_1/studio/song/002.wav\n 2nd | ||
+ | /path/to/artist_1/studio/song/003.wav\n 3rd | ||
+ | ... | ||
==== Output file ==== | ==== Output file ==== | ||
Line 130: | Line 172: | ||
* '''\t''' is tab space | * '''\t''' is tab space | ||
Start time end time | Start time end time | ||
− | hours.minutes.seconds.milliseconds \t hours.minutes.seconds.milliseconds\n | + | hours.minutes.seconds.milliseconds \t hours.minutes.seconds.milliseconds\n (for input input sond ID:3) |
− | hours.minutes.seconds.milliseconds \t hours.minutes.seconds.milliseconds\n | + | hours.minutes.seconds.milliseconds \t hours.minutes.seconds.milliseconds\n (for input input sond ID:17) |
− | hours.minutes.seconds.milliseconds \t hours.minutes.seconds.milliseconds\n | + | hours.minutes.seconds.milliseconds \t hours.minutes.seconds.milliseconds\n (for input input sond ID:59) |
... | ... | ||
Latest revision as of 01:40, 12 August 2015
Contents
Description
This task is new for 2015!
This task requires that algorithm identify the set list (See Set list). Set list is the song sequence in a live concert. It shows the order of songs will be performed in a live concert.
Recently, more and more full-length live concert videos have become available on website (e.g. Youtube). Most of them are lacking sufficient information to describe itself, such as the set list, and start/end time of each song. In this task, we collect the audio of live concerts and studio songs, applying music information retrieval techniques to answer this question -- what songs had been sung in this concert and when are the songs start and end.
For the first step of this task, we assume that artist is known. In the live concert, the performers play their studio songs only, however the ultimate goal is granted a full-length live concert audio and studio song database, we still can find out the set list and the start/end time of each song.
here are two sub tasks in this task:
Sub task 1: Song sequence identification
- To identify the order of songs which be performed in a live concert.
In this sub task, the participants known the the artist and artist's studio song collection. Assigning a live concert audio and studio songs collection of a specific artist, all songs in live concert are included in studio songs collection, to identify the order of songs in this live concert.
Sub task 2: Time boundary identification
- To identify the start/end time of each song in song sequence
In this sub task, the participants known the artist, artist's studio song collection and the song sequence. Assigning a live concert audio, song sequence and studio songs collection of a specific artist, all songs in live concert are included in studio songs collection, to identify start time and end time of each song in the live concert.
Data
To satisfy our assessment, we pre-process all audio -- remove the "out of artist song" form live concert audio for following our assumption. (See the description)
We provide two set for this task,participating algorithms will have to read audio in the following format.
- Sample rate: 22050 Hz
- Sample size: 16 bit
- Number of channels: 1 (mono)
- Encoding: WAV
Developing set
This set contain 3 artists and 7 live concerts, the following information would be release (Dropbox)
- artist
- live concert name and links
- studio collection list
- start/end time tags
We extract features for the convenience of participants, the links is the tool we used. (Dropbox)
- chroma (CRP features Chroma Toolbox)
Collection statistics:
- 3 artists
- 7 live concerts
- 279 tracks
Testing set
This set contain 7 artists and 13 live concerts, no information would be release.
Collection statistics:
- 7 artists
- 13 live concerts
- 873 tracks
Evaluation
For two tasks, the evaluation metrics were different.
Sub task 1
- Edit distance (see Edit distance)
We evaluated the two sequence (ground truth and your result) by edit distance, there three errors included
- insertion error
- substitution error
- deletion error
Edit Distance:
Percent Correct:
Percent Accuracy:
Sub task 2
- average time boundary
We will evaluate two time boundaries as follow: average start time boundary and average end time boundary. The evaluation function is described below:
- Set list contains songs
Ground truth:
- Start time of song :
- End time of song :
Identification result:
- Start time of song :
- End time of song :
,
,
Runtime performance
In addition computation times for feature extraction and training/classification will be measured.
Submission Format
- \n is end of line
Submission to this task will have to conform to a specified format detailed below.
Implementation details
we recommend your submission folder construction as follow:
/root_folder/... all the code you submitted /root_folder/extract_feature/... all feature your extracted /root_folder/output/... the folder to save results
Sub task 1
Two inputs : live file list and studio song file list
One output: song ID sequence
Input file
The input for studio songs list file format will be of the form:
/path/to/artist_1/studio/song/001.wav\n 1st /path/to/artist_1/studio/song/002.wav\n 2nd /path/to/artist_1/studio/song/003.wav\n 3rd ...
The input for live concert list file format will be of the form:
/path/to/artist_1/live/concert/001.wav\n
Output file
The output is a list file (song ID sequence), the song ID is the order of input list file, not the file name of *.wav file.
3\n <-- 003.wav is the first song of set list for your identification result 17\n 59\n ...
Sub task 2
Three inputs : song ID sequence list, live file list and studio song file list
One output: time label of song list
Input file
The input is a list of song ID (song ID sequence), the song ID is the order of studio songs list file.
Your system should read the *.wav file according that order and find the time boundary of the song.
3\n 17\n 59\n ...
The input for live concert list file format will be of the form:
/path/to/artist_1/live/concert/001.wav\n
The input for studio songs list file format will be of the form:
/path/to/artist_1/studio/song/001.wav\n 1st /path/to/artist_1/studio/song/002.wav\n 2nd /path/to/artist_1/studio/song/003.wav\n 3rd ...
Output file
The output for studio songs time boundary list file format will be of the form:
- please round the time boundary to millisecond
- \t is tab space
Start time end time hours.minutes.seconds.milliseconds \t hours.minutes.seconds.milliseconds\n (for input input sond ID:3) hours.minutes.seconds.milliseconds \t hours.minutes.seconds.milliseconds\n (for input input sond ID:17) hours.minutes.seconds.milliseconds \t hours.minutes.seconds.milliseconds\n (for input input sond ID:59) ...
Examples:
0.7.23.521 0.13.24.512 0.14.3.021 0.19.53.38 0.20.9.893 0.27.15.987 ... ... 0.56.22.433 1.1.46.593 1.3.51.146 1.9.21.138 ...
Packaging submissions
All submissions should be statically linked to all libraries (the presence of dynamically linked libraries cannot be guarenteed).
All submissions should include a README file including the following the information:
- Which task you want to participate (sub task1, sub task2 or all)
- Command line calling format for all executables and an example formatted set of commands
- Number of threads/cores used or whether this should be specified on the command line
- Expected memory footprint
- Expected runtime
- Any required environments (and versions), e.g. python, java, bash, matlab.
Time and hardware limits
Due to the potentially high number of particpants in this and other audio tasks, hard limits on the runtime of submissions are specified.
A hard limit of 72 hours will be imposed on runs (total feature extraction and querying times). Submissions that exceed this runtime may not receive a result.
Potential Participants
name / email