Difference between revisions of "2014:GC14UX"

From MIREX Wiki
(Criteria)
 
(84 intermediate revisions by 5 users not shown)
Line 1: Line 1:
[[DISPLAYTITLE: Doctor Downie Rocks]]
+
{{DISPLAYTITLE:Grand Challenge 2014: User Experience}}
 +
=Purpose=
 +
Holistic, user-centered evaluation of the user experience in interacting with complete, user-facing music information retrieval (MIR) systems.
  
=Welcome to GC14UX=
+
=Goals=
Grand Challenge on User Experience 2014
+
# To inspire the development of complete MIR systems.
 +
# To promote the notion of user experience as a first-class research objective in the MIR community.
  
=Purposes=  
+
=Dataset=
Holistic evaluation of user experience in interacting with user-serving MIR systems
+
A set of music 10,000 music audio tracks is provided for the GC14UX. It will be a subset of tracks drawn from the [http://www.jamendo.com/en/welcome Jamendo collection's] CC-BY licensed works.
 +
 
 +
The Jamendo collection contains music in a variety of genres and moods, but is mostly unknown to most listeners. This will mitigate against the possible user experience bias induced by the differential presence (or absence) of popular or known music within the participating systems.
 +
 
 +
As of May 20, 2014, the Jamendo collection contains 14,742 tracks with the [http://creativecommons.org/licenses/by/3.0/ CC-BY license]. The CC-BY license allows others to distribute, modify, optimize and use your work as a basis, even commercially, as long as you give credit for the original creation. This is one of the most permissive licenses possible.
 +
 
 +
The 10,000 tracks in GC14UX are sampled (w.r.t. maximizing music variety) from the Jamendo collection with CC-BY license and made available for participants (system developers) to download to build their systems. It represents a randomly chosen subset the content available at Jamendo that is published under the terms of the Creative Commons Attribution-Non-Commercial-ShareAlike (by-nc-sa), where user-supplied data has tagged a track with 1 or more genre categories. For more details about usage of this dataset, see the LICENSE.txt file contained in the downloaded files.
 +
 
 +
The dataset contains the MP3 tracks and the metadata the Jamendo site publishes on the respective items (represented in JSON format), retrieved using the site's API (6th Aug 2014). The dataset is available both zipped up and as a tar-ball(you only need one of these); however, at 60+ Gb it is a non-trival size of file to download over the web, and so we suggest you install a Download Manager extension to your browser if you do not already have one and make use of that. In a test using the DownThemAll! extension to Firefox, downloading the dataset between University of Illinois at Urbana-Champaign and Waikato University in New Zealand took a little under 2 hours.
 +
 
 +
You need to register to download the main dataset.
  
=Goals=
+
[https://www.music-ir.org/mirex/gc14ux/ https://www.music-ir.org/mirex/gc14ux/] <br/>
1) to inspire the development of complete MIR systems
 
2) to promote the notion of user experience as a first-class research objective in the MIR community
 
  
=Dataset=
 
A set of music audio of 10,000 tracks is provided for the GC14UX. It will be a subset of tracks drawn from Jamendo collection's CC-BY licensed works (http://www.jamendo.com/en/welcome).
 
  
The Jamendo collection contains music in a variety of genres and moods, but is mostly unknown to most listeners. This will mitigate against the possible user experience bias induced by the differential presence (or absence) of popular or known music within the participating systems.
+
==Metadata Extracted from JSON Files==
 +
The JSON files retrieved from Jamendo site contain various metadata:
  
As of May 20, 2014, the Jamendo collection contains 14742 tracks with the CC-BY license (http://creativecommons.org/licenses/by/3.0/). The CC-BY license allows others to distribute, modify, optimize and use your work as a basis, even commercially, as long as you give credit for the original creation. This is one of the most permissive licenses possible.
+
#album_id
 +
#album_image 
 +
#album_name   
 +
#artist_id     
 +
#artist_idstr
 +
#artist_name
 +
#audio 
 +
#audiodownload
 +
#duration
 +
#id     
 +
#license_ccurl 
 +
#musicinfo_lang 
 +
#musicinfo_speed
 +
#musicinfo_acousticelectric     
 +
#musicinfo_vocalinstrumental   
 +
#musicinfo_gender       
 +
#musicinfo_tags_vartags 
 +
#musicinfo_tags_genres 
 +
#musicinfo_tags_instruments     
 +
#name   
 +
#position       
 +
#releasedate   
 +
#shareurl
 +
#shorturl
  
The 10,000 tracks in GC14UX will be sampled (w.r.t. maximizing music variety) from the Jamendo collection with CC-BY license and made available for participants (system developers) to download to build their systems.  
+
[[2014:GC14UX:JSON Metadata]] presents statistics and plots of selected fields.
  
 
=Participating Systems=
 
=Participating Systems=
Unlike conventional MIREX tasks, participants are not asked to submit their systems. Instead, the systems will be hosted by their developers. All participating systems need to be constructed as websites accessible to users through normal web browsers. Participating teams will submit the URLs to their systems to the GC14UX team.  
+
Unlike conventional MIREX tasks, participants are not asked to submit their systems. Instead, the systems will be hosted by their developers. All participating systems need to be constructed as websites accessible to users through normal web browsers. Participating teams will submit the URLs to their systems to the GC14UX team.
 +
 
 +
To ensure a consistent experience, evaluators will see participating systems in fixed size window: '''1024x768'''. Please test your system for this screen size.  
  
=Evaluation=
+
See the [[#Evaluation Webforms]] below for a better understanding of our E6K-inpsired evaluation system design.
==Task==
 
To ensure that the GC14UX does not become a system-centered evaluation in disguise, the process will remain as agnostic as possible concerning the technological means by which participating systems create and deliver their experiences to the users. The only requirement for the systems is they should support music discovery (in whichever ways).
 
To design some criteria that’s not tied to the music content;
 
  
Task: You are creating a YouTube video and you need to find some open-source music to use.
+
==Potential Participants==
  
Q. What kind of video you are trying to find the music for? (Need to discuss more about this.)
+
Please put your names and email contacts in the following table. It is encouraged that you give your team a cool name!
 +
{| class="wikitable"
 +
|-
 +
! (Cool) Team Name
 +
! Name(s)
 +
! Email(s)
 +
|-
 +
| The MIR UX Master
 +
| Dr. MIR
 +
| mir@domain.com
 +
|-
 +
|-
 +
| Moody
 +
| Kai Xing,Pei-I Chen,Terry Lei,Jen-Yu Liu,Billy Vong,Yi-Hsuan Yang,Xiao Hu 
 +
| xkaics@gmail.com,gooa1121@gmail.com,skyhelpme@gmail.com,ciauaishere@gmail.com,billyvg@gmail.com,affige@gmail.com,xiaoxhu@hku.hk
 +
|-
 +
|-
 +
| Tonic
 +
| D. Bountouridis, J.V. Balen, M. Rodriguez, S. Manoli, A. Aljanaki, F. Wiering, R.C. Veltkamp
 +
|d.bountouridis@uu.nl, J.M.H.VanBalen@uu.nl. ME.RodriguezLopez@uu.nl, stavma0891@gmail.com, a.aljanaki@uu.nl, F.Wiering@uu.nl, R.C.Veltkamp@uu.nl
 +
|-
 +
|}
  
- Overall satisfaction (“On a scale of 0 to 100, rate your overall satisfaction with this system”)
+
=Evaluation=
and bands with different colors (0-20: very unsatisfied; 20-40: somewhat unsatisfied; 40-60: neutral; 60-80: somewhat satisfied; 80-100: very satisfied).
 
  
- Aesthetics (look of the interface)
+
As written in the name of the Grand Challenge, the evaluation will be user-centered. All systems will be used by a number of human evaluators and be rated by them on several most important criteria in evaluating user experience.
- Interaction (smoothness, no stalls or bugs)
 
- Awareness (what’s going on, where the user is)
 
  
Ten Heuristics here: http://www.nngroup.com/articles/ten-usability-heuristics/ 
+
==Criteria==
        How to convert the 10 heuristics
 
  
Any literature support for the three items?
+
''Note that the evaluation criteria or its descriptions may be slightly changed in the months leading up to the submission deadline, as we test it and work to improve it.''
Textbox
 
  
*. Evaluation criterion:
+
Given the GC14UX is all about how users perceive their experiences of the systems, we intend to capture the user perceptions in a minimally intrusive manner and not to burden the users/evaluators with too many questions or required data inputs. The following criteria are grounded on the literature of Human Computer Interaction (HCI) and User Experience (UX), with a careful consideration on striking a balance between being comprehensive and minimizing evaluators' cognitive load.  
  
Satisfaction: How much do you like using the interface?
+
Evaluators will rate systems on the following criteria:  
Memorability: <!--How to capture this in one session?-->
 
Errors: How many errors did you make? How easy is it to recover from them?
 
Learnability: How easy was it to figure out the interface? Does its layout match what you expect (e.g. buttons, search box locations, navigation)? Was anything confusing? How good is the interface at communicating functionality (help and documentation)?
 
---
 
  
This simplicity is because: 1) the GC14UX is all about how users perceive their experiences of the systems. We intend to capture the user perceptions in a minimally intrusive manner and not to burden the users/evaluators with too many questions or required data inputs. 2) more data capturing opportunities will distract from the real user experience.
+
* '''Overall satisfaction''': Overall, how pleasurable do you find the experience of using this system?
 +
Very unsatisfactory / Unsatisfactory / Slightly unsatisfactory / Neutral / Slightly satisfactory / Satisfactory / Very satisfactory
  
An open-ended question is provided but is optional for users to give feedback if they wish to do so.
+
* '''Learnability''': How easy was it to figure out how to use the system?
 +
Very difficult / Difficult / Slightly difficult / Neutral / Slightly easy / Easy / Very easy
  
==Evaluation mechanism==
+
* '''Robustness''': How good is the system’s ability to warn you when you’re about to make a mistake and allow you to recover?
The GC14UX team will provide a set of evaluation forms which wrap around the participating system. In other words, the evaluation system will offer forms for scoring the participating system, and embed the system within an iframe.
+
Very Poor / Poor / Slightly Poor / Neutral / Slightly Good / Good / Excellent
  
To prompt a scenario/context in which a user interacts with participating systems, a half-defined sentence will be provided to a user in the evaluation webforms *before* a set of participating systems are shown to the user:
+
* '''Affordances''': How well does the system allow you to perform what you want to do?
 +
Very Poor / Poor / Slightly Poor / Neutral / Slightly Good / Good / Excellent
  
“I am looking for music for ______________”
+
* '''Feedback''': How well does the system communicate what's going on?
 +
Very Poor / Poor / Slightly Poor / Neutral / Slightly Good / Good / Excellent
 +
 
 +
* '''Open Text Feedback''': An open-ended question is provided for evaluators to give feedback if they wish to do so.
  
The user is free to fill in any need s/he would like to find music for, such as “drinking a cup of coffee”, or “waiting for my girlfriend”. In this way, the goal of the user is captured and the goal is as authentic to the user as possible. 
 
 
 
==Evaluators==
 
==Evaluators==
They will be users aged 18 and above. For this round, evaluators will be drawn primarily from the MIR community through solicitations via the ISMIR-community mailing list. It is planned that each submitted system will be evaluated by 50 to 60 evaluators. The evaluation webforms developed by the GC14UX team will ensure all participating systems will get equal number of evaluators.  
+
Evaluators will be users aged 18 and above. For this round, evaluators will be drawn primarily from the MIR community through solicitations via the ISMIR-community mailing list. The [[#Evaluation Webforms]] developed by the GC14UX team will ensure all participating systems will get equal number of evaluators.
 +
 
 +
==Task for evaluators==
 +
 
 +
To motivate the evaluators, a defined yet open task is given to the evaluators:
 +
 
 +
<span style="color:#008000">'''''You are creating a short video about a memorable occasion that happened to you recently, and you need to find some (copyright-free) songs to use as background music.'''''</span>
 +
 
 +
The task is to ensure that evaluators have a (more or less) consistent goal when they interact with the systems. The goal is flexible and authentic to the evaluators' lives ("a recent, memorable occasion"). As the task is not too specific, evaluators can potentially look for a wide range of music in terms of genre, mood and other aspects. This allows great flexibility and virtually unlimited possibility in system design. 
 +
 
 +
Another important consideration in designing the task is the music collection available for this GC14UX: the Jamando collection. Jamando music is not well-known to most users/evaluators, whereas many more commonly seen music information tasks are more or less influenced by users' familiarity to the songs and song popularity. Through this task of "finding (copyright-free) background music for a self-made video", we strive to minimize the need of looking for familiar or popular music.
 +
 
 +
==Evaluation results==
 +
Statistics of the scores given by all evaluators will be reported: mean, average deviation. Meaningful text comments from the evaluators will also be reported.
 +
 
 +
==Evaluation Webforms==
 +
Graders can take as many assignments as they wish in the My Assignments page. They are allowed to go back to the evaluation page anytime by clicking the thumbnail of the submission. 
 +
 
 +
[[File:GCUX_wireframe_my_assignments.png|800px]]
 +
<br/>
 +
To facilitate the evaluators and minimize their burden, the GC14UX team will provide a set of evaluation forms which wrap around the participating systems. As shown in the following image, the evaluation webforms are for scoring the participating systems, with their client interfaces embedded within an iframe in the left side of the webform.
 +
 
 +
[[File:GCUX wireframe evaluation.png|800px]]
 +
 
 +
=Organization=
 +
 
 +
==Important Dates==
 +
 
 +
*July 1: announce the GC
 +
*Sep. 28st: deadline for system submission 
 +
*Oct. 5th: start the evaluation
 +
*Oct. 21st: close the evaluation system
 +
*Oct. 29th: announce the results
 +
*Oct. 31st: MIREX and GC session in ISMIR2014
 +
 
 +
==What to Submit==
  
==Evaluation results==
+
A URL to the participanting system.
Statistics of the scores given by all evaluators will be reported: mean, average deviation. Meaningful text comments from the evaluators will also be reported.  
 
  
==Certificates==
+
==Contacts==
The GC14UX will present certificates to all participating systems and the system(s) with the highest mean score.
+
The GC14UX team consists of:
 +
:J. Stephen Downie, University of Illinois (MIREX director)
 +
:Xiao Hu, University of Hong Kong (ISMIR2014 co-chair)
 +
:Jin Ha Lee, University of Washington (ISMIR2014 program co-chair)
 +
:Yi-Hsuan (Eric) Yang, Academic Sinica, Taiwan (ISMIR2014 program co-chair)
 +
:David Bainbridge, Waikato University, New Zealand
 +
:Kahyun Choi, University of Illinois
 +
:Peter Organisciak, University of Illinois
  
=Wireframes=
+
Inquiries, suggestions, questions, comments are all highly welcome! Please contact Prof. Downie [mailto:jdownie@illinois.edu] or anyone in the team.
[[File:GCUX_wireframe_2014_06_16.png|900px]]
 

Latest revision as of 14:25, 9 April 2015

Purpose

Holistic, user-centered evaluation of the user experience in interacting with complete, user-facing music information retrieval (MIR) systems.

Goals

  1. To inspire the development of complete MIR systems.
  2. To promote the notion of user experience as a first-class research objective in the MIR community.

Dataset

A set of music 10,000 music audio tracks is provided for the GC14UX. It will be a subset of tracks drawn from the Jamendo collection's CC-BY licensed works.

The Jamendo collection contains music in a variety of genres and moods, but is mostly unknown to most listeners. This will mitigate against the possible user experience bias induced by the differential presence (or absence) of popular or known music within the participating systems.

As of May 20, 2014, the Jamendo collection contains 14,742 tracks with the CC-BY license. The CC-BY license allows others to distribute, modify, optimize and use your work as a basis, even commercially, as long as you give credit for the original creation. This is one of the most permissive licenses possible.

The 10,000 tracks in GC14UX are sampled (w.r.t. maximizing music variety) from the Jamendo collection with CC-BY license and made available for participants (system developers) to download to build their systems. It represents a randomly chosen subset the content available at Jamendo that is published under the terms of the Creative Commons Attribution-Non-Commercial-ShareAlike (by-nc-sa), where user-supplied data has tagged a track with 1 or more genre categories. For more details about usage of this dataset, see the LICENSE.txt file contained in the downloaded files.

The dataset contains the MP3 tracks and the metadata the Jamendo site publishes on the respective items (represented in JSON format), retrieved using the site's API (6th Aug 2014). The dataset is available both zipped up and as a tar-ball(you only need one of these); however, at 60+ Gb it is a non-trival size of file to download over the web, and so we suggest you install a Download Manager extension to your browser if you do not already have one and make use of that. In a test using the DownThemAll! extension to Firefox, downloading the dataset between University of Illinois at Urbana-Champaign and Waikato University in New Zealand took a little under 2 hours.

You need to register to download the main dataset.

https://www.music-ir.org/mirex/gc14ux/


Metadata Extracted from JSON Files

The JSON files retrieved from Jamendo site contain various metadata:

  1. album_id
  2. album_image
  3. album_name
  4. artist_id
  5. artist_idstr
  6. artist_name
  7. audio
  8. audiodownload
  9. duration
  10. id
  11. license_ccurl
  12. musicinfo_lang
  13. musicinfo_speed
  14. musicinfo_acousticelectric
  15. musicinfo_vocalinstrumental
  16. musicinfo_gender
  17. musicinfo_tags_vartags
  18. musicinfo_tags_genres
  19. musicinfo_tags_instruments
  20. name
  21. position
  22. releasedate
  23. shareurl
  24. shorturl

2014:GC14UX:JSON Metadata presents statistics and plots of selected fields.

Participating Systems

Unlike conventional MIREX tasks, participants are not asked to submit their systems. Instead, the systems will be hosted by their developers. All participating systems need to be constructed as websites accessible to users through normal web browsers. Participating teams will submit the URLs to their systems to the GC14UX team.

To ensure a consistent experience, evaluators will see participating systems in fixed size window: 1024x768. Please test your system for this screen size.

See the #Evaluation Webforms below for a better understanding of our E6K-inpsired evaluation system design.

Potential Participants

Please put your names and email contacts in the following table. It is encouraged that you give your team a cool name!

(Cool) Team Name Name(s) Email(s)
The MIR UX Master Dr. MIR mir@domain.com
Moody Kai Xing,Pei-I Chen,Terry Lei,Jen-Yu Liu,Billy Vong,Yi-Hsuan Yang,Xiao Hu xkaics@gmail.com,gooa1121@gmail.com,skyhelpme@gmail.com,ciauaishere@gmail.com,billyvg@gmail.com,affige@gmail.com,xiaoxhu@hku.hk
Tonic D. Bountouridis, J.V. Balen, M. Rodriguez, S. Manoli, A. Aljanaki, F. Wiering, R.C. Veltkamp d.bountouridis@uu.nl, J.M.H.VanBalen@uu.nl. ME.RodriguezLopez@uu.nl, stavma0891@gmail.com, a.aljanaki@uu.nl, F.Wiering@uu.nl, R.C.Veltkamp@uu.nl

Evaluation

As written in the name of the Grand Challenge, the evaluation will be user-centered. All systems will be used by a number of human evaluators and be rated by them on several most important criteria in evaluating user experience.

Criteria

Note that the evaluation criteria or its descriptions may be slightly changed in the months leading up to the submission deadline, as we test it and work to improve it.

Given the GC14UX is all about how users perceive their experiences of the systems, we intend to capture the user perceptions in a minimally intrusive manner and not to burden the users/evaluators with too many questions or required data inputs. The following criteria are grounded on the literature of Human Computer Interaction (HCI) and User Experience (UX), with a careful consideration on striking a balance between being comprehensive and minimizing evaluators' cognitive load.

Evaluators will rate systems on the following criteria:

  • Overall satisfaction: Overall, how pleasurable do you find the experience of using this system?

Very unsatisfactory / Unsatisfactory / Slightly unsatisfactory / Neutral / Slightly satisfactory / Satisfactory / Very satisfactory

  • Learnability: How easy was it to figure out how to use the system?

Very difficult / Difficult / Slightly difficult / Neutral / Slightly easy / Easy / Very easy

  • Robustness: How good is the system’s ability to warn you when you’re about to make a mistake and allow you to recover?

Very Poor / Poor / Slightly Poor / Neutral / Slightly Good / Good / Excellent

  • Affordances: How well does the system allow you to perform what you want to do?

Very Poor / Poor / Slightly Poor / Neutral / Slightly Good / Good / Excellent

  • Feedback: How well does the system communicate what's going on?

Very Poor / Poor / Slightly Poor / Neutral / Slightly Good / Good / Excellent

  • Open Text Feedback: An open-ended question is provided for evaluators to give feedback if they wish to do so.

Evaluators

Evaluators will be users aged 18 and above. For this round, evaluators will be drawn primarily from the MIR community through solicitations via the ISMIR-community mailing list. The #Evaluation Webforms developed by the GC14UX team will ensure all participating systems will get equal number of evaluators.

Task for evaluators

To motivate the evaluators, a defined yet open task is given to the evaluators:

You are creating a short video about a memorable occasion that happened to you recently, and you need to find some (copyright-free) songs to use as background music.

The task is to ensure that evaluators have a (more or less) consistent goal when they interact with the systems. The goal is flexible and authentic to the evaluators' lives ("a recent, memorable occasion"). As the task is not too specific, evaluators can potentially look for a wide range of music in terms of genre, mood and other aspects. This allows great flexibility and virtually unlimited possibility in system design.

Another important consideration in designing the task is the music collection available for this GC14UX: the Jamando collection. Jamando music is not well-known to most users/evaluators, whereas many more commonly seen music information tasks are more or less influenced by users' familiarity to the songs and song popularity. Through this task of "finding (copyright-free) background music for a self-made video", we strive to minimize the need of looking for familiar or popular music.

Evaluation results

Statistics of the scores given by all evaluators will be reported: mean, average deviation. Meaningful text comments from the evaluators will also be reported.

Evaluation Webforms

Graders can take as many assignments as they wish in the My Assignments page. They are allowed to go back to the evaluation page anytime by clicking the thumbnail of the submission.

GCUX wireframe my assignments.png
To facilitate the evaluators and minimize their burden, the GC14UX team will provide a set of evaluation forms which wrap around the participating systems. As shown in the following image, the evaluation webforms are for scoring the participating systems, with their client interfaces embedded within an iframe in the left side of the webform.

GCUX wireframe evaluation.png

Organization

Important Dates

  • July 1: announce the GC
  • Sep. 28st: deadline for system submission
  • Oct. 5th: start the evaluation
  • Oct. 21st: close the evaluation system
  • Oct. 29th: announce the results
  • Oct. 31st: MIREX and GC session in ISMIR2014

What to Submit

A URL to the participanting system.

Contacts

The GC14UX team consists of:

J. Stephen Downie, University of Illinois (MIREX director)
Xiao Hu, University of Hong Kong (ISMIR2014 co-chair)
Jin Ha Lee, University of Washington (ISMIR2014 program co-chair)
Yi-Hsuan (Eric) Yang, Academic Sinica, Taiwan (ISMIR2014 program co-chair)
David Bainbridge, Waikato University, New Zealand
Kahyun Choi, University of Illinois
Peter Organisciak, University of Illinois

Inquiries, suggestions, questions, comments are all highly welcome! Please contact Prof. Downie [1] or anyone in the team.