In recent years, the market for Massive Open Online Course (MOOC) has been expanding. At the same time, besides the ever-expanding target audience, the traditional learning method in MOOC has a slight expansion beyond the visual material. A few of these education sites choose audio materials as their sole teaching materials. Interaction with visual content is widely discovered; however, audio content interactive solutions have not progressed at the same pace. Thinking in a traditional way, how to make it convenient for a user to fully control what they’re listening to on the radio without even looking at the radio? How do they take notes without using a pen and paper? Now replace the radio with the podcast player within the user’s smartphone, which Siri or Google Assistant does not support.
The goal of this project is to provide a seamless experience for users to control the audio educational materials fully they’re listening to using voice commands. In turn, the project will enhance the learning engagement rate of people who need to learn while participating in low cognitive activities (people engaged in repetitive tasks, such as assembly factory workers, farming tractor drivers, etc... )
Project duration
8 weeks including design and MVP development
Team member + Roles
Terry (me) - UX research | UX design | project management
Matthias - research assist | development
Bruno - research assist
Adobe XD | Miro | Wit.ai | React Native
Voice Command Prototype

Voice Command Prototype

Qualitative + Quantitative
A mixed-method research strategy was employed to garner a broad yet detailed perspective on the user experience. This approach combined anonymized insights from a study conducted by the University of Helsinki and our team’s qualitative research.
Insights from the University of Helsinki study:
- A significant portion of users engaged with MOOC content while multitasking.
- Current audio learning tools in MOOCs are underutilized due to their limited interactive capabilities.
- There is a noticeable pattern of user preference for audio content that allows multitasking without visual dependency.
Outcomes of our qualitative research:
- Users expressed a desire for more intuitive and hands-free control over their learning content, akin to the convenience offered by modern voice-activated home assistants.
- There was a recurring theme of frustration over the need to visually interact with devices to control playback or take notes, which was especially pronounced among users in motion (e.g., driving, exercising).
- Participants indicated a high interest in features such as smart note-taking, contextual content suggestions, and navigational voice commands within audio materials.
Competitive audit
During the project's inception, the market lacked a comprehensive solution that provided users with complete and intuitive control over audio content. Consequently, this project demonstrates significant potential to deliver value to its target audience by filling this gap.
This section delves into the fictional yet data-informed characters that represent our core user base. These archetypes, derived from meticulous research and user interviews, embody the goals, challenges, and behaviors of our intended audience
Journey map
This map is an amalgamation of actions, emotions, and thoughts that users encounter along their journey. It serves as a strategic tool that lays out the user's path in detail, highlighting moments of engagement, potential friction points, and opportunities for delight
Scenario + User flow
The Scenario provides a vivid depiction of our Audio Widget in action. Here, we illustrate the practical application of the widget through narratives that detail how our personas interact with the product within their natural environments. The flow is designed to be intuitive and efficient, ensuring a smooth user experience. Also, The flow is designed to be intuitive and efficient, ensuring a smooth user experience.
To AI or not to AI?
The Audio Widget project embarked on a mission to revolutionize the educational experience by seamlessly integrating voice control with audio content. This initiative was born from recognizing a market gap: no existing solution offered users an intuitive command over their auditory learning materials.
- Initial Concept Exploration: Our early ideation phase explored diverse interaction methods, including gesture-based control. However, this concept was set aside due to its inherent complexity - the challenges in phone placement and the requirement for users to memorize gestures made it impractical for our diverse user base.
- Focusing on Voice Interaction: The team then deliberated between simple voice commands and a more sophisticated AI-assisted companion using Natural Language Processing (NLP). The AI option, with its potential for a conversational and intuitive user experience, aligned more closely with our vision.
- Language Compatibility and Technical Feasibility: A key technical consideration was the inclusion of Finnish language support, vital for testing the MVP with Finnish farmers, facilitated by our stakeholders. Our research led us to wit.ai, which met our technical specifications and language requirements, thereby cementing our decision to pursue the AI-driven approach.
From paper to digital
Paper -> Digital wireframe -> Lo-fi prototype
Paper Wireframing began with sketching initial concepts on paper, allowing rapid exploration and iteration of basic layout and navigation ideas. We transitioned to digital wireframes using tools like Adobe XD for refined visualization of the user interface, focusing on elements like buttons and menus. Then, we developed Low-Fidelity prototypes to simulate user interactions and test functionality, providing early insights into usability and user feedback.
Early evaluation
The early evaluation phase where a chatbot exercise was employed, not as a feature of the app itself, but as a tool to categorize and refine the pages and features within the Audio Widget application. This innovative approach allowed us to simulate user interactions with the app, providing an in-depth understanding of how users navigate through various functionalities and content. The exercise was instrumental in identifying any navigational challenges and in ensuring that the app's structure was intuitive and aligned with user needs
Hi-fi prototype
The high-fidelity (hi-fi) prototyping phase of the Audio Widget marked a significant advancement in our design process. Utilizing Adobe XD's often-underused voice command and feedback features, we achieved a level of interaction and user testing that was initially planned to be conducted via the Wizard of Oz method. Advantages Over Wizard of Oz Method:
- Realism and Interactivity: Adobe XD’s voice features provided a more authentic and interactive experience for testers, giving them a true sense of the widget's voice interaction capabilities.
- Efficiency: This method streamlined the prototyping process, allowing for rapid iterations based on user feedback without the logistical complexities of the Wizard of Oz setup.
- User Feedback Accuracy: The direct engagement with the prototype facilitated the collection of more accurate and relevant user feedback, crucial for refining the voice command functionality.
We also include an instructional note for user evaluation which includes voice commands the prototype can recognize.
Prototype evaluation
Guided Thinking-Aloud
We employed a guided thinking-aloud protocol for evaluating the prototype. Users were given specific instructions on which voice commands to use at different stages of their interaction with the Adobe XD prototype. Instructions were crafted to be clear and easily understandable to prevent user frustration and to allow for a focus on the interaction with the prototype. The tasks within the prototype were designed to mirror real-world scenarios, ensuring the evaluation results were relevant and actionable. 
Insights from prototype evaluation
- User Proficiency: Participants, representative of our target audience, were able to navigate the prototype with ease, indicating the design's effectiveness.
- Feature Relevance: The expressed intent to use the widget in conjunction with favored audio services suggests that the features offered are relevant and desirable to the target audience.
- Market Opportunity: The lackluster performance of existing AI assistants in meeting our users' needs points to a market opportunity for a specialized tool like the Audio Widget.
- Design Validation: The successful completion of tasks by users confirms the prototype's design is on the right track, with a clear value proposition.
Key takeaways
What's next

Other projects

Back to Top