Terry Minh's Portfolio - Hands-Free Learning: To AI or not to AI

Overview

In recent years, the market for Massive Open Online Course (MOOC) has been expanding. Interaction with visual content is widely discovered; however, audio content interactive solutions have not progressed at the same pace.

"How do we listen and take notes without using a pen and paper?"

Goals

The goal of this project is to provide a seamless experience for users to control the audio educational materials fully they’re listening to using voice commands. In turn, the project will enhance the learning engagement rate of people who need to learn while participating in low cognitive activities (repetitive tasks, such as assembly factory workers, farming tractor drivers, etc... ). We also aim to make the solution open-source.

Team member + Roles

Terry (me) - UX research | UX design | project management

Matthias - research assist | development

Bruno - research assist

Tools

Adobe XD | Miro | Wit.ai | React Native

Voice Command Prototype

I know you're eager to try out the final design but read the instruction first: https://docs.google.com/document/d/1Eh8AbV00FkZFs-3HfSnzrqu_1oLY0TVT/edit?usp=sharing&ouid=111088220760311031773&rtpof=true&sd=true

Voice Command Prototype

Research

Qualitative + Quantitative

A mixed-method research strategy was employed to garner a broad yet detailed perspective on the user experience. This approach combined anonymized insights from a study conducted by the University of Helsinki and our team’s qualitative research.

Insights from the University of Helsinki study "Low cognitive activities":

- A significant portion of users engaged with MOOC content while multitasking.
- Current audio learning tools in MOOCs are underutilized due to their limited interactive capabilities.
- There is a noticeable pattern of user preference for audio content that allows multitasking without visual dependency.

Outcomes of our qualitative research:

- Users expressed a desire for more intuitive and hands-free control over their learning content, akin to the convenience offered by modern voice-activated home assistants.
- Frustration over the need to visually interact with devices to control playback or take notes, which was especially pronounced among users in motion (e.g., driving, exercising).
- Participants indicated a high interest in features such as smart note-taking, contextual content suggestions, and navigational voice commands within audio materials.

Competitive audit

During the project's inception, the market lacked a comprehensive solution that provided users with complete and intuitive control over audio content. Consequently, this project demonstrates significant potential to deliver value to its target audience by filling this gap.

We found that the "save note" feature in Coursera, though mainly visual interaction, was very useful and provided great usability for the learners. Therefore, we posed a question "How can we turn this feature into audio interaction?"

Persona

Journey map

Scenario + User flow

Ideation

To AI or not to AI?

- Initial Concept Exploration: Our early ideation phase explored diverse interaction methods, including gesture-based control. However, this concept was set aside due to its inherent complexity - the challenges in phone placement and the requirement for users to memorize gestures made it impractical for our diverse user base.
- Focusing on Voice Interaction: The team then deliberated between simple voice commands and a more sophisticated AI-assisted companion using Natural Language Processing (NLP). The AI option, with its potential for a conversational and intuitive user experience, aligned more closely with our vision. Besides, the team also considered the background noise which might interfere with the audio and users' voice. Later on, the research found that the users used noise-cancellation headset during work so the audio interaction solution was justified.
- Language Compatibility and Technical Feasibility: A key technical consideration was the inclusion of Finnish language support, vital for testing the MVP with Finnish farmers, facilitated by our stakeholders. Our research led us to wit.ai, which met our technical specifications and language requirements, thereby cementing our decision to pursue the AI-driven approach.

From paper to digital

Paper -> Digital wireframe -> Lo-fi prototype

Early evaluation

The early evaluation phase where a chatbot exercise was employed, not as a feature of the app itself, but as a tool to categorize and refine the pages and features within the Audio Widget application. The exercise was instrumental in identifying any navigational challenges and in ensuring that the app's structure was intuitive and aligned with user needs

Hi-fi prototype

Utilizing Adobe XD's often-underused voice command and feedback features, we achieved a level of interaction and user testing that was initially planned to be conducted via the Wizard of Oz method. Advantages Over Wizard of Oz Method:

- Realism and Interactivity: Adobe XD’s voice features provided a more authentic and interactive experience for testers, giving them a true sense of the widget's voice interaction capabilities.
- Efficiency: This method streamlined the prototyping process, allowing for rapid iterations based on user feedback without the logistical complexities of the Wizard of Oz setup.
- User Feedback Accuracy: The direct engagement with the prototype facilitated the collection of more accurate and relevant user feedback, crucial for refining the voice command functionality.
We also include an instructional note for user evaluation which includes voice commands the prototype can recognize.

Prototype evaluation

Guided Thinking-Aloud

Users were given specific instructions on which voice commands to use at different stages of their interaction with the Adobe XD prototype.

Insights from prototype evaluation

- User Proficiency: Participants were able to navigate the prototype with ease.
- Feature Relevance: The expressed intent to use the widget in conjunction with favored audio services suggests that the features offered are relevant and desirable to the target audience.
- Market Opportunity: The lackluster performance of existing AI assistants in meeting our users' needs points to a market opportunity for a specialized tool.
- Design Validation: The successful completion of tasks by users confirms the prototype's design is on the right track, with a clear value proposition.

Key takeaways

- Accessibility: tested different options to provide the best experience for the user when it comes interactive audios.

- Language inclusion: addressed the specific target group - Finnish speakers.

- AI implementation: explored the potential of NLM as well as ethics and EU regulation regarding the use of AI.

- User feedback: collected insights continuously to address the users' needs.

What's next

- MVP for larger scale evaluation

- AI training and optimization

- Transitioning to open-source

Other projects