Hire a web Developer and Designer to upgrade and boost your online presence with cutting edge Technologies

Sunday, June 25, 2023

Using AI To Detect Sentiment In Audio Files

 Imagine being able to unlock the emotional essence of audio. Dive into an article where you will build an app that evaluates audio files for positive and negative sentiments. The idea is that you will create an interface for uploading an audio file, then transcribe the contents into text before analyzing the text and assigning it a positive or negative score for how the tone is perceived. There are a few moving pieces you need to cobble together, including machine learning, natural language processing, speech-to-text conversion, and a UI framework.

I don’t know if you’ve ever used Grammarly’s service for writing and editing content. But if you have, then you no doubt have seen the feature that detects the tone of your writing.

It’s an extremely helpful tool! It can be hard to know how something you write might be perceived by others, and this can help affirm or correct you. Sure, it’s some algorithm doing the work, and we know that not all AI-driven stuff is perfectly accurate. But as a gut check, it’s really useful.

Grammarly tone detector
(Large preview)

Now imagine being able to do the same thing with audio files. How neat would it be to understand the underlying sentiments captured in audio recordings? Podcasters especially could stand to benefit from a tool like that, not to mention customer service teams and many other fields.

An audio sentiment analysis has the potential to transform the way we interact with data.

That’s what we are going to accomplish in this article.

A screenshot of the audio sentiment analyzer built in this tutorial
A screenshot of the audio sentiment analyzer we are building together in this tutorial. (Large preview)

The idea is fairly straightforward:

  • Upload an audio file.
  • Convert the content from speech to text.
  • Generate a score that indicates the type of sentiment it communicates.

But how do we actually build an interface that does all that? I’m going to introduce you to three tools and show how they work together to create an audio sentiment analyzer.

But First: Why Audio Sentiment Analysis? #

By harnessing the capabilities of an audio sentiment analysis tool, developers and data professionals can uncover valuable insights from audio recordings, revolutionizing the way we interpret emotions and sentiments in the digital age. Customer service, for example, is crucial for businesses aiming to deliver personable experiences. We can surpass the limitations of text-based analysis to get a better idea of the feelings communicated by verbal exchanges in a variety of settings, including:

  • Call centers
    Call center agents can gain real-time insights into customer sentiment, enabling them to provide personalized and empathetic support.
  • Voice assistants
    Companies can improve their natural language processing algorithms to deliver more accurate responses to customer questions.
  • Surveys
    Organizations can gain valuable insights and understand customer satisfaction levels, identify areas of improvement, and make data-driven decisions to enhance overall customer experience.

And that is just the tip of the iceberg for one industry. Audio sentiment analysis offers valuable insights across various industries. Consider healthcare as another example. Audio analysis could enhance patient care and improve doctor-patient interactions. Healthcare providers can gain a deeper understanding of patient feedback, identify areas for improvement, and optimize the overall patient experience.

Market research is another area that could benefit from audio analysis. Researchers can leverage sentiments to gain valuable insights into a target audience’s reactions that could be used in everything from competitor analyses to brand refreshes with the use of audio speech data from interviews, focus groups, or even social media interactions where audio is used.

I can also see audio analysis being used in the design process. Like, instead of asking stakeholders to write responses, how about asking them to record their verbal reactions and running those through an audio analysis tool? The possibilities are endless!

More after jump! Continue reading below ↓

The Technical Foundations Of Audio Sentiment Analysis #

Let’s explore the technical foundations that underpin audio sentiment analysis. We will delve into machine learning for natural language processing (NLP) tasks and look into Streamlit as a web application framework. These essential components lay the groundwork for the audio analyzer we’re making.

Natural Language Processing #

In our project, we leverage the Hugging Face Transformers library, a crucial component of our development toolkit. Developed by Hugging Face, the Transformers library equips developers with a vast collection of pre-trained models and advanced techniques, enabling them to extract valuable insights from audio data.

A screenshot of a pre-trained model from Hugging Face called Transformers
(Large preview)

With Transformers, we can supply our audio analyzer with the ability to classify text, recognize named entities, answer questions, summarize text, translate, and generate text. Most notably, it also provides speech recognition and audio classification capabilities. Basically, we get an API that taps into pre-trained models so that our AI tool has a starting point rather than us having to train it ourselves.

UI Framework And Deployments #

Streamlit is a web framework that simplifies the process of building interactive data applications. What I like about it is that it provides a set of predefined components that works well in the command line with the rest of the tools we’re using for the audio analyzer, not to mention we can deploy directly to their service to preview our work. It’s not required, as there may be other frameworks you are more familiar with.

Building The App #

Now that we’ve established the two core components of our technical foundation, we will next explore implementation, such as

  1. Setting up the development environment,
  2. Performing sentiment analysis,
  3. Integrating speech recognition,
  4. Building the user interface, and
  5. Deploying the app.

Initial Setup #

We begin by importing the libraries we need:

import os
import traceback
import streamlit as st
import speech_recognition as sr
from transformers import pipeline

We import os for system operations, traceback for error handling, streamlit (st) as our UI framework and for deployments, speech_recognition (sr) for audio transcription, and pipeline from Transformers to perform sentiment analysis using pre-trained models.

The project folder can be a pretty simple single directory with the following files:

  • app.py: The main script file for the Streamlit application.
  • requirements.txt: File specifying project dependencies.
  • README.md: Documentation file providing an overview of the project.

Creating The User Interface #

Next, we set up the layout, courtesy of Streamlit’s framework. We can create a spacious UI by calling a wide layout:

st.set_page_config(layout="wide")

This ensures that the user interface provides ample space for displaying results and interacting with the tool.

Now let’s add some elements to the page using Streamlit’s functions. We can add a title and write some text:

// app.py
st.title("🎧 Audio Analysis 📝")
st.write("[Joas](https://huggingface.co/Pontonkid)")

I’d like to add a sidebar to the layout that can hold a description of the app as well as the form control for uploading an audio file. We’ll use the main area of the layout to display the audio transcription and sentiment score.

Here’s how we add a sidebar with Streamlit:

// app.py
st.sidebar.title("Audio Analysis")
st.sidebar.write("The Audio Analysis app is a powerful tool that allows you to analyze audio files and gain valuable insights from them. It combines speech recognition and sentiment analysis techniques to transcribe the audio and determine the sentiment expressed within it.")

And here’s how we add the form control for uploading an audio file:

// app.py
st.sidebar.header("Upload Audio")
audio_file = st.sidebar.file_uploader("Browse", type=["wav"])
upload_button = st.sidebar.button("Upload")

Notice that I’ve set up the file_uploader() so it only accepts WAV audio files. That’s just a preference, and you can specify the exact types of files you want to support. Also, notice how I added an Upload button to initiate the upload process.

Analyzing Audio Files #

Here’s the fun part, where we get to extract text from an audio file, analyze it, and calculate a score that measures the sentiment level of what is said in the audio.

The plan is the following:

  1. Configure the tool to utilize a pre-trained NLP model fetched from the Hugging Face models hub.
  2. Integrate Transformers’ pipeline to perform sentiment analysis on the transcribed text.
  3. Print the transcribed text.
  4. Return a score based on the analysis of the text.

In the first step, we configure the tool to leverage a pre-trained model:

// app.py
def perform_sentiment_analysis(text):
  model_name = "distilbert-base-uncased-finetuned-sst-2-english"

This points to a model in the hub called DistilBERT. I like it because it’s focused on text classification and is pretty lightweight compared to some other models, making it ideal for a tutorial like this. But there are plenty of other models available in Transformers out there to consider.

Now we integrate the pipeline() function that does the sentiment analysis:

// app.py
def perform_sentiment_analysis(text):
  model_name = "distilbert-base-uncased-finetuned-sst-2-english"
  sentiment_analysis = pipeline("sentiment-analysis", model=model_name)

We’ve set that up to perform a sentiment analysis based on the DistilBERT model we’re using.

Next up, define a variable for the text that we get back from the analysis:

// app.py
def perform_sentiment_analysis(text):
  model_name = "distilbert-base-uncased-finetuned-sst-2-english"
  sentiment_analysis = pipeline("sentiment-analysis", model=model_name)
  results = sentiment_analysis(text)

From there, we’ll assign variables for the score label and the score itself before returning it for use:

// app.py
def perform_sentiment_analysis(text):
  model_name = "distilbert-base-uncased-finetuned-sst-2-english"
  sentiment_analysis = pipeline("sentiment-analysis", model=model_name)
  results = sentiment_analysis(text)
  sentiment_label = results[0]['label']
  sentiment_score = results[0]['score']
  return sentiment_label, sentiment_score

That’s our complete perform_sentiment_analysis() function!

Transcribing Audio Files #

Next, we’re going to transcribe the content in the audio file into plain text. We’ll do that by defining a transcribe_audio() function that uses the speech_recognition library to transcribe the uploaded audio file:

// app.py
def transcribe_audio(audio_file):
  r = sr.Recognizer()
  with sr.AudioFile(audio_file) as source:
    audio = r.record(source)
    transcribed_text = r.recognize_google(audio)
  return transcribed_text

We initialize a recognizer object (r) from the speech_recognition library and open the uploaded audio file using the AudioFile function. We then record the audio using r.record(source). Finally, we use the Google Speech Recognition API through r.recognize_google(audio) to transcribe the audio and obtain the transcribed text.

In a main() function, we first check if an audio file is uploaded and the upload button is clicked. If both conditions are met, we proceed with audio transcription and sentiment analysis.

// app.py
def main():
  if audio_file and upload_button:
    try:
      transcribed_text = transcribe_audio(audio_file)
      sentiment_label, sentiment_score = perform_sentiment_analysis(transcribed_text)

Integrating Data With The UI #

We have everything we need to display a sentiment analysis for an audio file in our app’s interface. We have the file uploader, a language model to train the app, a function for transcribing the audio into text, and a way to return a score. All we need to do now is hook it up to the app!

What I’m going to do is set up two headers and a text area from Streamlit, as well as variables for icons that represent the sentiment score results:

// app.py
st.header("Transcribed Text")
st.text_area("Transcribed Text", transcribed_text, height=200)
st.header("Sentiment Analysis")
negative_icon = "👎"
neutral_icon = "😐"
positive_icon = "👍"

Let’s use conditional statements to display the sentiment score based on which label corresponds to the returned result. If a sentiment label is empty, we use st.empty() to leave the section blank.

// app.py
if sentiment_label == "NEGATIVE":
  st.write(f"{negative_icon} Negative (Score: {sentiment_score})", unsafe_allow_html=True)
else:
  st.empty()

if sentiment_label == "NEUTRAL":
  st.write(f"{neutral_icon} Neutral (Score: {sentiment_score})", unsafe_allow_html=True)
else:
  st.empty()

if sentiment_label == "POSITIVE":
  st.write(f"{positive_icon} Positive (Score: {sentiment_score})", unsafe_allow_html=True)
else:
  st.empty()

Streamlit has a handy st.info() element for displaying informational messages and statuses. Let’s tap into that to display an explanation of the sentiment score results:

// app.py
st.info(
  "The sentiment score measures how strongly positive, negative, or neutral the feelings or opinions are."
  "A higher score indicates a positive sentiment, while a lower score indicates a negative sentiment."
)

We should account for error handling, right? If any exceptions occur during the audio transcription and sentiment analysis processes, they are caught in an except block. We display an error message using Streamlit’s st.error() function to inform users about the issue, and we also print the exception traceback using traceback.print_exc():

// app.py
except Exception as ex:
  st.error("Error occurred during audio transcription and sentiment analysis.")
  st.error(str(ex))
  traceback.print_exc()

This code block ensures that the app’s main() function is executed when the script is run as the main program:

// app.py
if __name__ == "__main__": main()

It’s common practice to wrap the execution of the main logic within this condition to prevent it from being executed when the script is imported as a module.

Deployments And Hosting #

Now that we have successfully built our audio sentiment analysis tool, it’s time to deploy it and publish it live. For convenience, I am using the Streamlit Community Cloud for deployments since I’m already using Streamlit as a UI framework. That said, I do think it is a fantastic platform because it’s free and allows you to share your apps pretty easily.

But before we proceed, there are a few prerequisites:

  • GitHub account
    If you don’t already have one, create a GitHub account. GitHub will serve as our code repository that connects to the Streamlit Community Cloud. This is where Streamlit gets the app files to serve.
  • Streamlit Community Cloud account
    Sign up for a Streamlit Cloud so you can deploy to the cloud.

Once you have your accounts set up, it’s time to dive into the deployment process:

  1. Create a GitHub repository.
    Create a new repository on GitHub. This repository will serve as a central hub for managing and collaborating on the codebase.
  2. Create the Streamlit application.
    Log into Streamlit Community Cloud and create a new application project, providing details like the name and pointing the app to the GitHub repository with the app files.
  3. Configure deployment settings.
    Customize the deployment environment by specifying a Python version and defining environment variables.

That’s it! From here, Streamlit will automatically build and deploy our application when new changes are pushed to the main branch of the GitHub repository. You can see a working example of the audio analyzer I created: Live Demo.

Conclusion #

There you have it! You have successfully built and deployed an app that recognizes speech in audio files, transcribes that speech into text, analyzes the text, and assigns a score that indicates whether the overall sentiment of the speech is positive or negative.

We used a tech stack that only consists of a language model (Transformers) and a UI framework (Streamlit) that has integrated deployment and hosting capabilities. That’s really all we needed to pull everything together!

So, what’s next? Imagine capturing sentiments in real time. That could open up new avenues for instant insights and dynamic applications. It’s an exciting opportunity to push the boundaries and take this audio sentiment analysis experiment to the next level.

Behind The Curtains Of Wikipedia Redesign

 The Wikipedia team shipped a redesign of the ubiquitous and one of the most visited websites on the web. Alex Hollendar and Jon Robson led the work and generously discussed the effort with us in a thorough, wide-ranging interview that covers the design, development, and processes that went into the project.

Wikipedia is more than a website — it’s perhaps a cornerstone of the World Wide Web. For decades, the site has provided a model for collaborating online, designing long-form content layouts, and supporting internationalization.

One of the more endearing qualities of Wikipedia is its design, which is known for its utilitarian aesthetics that have stuck around since its 2001 inception. The site has undergone redesigns before, but they are rare and often introduce subtle updates.

This year, 2023, marks the first Wikipedia redesign since 2014. Alex Hollender and Jon Robson led the effort and were kind enough to discuss it with us. The following is an interview that delves into what changed in this latest design, getting into the process as well as design and development details that we all can learn from.

Interview #

Photo of Jon Robson and Alex HollendarGeoff Graham: When I think of Wikipedia as a website, I think about the design first and foremost. It’s classic for its focus on function over aesthetics, yet often considered a relic along the same lines as Craigslist. How was it decided that “now” is the right time for a redesign?

Alex Hollender: You know, it’s funny, I think people sometimes assume that organizations make these super-calculated, methodical decisions, and maybe some do. What I’ve experienced more often are opportunistic decisions resulting from some combination of intuition and relationships. Nirzar Pangakar, the design director back in 2019, knew what the organization was hoping to accomplish in the coming years and understood that media and content on the internet were changing rapidly. He saw that we needed to set ourselves up with a better foundation to iterate on top of going forward. He also imagined how the website looked to newcomers and thought that making it a bit more familiar to them would offer a more inclusive experience. And I think he also sensed that in terms of the culture of the Wikipedia community, if we let any more time pass before making some changes, the conservativism and ossification would grow more and more intense, and projects like this would only become more difficult down the road.

Comparing screenshots before and after the Wikipedia redesign
Comparing screenshots before and after the Wikipedia redesign. (Large preview)

So it’s not like something was severely broken, or data was pointing us towards a specific problem or opportunity. There were a few concrete things we knew could be improved, but the driving force was Nirzar’s intuition regarding some of these larger things. He had a great relationship with the Chief Product Officer, Toby Negrin, and our team’s Product Manager, Olga Vasileva, and found an opportunity to get the project started. And because it can be somewhat difficult to articulate these sorts of intuitions, Nirzar, Olga, and I made a little design sprint to help others envision and understand the types of changes we could start with and where they might lead us.

Geoff: Wikipedia is more than just a website, right? It’s more like 300 sites where each instance is a different language. How do you approach a design system for a large network of sites like that? Is there a single, centralized source of truth, or is it something looser, depending on the locale?

Alex: Right, so there’s Wikipedia in over 300 languages, then there’s also a bunch of sister projects, including WikiData, Commons, WikiQuote, WikiSource, and others — all of which use the same interface. I’d say the needs are maybe 80-ish percent the same across all of the experiences. Then you’ve got things where specific languages need special functionality, or the WikiData search bar needs something extra, or the WikiSource “article” page has different needs from the Wikipedia one.

There’s, unfortunately, no single source of truth — we don’t even have all of the customizations and variations documented. A big part of being a designer here is just building a catalog in your mind over time. Different people know about different little nooks and crannies and would remind us like, “Hey, if you want to put a button there, you’re going to have to figure out something for project X in language Y because they’ve got a custom feature living in that spot currently.” It’s this very organic, emergent kind of thing where it’s just grown to fit people’s needs in a very unstructured, decentralized way. Super cool but quite difficult when you want to tweak some of the more fundamental/foundational parts of the experience.

Jon Robson: Before I worked on Wikipedia, I’d never worked on multilingual sites. There’s such a fascinating depth to it, for example, how numbering systems differ in different languages, how quotation marks should be considered translated content, how certain projects have content in two scripts, and how some projects add their own cultural flavor to the design. If you look at the Navajo Wikipedia website, they use a Navajo rug pattern which they’ve had since at least 2005.

Navajo Wikipedia website
Navajo Wikipedia website. (Large preview)
More after jump! Continue reading below ↓

It was fascinating how during this redesign, every release risked disrupting something small, as it was impossible to audit everything that was happening in all those projects. We had to make peace with the fact that we might not be able to retain them all and that things would break, and we’d iterate and find a happy medium. Often it’s unclear who to talk to about these things within the organization. Some projects just notice our changes and adapt, while other communities are more vocal. So we have to work together to reconcile these extremes. I’ve been impressed with how Alex has remained so stoic as a designer despite the curve balls the project has thrown at him.

Geoff: I imagine there’s a fine balance when working on a redesign for a site that’s as ubiquitous and that has as a long legacy as Wikipedia. How important was maintaining a sense of familiarity with the design for users? And how constraining was that for introducing new design elements?

Alex: Ultimately, we were focused on delivering the best reading and editing experience we could, somewhat regardless of familiarity for experienced users. For example, moving the table of contents from being inline below the lead section to being a sidebar, from a familiarity perspective, was a huge shift, and a lot of experienced users couldn’t get past that. For them, it violated the platonic form of a Wikipedia article or something, like if the table of contents wasn’t inline, then the article wasn’t a Wikipedia article. And while they tried to justify that preference from a functionality standpoint, their reasons weren’t strong, and I think it was mostly about them being uncomfortable with the unfamiliar. Meanwhile, all of the testing and the functional justifications we, and some community members, put forth made it super clear that the sidebar was the better approach. So, that’s how we made that particular decision.

Jon: The table of contents going from within the article to outside the article also uncovered a lot of interesting innovations our community had made for certain articles. For example, in some articles, they’d converted the standard table of contents to a horizontal layout using some inline styles or only listed the top-level headings using display: none in CSS to hide the rest. These customizations were broken when we implemented our redesign, which has opened up interesting discussions about whether customizations should be core parts of the software and how they should work in the new design.

Alex: I think the question of familiarity came into play more in terms of the rollout and how much we could change at once. We were sensitive to the risk of upsetting this very small part of the community that has an outsized influence on our decisions. Our fear was they would try to shut the project down, which has happened with other projects, big and small, in the past. So, for example, we didn’t include an increased font size in the first version of the new interface, even though we (and many community members) strongly believed it would be a significant improvement. We know from past projects that typography is a particularly hot-button topic.

Geoff: Who else was involved in the redesign? What roles did they play, and how did you manage all the work?

Alex: As far as our team goes, it’s about 5-6 Engineers, a Product Manager, a Community Specialist, and someone on Quality Assurance. Pretty much everyone was involved in a meaningful way in terms of exploring design challenges and weighing in on various options. Olga, the Product Manager, and several of the Engineers are better than I am when it comes to thinking about certain challenges. One clear example is accessibility.

There were several community members who were close collaborators and hundreds of others who were more casually involved. The majority of that collaboration happens on Phabricator, which is our task-tracking system. Of course, the timing gets tricky because community members might jump in with ideas or concerns as we’re finishing up a feature, maybe just because they weren’t aware that the conversation had started a few months back or whatever.

And then there’s the Wikimedia Foundation (WMF) design team. Each member of the design team has their own product team they belong to, so involvement, for the most part, happens via design reviews. There was a bunch of overlap, particularly between the work we were doing and the stuff the editing team worked on, so I got to collaborate closely with that designer. Also, each designer is assigned a design mentor. So, Rita, who is my design mentor — and who also happens to be an incredible designer and person — was behind the scenes all along, helping me figure everything out.

To me, the whole process felt pretty inclusive. A lot of the time, it felt like the process and the conversations were guiding things more than any one individual, which is both cool and a little scary.

Geoff: Wikipedia has been used to study online text legibility (PDF) because of its heavy focus on content. Yet, there have been so many advances in web fonts and typography since the last significant Wikipedia redesign in 2004, from variable font formats and fluid typography to even newer stuff in CSS from this past year, like the super new text-wrap: balance and a new line height (lh) unit. What design considerations went into the text in the latest redesign?

Alex: As far as I understand, there was a typography refresh back in 2014, which succeeded in some ways but was also super contentious. In terms of design ownership, there’s an unwritten understanding that the volunteer community owns the content, and WMF owns the interface. And while the typography is clearly a fundamental part of the overall user experience of the site, it’s definitely on the content side of the content-interface divide, which makes it more difficult for us to work on.

Prior to this project, a lot of great work had already been done by the Design Systems Team regarding the font stack (which is critical, given all of the different language editions of Wikipedia), how the type sizing is declared (which has a big impact on the experience if you manually change the font size), and other things like that.

For this project, from a sort of 8020 perspective, I think 80% of the room for improvement was managing the line length by adding a max-width, and increasing the base font-size value (which is hopefully coming soon). We did spend a bunch of time looking into other refinements that are forthcoming.

Jon: I actually worked on that typography refresh early in my career at the Wikimedia Foundation. It was contentious for two reasons. First, we added a limited container width for the content and used Helvetica Neue for the font. The latter was a problem due to the “open source” nature of the project, which the community felt strongly about. We compromised by preferring an open font when available, which I think was Linux Libertine at the time.

That project was a lot shorter in terms of time, and we had more important problems to solve, such as having a functioning mobile site and a WYSIWYG editor. So, no compromise could be found on the limited width front. But I was glad we finally got that in with this redesign, even if it came eight years later. Free knowledge is more a marathon than a sprint.

Alex: I do think it’s ironic that Wikipedia, one of the most popular text-based websites on the internet, doesn’t necessarily have a super strong typography practice, at least from a design perspective. Maybe a lot of that has to do with how varied the content is, how many different templates we have, and all of the different languages we need to support. Maybe it would have to almost be a language-by-language endeavor if we were ever to pull it off. I’m not sure.

Editor’s Note: The main discussion and prototype for the project’s typography efforts are available to view.

Geoff: Speaking of the differences in web design since 2004, the term “responsive web design” was also coined in that span of time. Wikipedia has no doubt had a mobile presence for some time, but were there any new challenges to make the site more responsive, given how best practices have evolved?

Alex: We set a soft goal of delivering a great experience down to a 500px browser width. I think it’s fairly uncommon for people to be using desktop or laptop devices with browsers that narrow. But these days, it’s pretty easy to achieve a fully-responsive site with CSS alone, so there didn’t seem to be much of a tradeoff there. Plus, we heard from a few editors that they often tile two or three browser windows side-by-side, so it can get narrow in those cases. The updated interface does feature three menus that can be pinned open as sidebars or collapsed as dropdowns, which is a configuration mainly for logged-in users in order to give them more control over their workstations. And the state of those menus is managed by JavaScript, which presented a slight challenge. Jon wrote a great article a few years ago about why we still have separate mobile and desktop sites.

I think another aspect of making things work well down to 500px was that we wanted to push ourselves to see how close we might be able to get to have one site for all devices, though we’re not quite there yet.

Jon: If I remember correctly, Alex and I had a good back-and-forth about that 500px threshold. In theory, we could have supported a breakpoint below that, and Alex had the mockups ready, but I was concerned that it would slow down development. Plus, the use case was not there as most of our users were resizing browsers, and we could back that up with data.

In fact, during the redesign, vocal members of our community pushed us to introduce an explicit viewport size in our markup because they were annoyed that the table of contents component was collapsing inconsistently in browsers. If you view the source, you’ll now see <meta name="viewport" content="width=1000">.

Note: You can even read the entire discussion about the change.

Showing DevTools highlighting the updated meta viewport in markup
Showing DevTools highlighting the updated meta viewport in markup. (Large preview)

Geoff: I know front-end nerds will want to know how CSS is written and managed in this latest design because, well, I’m one of them! What does the process look like to make an edit to the styles?

Jon: You have to remember that Wikipedia — and the MediaWiki software that provides it — is quite old and very large, and some of our technology stack reflects that.

MediaWiki is primarily a progressively enhanced web page written in PHP, so we tend to ship HTML with vanilla JavaScript and CSS that enhances it. Our front end is really unusual in that we have no build scripts for our JavaScript and CSS. We write ES6 code without transpiling it, and we use LESS compiled at runtime in PHP, with heavy caching, for our CSS. HTML is provided by Mustache templates.

We are very conservative about what libraries and technologies we use, particularly if they are likely to have an impact on others in the stack. We use TypeScript in the project to validate our code using JSDoc blocks but do not write our code in TypeScript as many of our volunteers do not know the language, and we don’t want to alienate them.

There was talk about replacing LESS with a different CSS preprocessor, but we decided to retain the status quo we’ve used since 2013 because we don’t want to fragment our codebase. We currently use Mustache templates because that’s what we’ve used since 2014, but we hope to eventually phase those out and replace them with Vue.js templates.

All our code is open-sourced, which is pretty unusual and cool! So, if you ever see some visual thing that looks off or could be improved, we’re always happy to take PRs with CSS that fix it.

Geoff: Another nerdy but key question for you: how important were performance considerations to the redesign? What specific things do you look for in Wikipedia’s performance, and what tools do you use to measure them?

Jon: Performance is really important to us, as Wikipedia is global, and we have many projects growing in areas with slower internet connections. We have a performance dashboard that we monitor where we collect global data from our users using the NavigationTiming API. And we run automated synthetic performance tests using Sitespeed.io. This is all public, and anyone can dig into the data!

Wikipedia Performance Dashboard homepage
Wikipedia Performance Dashboard homepage. (Large preview)

One of the biggest concerns for this redesign project was how replacing the internal search feature might lose users if it became too slow or unresponsive. We added instrumentation specifically designed to monitor this, and there’s a detailed write-up on how we analyzed the findings with synthetic performance tests.

Besides thinking about performance for specific features, we monitor bundle sizes of our render-blocking CSS assets, and our CI pipeline blocks anything that goes over our performance budget. We also run spikes to see if there are additional ways to improve performance. For example, in a quiet period, we ran a spike, which made our mobile site 300ms faster.

Given that we have hundreds of volunteers and staff collaborating on the codebase,

It’s a challenge to uphold our own high-performance standards. We’re currently working on implementing a performance budget across all our projects to formally enforce this and share the knowledge more widely for everyone to reference.

Geoff: Alex, you’ve noted that one of the goals you defined for the project was to “develop a more flexible interface with an eye towards future features.” What makes the new interface more flexible compared to how it was before, and what future features are you anticipating?

Alex: A small example of a new feature is the sticky header, which is currently only available when you are logged into the site. We built it knowing that for different types of pages, like article pages versus discussion pages versus help pages, et cetera, we would want to put different types of tools in the sticky header. That forethought can save a lot of time and complexity in terms of development.

Another aspect of flexibility, or maybe more specifically, extensibility, is information architecture. The previous interface had two different places for page tools: in the sidebar menu on the left and then above the article title. So, whenever we worked on a new page tools feature, we had to decide where it would go. Creating a clearer and more structured information architecture for the site means there’s one place for page tools, one for global navigation, and so on. I think this will make it easier for us to design new features in the future.

In terms of future features, we’re thinking about reading settings: dark mode, the ability to increase and decrease the font size and line height more easily, and maybe even themes like the Wikipedia apps have. We’re also thinking about ways to help people discover more knowledge related to what they are reading. Other things we might consider are reading features, like the ability to take notes and create collections of articles.

The concept for a dark mode design on a Wikipedia article page
The concept for a dark mode design on a Wikipedia article page. (Large preview)

Geoff: Thanks so much to you both for spending some time to share your work with us! Is there anything especially interesting about the design or the work it took to make it that might not be immediately obvious but that you are proud of?

Alex: I think it’s cool to think about super small things that have a big impact. Links are a critical part of the reading experience, and following from that, knowing which links you’ve already visited is important. Previously, there was so little contrast between visited links and black text that this whole sort of navigational wayfinding benefit was missing from experience. Changing the color of visited links was about as simple as a change can be from a technical perspective, with an outsized impact on the user experience.

Another thing I’m interested in and excited about is prototyping, specifically how additional fidelity in prototypes affects the design process. I reached a point where I was predominantly making prototypes with HTML, CSS, and JavaScript to work through design challenges rather than relying on mockups. It’s maybe impossible to know what impact that had in terms of the ability for us to have discussions about the designs, evaluate them, and include community members across many languages, among other things. There’s no way for us to know how the project would have turned out or how much longer it would have taken us to arrive at certain decisions if I hadn’t taken that approach, but my inclination is that it was super helpful.

Jon: The thing I’m most excited about is that the redesign project gave us the time to really pull apart a system that was 21 years old and build the foundation for something more sustainable. Fundamental things like introducing design tokens across the entire software stack are going to be powerful tools that we can use to support user customizations that allow people to change font size and enable a dark mode, the latter of which has been a popular request. So hopefully, we can finally deliver that.

Friday, June 23, 2023

Designing Sticky Menus: UX Guidelines

 Are sticky headers always a good idea? Best practices for designing sticky headers, with examples, UX guidelines and usability considerations.

We often rely on sticky headers to point user’s attention to critical features or calls to action. Think of sidebar navigation, CTAs, sticky headers and footers, “fixed” rows or columns in tables, and floating buttons. We’ve already looked into mobile navigation patterns in Smart Interface Design Patterns, but sticky menus deserve a closer look.

As users scroll, a sticky menu always stays in sight. And typically, it’s considered to be a good feature, especially if the menus are frequently used and especially if we want to speed up navigation.

Two examples of sticky menus with Sverigesradio on the left and  TV Gids on the right
Multiple sticky menus in use: On Sverigesradio and TV Gids, with multiple chained sticky menus. (Large preview)

However, sticky menus also come with a few disadvantages. In his recent article on Sticky Menus Are Problematic, And What To Do Instead, Adam Silver argues about some common usability issues of sticky menus — and how to solve them. Let’s take a closer look.

When Sticky Menus Are Useful #

How do we decide if a menu should be sticky or not? This depends on the primary job of a page. If it’s designed to primarily convey information and we don’t expect a lot of navigation, then sticky menus aren’t very helpful.

A sticky bar on France TV
A helpful sticky bard for navigation through channels on France TV. (Large preview)

However, if we expect users to navigate between different views on a page a lot and stay on the page while doing so — as it often is on long landing pages, product pages, and filters — then having access to navigation, A-Z or tabs can be very helpful.

Also, when users compare features in a data table, sticky headers help them verify that they always look at the right piece of data. That’s where sticky headers or columns can help and aid understanding. That’s why sticky bars are so frequently used in eCommerce, and in my experience, they improve the discoverability of content and speed of interaction.

Keep Sticky Headers Small, But Large Enough To Avoid Rage Taps #

The downside of sticky menus is that they typically make it more difficult for users to explore the page as they obscure content. Full-width bars on mobile and desktop are common, but they need to be compact, especially on narrow screens. And they need to accommodate for accessible tap sizes to prevent rage taps and rage clicks.

A sticky bar navigation of a postal service
Postal Service from Iceland with four items in the sticky bar navigation (now changed). (Source: Posturinn) (Large preview)

Typically, that means we can’t have more than five items in the sticky bar navigation. The choice of the items displayed in the sticky menu should be informed by the most important tasks that users need to perform on the website. If you have more than five items, you probably might need to look into some sort of an overflow menu, as displayed by Samsung.

Sticky overflow menu at Samsung
Sticky overflow menu at Samsung. (Large preview)

Whenever users have to deal with forms on a page on mobile, consider replacing sticky menus with accordions. Virtual keyboards typically take up to 60% of the screen, and with a sticky bar in view, filling in a form quickly becomes nothing short of impossible.

More after jump! Continue reading below ↓

Accessibility Issues of Sticky Menus #

By their nature, sticky menus always live on top of the content and often cause accessibility issues. They break when users zoom in. They often block the content for keyboard users who tab through the content. They obscure links and other focusable elements. And there is often not enough contrast between the menu and the content area.

Example of a poor contrast between the sticky sub-menu-navigation and the content area
Poor contrast between the sticky sub-menu-navigation and the content area can cause accessibility issues. Discovered via NN/Group. (Large preview)

Whenever we implement a sticky menu, we need to make sure that focusable elements are still visible with a sticky menu in action. And this also goes for internal page anchors that need to account for the sticky bar with the scroll-padding property in CSS.

Avoid Multiple Scrollbars Of Long Sticky Menus #

When sticky menus become lengthy, the last items on the list become difficult to access. We could make them visible with some sort of an overflow menu, but often they appear as scrollable panes, causing multiple scroll bars.

A large sticky sidebar navigation of the Australian Bureau of Statistics
Australian Bureau of Statistics with a large sticky sidebar navigation. (Large preview)

Not only does this behavior cause discoverability issues, but it’s also often a cause for mistakes and repetitive actions on a page. Ideally, we would prevent it by keeping the number of items short, but often it’s not possible or can’t be managed properly.

Example of an accordion menu on Smashing Magazine
Showing and hiding cart details when needed. On Smashing Magazine. (Large preview)

A way out is to show the menu as an accordion instead in situations when the space is limited, especially on mobile devices. That’s what we do at Smashing Magazine in the checkout, with a button that reveals and hides the contents of the cart when needed.

Partially Persistent Menus #

Because sticky menus often take up too much space, we could reveal them when needed and hide them when a user is focused on the content. That’s the idea behind partially persistent headers: as a user starts scrolling down, the menu disappears, but then any scrolling up prompts the menu to appear again.

Partially persistent menu
Partially persistent menu on CB2, appearing when you need it, and disappearing when you don't need it. (Large preview)

The issue with this pattern is that sometimes users just want to jump back to a previous section of the page or double-check some details in a previous paragraph, and the menu often gets in the way. Page Laubheimer from NN/Group recommends using a slide-in animation that is roughly 300–400ms long and will preserve the natural feel without being distracting.

Alternatives To Sticky Menus #

In some situations, we might not need a sticky menu after all. We can avoid their downsides with shorter pages, or lengthy pages which repeat relevant calls-to-actions or navigation within the page.

Tables of Contents displayed on UK Government and New Zealand Government websites
Tables of Contents displayed on UK Government and New Zealand Government websites. (Large preview)

We could display a table of contents on the top of the page and bring the user’s attention to the table of contents with a back-to-top link at the bottom of the page.

Wrapping Up #

Whenever the job of the page is to help users act, save, and compare, or we expect users to rely on navigation a lot, we might consider displaying sticky navigation. They are most harmful when there isn’t enough space anyway, as it often is with forms on mobile devices.

Sticky menus do come at a cost, as we need to account for usability and accessibility issues, especially for zooming, keyboard navigation, and anchor jumps. Add them if you need them, but be careful in plugging them in by default.

We need to prioritize what matters and remove what doesn’t. And too often, the focus should lie entirely on content and not navigation.

You can find more details on navigation UX in the video library on Smart Interface Design Patterns 🍣 — with a live UX training that’s coming up in September this year.

Further Resources #

Of course, the techniques listed above barely scratch the surface. Here are wonderful articles around sticky headers, from design considerations to technical implementations:

How To Boost Your Design Workflow With Setapp

 Trying to keep up with everything but still, constantly feeling overwhelmed with never-ending to-do lists? Spend some time now exploring efficient tools to save time in the future and speed up your workflow. Focus on what you do best — designing high-quality work, and let these tools handle the rest!

As someone who wears multiple hats, it is challenging to balance a full-time job, freelance projects, and all sorts of creative endeavors.

This is how I started off: By day, I’m a full-time product designer. By night, I juggle all sorts of freelance work and creative projects.

I am currently self-employed. However, there are challenges that come with being my own boss: Working with clients, sales and negotiation, invoicing, building a personal brand, crafting a content strategy, time tracking, project management… The list goes on.

Trying to keep up with everything used to be tough. No matter how hard I tried, my to-do list always seemed never-ending. I was constantly feeling overwhelmed.

I thought to myself, “There’s got to be a better way.”

After analyzing my workflow, I realized that many tasks could be simplified or automated so that I could save time, focus on high-value tasks, and work fewer hours.

After years of trial and error, I discovered a range of tools and strategies that helped me save time and stay organized to focus on what really matters.

The apps mentioned in this guide are available on Setapp. Whether you’re a Mac user or not, these hacks will help you get more done in less time and improve your quality of life. I hope you find value in this guide.

Streamline Your Workflow With the Best Apps #

You can use Setapp to access 240+ apps on your Mac and iPhone under a single monthly subscription.

Personally, I use Setapp to do three things:

  1. Try out apps that could help save time. Some of these apps cost more than Setapp’s subscription, so it’s a relief that I do not need to pay for each one individually.
  2. For apps that I only need to use occasionally, I can quickly install and uninstall them as needed, with no extra cost. This saves me precious space on my Mac and ensures that I’m not cluttering up my system with unnecessary apps.
  3. Since Setapp’s library is updated regularly, I always get to try out new apps to further enhance my workflow.

Track Time & Eliminate Distractions #

As a freelance designer, I need to track how much time I spend on each project to calculate my billable hours. I used to manually create events on my calendar and calculate the hours spent on each project. It’s a waste of time, and sadly, it is inaccurate.

To solve this problem, you can use Timemator to track your time accurately and minimize distractions.

Timemator interface on different devices
Image credit: Timemator. (Large preview)

With Timemator, you can set up auto time-tracking rules for specific apps, files, or websites. For example, you can set rules so that the timer starts tracking when you work on a specific project on Figma or Adobe Photoshop.

The timer runs quietly in the background so that you can stay focused without any interruptions. You no longer need to manually start or pause the timer.

Pro tip: Use it to reduce distractions! Set up auto-tracking to track how much time you spend on meetings, talking to teammates or clients on Slack, or watching Netflix.

To help you identify where you’ve spent your time, Timemator gives detailed reports and analytics so you can reduce or eliminate time-wasting activities and get more done in less time.

Timemator reports
Image credit: Timemator. (Large preview)

The Only Font Manager You Need #

As designers, we all know that font selection can make or break a creative project.

I was frustrated with Font Book (the default font manager on MacOS). It wasn’t user-friendly. Searching and comparing fonts was a chore.

I found Typeface to be useful — especially when you need to quickly browse through your font collection, customize the preview text and size in real-time, and compare to see how different fonts look side-by-side.

Different fonts compared side-by-side
(Large preview)

Over the years, I have saved up a huge font library. Typeface is able to load all my fonts quickly and remove duplicate fonts that bloat up my computer. It supports variable fonts and OpenType font features and has robust features for the busy designer.

Typeface’s feature to remove duplicate fonts
(Large preview)

For fonts you don’t use often, you can choose to activate them only when necessary. This way, your computer stays clean and fast.

As a bonus, you can also easily organize fonts into custom collections or tags.

Fastest Way To Create Device Mockups #

When designing, we often need to create high-quality, professional-looking phone, tablet, and computer mockups to showcase our designs.

I used to spend hours searching for device mockup templates and launch Adobe Photoshop in order to use those templates. The whole process was time-consuming, so I switched to a tool called Mockuuups Studio.

Examples of different mockups generated by Mockuuups Studio
(Large preview)

All you need to do is drag and drop a screenshot of your website or app into it, pick a scene, and it will generate thousands of mockups. It’s pretty neat.

You can filter through scenes, models, and devices to find the perfect mockup for your digital product. Then, add hands, overlays, realistic shadows, or backgrounds to your device mockups. In the example above, I have filtered ‘iPhone’ mockups only.

Since it’s cloud-based, you can access it anywhere and collaborate with your teammates in real time too.

To further speed up your workflow, you can use their Figma, Sketch, or Adobe XD plugin. This is their Figma plugin:

Mockuuups Studio’s Figma plugin
(Large preview)

Create Screenshots & Screen Recordings, Fast #

When presenting designs (especially when working remotely), I take screenshots and screen recordings for my clients every day.

But instead of using the default Mac screenshot tool, CleanShot X is a better solution. This is an essential tool for every Mac user.

To quickly take a screenshot, use this shortcut key on your Mac: Command + Shift + 4.

This tool gives you the convenience to record MP4 or GIF with your desktop icons hidden, capture scrollable content, and annotate, highlight, or blur screenshots to hide sensitive personal information.

An example of how I annotate my screenshots:

An example of an annotated screenshot
(Large preview)

I’ve used this tool for years with zero complaints. This tool will make your life easier when sharing screenshots with clients or on social media.

A cool feature you’ll also love: You can capture and copy any text, so you’ll never have to manually retype it again!

Your workflow will become much more streamlined and efficient since you no longer get bogged down in the technical details.

It’s challenging to keep track of various meetings, their details, and attendees, especially when switching between Google Meet, Zoom, your email inbox, and calendars.

To solve this problem, you can use Meeter to schedule or join meetings with one click right from the menu bar on your Mac.

Meeter interface with scheduled google meets
(Large preview)

It supports Google Meet, Zoom, and Microsoft Teams. When you want to join a meeting, you no longer have to waste time searching for meeting links, then copy and paste the link into the browser. Instead, you can now focus on being present in every meeting.

The tool allows you to directly call your FaceTime contacts and phone numbers and jump into recurring calls from the menu bar too. Pretty simple!

Quick call list on Meeter
(Large preview)

Save Time With Spotlight On Mac #

When working with multiple files and apps on your Mac, you need to be able to quickly find and access them instead of navigating through different folders.

With Spotlight, you can do these things quickly. While this is not an app, it’s one of the most powerful features on Mac that can save you plenty of time.

To open Spotlight, simply hit Command + Spacebar on your keyboard and start typing.

Then, try these on Spotlight:

  • Perform quick calculations.
    No need to open a calculator app. Simply type in your calculation in Spotlight and hit enter. It’s that easy.
  • Search for apps.
    Quickly find any app on your Mac.
  • Search the internet.
    Type your search term, and it will launch your default browser with the search results. You’ve just saved a few clicks.
  • Find files or folders.
    Type in the name of the file or folder, and you have it.
Find files or folders feature in Spotlight
(Large preview)
  • Check the weather.
    Type “weather” followed by your location, and it will give you up-to-date information on the current weather conditions and forecast.
Check the weather feature in Spotlight
(Large preview)

Cool, right? Learning how to use Spotlight effectively is a game-changer. Give it a try, and see how much time you can save.

Design Accessible Interfaces #

As a product designer who also builds websites for clients, it’s a challenge to find and create the perfect color palettes while working on multiple projects at once. In the past, I’ve had to rely on a combination of tools like swatch libraries and notes to keep track of my palettes.

If you’re a designer or a developer, you’ll love Sip — a powerful color picker that can help you design beautiful and accessible interfaces easily.

A color picker
(Large preview)

With Sip, you can quickly grab colors right from the Mac menu bar and drop them into any design or development tool, including Adobe Photoshop, Figma, and Sketch. This makes it easy to create custom color palettes that match the client’s brand.

You can create and save custom color palettes, and the quick access menu that floats on the side of your desktop gives you quick access to your color palettes.

A custom color palette
(Large preview)

Currently, it supports 24 of the most popular color formats in the industry, like Android XML, CSS hex, RGB, and CMYK.

Now, my favorite feature is Sip’s Contrast Checker. In the example below, you can use the color picker to check the contrast between the gray text and white background, ensuring that it meets accessibility standards and is legible for all users.

Sip’s Contrast Checker
(Large preview)

Tip: Always make sure the contrast between the text and background is greater than or equal to 4.5:1 for small text and 3:1 for large text. If the color contrast fails, click on the ‘FIX’ button to improve it!

Declutter Your Mac’s Menu Bar #

If you have a bunch of apps running on your Mac, your menu bar may be cluttered with all sorts of icons and notifications.

Just like physical clutter, digital clutter takes up mental space and affects your focus, too! To solve this problem, you can use Bartender.

Bartender allows you to organize your menu bar icons into neat and tidy groups or hide them completely — as simple as that. You can collapse your menu bar icons into a customizable dropdown menu so it remains clutter-free.

An example of a customizable dropdown menu to hide bar icons
(Large preview)

In the above example, most of my menu icons are hidden, except Figma and the battery level indicator.

After using it for over a month, I am able to focus better. It’s one of those subtle quality-of-life improvements that can have a big impact on your productivity and mindset.

Wrapping Up #

I wish I had discovered these tools sooner!

The apps I’ve shared above are available on Setapp. With a single monthly subscription, you get access to 240+ Mac and iPhone apps. They offer a free 7-day trial, so you can try it out and decide if it’s right for you.

These tools have completely transformed my workflow and helped me become more productive and less stressed. I hope that these tools will do the same for you so you can make the most of your time. After all, time is a limited resource, and it’s up to us to use it wisely.

Thank you for reading. Have a productive day!