Microsoft Edge can now auto generate ALT description tags for images

Kevin Okemwa

Microsoft Edge

At the beginning of this year, Microsoft Edge introduced text prediction for Windows 11 and Windows 10 users using AI and Machine Learning. And today through an announcement, it will now be able to provide auto-generated image labels using Machine Learning algorithms. This is an attempt to bridge the gap created by images missing the “alt text” on the web for the visually impaired.

Previously, if your vision was blurred the main and common option that many people were inclined to use was the screen reader. However, the challenge with this is that if images on the web missed the alternative text it would be impossible for the screen reader to interpret the image and give an accurate description.

With that said, Microsoft Edge has come up with a workaround for this, the auto-generated alt text for images. Its main purpose is to enhance the user experience for those with screen readers by helping them interpret the meaning and significance of the images displayed.

All you need to do is turn on the Get image descriptions from Microsoft for screen readers to access this feature. However, the algorithms may not be accurate and may vary when it comes to descriptions.  Once you turn on the button, unlabeled images will be automatically sent to Azure Cognitive Services’ Computer Vision API for processing by Microsoft Edge.

During processing the Vision API gets to analyze and create descriptions for the images in 5 different languages. Moreover, it is also able to recognize alternative texts in images in over 120 different languages.

However, there are some limitations to this system, in that it may not be able to process descriptions for some images, such as:

  • Images that are marked as “decorative” by the web site author. Decorative images don’t contribute to the content or meaning of the web site.
  • Images smaller than 50 x 50 pixels (icon size and smaller)
  • Excessively large images
  • Images categorized by the Vision API as pornographic in nature, gory, or sexually suggestive.

Users have control over this feature through an enterprise policy setting dubbed AccessibilityImageLabelsEnabled where you can disable it. It is currently available for Windows, Mac, and Linux users. There is no indication yet in regards to when it will make it to Android and iOS users.