A model for making software global
For Microsoft’s Cloud + AI division, localization is by default a global, multidimensional proposition—encompassing translation, engineering at scale, deep and broad strategic partnerships, machine learning, and more. We manage a continuous technical and human conversation, with a central theme in customer experience.
How do we do this?
This is the story of the international team for Microsoft’s Cloud, including tips for organizations that may be currently global and customer-focused, or that may be on the journey there.
Our centralized international team is responsible for localization infrastructure and process for Microsoft Cloud’s products and services. This team aligns to support corresponding software and content development teams so that localization is considered from the start. Early integration ensures code is written with globalization in mind and can support customer expectations in locale-sensitive data—such as dates, times, and numbers—as well as geopolitical sensitivity.
Development teams connect to our localization infrastructure and process, exposing resources within code repositories for translation. An external localization supplier and their network of translators are also connected to our infrastructure and are an integral part of the process. These partnerships and related processes result in availability of our products in many languages in addition to English.
Yet effort in localization does not end with in-market availability of products. Effective localization requires feedback to make a round trip, from customer to product and back. After we translate all customer feedback, we triage, analyze, and route issues to improve customer experience in all languages we support.
A complement to the standard localization-supplier model is machine translation (MT). MT increases the availability of localized experiences to customers in overall coverage and speed of delivery. Ongoing reviews of translation quality help us evaluate both the technical accuracy and customer reception of MT. Our localization supplier collaborates with our team on this function.
We further augment this model with a growing community of developers and IT professionals who contribute directly to the localization of Microsoft’s open-source software user experience and documentation.
Finally, this model requires scalable infrastructure. Our tools and processes scan continuously for new translation needs across the portfolio to enable localization in over 100 languages, across hundreds of repositories, hundreds of thousands of files, at nearly 100 million words annually.
While our business is massive, any organization can use frameworks described here— or aspects of them—to ensure products and services reach as many customers as possible, regardless of language or region.
Team lead Arthur Yasinski sums it up: “A unique aspect of our team is that we’re part of the complete customer journey. We handle localization of products, services and related content, documentation, and sites, allowing us to ensure consistency and quality across projects.”
Understand and build a framework for global readiness
Global-ready software is appropriate and compelling for customers in all localities. Respecting customers, communities, and experiential preferences is essential to its design. For example, how would expectations around display of data such as dates and currencies differ if you were based in the United States, Japan, France, or elsewhere? To ensure global readiness of products, Microsoft created and maintains an extensive compliance process—a framework—to help teams track and execute on critical software development requirements, including global readiness.
This framework tracks global readiness in three overall areas:
Geopolitical: Profanity, along with sensitive geographical and cultural expectations
Linguistic: Languages to cover, language laws, and language reforms
Technical: Internationalization capabilities
Development teams address most of these by using established tools, such as international support built into operating systems, with minimal support. Things get interesting, however, when dealing with an evolving list of trickier “risk areas” particularly challenging to development.
Risk areas include language detection/ selection, natural language processing, cross-service integration, commerce or integration with third-party services, aligning commerce functions with regional expectations, and more. Our international team assists software development teams in identifying these risks. From there, we help address the risks directly or consider mitigation options.
We get it: It’s complicated. Software development teams ask for clarity: “Give us a simple, concise overview of requirements for international success. Tell us what aspects are difficult to get right, point us in the right direction, and help us figure out how to handle it!”
Listen to customers—in all languages
Product feedback loop
Building great products and documentation requires understanding customer satisfaction and using feedback to self-correct. This is especially important to localization. For organizations operating at big scale, it’s critical.
How do we know if we delivered a useful experience for international customers?
A common sentiment may be to just accept that a company’s understanding of customers might always be limited. However, viable methods exist to collect, organize, and address international customer feedback.
Our approach to this challenge: gain understanding of customer perception by enabling submission of customer feedback text (“verbatims”) directly through software user interfaces (UI) and web sites. Our technical documentation sites alone get over one million customer feedback submissions per year.
This volume requires automation to aggregate, categorize, and analyze, so we can address problem areas. So, we created an analytics platform. The platform helps us draw insight and make data-driven decisions by bringing together feedback from multiple input points inside products and documentation sites. It then applies machine-learning models that categorize incoming feedback into a taxonomy of 22 predefined categories. Categories such as translation, missed localization, or broken links help target improvements.
Crucially, feedback can be submitted in whatever language the user chooses, then machine-translated to English when necessary.
International team members review this feedback in a web-based tool, triage it, and, if necessary, update the machine-applied taxonomy category. These actions ensure that feedback is routed correctly for implementation and improves the capability of the machine learning–classification model.
Bug databases store all actionable, categorized customer feedback. From there, localization suppliers implement revised translations. Finally, reviewers on the international team confirm with customers that action was taken, closing the loop. Where customer feedback is product feature-related—not “actionable” by localization-focused personnel—it is channeled to developer backlogs for their review.
This system enables feedback handling at scale by drawing continuously from all available sources, storing that feedback centrally, then classifying and routing it automatically before closing the loop with customers.
Docs.microsoft.com feedback loop
Great documentation aids in product discoverability, deployment, and usage. And yet, the mere delivery of technical documentation to international customers is not enough: We must have a global-ready platform that supports the unique usage scenarios of these customers. Our solution was the development of customer-focused features on docs.microsoft.com to complement the documentation on the site.
Features include toggle controls that allow customers to switch between English and localized content. Customers can also sample localized content while in a “hover-over” state with mouse-over English source content. Built-in fallback defaults to English for content not yet localized.
Customers can also provide feedback to published technical documentation, rating how helpful (or not) content is, or making descriptive comments. An “edit” button embedded in the site allows customers to literally make pull requests (PRs) on GitHub, suggesting changes. Customers use PRs to report unclear sentences, poor-quality machine translation, clarity issues, and similar.
GitHub itself includes another built-in feedback and tracking mechanism: GitHub issues. Customers open issues to give feedback or make suggestions about technical documentation. The feature is for feedback that isn’t document- or topic-specific: site features, overall translation quality, or requests to localize currently English content.
Our localization supplier reviews the input and accepts or rejects PRs. Feedback applicable to all languages gets escalated to English-language writing teams, so the documentation can be updated and re-localized.
Finally, internal teams at Microsoft, including employees at international subsidiaries, provide feedback via an internal ticketing system. This feedback is also processed by our localization supplier, assessed for priority and validity, and then implemented and automatically republished.
Supplement localization models with machine translation
Machine translation is a viable tool for expanding localized customer experiences and can be done either via localization suppliers or in house where such expertise and infrastructure are available.
When exploring MT solutions, organizations should investigate and leverage the latest capabilities of the technology, utilize localization supplier expertise strategically, and consider right-sizing for cost while maintaining the translation quality customers expect.
The initial MT effort for our team happened in two phases. Our first exploratory phase used the “technology domain” engine of Microsoft Translator, part of Azure Cognitive Services. Next we used Microsoft Translator’s Custom Translator feature in concert with data from our team’s current software projects to train an engine to our portfolio.
The effort paid off. Comparing Bilingual Evaluation Understudy (BLEU) and Translation Error Rate (TER) scores from both methods, the latter setup—Custom Translator with added training—yielded a 49 percent improvement in BLEU. TER score across languages increased by an average of 23 percent.
A single-blind study with human reviewers was the final test. Bilingual reviewers evaluated samples of original English text alongside either MT or human translation (HT). Participants were unaware of whether the translation they read was the result of MT or HT. Human reviewers submitted accuracy and fluency scores. Results indicated that for some languages, human reviewers found MT and HT quality to be comparable.
Results identified further languages that showed promising MT quality, leading us to enable MT in production on more than 40 of them. Doing so yielded 18 percent savings in translation costs.
However, while overall quality improved, a new issue arose. MT over-localized (translated) product and service names, where Microsoft’s standards indicate that they should remain in English. We solved this by connecting a dictionary of product, service, and business terms from our portfolio to the engine. This increased accuracy significantly for some languages. Where we didn’t find improvement, human post-editing of MT made the difference.
We also experimented with neural MT (NMT), which utilizes artificial neural networks to help improve MT output accuracy. NMT employs different machine learning algorithms that work together to predict sequences of words, including the modeling of complete sentences. Our team experimented with customizing the Microsoft research neural engine, which utilized technical content from our portfolio, and we saw BLEU scores improve.
In a complementary effort, we tested how using a dictionary built from terms from our portfolio, from an internal term database for localization suppliers (“Term Studio”), impacted quality. This drove promising results for some languages, notably reducing over-localization of product and service names.
Our MT human evaluation framework and methodology is still evolving, and we are learning about it alongside the rest of the industry. We are currently comparing the raw MT quality (adequacy and fluency) of two different MT engines translating technical documentation, which will inform an eventual decision on overall MT engine strategy.
Leverage technical communities
Community translation
Community translation allows customers to contribute directly to localization of software and technical documentation from wherever they may find themselves online. Creating localized versions of open-source software (OSS) and fine-tuning technical content are two key ways communities can optimize experiences.
The goal of our effort is to nurture an ecosystem where international users feel part of a larger developer community, get recognized for contributions, and build reputation. Community contributions play a role in how we launch some Microsoft products and improve documentation. International users contribute as follows:
Software projects: Select Microsoft OSS technologies such as VS Code, SQL Tools for Linux, and Batch Explorer are open to community localization input, enabling customers to contribute in their native language. Community members provide feedback, suggest improvements to existing translations, vote or rate the value of the suggestions of other contributors, and help localize a product from scratch with MT support.
Technical documentation: International users edit the technical documentation they use on docs.microsoft.com, submitting native language suggestions via GitHub, ranging from the quality of translation to the accuracy of English source content. Moderators review suggestions, then approve and republish valid suggestions.
After over a year of this program being underway, we’ve seen top contributors become influencers and encourage a self-sustaining community.
Develop a pipeline for scale
Software: User interface
The heart of our infrastructure for software localization is a build-scanning tool, used internally at Microsoft and with our partners. This tool connects developers who work on Cloud + AI technologies at Microsoft—over 10,000 developers and growing—to a unified localization pipeline.
Built on Azure, the capabilities scale to accommodate scores of repositories. Code scans execute agnostically, identifying to-be-translated resources in proprietary infrastructure and OSS repositories alike.
Upon check-in of new UI text (strings) to a repository, a build kicks off. Efficiency is core. Before strings go to a translator, automations compare new strings against other products to determine if translation occurred before and whether the string can be digested, stemmed, and auto-translated in whole or in part.
From there, the string surfaces to a network of over 1,000 translators representing over 100 languages, managed by our localization supplier.
On return, the tool checks translations for sensitive language or geopolitical sensitivities, then checks files back into home repositories and integrates them into the next available build.
Technical documentation
Like the approach with software, our team opted for a long-term partnership with a localization supplier for technical documentation. We then created a pipeline to deliver to-be-localized files directly from source GitHub repositories to localization supplier infrastructure.
English-language content authors create technical documentation in Markdown format, storing files on GitHub, then publish to docs.microsoft.com. Localization automation scans continuously for new or updated files and submits them into the localization pipeline for translation. Based on preconfigured settings for cadence, the supplier receives files, along with metadata specifying desired languages and translation quality—whether translation memory (TM) with MT or HT.
The localization supplier processes files through a TM + MT workflow, wherein similar or identical translations from the past are leveraged and matched. From there, translations may proceed to human translators who edit the text to premium quality where indicated.
International team members analyze data to better apply the right translation method depending on scenario: Which content is most used and essential to customers? Which content needs improvement? Depending on the case, we use TM + MT for content of emerging or unknown interest level—or where speed may be a necessity. We use TM + MT + HT for content critically important to customers.
Translated files then return to the localization pipeline, automatically submitted to GitHub repositories for each target (non-English) language and triggering a publishing event when a change is detected. Localized content publishes live to sites within a few hours after the supplier finalizes the translation.
Final words
Great software localization programs result from intentional and strategic decisions in technical infrastructure, partnerships, customer engagement models, and more. What you’ve read here is the view from one Microsoft team: our journey, our related learnings, our story arc to date. Mileage will vary: No single approach or specific collection of approaches will suit all organizations. Available technical and process options evolve constantly.
Step one is to put yourself in the shoes of your customers and understand the options available. If you aim to be global for the long haul, make an organizational commitment to continuous learning about localization. Make a realistic assessment of the current state of your business as well as your aspirations in the near term and beyond. Align your technology, partner, and customer engagement strategies with headroom for growth. Course correct as needed. From there, you are on your way to delighting all your customers.