What are the most important factors to consider when building a global data science team? Ya Xu, Head of Data Science at LinkedIn, recently shared at TWIMLcon how she went from being an individual contributor to becoming a data science leader with over 350 global team members.

During our conversation, Ya encouraged us to remember that whether we work on the platforms team, on the application side, or as a business users we are all on the same team and should be focused on common goals. As Ya shared her insights into how to build and develop a global data science team, three points stood out to me:

  1. Everyone is accountable for solving real-world problems for the business
  2. Organizations and platforms must be built with scalability in mind
  3. Champions and subject matter experts should drive the roll-out of new platform capabilities and the identification of opportunities for continuous improvement


To mitigate the risk of failure and to successfully deploy a machine learning platform, the platforms, applications, and business teams must feel accountable for solving a real-world business problem from day one of the build phases. This can be achieved by developing a culture where everyone owns the problem and strives to be a part of the solution. All sides should have the desire to contribute to the solution and not wait for others to solve it for them (or even worse blame others for not solving it for them).

This type of culture encourages technical folks to provide actionable insights to product and business teams and also instills a mindset that strives to optimize infrastructure design, resource utilization, and horizontal capabilities like experimentation methodology and differential privacy. When everyone is accountable to the same goal, all the teams involved will seek each other out when developing new solutions rather than building bespoke solutions within their own silos. This collaborative approach will focus the organization on the most important business problems to be solved.


When developing your data team’s organizational structure and platform architecture, always think about how to build in a scalable way. While the optimal organizational structure will vary according to your company’s context, the trick is to strike a balance between the leverage you gain from a centralized team and the actionable results you’ll get from having data science team members embedded in your company’s business units. While scalability requires a global approach to model deployment methodologies and platforms, the organization must also mitigate the risk of becoming disconnected from the business problems. This can be achieved by always having one foot in the business area.

From a platform perspective, scalability requires you to think about how you design your platform to collect input from its users. Though some methodologies and tools will be standardized, you should develop a platform that allows others to build solutions that scale on your platform. It is also critical to develop a culture around your platform that encourages your platform’s users to request new features that can be built at scale rather than develop one-off solutions that do not benefit the company as a whole.

Champions and SMEs drive improvement

During the build phase, it is critical to engender support from champions and SMEs by including them in the design process and initial testing. As your build shifts into the adoption phase, those early champions will be critical in getting their teams excited about the new features. While you roll out the platform and continue to work with those early adopters to address their needs, other teams at the company will start to take notice and begin to adopt the platform.

When your platform enters the mature phase, your champions and SMEs will continue to play a critical role in onboarding and coaching new users on the platform and will also serve as critical in communicating new feature requests. Their deep understanding of the platform will inform the feature requests and will ensure that each new feature requested delivers value to the business.

If you enjoyed these insights from my conversation with Ya Xu at TWIMLcon, we encourage you to check out TWIMLcon On Demand. Last month, we gathered over 500 machine learning and artificial intelligence practitioners and leaders to explore the real-world challenges of developing, operationalizing, and scaling ML & AI in the enterprise. The conference featured 50+ world-class presenters and panelists from teams leading the application of AI and Machine Learning at companies like Netflix, Shopify, LinkedIn, Spotify, Google, Walmart, iRobot, Adobe, Intuit, Yelp, Salesforce, Prosus Group, Palo Alto Networks, Microsoft, Qualcomm, and more. TWIMLcon’s 20+ hours of presentations, workshops, and discussions will provide you with a practical blueprint for delivering machine learning efficiently and at scale. To explore this great content and learn more about building smarter, innovating faster, and avoiding costly mistakes across end-to-end ML model production, visit twimlcon.com/ondemand.