#174 Building An MLOps Ecosystem with Ivan Liu, Director of Engineering at Rokt

Machine learning helps individuals and businesses deploy solutions that unlock previously untapped sources of revenue, save time, and reduce cost by creating more efficient workflows, leveraging data analytics for decision-making, and improving customer experience.

These goals are hard to accomplish without a solid framework to follow.

Getting machine learning models to production is notoriously difficult: it involves multiple teams (data scientists, data and machine learning engineers, operations, …); the model can be trained in one environment but then productionalised in a completely different environment; it is not just about the code, but also about the data (features) and the model itself.

When it comes to MLOps, no one tool can do it all. It takes an array of best-of-breed tools to truly automate the entire ML lifecycle.

This week, we are joined by Ivan Liu. Director of Engineering at Rokt. He specialised in software engineering, machine learning and cloud architecture.

Tune in for the full conversation where Ivan talks about ML ops and ML Engineering.

Enjoy the show!

Thanks to our sponsor Talent Insights Group!

Quotes:

It's also a learning journey for me to start from small things to more and more big and bigger things to me. And eventually, I found what motivates me is to actually land those disruptive and ARV solutions in the real world and to see how they can make a difference to the business or even to the life of the people. This while also co-founding a few startups during my spare time to do the AI-based solutions to change the existing behaviour of providing more value to the people.
So it's not really like a waste of analysis, it can be continuously used to help finish, to make the decision on a day to day basis.
When we talk about MF, it's not really, let's say, about toolings. It's all about the process, we want a best-practice process we want to implement into organizations when we talk about introducing a new process.
A lot of ML engineers or data scientists, they are coming from a research background, they are super good at modelling. And they can optimize the model towards a very high accuracy.
If your business is facing all those problems, and you don't really have enough people to build an in-house solution to solve those problems, then a feature store can be adding a lot of value.
They'll definitely come at a time to bite the bullet and make it something that can be consistent, reusable, and bring in some of the software engineering practices of DevOps into creating this data set for machine learning consumption.
We don't really want to test every model in the production. That's the goal here.
To get the solid results locally run before going live that would be ideal. And so they can fail faster, they can stop faster, earlier and try new models faster.
If your teams have those skill sets together, and the team can learn from each other and grow from each other, I think that's enough to build a MF team.
There's a culture unlock, yes, you build it, you own it, if something is happening in the live system, you need to get up and respond to it.
Just to keep a close eye on the success story and be prepared to react to all those challenges you might be facing and are unique to.
Those unstructured data that cannot be handled easily by humans and which can be processed by AI provide innovative solutions, that's actually very much excite me.

What we discussed:

05:55 What are your aim for the performance of the models? Do you have a target that you expect the models to perform under a set of time? Or how fast do the models need to be one serving predictions to the customers?
07:40 What do you see is the difference between data science and ml ops, or data science and ml engineering, and how they're related?
12:12 How valuable is it that it’s changing over time and being able to retrain the models in ways that are tracked?
12:54 What are some differences when products and getting the models to production and getting them to run at scale? What is your number one objective? What do you see that needs to change from what was the data science workflow environment that didn't have that push to personalisation? What changes in terms of mindset, or approach when you're building a model almost in isolation between that and getting it pushing to get into production?
20:48 How are you guys able to update the feature stores with incoming data at the speed that you need it to be done?
24:03 How are the models compared?
24:14 What does this testing looks like and how do you tell that the new challenger is better or worse than the Champion?
27:21 Is there some testing before the model gets any production traffic?
27:44 Are the creation of the models done by humans or machines?
29:05 For existing champion models, do you do any automated retraining?
30:51 We have the model versioning, the registry, the metadata, and also the deployment and monitoring? Is that all part of the model store or are they different?
32:03 What is your ml pipeline and orchestration tech look like?
34:13 What type of skill sets do you think ML engineers need?
36:55 What are your views on team structures in this space?
38:43 Is your team a product team? Does it have a cross-functional team from people across the organization? Or is it a data science and ML engineering team?
40:27 What are some points in the envelopes face at the moment?
41:53 How do you recommend people to get started in the ML space?
46:57 What do you see as the evolution of the industry?
48:57 Do you have any advice for organizations that are looking to break the habit and jump into the newer space?
50:54 Has your passions or your motivations changed between the start of the journey and where you are now?
52:32 What applications of AI Are you particularly excited about? Anything that you would like to either see come to reality or things that are happening that you're particularly excited about?
55:20 What do you think about this apprehension in investing in trying ML Ops from companies?

UPCOMING EVENTS!

At Data Futurology, we are always working to bring you use cases, new approaches and everything related to the most relevant topics in data science to help you get the most value out of these technologies!

Join us at upcoming events https://www.datafuturology.com/events

Felipe Flores8 December 2021responsible AI, AI, hackathon, AI ethics, data science, data for good, Data Science, Data Analytics, Machine Learning, Branding, Leadership, ai, Ethics & Privacy, Data Governance, Scaling AI, Season 4, Season 4.1Comment