Back To Schedule
Thursday, November 14 • 4:20pm - 4:50pm
Deploy end to end ML pipeline using Apache spark streaming and kubernetes.

Sign up or log in to save this to your schedule, view media, leave feedback and see who's attending!

Deploy an end to end ML pipeline using Apache spark streaming and kubernetes. Distributed streaming processing engines, like Apache Spark(TM) Structured streaming can help in various ways for performing machine learning in real time at a large scale. A typical streaming machine learning end to end pipeline consists of : # Preprocessing the data based on the application. e.g. normalising or cleaning etc.. # Using micro service and kubernetes hosting the model, using IBM MAX (IBM Model asset exchange). # Scaling the entire pipeline using Apache Spark and kubernetes. This talk may consist of a live demo of applying the above technique, for predicting objects in an image, using an object detection model. Since this is a streaming application, the prediction will be made in realtime. Key takeaways: # Learn about reusing ML models using IBM Model asset exchange. # Learn about how to scale an online ML application end to end, using Apache Spark Structured streaming and kubernetes. Details of associated code and data source used for the demo available here: https://github.com/ScrapCodes/SS-on-kube

avatar for Prashant Sharma

Prashant Sharma

System Software Engineer, IBM
Open source contributor, part of the CODAIT (Center for Open Source Dataand AI Technologies) group at IBM. Apache Spark committer and PMC member.
avatar for Nick Pentreath

Nick Pentreath

Principal Engineer, IBM
Nick Pentreath is a principal engineer in IBM's Center for Open-source Data & AI Technology (CODAIT), where he works on machine learning. Previously, he cofounded Graphflow, a machine learning startup focused on recommendations. He has also worked at Goldman Sachs, Cognitive Match... Read More →

Thursday November 14, 2019 4:20pm - 4:50pm PST