Data Science in the Enterprise

講者: Josh Yeh / Software Engineer @ Cloudera
地點:1101 會議廳
“Machine learning is all the rage. ML poses great opportunities for enterprises who already capture vast amount of data, and Cloudera’s customers are using our platform to solve ML problems everyday.

However, the reality is: getting data from an enterprise data hub is no trivial task for a data scientist. The data access must be secured through Kerberos authentication; the tools and libraries that data scientists use often conflicting each other, which creates a management problem. A data scientist could download data to his/her own laptop for data modeling, but it creates data silo, small dataset problem, in addition to data governance problems for cluster administrators. Data scientists also want to shorten the time of putting a model into production, which is really hard in today’s environment.

Cloudera developed Cloudera Data Science Workbench to help making data science easier at enterprise scale. In this talk, I will review a few problems that today’s data scientist have, and I will talk about how CDSW makes data scientist’s’ life easier. Finally, I will do a demo.”


