Data Lake: centralize in on-prem vs. decentralize on cloud

講者: Jeff Hung @ 趨勢科技 / SPN
地點:1003 會議廳 (10F)
講題:Data Lake: centralize in on-prem vs. decentralize on cloud


Trend Micro has been running big-data in on-premises data center for many years. With Hadoop and its mature ecosystem, we are able to build the centralized Data Lake to serve and fulfill massive data processing loads while manage and encourage new use of data.

In recent years, we are shifting our focus to AWS. Due to the decentralized nature of the cloud, the design and thinking for building Data Lake are different. We must identify what are still important no matter in on-prem or on the cloud, and what could be done differently to embrace the cloud model.

In this talk, we will elaborate Trend Micro considerations and best practices on building Data Lake in on-prem and on cloud. And share our experience on managing peta-byte scale data with many years of evolution.


Manager of SPN Infrastructure team in Trend Micro that maintain our own Hadoop distribution and operate several Hadoop clusters with hundreds of machines in datacenter and on the cloud. We also provide other kinds of cloud infrastructures such as cross-datacenter message queue, search engine service, and peta-byte scale metadata storage.

Tagged on: