114培训网欢迎您来到杭州博学国际教育培训中心!

400-850-8622

全国统一学习专线 8:30-21:00

杭州Cloudera Apache Spark程序员

授课机构:杭州博学国际教育培训中心

关注度:72

课程价格: 请咨询客服

上课地址:请咨询客服

开课时间:滚动开班

咨询热线:400-850-8622

在线报名

课程详情在线报名

更新时间:2024-11-22
Cloudera Apache Spark程序员 培训班型: 公开课,内训 课程长度: 3天/18小时 培训日期: 待定 认证考试: 暂无 培训地点: 博学国际教育培训中心 环境要求: 投影仪、白板、大白纸 培训形式: 实例讲授,现场演、练、及时沟通 培训资料: 培训教材 课程内容 Cloudera Developer Training for Apache Spark 课程概述: 结合批处理、流媒体和交互分析技术,利用 Apache Spark 构建完整统一的大 数据应用。学习编写复杂的并行应用程序,为各种用例、架构和行业执行快速良好的决策和实时行动。 授课对象: 面向意欲优化应用程序速度、易用性和复杂程度的开发人员和工程师。培训对象要求 具 备Python或Scala背景知识,具备Linux 相关基础知识更佳。 培训目标: Using the Spark shell for interactive data analysis  The features of Spark’s Resilient Distributed Datasets  How Spark runs on a cluster  How Spark parallelizes task execution  Writing Spark applications  Processing streaming data with Spark 课程内容: Introduction to Spark  What is Spark?  Review: From Hadoop MapReduce to Spark  Review: HDFS  Review: YARN  Spark Overview Spark Basics  Using the Spark Shell  RDDs (Resilient Distributed Datasets)  Functional Programming in Spark Working with RDDs in Spark  Creating RDDs  Other General RDD Operations Aggregating Data with Pair RDDs  Key-Value Pair RDDs  Map-Reduce  Other Pair RDD Operations Writing and Deploying Spark Applications  Spark Applications vs. Spark Shell  Creating the SparkContext  Building a Spark Application (Scala and Java)  Running a Spark Application  The Spark Application Web UI  Hands-On Exercise: Write and Run a Spark Application  Configuring Spark Properties  Logging Parallel Processing  Review: Spark on a Cluster  RDD Partitions  Partitioning of File-based RDDs  HDFS and Data Locality  Executing Parallel Operations  Stages and Tasks Spark RDD Persistence  RDD Lineage  RDD Persistence Overview  Distributed Persistence Basic Spark Streaming  Spark Streaming Overview  Example: Streaming Request Count  DStreams  Developing Spark Streaming Applications Advanced Spark Streaming  Multi-Batch Operations  State Operations  Sliding Window Operations  Advanced Data Sources Common Patterns in Spark Data Processing  Common Spark Use Cases  Iterative Algorithms in Spark  Graph Processing and Analysis  Machine Learning  Example: k-means Improving Spark Performance  Shared Variables: Broadcast Variables  Shared Variables: Accumulators  Common Performance Issues  Diagnosing Performance Problems Spark SQL and DataFrames  Spark SQL and the SQL Context  Creating DataFrames  Transforming and Querying DataFrames  Saving DataFrames  DataFrames and RDDs  Comparing Spark SQL, Impala and Hive-on-Spark
姓名不能为空
手机号格式错误