博客
关于我
强烈建议你试试无所不能的chatGPT,快点击我
spark资料汇集
阅读量:6571 次
发布时间:2019-06-24

本文共 4999 字,大约阅读时间需要 16 分钟。

http://databricks.com/spark-training-resources

Spark Workshops  https://www.sics.se/~amir/ ,搜索 spark
http://databricks.gitbooks.io/databricks-spark-reference-applications/content/
https://github.com/apache/spark/tree/master/examples/src/main/python
litaotao 的一系列 spark 文章 (python+spark)
http://litaotao.github.io/spark-dataframe-introduction
litaotao 也整理了一系列spark的学习资源
http://litaotao.github.io/spark-resouces-blogs-paper
Spark修炼之道(进阶篇)--Spark入门到精通
http://blog.csdn.net/lovehuangjiaju/article/details/48580863
spark 编程向导--书籍
http://endymecy.gitbooks.io/spark-programming-guide-zh-cn/content/deploying/running-spark-on-yarn.html
http://taoistwar.gitbooks.io/spark-operationand-maintenance-management/content/
以WordCount为例, 讲解Spark内核作业调度机制
http://www.cnblogs.com/yoyaprogrammer/p/dive_into_wordcount_1.html
Spark(四) -- Spark工作机制
http://blog.csdn.net/qq1010885678/article/details/45728173
====================
部署专题
====================
CDH5 集群中 Spark 集群模式的安装过程配置过程
http://blog.javachen.com/2014/07/01/spark-install-and-usage.html
Spinning up an Apache Spark Cluster: Step-by-Step
 http://blog.insightdatalabs.com/spark-cluster-step-by-step/
 
Hadoop+Spark+Hbase部署整合篇
http://blog.csdn.net/qq1010885678/article/details/46673079
Spark On Yarn & Spark as a Service & Spark On Tachyon
http://blog.csdn.net/qq1010885678/article/details/46242143
Windows平台下安装Hadoop
http://www.cnblogs.com/kinglau/p/3270160.html
Introduction to Spark for .NET Developers
https://msdn.microsoft.com/en-us/magazine/mt595756.aspx
Spark 部署
https://docs.qingcloud.com/guide/spark.html
Spark集群安装和使用 - JavaChen Blog
http://blog.javachen.com/2014/07/01/spark-install-and-usage.html
====================
调优
====================
一个实际PySpark项目性能调优
http://flykobe.com/index.php/2015/06/01/pyspark-spark-tuning/
美团点评的 Spark性能优化指南
http://tech.meituan.com/spark-tuning-pro.html
====================
pyspark 专题
====================
Spark Python API函数学习:
http://www.iteblog.com/archives/1395
pyspark文章, https://districtdatalabs.silvrback.com/getting-started-with-spark-in-python,
中文翻译:  http://blog.jobbole.com/86232/
示例代码库 https://github.com/DistrictDataLabs/spark-workshop/ ,非常棒!
Using Jupyter on Apache Spark: Step-by-Step with a Terabyte of Reddit Data
http://blog.insightdatalabs.com/jupyter-on-apache-spark-step-by-step/
How To Write Spark Applications in Python
http://blog.appliedinformaticsinc.com/how-to-write-spark-applications-in-python/
spark streaming+kafka, 另外还有spark-submit如何传入依赖的python package和jar包.
http://rustyrazorblade.com/2015/05/spark-streaming-with-python-and-kafka/
http://www.csdn.net/article/2014-01-28/2818282-Spark-Streaming-big-data
【Spark1.3官方翻译】 Spark Submit提交应用程序
http://blog.csdn.net/mycafe_/article/details/44923265
Spark 入门(Python、Scala 版)
http://my.oschina.net/leejun2005/blog/411605
Spark编程指南--Python版
http://www.csdn.net/article/2015-04-24/2824552
pyspark与spark的集成方式
http://flykobe.com/index.php/2015/04/18/pyspark-and-spark/
开发模式下, 如何方便解决jar的依赖, 或者直接将jar加到SPARK_CLASSPATH中, 参见compute-classpath.sh
http://zhangyi.farbox.com/post/wen-ti-jie-jue/solve-spark-issue-of-all-masters-are-unresponsive
http://blog.csdn.net/qq1010885678/article/details/46052055
====================
SQL 专题
====================
平易近人、兼容并蓄--Spark SQL 1.3.0概览
http://www.csdn.net/article/2015-04-03/2824407
Spark ETL Techniques (包括python/scala的优劣对比)
http://www.slideshare.net/DonDrake/presentations
基于spark1.3.1的spark-sql实战-01
http://blog.csdn.net/stark_summer/article/details/45825177
Spark SQL 1.3测试
http://www.cnblogs.com/kxdblog/p/4488991.html
Spark SQL 之 Data Sources
http://www.cnblogs.com/BYRans/p/5005342.html
有几个spark 和RDMS交互的文章
http://www.sparkexpert.com/category/etl/
Spark-1.3.1与Hive整合实现查询分析
http://shiyanjun.cn/archives/1113.html
瞌睡中的葡萄虎的cnblogs, 包含很多Spark SQL文章
http://www.cnblogs.com/luogankun/
Spark RDD写入RMDB(Mysql)方法二
http://www.iteblog.com/archives/1290
Spark读取MySQL的方法
http://www.iteblog.com/archives/1275
Spark SQL整合PostgreSQL
http://www.iteblog.com/archives/1369
Spark-1.3.1与Hive整合实现查询分析
http://shiyanjun.cn/archives/1113.html
spark sql 访问postgresql
http://zhangyi.farbox.com/post/access-postgresql-based-on-spark-sql?utm_source=tuicool
====================
ML 专题
====================
一号店的段石石同学的Machine Learning With Spark的几个notebook
http://hacker.duanshishi.com/?p=1282
几篇spark机器学习的文章
http://blog.selfup.cn/tag/spark
https://www.codementor.io/spark/tutorial/building-a-recommender-with-apache-spark-python-example-app-part1
end-to-end tutorial for a recommendation engine using PySpark
http://tech.marksblogg.com/recommendation-engine-spark-python.html
====================
Hadoop 相关
====================
Hadoop学习笔记-20.网站日志分析项目案例(一)项目介绍
http://www.cnblogs.com/edisonchou/p/4449082.html
Hadoop学习笔记-2.不怕故障的海量存储:HDFS基础入门
http://www.cnblogs.com/edisonchou/p/3538524.html
hadoop 2.0 详细配置教程
http://www.cnblogs.com/scotoma/archive/2012/09/18/2689902.html
hdfs指令:
http://hadoop.apache.org/docs/r2.6.0/hadoop-project-dist/hadoop-common/FileSystemShell.html
hadoop2.6.0】安装+例子运行
http://www.cnblogs.com/dplearning/p/4145209.html

你可能感兴趣的文章
Python学习【第14篇】:面向对象之反射以及内置方法
查看>>
[日常] Go语言圣经--JSON习题2
查看>>
[日常] Go语言圣经-错误,函数值习题
查看>>
高并发秒杀系统分析
查看>>
3. 深入研究 UCenter API 之 加密与解密(转载)
查看>>
Asp.net MVC验证哪些事(3)-- Remote验证及其改进(附源码)
查看>>
php文件处理
查看>>
今天写了个从一张表数据插入到另一张表的oracle 语句
查看>>
Odoo Auto Backup Database And Set Linux task schedualer
查看>>
Java线程专栏文章汇总(转)
查看>>
listview中getview异步加载网络图片
查看>>
【AdaBoost算法】积分图代码实现
查看>>
如何让jquery-easyui的combobox像select那样不可编辑
查看>>
Linq之扩展方法
查看>>
【Bug Fix】Error : Can't create table 'moshop_1.#sql-534_185' (errno: 150)
查看>>
Android DownloadManager 的使用
查看>>
Android数据的四种存储方式
查看>>
上海互联网整体沉沦:盛大巨人全没落 8年没出一个马云
查看>>
fabric批量操作远程操作主机的练习
查看>>
css知多少(7)——盒子模型
查看>>