08 mayo 2017

Laboratorio de Analítica Avanzada de Datos - Big Data e inteligencia de negocio.

Dentro del ámbito del laboratorio de Big Data de Bilbomática, hoy os dejamos algunas de las áreas en las que nos encontramos trabajando:

Infraestructuras: Trabajando con clusters y con las configuraciones del servidor Ambari y la integración de NIFI y balanceo de carga con Nginx

SYSLOG:   de Syslog a NIFI y con persistencia en HBase,  Recopilación y Envió  de datos con MiNiFi y balanceo de carga como alternativa a Nginx, mensajería con Kafka, consumo de topics desde Flink y persistencia de datos en HBase desde Flink

HBASE: Trabajando con tablas HBase
desde el Shell y desde aplicaciones Java.

HIVE: desarrolo de Hive en cluster, la persistencia de datos en Hive y el desarrollo de aplicaciones Java para trabajar contra tablas de Hive mediante peticiones REST

Sin olvidarnos de Apache Zeppelin, de la que os dejamos una interesante entrada del Blog Dataminded Apache Zeppelin: Big data prototyping and visualization in no-time :

"Apache Zeppelin: Big data prototyping and visualization in no-time

Lately the name Zeppelin crossed our minds several times. Keeping in mind the daily release of a new big data tool and the mostly disappointing impression you get when diving into those tools, we silently ignored Zeppelin for the time being. After the ongoing encouragement of several colleagues we finally decided to take a look at Apache's latest flying machine: Can it make us fly?



What is Apache Zeppelin?



So what is Apache Zeppelin? Users of IPython notebooks are already familiar with the concept of an interactive web-based computational environment. Apache Zeppelin provides a web-based notebook that enables interactive data analytics. The main focus of Zeppelin strikes data ingestion, discovery, analytics, visualization and collaboration. Though IPython notebooks can also be used to provide data-analytics with Spark, they do not provide the out-of-the box data optimizations that are built into Zeppelin. 

.....

Conclusions

Apache Zeppelin certainly convinced us as a prototyping tool voor (big) data analysis. Besides the on-the-fly available Spark and SqlContext and the ability to mix and match between Scala and Python, the querying features with automatic visualizations are a great pro for instant data exploration. There are still some minor bugs but we believe Zeppelin could become a de facto standard for big data analysis in the near feature. We are certainly curious about your feedback regarding this blog and the tool, have fun with Zeppelin!




" Ofrecemos servicios en Business intelligence y Big Data que permiten alinear las estrategias de los clientes con las tecnologías que las implementan"