We live in the age of data revolution, where everything that surrounds us is related to a data source and everything in our lives is digitally captured. There are not so many people in the world that hasn’t heard about Big Data. We read about it in...
Main Reasons Why Data Engineering Is Important for Companies Today
Spark Structured Streaming: customizing Kafka stream processing
According to Spark documentation:
Structured Streaming is a scalable and fault-tolerant stream processing engine built on the Spark SQL engine. … In short, Structured Streaming provides fast, scalable, fault-tolerant, end-to-end exactly-once stream...
Startup Reactive Systems
New emerging companies that build innovative product often call themselves as a Startup. Creating an innovative product is a risky strategy. Almost all products in the current digital world actually are software products. At least...
Spark DataSkew Problem
Data Distribution in Big Data
The performance of the Big Data systems is directly linked to the uniform distribution of the processing data across all of the workers. When you have a database table and then take the data from it to processing, the rows of the data should be distributed uniformly among all the data workers.
What is the main idea of monitoring ?
Monitoring is a way to see the state of your system. For example load average, active RAM, load on the network and many other useful metrics, which you will necessarily want to see if you have a complex, cohesive...
Networking in Berlin - Lightbend Partner Executive Summit 2018
As Lightbend partner, DataEngi was invited to “Lightbend Partners Summit” on 15th of May'18 that was taking part in Berlin just right before the Scala Days conference. Our Scala experts Dmytro and Nazar had a chance to speak out on the event with presentation regarding how Lightbend products can be used in different business cases. So, three of us have packed our bags and landed in the Schönefeld Airport. Luckily, the trip was more than for one day so we had a chance to walk around the city and see all the beautiful and historical places around.
Abstracting From Future
The concept of Future is an encapsulation of a computational code in a way that is convenient for functional composition. Future is an abstraction that eliminates direct work with threads. The focus of development is moving from thread synchronization techniques (monitors, locks etc) to terms of chaining data transformation, callback behaviors, composability of asynchronous results. Engineers primarily focus on business logic rather than on implementational details of the logic execution. It’s a shift from imperative to declarative development, that hides details of composition and synchronization of asynchronous results of the computation.
DataEngi public web content using GitLab pages
So we have to build company site with blog. What we should care about?
- Common tools and learning curve for writers
- Content approvement wokflow
- Data consistency
- Site improvement and migration should be in architecture
Is it effective to use any CMS?