Imagine the following scenario. Alice needs to review a change submitted by Bob, a new developer in the project. And it goes likes this: Review 1: The formatting is inconsistent with the rest of the code. She explains the rules to Bob. Review 2: It’s better, but it’s still not right. She needs to point […]

This is first article in Big Data series. Introduction Sometimes in Data Scientist work we need to perform analysis on CSV files. In this article I want to compare performance of different tools in Real-World Use Case. Described tools include: R (using data.tables library) Python (using Pandas) Spark 2.0 (using Spark SQL) We used Spark […]