Open in app

Sign In

Write

Sign In

Ashish Patel
Ashish Patel

206 Followers

Home

About

Published in

Codebrace

·Pinned

Data Validation — What, Why, and How?

what? Data validation is a method for checking the accuracy and quality of your data. Data validation ensures that your data is complete (no blank or null values), unique (contains distinct values that are not duplicated), in range of values is what you expect, and much more depending on your field…

Data Validation

5 min read

Data Validation — What, Why, and How?
Data Validation — What, Why, and How?
Data Validation

5 min read


Published in

Codebrace

·Feb 26

Bitmasking and C++ bitset

What is Bitmasking? At the Smallest scale in computers, data is stored as bits. A bit stores either 0 or 1. A binary number is a number expressed in the base-2 system. Each digit can be either 0 or 1. You can check the binary representation of a number in python: >>> bin(21)…

Cpp

4 min read

Bitmasking and C++ bitset
Bitmasking and C++ bitset
Cpp

4 min read


Published in

Codebrace

·Apr 8, 2021

Object Versioning for Google Cloud Storage!

Usecase Suppose we have a lot of data in our Cloud Storage bucket and somehow by mistake someone runs gsutil rm gs://my_bucket/*, we will lose all our data and won’t be able to recover it easily or may never be able to recover it. How does Object Versioning Help? By Design, every storage object (file) in…

Google Cloud Platform

4 min read

Using Object Versioning for Google Cloud Storage!
Using Object Versioning for Google Cloud Storage!
Google Cloud Platform

4 min read


Published in

Codebrace

·Mar 18, 2021

Understanding Docker Networks and resolving conflict with Docker Subnet IP Range!

As we all know, By default Docker creates 3 networks automatically Bridge, Host, and None network. Bridge Network The private internal network created by default. Every container is attached to this by default and gets an IP or range 172.17.*.* Containers can also access each other using this IP if required. For…

Docker

3 min read

Understanding Docker Networks and resolving conflict with Docker Subnet IP Range!
Understanding Docker Networks and resolving conflict with Docker Subnet IP Range!
Docker

3 min read


Published in

Codebrace

·Mar 11, 2021

Working on On-prem/External Airflow with Google Cloud Platform(GCP)

If you want to work with Airflow and just starting up with your installation then Google Cloud Composer is the best solution, As it creates all the required services and manages Kubernetes Cluster via GKE and everything connects like magic. But if you already have an On-prem Airflow or Airflow…

Airflow

4 min read

Working on On-prem/External Airflow with Google Cloud Platform(GCP)
Working on On-prem/External Airflow with Google Cloud Platform(GCP)
Airflow

4 min read


Published in

Codebrace

·Jan 10, 2021

Working with JSON ( JSONL) & multiline JSON in Apache Spark

Few days back I was trying to work with Multiline JSONs (aka. JSON ) on Spark 2.1 and I faced a very peculiar issue while working on Single Line JSON(aka. JSONL or JSON Lines ) vs Multiline JSON files. JSON Lines vs JSON Consider an example, our JSON looks like below here we can see…

Spark

3 min read

Working with JSON ( JSONL)& multiline JSON in Apache Spark
Working with JSON ( JSONL)& multiline JSON in Apache Spark
Spark

3 min read


Published in

Codebrace

·Apr 20, 2020

LeetCode Day #18 -Minimum Path Sum

Problem — https://leetcode.com/explore/challenge/card/30-day-leetcoding-challenge/530/week-3/3303/ Intuition at every index(i,j) we have to choices either go right or down. Clearly a optimal subproblem solution, first try to implement using recursion, then easily convert to Top-down Dynamic programming. findSum(i,j) = grid[i][j] + min( findSum(i,j+1),findSum(i+1,j)); Solution

Leetcode

1 min read

Leetcode

1 min read


Published in

Codebrace

·Nov 26, 2019

How Apache Spark runs our Application?

In order to understand how your application runs on a cluster, an important thing to know about Dataset/Dataframe transformations is that they fall into two types, narrow and wide, which we will discuss first, before explaining the execution model. Dataframe is nothing but a Dataset[Row], so going forward we will…

Big Data

5 min read

How Apache Spark runs our Application?
How Apache Spark runs our Application?
Big Data

5 min read


Published in

Codebrace

·Jul 23, 2019

Vim Tutorial in 59 Minutes, Part — 1 Basics

Vim is a hell of an editor, which has a very steep learning curve, but very efficient when you are done with it. Starting Editing in Vim editing from command-line $ vi filename open a file inside vim, first entering vim using vi command then use :e command $ vi :e filename Saving file run following…

Vim

2 min read

Vim Tutorial in 59 Minutes, Part — 1 Basics
Vim Tutorial in 59 Minutes, Part — 1 Basics
Vim

2 min read


Published in

Codebrace

·Oct 26, 2018

Adding Class Path in git-bash

If you have been using git-bash for command line operations and couldn’t able to find some class paths this blog might help you in adding all class paths permanently to be used from git-bash. Problem Statement You use git-bash a lot and want to access very application from there.(because …

Git

2 min read

Git

2 min read

Ashish Patel

Ashish Patel

206 Followers

Big Data Engineer at Skyscanner , loves Competitive programming, Big Data.

Following
  • Skyscanner Engineering

    Skyscanner Engineering

  • Maxime Beauchemin

    Maxime Beauchemin

  • Jarek Potiuk

    Jarek Potiuk

  • Souvik Biswas

    Souvik Biswas

  • Jyoti Dhiman

    Jyoti Dhiman

See all (60)

Help

Status

Writers

Blog

Careers

Privacy

Terms

About

Text to speech

Teams