Professional Documents
Culture Documents
Vedanth Kunchala Data Integration Engineer
Vedanth Kunchala Data Integration Engineer
com
BigData:
Experience in working various hadoop distributions like Cloudera, Hortonworks and MapR.
Expert in ingesting batch data for incremental loads from various RBMS tools using Apache Sqoop.
Developed scalable applications for real-time ingestions into various databases using Apache Kafka.
Developed Pig Latin scripts and MapReduce jobs for large data transformations and Loads.
Experience in using optimized data formats like ORC, Parquet and Avro.
Experience in building optimized ETL data pipelines using Apache Hive and Spark.
Implemented various optimizing techniques in Hive scripts for data crunching and transformations.
Experience in building ETL scripts in Impala for faster access for reporting layer.
Built spark data pipelines with various optimization techniques using python and scala.
Experience in loading transactional and delta loads into NoSQL databases like HBase.
Developed various automation flows using Apache Oozie, Azkaban, and Airflow.
Experience in working with NoSQL Databases like HBase, Cassandra and MongoDB.
Experience in various integration tools like Talend, NiFi for ingesting batch and streaming data.
Cloud:
Experience in working with various cloud distributions like AWS, Azure and GCP.
Developed various ETL applications using Databricks Spark distributions and Notebooks.
Implemented streaming applications to consume data from Event Hub and Pub/Sub.
Developed various scalable bigdata applications in Azure HDInsight’s for ETL services .
Developed scalable applications using AWS tools like Redshift, DynamoDB.
Worked on building pipelines using snowflake for extensive data aggregations.
Working knowledge on GCP tools like BigQuery, Pub/Sub, Cloud SQL, and Cloud functions.
Experience in visualizing reporting data using tools like PowerBi, Google analytics.
DevOps:
Experience in building continuous integration and deployments using Jenkins, Drone, Travis CI.
Expert in building containerized apps using tools like Docker, Kubernetes and terraform.
Developed reusable application libraries using docker containers .
Experience in building metrics dashboards and alerts using Grafana and Kibana.
Expert in java and scala built tools like Maven, Pom and SBT for application development.
Experience in working with tools like GitHub, GitLab and SVN for code repository.
Expert in writing various YAML scripts for automation purpose
Technical Skills:
Big Data: Hadoop, Sqoop, Flume, Hive, Spark, Pig, Kafka, Talend, HBase, Impala
ETL Tools: Informatica, Talend, Microsoft SSIS, Confidential DataStage
Database: Oracle, Mongo DB, SQL Server 2016, Teradata, Netezza, MS Access
Reporting: Microsoft Power BI, Tableau, QlikView, SSRS, Business Objects(Crystal)
Business Intelligence: MDM, Change Data Capture (CDC), Metadata, Data Cleansing, OLAP, OLTP,
SCD, SOA, REST, Web Services
Tools: Ambari, SQL Developer, TOAD, Erwin, Visio, Tortoise SVN
Operating Systems: Windows Server, UNIX (Red Hat, Linux, Solaris, AIX)
Web Technologies: J2EE, JMS, Web Service
Languages: UNIX shell scripting, SCALA, SQL, PL/SQL, T-SQL
Scripting : HTML, JavaScript, CSS, XML, Shell Script, Perl, and Ajax
Cloud: Azure , AWS, GCP
Version control: Git, SVN, CVS, GitLab
Tools: FileZilla, Putty, PL/SQL Developer, JUnit,
IDE: Eclipse, Microsoft Visual Studio 2008,2012 , Flex Builder
Experience:
Client: FedEx ServicesLocation Jan 2020 – Present
Memphis, TN
Sr.Data Engineer
Responsibilities
Developed code for importing and exporting data from RBMS into HDFS using Sqoop and vice versa.
Implemented Partitions, Buckets based on State to further process using Bucket based Hive joins.
Developed custom UDF’s in Java as and used whenever necessary to reduce code in Hive queries.
Handled ETL Framework in Spark with python and scala for data transformations.
Implemented various optimization techniques for Spark applications for improving performance.
Involved in Spark streaming for real-time computations to process JSON files from Kafka.
Developed API’s for quick real-time lookup on top of HBase tables for transactional data.
Built optimized dynamic schema tables using AVRO and columnar tables using parquet.
Built various oozie actions , workflows and coordinators for automation purpose.
Developed various scripting functionality using shell , Bash and Python for various operations.
Pushed application logs and data streams logs to Grafana server for monitoring and alerting purpose.
Developed Jenkins and Drone pipelines for continuous integration and deployment purpose.
Worked on building various pipelines and integration using NiFi for ingestion and exports.
Built custom end points and libraries in NiFi for ingesting data from traditional legacy systems.
Implemented integrations to various cloud environments like AWS, Azure and GCP for external vendor
integrations for files exchange systems.
Implemented secure transfer routes for external clients using microservices to integrated external
storage locations like AWS S3 and Google Storage Buckets(GCS).
Built SFTP integrations using various VMWare solutions for external vendors on boarding.
Developed automated file transfer mechanism using python from MFT, SFTP to HDFS.
Environment: Apache Hadoop 2.0, Cloudera, HDFS, MapReduce, Hive, Impala, HBase, Sqoop, Kafka, Spark,
Linux, MySQL, NiFi, Oozie, SFTP
Responsibilities
Developed Hive ETL Logic for data cleansing and transformation of data coming through RBMS.
Implemented complex data types in hive also used multiple data formats like ORC, Parquet.
Worked in different parts of data lake implementations and maintenance for ETL processing.
Developed Spark Streaming application using Scala and python for processing data from Kafka.
Implemented various optimization techniques in spark streaming with python applications.
Imported batch data using Sqoop to load data from MySQL to HDFS on regular intervals.
Extracted data from various APIs, data cleansing and processing by using Java and Scala.
Converted Hive queries into Spark SQL that integrate Spark environment for optimized runs.
Developed a migration data pipelines from HDFS on prem cluster to Azure HD Insights.
Developed Complex queries and ETL process in Jupyter notebooks using data bricks spark.
Developed different modules in microservices to collect stats of application for visualization.
Worked on docker and Kubernetes for deploying application and make it containerize.
Implemented NiFi pipelines to export data from HDFS to cloud locations like AWS and Azure .
Ingested data from Azure Event Hub for Realtime data ingestion into various applications.
Experience designing solutions in Azure tools like Azure Data Factory, Azure Data Lake, Azure SQL &
Azure SQL Data Warehouse, Azure Functions.
Implemented DataLake migration from on prem clusters to Azure for highly scalable solutions.
Worked on implementing various airflow automations for building integrations between clusters.
Environment: Hive, Sqoop, Linux, Cloudera CDH 5, Scala, Kafka, HBase, Avro, Spark, Zookeeper and
MySQL, Azure , Databricks, Scala, Python, airflow .
Responsibilities
Worked on building and developing ETL pipelines using Spark-based applications.
Worked in migration of RDMS data into Data Lake applications.
Build optimized hive and spark jobs for data cleansing and transformations.
Developed spark scala applications in an optimized way to complete in time.
Worked on various optimizations techniques in Hive for data transformations and loading.
Expert in working with dynamic data schema evolutions like Avro formats.
Built API on top of HBase data to expose for external teams for quick lookups.
Experience in building impala script for quick retrieval of data to expose through tableau.
Experience in developing various oozie actions for automation purpose.
Developed a monitoring platform for our jobs in Kibana and Grafana.
Developed real-time log aggregations on Kibana for analyzing data.
Worked in developed Ni-Fi pipelines for extracting data from external sources.
Developed Jenkins pipelines for data pipeline deployments.
Worked on building different modules in spring boot scalable applications.
Developed Docker container for automating run time environments for various applications.
Expert in building ingestion pipelines for reading real time data from Kafka.
Worked in Poc for setup Talend environments and custom libraries for different pipelines.
Developed various python and shell scripting for various operations.
Worked in Agile environment with various teams and projects in fast phase environments.
Environment: Hadoop, Sqoop, Pig, HBase, Hive, Flume, Java 6, Eclipse, Apache Tomcat 7.0, Oracle, Java,
J2ee, Talend, NiFi, Scala, Python .
Responsibilities
Environment: Hadoop, Sqoop, Pig, HBase, Hive, Flume, Java 6, Eclipse, Apache Tomcat 7.0, Oracle, Java,
J2ee.
Education
Masters at Eastern Illinois University 2015
Bachelor of Technology at JNTUH 2012