Durgesh Sr. Data Architect / Modeler/Bigdata

Durgesh
Sr. Data Architect / Modeler/BigData

Summary
 Highly effective Data Architect with over 8 years of experience specializing in working with big data, cloud, data and analytics platforms.
 Excellent knowledge in Data Analysis, Data Validation, Data Cleansing, Data Verification and identifying data mismatch.
 Excellent experience on Teradata SQL queries, Teradata Indexes, Utilities such as Mload, Tpump, Fast load and Fast Export.
 Expert in writing SQL queries and optimizing the queries in Oracle, SQL Server 2008 and Teradata.
 Experience in Architecture, Design and Development of large Enterprise Data Warehouse (EDW) and Data-marts for target user-base
consumption.
 Excellent knowledge in Data Analysis, Data Validation, Data Cleansing, Data Verification and identifying data mismatch.
 Performed data analysis and data profiling using complex SQL on various sources systems including Oracle and Teradata.
 Excellent Software Development Life Cycle (SDLC) with good working knowledge of testing methodologies, disciplines, tasks, resources
and scheduling.
 Strong experience in using Excel and MS Access to dump the data and analyze based on business needs.
 Expertise lies in Data Modeling, Database design and implementation of Oracle, AWS Redshift databases and Administration, Performance
tuning etc.
 Experience in analyzing data using Hadoop Ecosystem including HDFS, Hive, Spark, Spark Streaming, Elastic Search, Kibana, Kafka, HBase,
Zookeeper, PIG, Sqoop, Flume.
 Experienced working with Excel Pivot and VBA macros for various business scenarios.
 Strong experience in Data Analysis, Data Migration, Data Cleansing, Transformation, Integration, Data Import, and Data Export .
 Data Transformation using Pig scripts in AWS EMR, AWS RDS.
 Experience working with data modeling tools like Erwin, Power Designer and ER Studio.
 Experience in importing and exporting data using Sqoop from HDFS to Relational Database Systems (RDBMS) and from RDBMS to HDFS.
 Experience in data analysis using Hive, PigLatin, Impala.
 Well versed in Normalization / De-normalization techniques for optimum performance in relational and dimensional database
environments.
 Good understanding of AWS, big data concepts and Hadoop ecosystem.
 Experienced in various Teradata utilities like Fastload, Multiload, BTEQ, and Teradata SQL Assistant.
 Expert in writing SQL queries and optimizing the queries in Oracle, SQL Server 2008 and Teradata.
 Develop and manage SQL, Python and R code bases for data cleansing and data analysis using Git version control
 Excellent Software Development Life Cycle (SDLC) with good working knowledge of testing methodologies, disciplines, tasks, resources and
scheduling.
 Exposure in Core Java,.
 Extensive ETL testing experience using Informatica 8.6.1/8.1 (Power Center/ Power Mart) (Designer, Workflow Manager, Workflow Monitor
and Server Manager)
 Excellent in creating various artifacts for projects which include specification documents, data mapping and data analysis documents.
 An excellent team player& technically strong person who has capability to work with business users, project managers, team leads,
architects and peers, thus maintaining healthy environment in the project.
Technical Skills
Analysis and Modeling Tools: IBM Infosphere, SQL Power Architect, Oracle Designer, Erwin 9.6/9.5, ER/Studio 9.7, Sybase Power Designer.
Database Tools: Oracle 12c/11g, MS Access, Microsoft SQL Server 2014/2012 Teradata 15/14, Poster SQL, Netezza.
Big Data Technologies: Hadoop, HDFS 2, Hive, Pig, HBase, Sqoop, Flume.
Cloud Platform: AWS, EC2, S3, SQS, Azure.
OLAP Tools: Business Objects, Tableau, SAP BO, SSAS, Crystal Reports 9.
Operating System: Windows, Dos, Unix, Linux.
Reporting Tools: Business Objects, Crystal Reports.
Tools & Software’s: TOAD, MS Office, BTEQ, Teradata SQL Assistant.
ETL Tools: SSIS, Pentaho, Informatica Power 9.6, SAP Business Objects XIR3.1/XIR2, Web Intelligence.
Other tools: TOAD, SQL PLUS, SQL LOADER, MS Project, MS Visio and MS Office, Have worked on C++, UNIX, PL/SQL etc.
Capital Group, Irvine, CA Sep’18 – Till date

Sr. Data Architect / Modeler
Responsibilities:
 Consulted and supported Data Architect / Data Modeler initiatives in the development of integrated data repository transformed from
legacy system to new operational system and data warehouse.
 Providing solutions on ingesting the data into the new Hadoop big data platform by designing data models for multiple features to help
analyze the data on graph databases.
 Applied business rules in modeling Data Marts and data profiling to model and new data structures.
 Delivered scope, requirements, and design for transactional and data warehouse system which included Oracle DB, SQL server, and
Salesforce database.
 Designed and developed architecture for data services ecosystem spanning Relational, NoSQL, and Big Data technologies.
 Designed ODS structures and data mart structures.
 Developed long term data warehouse roadmap and architectures, designs and builds the data warehouse framework per the roadmap.
 Designed the Logical Data Model using ERWIN 9.64 with the entities and attributes for each subject areas.
 Extracted data from a transactional system into a staging area that transformed and loaded into a star schema.
 Working on AWS and architecting a solution to load data create data models and run BI on it.
 Working on logical and physical modeling, and ETL design for manufacturing data warehouse applications.
 Involved in creating Hive tables, and loading and analyzing data using hive queries Developed Hive queries to process the data and generate
the data cubes for visualizing Implemented.
 Implemented Join optimizations in Pig using Skewed and Merge joins for large datasets schema.
 Designed and developed a Data Lake using Hadoop for processing raw and processed claims via Hive and Informatica.
 Developed and implemented different Pig UDFs to write ad-hoc and scheduled reports as required by the Business team.
 Involved in Normalization / De normalization techniques for optimum performance in relational and dimensional database environments.
 Design of Big Data platform technology architecture. The scope includes data intake, data staging, data warehousing, and high performance
analytics environment.
 Involved in loading data from LINUX file system to HDFS Importing and exporting data into HDFS and Hive using Sqoop Implemented
Partitioning, Dynamic Partitions, Buckets in Hive.
 Used SSRS to create reports, customized Reports, on-demand reports, ad-hoc reports and involved in analyzing multi-dimensional reports in
SSRS.
 Creating dimensional data models based on hierarchical source data and implemented on Teradata achieving high performance without
special tuning.
 Focused on architecting NoSQL databases like Mongo, Cassandra and Cache database.
 Perform routine management operations, including configuration and performance analysis for mongodb. Diagnosing Performance Issues for
mongodb.
 Involved in designing Logical and Physical data models for different database applications using the Erwin.
 Data modeling, Design, implement, and deploy high-performance, custom applications at scale on Hadoop /Spark.
 Implemented Data Integrity and Data Quality checks in Hadoop using Hive and Linux scripts.
 Reverse engineered some of the databases using Erwin.
 Proficiency in SQL across a number of dialects (we commonly write MySQL, PostgreSQL, Redshift, SQL Server, and Oracle).
 Routinely deal in with large internal and vendor data and perform performance tuning, query optimizations and production support for SAS,
Oracle 12c.
 Working on defining data architecture for data warehouses, Data marts and business applications.
 Designed and developed architecture for data services ecosystem spanning Relational, NoSQL, and Big Data technologies.
 Specifies overall Data Architecture for all areas and domains of the enterprise, including Data Acquisition, ODS, MDM, Data Warehouse, Data
Provisioning, ETL, and BI.
 Developed Data Mapping, Data Governance, and Transformation and cleansing rules for the Master Data Management Architecture
involving OLTP, ODS.
 Performance tuning and stress-testing of NoSQL database environments in order to ensure acceptable database performance in production
mode.
 Implemented strong referential integrity and auditing by the use of triggers and SQL Scripts.
 Designed and developed T-SQL stored procedures to extract, aggregate, transform, and insert data.
 Created and maintained SQL Server scheduled jobs, executing stored procedures for the purpose of extracting data from DB2 into SQL
Server.
 Experience with SQL Server Reporting Services (SSRS) to author, manage, and deliver both paper-based and interactive Web-based reports.
 Performed Hive programming for applications that were migrated to big data using Hadoop
 Generated parameterized queries for generating tabular reports using global variables, expressions, functions, and stored procedures using
SSRS.
 Extensive knowledge in Data loading using PL/ SQL Scripts and SQL Server Integration Services (SSIS).
 Work in team using ETL tool Informatica to populate the database, data transformation from the old database to the new database using
Oracle and SQL Server.
 Ensured that data architecture tasks were executed within deadlines.
Environment: DB2, CA Erwin 9.6, Oracle 12c, Salesforce, MS-Office, SQL Architect, TOAD Benchmark Factory, SQL Loader, PL/SQL, SharePoint, ERwin
r9.64, Talend, MS-Office, SQL Server 2008/2012, Hive, Pig, Hadoop , Spark, AWS.
Center Light Health, Bronx, NY Duration: Apr’17 – Aug’18

Data Architect /Modeler
Responsibilities:
 Developed and maintained the data definitions, data models, data flow diagrams, metadata management, business semantics, and metadata
workflow management.
 Integrated 40 data sources in one data repository utilizing modeling tools (ER Studio) and ETL tool (PL/SQL).
 Involve in data cleaning procedure by removing old, corrupted or irrelevant data in consultation with the teams.
 Worked with Big Data Hadoop Ecosystem in ingestion, storage, querying, processing and analysis of big data and conventional RDBMS.
 Involved in Relational and Dimensional Data modeling for creating Logical and Physical Design of Database and ER Diagrams with all
related entities and relationship with each entity based on the rules provided by the business manager using ER Studio.
 Worked on Normalization and De-normalization concepts and design methodologies like Ralph Kimball and Bill Inmon's Data
Warehouse methodology.
 Use database design and database modeling concepts to ensure data accessibility and security
 Designed both 3NF data models for ODS, OLTP systems and dimensional data models using Star and Snow Flake Schemas.
 Responsible for delivering and coordinating data-profiling, data-analysis, data-governance, data-models (conceptual, logical, physical),
data-mapping, data-lineage and reference data management.
 Worked on Data Stage admin activities like creating ODBC connections to various Data sources, Server Start up and shut down, Creating
Environmental Variables, Creating Data Stage projects.
 Participated in all phases of project including Requirement gathering, Architecture, Analysis, Design, Coding, Testing, Documentation and
warranty period.
 Worked with Data governance, Data quality, data lineage, Data architect to design various models and processes.
 Worked on SQL Server concepts SSIS (SQL Server Integration Services), SSAS (Analysis Services) and SSRS (Reporting Services).
 Generated and DDL (Data Definition Language) scripts using ER Studio and assisted DBA in Physical Implementation of Data Models.
 Extensively worked on creating the migration plan to Amazon web services (AWS).
 Extracted Mega Data from AWS, and Elastic Search engine using Sql Queries to create reports.
 Completed enhancement for MDM (Master data management) and suggested the implementation for hybrid MDM (Master Data
Management).
 Exported data from HDFS environment into RDBMS using Sqoop for report generation and visualization purpose.
 Generated comprehensive analytical reports by running SQL queries against current databases to conduct Data Analysis.
 Performed Data Analysis, Data Migration and data profiling using complex SQL on various sources systems including Oracle and Teradata
 Designed and documented Use Cases, Activity Diagrams, Sequence Diagrams, OOD (Object Oriented Design) using UML and Visio.
 Used forward engineering to generate DDL from the Physical Data Model and handed it to the DBA.
 Integrated Spotfire visualization into client's Salesforce environment.
 Involved in Normalization and De-Normalization of existing tables for faster query retrieval and designed both 3NF data models for ODS,
OLTP systems and dimensional data models using star and snow flake Schemas.
 Involved in Planning, Defining and Designing database using ER Studio on business requirement and provided documentation.
 Worked with BTEQ to submit Sql statements, import and export data, and generate reports in Teradata.
 Developed Full life cycle of Data Lake, Data Warehouse with Big data technologies like Spark and Hadoop.
 Created data masking mappings to mask the sensitive data between production and test environment.
 Responsible for all metadata relating to the EDW's overall data architecture, descriptions of data objects, access methods and security
requirements.
 Involved in Data Profiling, Data cleansing and make sure the data is accurate and analyzed when it is transferring from OLTP to Data Marts
and Data Warehouse.
 Used Agile Methodology of Data Warehouse development using Kanbanize.
 Worked with DBA group to create Best-Fit Physical Data Model from the Logical Data Model using Forward Engineering.
 Worked with NoSQL databases like HBase in creating HBase tables to load large sets of semi-structured data coming from various sources.
 Development of Data stage design concepts, execution, testing and deployment on the client server
 Developed Linux Shell scripts by using Nzsql/Nzload utilities to load data from flat files to Netezza database.
 Validated the data of reports by writing SQL queries in PL/SQL Developer against ODS.
 Involved in user training sessions and assisting in UAT (User Acceptance Testing).
Environment: ER Studio, AWS, OLTP, Teradata r15, Sqoop 1.4, Cassandra 3.11, MongoDB 3.6, HDFS, Linux, Shell, scripts, NoSQL, SSIS, SSAS, HBase 1.2,
MDM.
Santander Bank, Dorchester, MA Oct’15 – Mar’17

Sr. Data Analyst /Modeler
Responsibilities:
 Worked on OLAP for data warehouse and data mart developments using Ralph Kimball methodology as well as OLTP models (3NF) and
interacting with all the involved stakeholders and SME's to derive the solution.
 Conduct knowledge sharing sessions with the Architect and SMEs and design the Data Flow Diagram.
 Designed the ER diagrams, logical model (relationship, cardinality, attributes, and, candidate keys) and physical database (capacity planning,
object creation and aggregation strategies) for Oracle and Teradata as per business requirements using Erwin
 Designed 3rd normal form target data model and mapped to logical model.
 Involved in extensive DATA validation using SQL queries and back-end testing
 Generated DDL statements for the creation of new ER/studio objects like table, views, indexes, packages and stored procedures.
 Design MOLAP/ROLAP cubes on Teradata Database using SSAS.
 Used SQL for Querying the database in UNIX environment
 Creation of BTEQ, Fast export, Multi Load, TPump, Fast load scripts for extracting data from various production systems .
 Working along with ETL team for documentation of transformation rules for data migration from OLTP to warehouse for purpose of reporting.
 Created views and extracted data from Teradata base tables and uploaded data to oracle staging server from Teradata tables, using fast export
concept.
 Worked RDS for implementing models and data on RDS.
 Developed mapping spreadsheets for (ETL) team with source to target data mapping with physical naming standards, data types, volumetric,
domain definitions, and corporate meta-data definitions.
 Designing Star schema and Snow Flake Schema on Dimensions and Fact Tables
 Worked with Data Vault Methodology Developed normalized Logical and Physical database models.
 Transformed Logical Data Model to Physical Data Model ensuring the Primary Key and Foreign key relationships in PDM, Consistency of
definitions of Data Attributes and Primary Index considerations.
 Wrote and running SQL, BI and other reports, analyzing data, creating metrics/dashboards/pivots/etc.
 Gather and analyze business data requirements and model these needs. In doing so, work closely with the users of the information, the
application developers and architects, to ensure the information models are capable of meeting their needs.
Environment: SQL Server, Erwin9.1, Oracle, Informatica, RDS, Big Data, JDBC, NOSQL, Star schema , Snow Flake Schema, Python, MySQL, PostgreSQL .
T-Mobile, Redmond, WA Duration: Mar’ 13 – Sep’15

Sr. Data Modeler /Analyst
Responsibilities:
 Involved in the projects from requirement analysis to better understand the requirements and support the development team with a better
understanding of the data.
 Developed Data Mapping, Data Governance, Transformation and Cleansing rules for the Master Data Management Architecture
involving OLTP, ODS and OLAP.
 Involved in Data Architecture, Data profiling, Data analysis, data mapping and Data architecture artifacts design.
 Responsible for Relational data modeling (OLTP) using MS Visio (Logical, Physical and Conceptual).
 Analyzed the data and provide resolution by writing analytical/complex SQL in case of data discrepancies.
 Involved in logical and Physical Database design & development, Normalization and Data modeling using Erwin and Sql Server Enterprise
manager.
 Prepared ETL technical Mapping Documents along with test cases for each Mapping for future developments to maintain Software
Development Life Cycle (SDLC).
 Designed OLTP system environment and maintained documentation of Metadata.
 Worked on Amazon Redshift and AWS and architecting a solution to load data, create data models.
 Created dimensional model for the reporting system by identifying required dimensions and facts using
 Used Reverse Engineering to connect to existing database and create graphical representation (E-R diagram)
 Using Erwin modeling tool, publishing of a data dictionary, review of the model and dictionary with subject matter experts and generation
of data definition language.
 Coordinated with DBA in implementing the Database changes and also updating Data Models with changes implemented in development,
QA and Production.
 Created and execute test scripts, cases, and scenarios that will determine optimal system performance according to specifications.
 Worked Extensively with DBA and Reporting team for improving the Report Performance with the Use of appropriate indexes and
Partitioning
 Extensive experience in PL/Sql programming Stored Procedures, Functions, Packages and Triggers
 Data modeling in Erwin; design of target data models for enterprise data warehouse (Teradata)
 Designed and Developed Oracle, PL/Sql, Procedures, Linux and Unix Shell Scripts for data Import/Export and data Conversions.
 Experienced with BI Reporting in Design and Development of Queries, Reports, Workbooks, Business Explorer Analyzer, Query Builder,
Web Reporting.
 Generated various reports using Sql Server Report Services (SSRS) for business analysts and the management team.
 Automated and scheduled recurring reporting processes using UNIX shell scripting and Teradata utilities such as Mload, BTEQ and Fast
Load
 Participated in all phases including Analysis, Design, Coding, Testing and Documentation.
 Gathered and translated business requirements into detailed, production-level technical specifications, new features, and enhancements to
existing technical business functionality.
 Involved in Data flow analysis, Data modeling, Physical database design, forms design and development, data conversion, performance
analysis and tuning.
 Created and maintained data model standards, including master data management (MDM) and Involved in extracting the data from various
sources like Oracle, Sql, Teradata, and XML.
 Worked with medical claim data in the Oracle database for Inpatient/Outpatient data validation, trend and comparative analysis.
 Used Load utilities (Fast Load & Multi Load) with the mainframe interface to load the data into Teradata.
 Optimized and updated UML Models (Visio) and Relational Data Models for various applications.
Environment: Erwin9.0, Oracle11g, Sql Server 2010, Teradata14, XML, OLTP, PL/Sql, Linux, UNIX, Mload, BTEQ, UNIX shell scripting
Trianz, Hyderabad, India Jun’11 – Feb’13

Data Analyst / Modeler
Responsibilities:
 Performed Data Analysis, Data Migration and data profiling using complex SQL on various sources systems including Oracle and Teradata.
 Logical and physical database models to design OLTP system for applications using Erwin.
 Forward engineering to create a physical data model with DDL that best suits the requirements from the logical data model using Erwin for
effective model management of sharing, dividing and reusing model information.
 Worked with BTEQ to submit SQL statements, import and export data, and generate reports in Teradata.
 Translated business requirements into working logical and physical data models for Data Warehouse, Data marts and OLAP applications.
 Involved using ETL tool Informatica to populate the database, data transformation from the old database to the new database using Oracle.
 Identified the entities and relationship between the entities to develop Conceptual Model using ERWIN.
 Involved in the creation, maintenance of Data Warehouse and repositories containing Metadata.
 Wrote and executed unit, system, integration and UAT scripts in a Data Warehouse projects.
 Extensively used SQL, Transact SQL and PL/SQL to write stored procedures, functions, packages and triggers.
 Wrote and executed SQL queries to verify that data has been moved from transactional system to DSS, Data Warehouse, data mart reporting
system in accordance with requirements.
 Excellent experience and knowledge on Data Warehouse concepts and dimensional data modelling using Ralph Kimball methodology.
 Developed separate test cases for ETL process (Inbound & Outbound) and reporting.
 Designed Star and Snowflake Data Models for Enterprise Data Warehouse using ERWIN
 Created and maintained Logical Data Model (LDM) for the project.Includes documentation of all entities, attributes, data relationships,
primary and foreign key structures, allowed values, codes, business rules, glossary terms, etc.
Environment: Oracle , MS Visio, PL-SQL, Microsoft SQL Server 2000, Rational Rose, Data warehouse, OLTP, OLAP, ERWIN , Informatica 9.x, Windows, SQL,
PL/SQL, SQL Server, Talend Data Quality, Talend Integration Suite 4x, Oracle 9i, Flat Files, Windows , SVN.

Durgesh Sr. Data Architect / Modeler/Bigdata

Uploaded by

Document Information

Original Description:

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Durgesh Sr. Data Architect / Modeler/Bigdata

Uploaded by

Copyright:

Available Formats

Durgesh

Sr. Data Architect / Modeler/BigData

Capital Group, Irvine, CA Sep’18 – Till date

Center Light Health, Bronx, NY Duration: Apr’17 – Aug’18

Santander Bank, Dorchester, MA Oct’15 – Mar’17

T-Mobile, Redmond, WA Duration: Mar’ 13 – Sep’15

Trianz, Hyderabad, India Jun’11 – Feb’13

You might also like