data-analytics-internship-report-santhosh

lOMoARcPSD|44082428
Data Analytics Internship Report (Santhosh)
Computer Science (Sreenidhi Institute of Science and Technology)
Scan to open on Studocu
Studocu is not sponsored or endorsed by any college or university

Downloaded by vijaya sawant (vijayasawant968@gmail.com)
lOMoARcPSD|44082428
A Summer Industry Internship–II Report

on
Visualization and Analysis of India’s GDP using Amazon
Redshift
During
III Year II Semester Summer
Submitted to
The Department of Computer Science and Engineering
In partial fulfillment of the academic requirements of

Jawaharlal Nehru Technological University
For
The award of the degree of
Bachelor of Technologyin
Computer Science and Engineering
By
SANTOSH LOLAM 20311A0599
Sreenidhi Institute of Science and Technology Yamnampet,

Ghatkesar, R.R. District, Hyderabad - 501301
Affiliated to
Jawaharlal Nehru Technology University
Hyderabad - 500085
Department of Computer Science and Engineering

lOMoARcPSD|44082428
Department of Computer Science and Engineering

Sreenidhi Institute of Science and Technology
CERTIFICATE
This is to certify that this Summer Industry Internship –II report on “Visualization and
Analysis of India’s GDP using AWS services”, submitted by Santhosh Lolam
(20311A0599) in the year 2023 in partial fulfillment of the academic requirements of
Jawaharlal Nehru Technological University for the award of the degree of Bachelor of
Technology in Computer Science and Engineering, is a bonafide work- summer industry
internship that has been carried out during III B.Tech CSE II semester, will be
evaluated in IV
B.Tech CSE I Semester , under your guidance.
This report has not been submitted to any other institute or university for the award of
anydegree.
Project Coordinator Head of the Department
Mrs.B.Vasundhara Devi Dr .Aruna Varanasi

Assistant Professor Professor
Department of CSE Department of CSE
External Examiner
Date:-

lOMoARcPSD|44082428

lOMoARcPSD|44082428

lOMoARcPSD|44082428

lOMoARcPSD|44082428
DECLARATION
I, SANTOSH LOLAM (20311A0599), students of SREENIDHI INSTITUTE OF

SCIENCE AND TECHNOLOGY, YAMNAMPET, GHATKESAR, studying IVth year
Ist semester, COMPUTER SCIENCE AND ENGINEERING solemnly declare that the
Summer Industry Internship-II report , titled “VISUALIZATION AND ANALYSIS OF
INDIA’S GDP USING AMAZON REDSHIFT” is submitted to SREENIDHI
INSTITUTE OF SCIENCE AND TECHNOLOGY for partial fulfillment for the award
of degree of Bachelor of technology in COMPUTER SCIENCE AND ENGINEERING.
It is declared to the best of our knowledge that the work reported does not form part of any
dissertation submitted to any other University or Institute for award of any degree

lOMoARcPSD|44082428
ACKNOWLEDGEMENT
I would like to express our gratitude to all the people behind the screen who helped me to
transform an idea into a real application.
I would like to thank our Project coordinator Mrs.B.Vasundhara Devi for her technical
guidance, constant encouragement and support in carrying out our project at college.
I profoundly thank Dr. ARUNA VARANASI, Head of the Department of Computer Science
& Engineering who has been an excellent guide and also a great source of inspiration to our
work.
We would like to express our heart-felt gratitude to our parents without whom we would
not have been privileged to achieve and fulfill our dreams. We are grateful to our principal,
Dr.T.Ch.Siva Reddy, who most ably run the institution and has had the major hand in
enabling me todo our project.
The satisfaction and euphoria that accompany the successful completion of the task would
be great but incomplete without the mention of the people who made it possible with their
constant guidance and encouragement crowns all the efforts with success. In this context, we
would like thank all the other staff members, both teaching and non-teaching, who have
extended their timely help and eased our task.

lOMoARcPSD|44082428
VISUALIZATION AND ANALYSIS OF INDIA’S GDP USING

AMAZON REDSHIFT
Abstract
Storing data in Amazon Redshift along with Amazon S3 is of paramount importance in the
field of data analytics, serving as a foundational solution for secure, scalable, and reliable
data storage. Amazon S3's ability to handle diverse datasets, from raw to processed, makes
it an ideal choice for analytics workflows, ensuring seamless scalability as data volumes
grow. The durability and availability of Amazon S3 contribute to the reliability of analytics
processes, while robust security features such as access controls and encryption safeguard
sensitive data. The integration capabilities with various analytics tools streamline
workflows, allowing analysts to efficiently access and analyze data. Overall, Amazon S3
plays a central role in empowering organizations to derive meaningful insights from their
data while maintaining the integrity, security, and scalability required for
effective data analytics.

lOMoARcPSD|44082428
LIST OF FIGURES
S.NO Figure No. Title of Figure Page No.
1 3.1 Architectural Design 6
2 3.2 Use Case diagram 8

3 5.1 Create an IAM User account 13
4 5.2 Create a S3 Bucket 14
5 5.3 Load data into S3 Bucket 15
6 5.4 Query on data 15
7 5.5 Creating Redshift Cluster 16
8 5.6 Data Analysis Tools 16
9 5.7 Editing Storage Class 17
10 5.8 Load data into S3 Bucket 17
11 5.9 Query on data 18

lOMoARcPSD|44082428
INDEX
Abstract i
List of Figures ii
1. INTRODUCTION 1
1.1 Scope 1
1.2 Existing System 2
1.3 Proposed System 2
2. SYSTEM ANALYSIS 4
2.1 Functional Requirement Specifications 4
2.2 Performance Requirements 5
2.3 Software Requirements 5
2.4 Hardware Requirements 5
3.SYSTEM DESIGN 6
3.1 Architecture Design 6
3.2 Modules 7
3.3 UML Diagrams 8
3.3.1 Use Case Diagrams 8
4. SYSTEM IMPLEMENTATION 9
5. OUTPUT SCREENS 15

lOMoARcPSD|44082428
6. INTERNSHIP FEEDBACK (Experience) 19
6.1 Challenges faced 19
7. CONCLUSIONS AND FUTURE SCOPE 20
BIBLIOGRAPHY 21
Appendix A: Abstract 22
Appendix B: Correlation between the Summer Industry Internship-I and the Program
Outcomes (POs), Program Specific Outcomes (PSOs) 23
Appendix C: Domain of Internship and Nature of internship 24

lOMoARcPSD|44082428
1. INTRODUCTION
Storing and managing data efficiently is a critical aspect of modern digital ecosystems, and
Amazon Redshift has emerged as a cornerstone in this endeavor. As a highly scalable,
durable, and secure object storage service, Amazon S3 offers organizations a robust
platform to store, retrieve, and manage vast amounts of data in the cloud. This introduction
provides an overview of the key features and benefits of leveraging Amazon S3 for data
storage, highlighting its pivotal role in addressing the evolving needs of businesses in the
digital age. From its seamless scalability to advanced security measures and integration
capabilities, Amazon S3 has become a go-to solution for diverse applications, ranging from
data analytics and content distribution to backup and archiving. Understanding the
significance of Amazon S3 lays the foundation for harnessing the full potential of cloud-
based storage solutions in the pursuit of effective and streamlined data management.
1.1 Scope
The scope of Amazon Redshift is vast, positioning it as a versatile solution in cloud-based

data management. Renowned for its exceptional scalability, S3 seamlessly accommodates
diverse storage needs, from modest datasets to expansive petabytes, ensuring adaptability to
evolving requirements. A pivotal application lies in data analytics, where S3 serves as a
central repository for structured and unstructured data, enhancing storage efficiency and
retrieval. Integration with various analytics tools augments the agility and efficacy of data-
driven decision-making. Beyond analytics, S3 plays a critical role in backup and recovery,
offering reliability and durability. Versioning, redundancy, and robust disaster recovery
strategies contribute to its utility. S3's versatility extends to content distribution, in
conjunction with Amazon CloudFront, enabling scalable, low-latency global delivery of
web content. For long-term data archiving, S3 provides cost-effective solutions, including
storage classes like Glacier. Its collaborative potential is evident in secure collaboration,
allowing teams to efficiently share and collaborate on documents and media files. In the IoT
era, S3 is a reliable storage solution for vast volumes of IoT-generated data. With robust
security features such as access controls and encryption, S3 is apt for storing sensitive and
regulated data, ensuring compliance. Supporting application hosting, S3 facilitates static
website hosting and serves as a storage backend for web applications. Seamless integration
with the AWS ecosystem enhances its functionality, presenting opportunities for
comprehensive and scalable cloud-based solutions.
1

lOMoARcPSD|44082428
1.2 Existing System
Organizations used to rely on on-premises solutions or other cloud storage systems for
data storage before implementing Amazon Simple Storage Service (Amazon S3).On-
premises setups often involved physical servers and local infrastructure, posing
challenges in scalability and flexibility. Some organizations used other cloud storage
platforms, facing limitations in terms of scalability and integration. The pre-Amazon
S3 era was characterized by a lack of seamless scalability and comprehensive features
in data storage systems.
The following are the drawbacks of the existing manual System:
• Limited Scalability
• Higher Upfront Costs
• Complex Maintenance
• Reduced Flexibility
• Limited Accessibility and Collaboration
1.3 Proposed System
The proposed system for data storage in Amazon Simple Storage Service (Amazon S3)
revolves around creating a streamlined and secure infrastructure. Utilizing S3 buckets as the
organizational framework, the system ensures the systematic categorization and storage of
diverse datasets. Access controls are meticulously configured to fortify security measures,
providing granular control over data access. Data transfer methods, including direct uploads
and seamless integration with AWS services, facilitate the efficient and secure flow of a
variety of data types into Amazon S3, ensuring adaptability to dynamic data requirements.
Strategic decisions regarding storage classes, such as Standard, Intelligent-Tiering, Glacier,
and Glacier Deep Archive, are made based on the specific characteristics of the data. This
approach optimizes storage by balancing considerations of durability, accessibility, and
cost-effectiveness. Enabling versioning enhances data integrity, offering protection against
accidental deletions or modifications. Automated backup strategies and lifecycle policies
are implemented, efficiently managing data retention periods and transitions between
storage classes. Moreover, the seamless integration of Amazon S3 with analytics tools
within the AWS ecosystem streamlines data analysis workflows. This integration empowers
organizations to extract valuable insights from stored data, fostering informed and data-
2

lOMoARcPSD|44082428
driven decision-making processes. In essence, the proposed system maximizes the

capabilities of Amazon S3, creating a scalable, secure, and efficient data storage solution
that addresses a spectrum of organizational needs within the dynamic landscape of cloud-
based storage services.
MERITS:
• Scalability and Flexibility
• Enhanced Security Measures
• Optimized Storage Cost
• Data Integrity and Disaster Recovery
• Seamless Integration for Analytics

lOMoARcPSD|44082428
2. SYSTEM ANALYSIS
System analysis for storing data in Amazon Simple Storage Service (Amazon S3) involves a
comprehensive examination of the requirements, functionalities, and constraints associated
with utilizing this cloud storage solution. The analysis encompasses several key aspects:
2.1 Functional Requirement Specification

Functional requirements for storing data in Amazon Simple Storage Service (Amazon S3)
encompass the essential features and capabilities necessary for an effective and seamless data
storage system. Here are key functional requirements:
Bucket Management:
• Creation: Users should be able to create new S3 buckets to logically organize and store data.
• Deletion: Authorized users should have the ability to delete buckets that are no longer
needed.
• Configuration: Users must be able to configure bucket properties, including access controls
and logging settings.
Data Upload and Retrieval:

• Direct Uploads: Users should be able to directly upload data to S3 buckets through the user
interface or APIs.
• Download and Retrieval: Authorized users should have the capability to retrieve and
download stored data from S3 buckets.
Access Controls:
• ACLs and Bucket Policies: Implement access control lists (ACLs) and bucket
policies to control who can access and perform operations on S3 buckets and objects.
Security Measures:
• Encryption: Implement encryption mechanisms for data in transit and at rest, ensuring
the security and confidentiality of stored information.
Usability:
• User Access Management: Implement user access management to control who can perform
specific actions within the system. 4

lOMoARcPSD|44082428
2.2 Performance Requirements
The performance requirements for storing data in Amazon Simple Storage Service (Amazon
S3) center on optimizing data transfer, retrieval, and system responsiveness. The system
must ensure high-speed data transfer between clients and S3 buckets, with clearly defined
minimum acceptable rates for uploads and downloads, accounting for network latency.
Minimizing latency in data access and retrieval operations is paramount, and the system
should support a specified number of concurrent requests without compromising
performance. Different storage classes, such as Standard and Glacier, should exhibit defined
performance characteristics, and the system must scale horizontally to handle increasing
data volumes while maintaining high availability and reliability. Data redundancy measures
should be in place to ensure availability in the event of hardware failures, and the system
should optimize data retrieval speed, especially for frequently accessed data. Throughput
requirements must be specified for data transfer operations, and seamless integration with
analytics tools and other AWS services should be ensured. Monitoring and reporting
mechanisms for performance metrics, including caching to optimize retrieval, should be
implemented to evaluate and maintain the system's efficiency, responsiveness, and
scalability over time.
2.3 Software Requirements:
➢ Operating System: Microsoft Windows XP
➢ Technology: Amazon S3
➢ Programming language: MySQL
➢ Authentication mechanism: AWS Identity and Access Management
➢ Web-Browser: Google Chrome (Version 119.0.6045.200)
2.4 Hardware Requirements:
Processor : Intel P-IV based system

RAM : Min. 512 MB

lOMoARcPSD|44082428
3. SYSTEM DESIGN
Systems design is the process of defining the architecture, components, modules,

interfaces, and data for a system to satisfy specified requirements. One could see it as the
application of systems theory to product development. Here's an overview of the
system design:
3.1 Architectural Design
Fig 3.1 Architectural Design
All big data solutions begin with storing data. This is the first step in the big data pipeline.
You can store data with several different services from Amazon Web Services (AWS).
Amazon Simple Storage Service (Amazon S3) is one of the most commonly used services
for storing data. The AWS Management Console to create an S3 bucket. You will then add
an AWS Identity and Access Management (IAM) user to a group that has full access to
Amazon S3. You will also upload files to Amazon S3, and run simple queries on the
data in Amazon S3. You must have permissions to access Amazon S3. IAM is a web
service for securely controlling access to AWS services. One best practice for managing
IAM permissions is to create groups of users with a set of permissions. These permissions
6

lOMoARcPSD|44082428
are controlled by IAM policies.

3.2 Modules
Section 1: Bucket Management
Bucket management in Amazon S3 includes creating and configuring storage containers
with fine-grained access controls, versioning, and lifecycle policies for efficient data
governance. Users can optimize storage costs and enhance security, tailoring configurations
to specific organizational needs through features like cross-region replication and
bucket policies.
Section 2: Data Upload and Retrieval

Data upload and retrieval in Amazon S3 enable users to seamlessly transfer information
through direct uploads, supporting various methods for efficient data handling. This
includes direct uploads, multipart uploads for large files, and integration with data transfer
tools, ensuring a flexible and streamlined approach for managing data within the
storage service.
Section 3: Access Controls

Access controls in Amazon S3 provide fine-grained permissions, managed through IAM
policies and ACLs, allowing users to regulate who can upload, download, or modify data
within a bucket. This security feature ensures controlled and secure access to S3 resources,
enhancing data protection and compliance.
Section 4: Security Measures

Security measures in Amazon S3 include encryption mechanisms for data in transit and at
rest, providing a secure environment. Monitoring and auditing capabilities, such as AWS
CloudWatch, further enhance data protection by tracking access, modifications, and security
events within the storage service.
Section 5: Usability
Amazon S3's usability is reflected in its intuitive web interface, facilitating easy navigation,
bucket management, and access control configuration. User access management ensures a
secure and efficient experience, making it accessible for users to manage and retrieve
data seamlessly.

lOMoARcPSD|44082428
3.3 UML Diagram
UML Diagram for our application is below:
3.3.1 Use Case Diagram
Fig 3.2 Use case Diagram for Amazon S3 bucket
In UML, use-case diagrams model the behavior of a system and help to capture the
requirements of the system. Use-case diagrams describe the high-level functions and
scope of a system. These diagrams also identify the interactions between the system and
its actors.

lOMoARcPSD|44082428
4. SYSTEM IMPLEMENTATION
Implementing data storage in Amazon S3 involves configuring credentials, creating buckets,

and enabling secure data transfer and access controls. Security measures include encryption,
versioning, and automated lifecycle policies for cost-effective storage. Monitoring with
CloudWatch and integration with analytics tools facilitate performance tracking and
analysis. Thorough testing and documentation ensure effective deployment, complying with
legal requirements and offering scalability for growing data volumes in a streamlined and
secure storage solution.
4.1 Procedure
Task 1: Create an IAM user account
In this task, we will review the permissions for the awsusers IAM group and add the awsuser to that
group
Step 1: Review users and group permissions in the IAM console
In the task, you will create a new group of user accounts
• On the AWS Management Console, on the Services menu, choose Services.
• From the list of services, choose IAM.
• In the navigation pane, choose Groups.
• Choose the awsusers group.
• Choose the Permissions tab.
Notice that the AmazonS3FullAccess policy is attached to the group.
• Choose Show Policy
The policy document is in JavaScript Object Notation (JSON) format. This policy states that
users in that group are allowed to take all actions for Amazon S3 on all resources.
• Choose Cancel.
• In the Inline Policies section, choose Show Policy.
The policy document is in JSON format. This policy states that users in the group are not
9

lOMoARcPSD|44082428
allowed to perform the following specified actions on S3 objects:
o ObjectLegalHold – A legal hold prevents an object version from being overwritten or

deleted.
o ObjectRetention – A retention period determines how long an object is retained.
o BucketObjectLock – When an object is locked, it cannot be deleted or overwritten.
• Choose Cancel.
Step 2: Add awsuser to the awsusers group
In this task, you will add the awsuser to the awsusers group. You will also log out of the console
and log back in to the console with the awsuser account and password.
• In the navigation pane, choose Groups.
• Select the awsusers group.
• From the Group Actions menu, choose Add Users to Group.
• Select the awsuser user.
• Choose Add Users.
• From the navigation header, open the list of account actions and copy the account ID.
• In the list of account actions, choose Sign Out.
• To sign back in with the awsuser credentials, choose Sign in to the Console.
• Select IAM user and then use the following information to sign in:
Note: Remove the dashes from the account number before you enter it.
o Account: The account ID that you previously copied
o IAM user name: awsuser
o Password: myP@ssW0rd
Task 2: Load data into Amazon S3
Step 1: Create an S3 bucket
In this task, you will create a new S3 bucket.
• On the AWS Management Console, on the Services menu, choose Services.

10

lOMoARcPSD|44082428
• From the list of services, choose S3.
• Choose Create bucket.
• Enter a bucket name with three or more characters. Uppercase characters are not allowed.
• Choose Create bucket.
Note: S3 bucket names must be unique across all buckets in Amazon S3. If you get a conflict with
another bucket, add a digit and try again.
Note: Write down the bucket name because it will be used in future steps.
Step 2: Upload an object
In this task, you will upload an object to the S3 bucket that you created. First, you must get the file.
• Download the lab1.csv file to a local directory.
• Choose the bucket that you created in the previous task.
• In the Amazon S3 console, choose Upload.
• Choose Add files.
• Browse to the directory where you stored the lab1.csv file.
• Choose the lab1.csv file.
• Choose Upload.
Step 3: Query the object you uploaded
In this task, you will query the object that you uploaded to verify that it was uploaded successfully.
• In the Amazon S3 console, choose the lab1.csv file.
• Review the file properties for the file that you uploaded.
Note: You should get a message stating that versioning is not enabled for the bucket. This
behavior is expected.
• From the Object actions menu, choose Query with S3 Select.
• Scroll down the page and choose Run SQL query.
• You should see the first few records from the file.
• Choose Add SQL from templates.

11

lOMoARcPSD|44082428
• Choose SELECT COUNT * FROM s3object s.
• Choose Copy SQL.
• Replace the previous query by deleting it and then paste the query you copied.
• Choose Run SQL query.
• In the Result pane, you should get the total number of records, which is 5.
Step 4: Change the encryption properties and storage type
In this task, you will change the encryption setting and storage class for the lab1.csv file.
• In the Amazon S3 breadcrumbs, choose the bucket name for your bucket.
• In the Amazon S3 console, choose the lab1.csv file.
• From the Object actions menu, choose Edit server-side encryption.
• Choose Enable and Save changes.
• To return to the object overview page, choose Exit.
• From the Object actions menu, choose Edit storage class.
• Select Intelligent-Tiering and Save changes.
You receive a confirmation that you successfully edited the storage class.
Step 5: Upload a compressed file
In this task, you will upload a file that is compressed as a .gzip file. First, you must get the file and
save it to a local directory.
• In the Amazon S3 console, choose your bucket from the breadcrumbs again.
• Choose Upload.
• Choose Add files, and choose the lab1.csv.gz file that you downloaded previously.
• Choose Upload.
• Select the lab1.csv.gz file.
• To close the Upload: status page, choose Exit.
• From the Object actions menu, choose Query with S3 Select. 12

lOMoARcPSD|44082428
5. OUTPUT SCREENS
Output Screens of various functionalities in our application are shown over here
along with the description.
Task 1: Create an IAM role with required permissions

You must have permissions to access Amazon S3. IAM is a web service for securely
controlling access to AWS services. One best practice for managing IAM permissions is to
create groups of users with a set of permissions. These permissions are controlled by IAM
policies. An IAM policy is an entity that you attach to identities or resources to
define permissions.
Fig 5.1
13

lOMoARcPSD|44082428
Task 2: Load data into Amazon S3
Buckets and objects are the basic building blocks for Amazon S3. You create buckets and
add objects to the buckets. Objects in Amazon S3 can be up to 5 TB. You can set individual
object properties—such as encryption at rest and storage class type—in the Amazon S3
console. Amazon S3 supports two kinds of encryption: Advanced Encryption Standard
(AES)-256, and AWS Key Management Service (AWS KMS).
If you select server-side encryption, each object has a unique key. The keys are also
encrypted with a master key that AWS rotates regularly. If you choose to use AWS KMS,
your objects will also be encrypted with unique keys, but you will manage those keys
yourself.
When you uploaded the lab1.csv file, you accepted the default storage class, which is
Standard. Amazon S3 provides six different storage classes, each with different properties
and cost structures.
Fig 5.2
14

lOMoARcPSD|44082428
Fig 5.3
Fig 5.4
15

lOMoARcPSD|44082428
Fig 5.5
Fig 5.6
16

lOMoARcPSD|44082428
Fig 5.7
Fig 5.8
17

lOMoARcPSD|44082428
Fig 5.9
18

lOMoARcPSD|44082428
6. INTERNSHIP FEEDBACK
6.1 CHALLENGES FACED
It was a good experience performing all the lab activities and also refering the keen power
point presentations provided. Also it was a new experience for us to enhance your skills by
using all theapplications provided in the internship. we have got hands-on experience to use
each and every tool in AWS platform by performing various lab activities . The guided labs
were the building blocks which are to be learnt to perform the challenging labs which
were really challenging and compact.
19

lOMoARcPSD|44082428
7. CONCLUSION AND FUTURE SCOPE
CONCLUSION
In conclusion, employing AWS data analytics with data stored in Amazon S3, coupled with
Identity and Access Management (IAM), establishes a robust and secure foundation for
scalable and efficient data processing. Amazon S3 serves as a highly durable and scalable
storage solution, accommodating diverse data types and volumes. IAM ensures secure
access controls, allowing fine-grained permissions to regulate who can interact with the
data. This integrated approach facilitates seamless data analytics workflows, from ingestion
to transformation and analysis. The combination of these AWS services enables
organizations to harness the power of their data, ensuring reliability, scalability, and
stringent security measures throughout the entire data lifecycle.
FUTURE SCOPE
The future scope of AWS data analytics in storing data using Amazon S3 and IAM (Identity
and Access Management) is poised for continued growth and innovation. As organizations
increasingly prioritize data-driven decision-making, the demand for scalable and secure data
storage solutions coupled with robust analytics capabilities is set to surge. AWS, with its
comprehensive suite of services, including Amazon S3 for durable and scalable storage, and
IAM for fine-grained access control, positions itself at the forefront of this evolution. Future
developments may see enhanced integration with machine learning and AI services,
enabling more sophisticated analytics. Additionally, advancements in real-time analytics,
data governance, and compliance features within the AWS ecosystem are likely, offering
organizations powerful tools to derive actionable insights from their data while ensuring
security and compliance standards are met. The collaborative nature of AWS services is
expected to foster an ecosystem where seamless interactions between storage, access
control, and analytics components drive continuous innovation in data analytics solutions.
20

lOMoARcPSD|44082428
BIBLIOGRAPHY
[1] https://awsacademy.instructure.com,
[2] Grady Booch, James Rumbaugh, Ivar Jacobson. The Unified Modeling Language
UserGuide. Addison-Wesley, Reading, Mass., 1999.
[3] https://docs.aws.amazon.com/s3/?id=docs_gateway#lang/en_us
[4] https://medium.com/aws-lambda-serverless-developer-guide-with-hands/amazon-s3-main-
features-buckets-and-objects-use-cases-and-how-it-works-b2689024e1b6
[5] www.w3schools.com
[6] www.wikipedia.org
21

lOMoARcPSD|44082428
APPENDIX A: ABSTRACT
Sreenidhi Institute of Science and Technology

Summer Industry Internship -II
Batch No:B18
Title
Roll No Name
20311A0599 SANTHOSH LOLAM

VISUALIZATION AND ANALYSIS OF
INDIA’S GDP USING AMAZON REDSHIFT
ABSTRACT
Storing data in Amazon Redshift is of paramount importance in the field of data analytics, serving as a
foundational solution for secure, scalable, and reliable data storage. Amazon S3's ability to handle diverse
datasets, from raw to processed, makes it an ideal choice for analytics workflows, ensuring seamless
scalability as data volumes grow. The durability and availability of Amazon S3 contribute to the
reliability of analytics processes, while robust security features such as access controls and encryption
safeguard sensitive data. The integration capabilities with various analytics tools streamline workflows,
allowing analysts to efficiently access and analyze data. Overall, Amazon S3 plays a central role in
empowering organizations to derive meaningful insights from their data while maintaining the integrity,
security, and scalability required for effective data analytics.
Student 1: SANTHOSH LOLAM Project Coordinator HOD-CSE

Mrs. B .Vasundhara Devi Dr Aruna Varanasi

Dept of CSE
22

lOMoARcPSD|44082428
APPENDIX B: CORRELATION BETWEEN THE SUMMER INDUSTRY

INTERNSHIP-IIAND THE PROGRAMOUTCOMES (POS), PROGRAM
SPECIFIC OUTCOMES (PSOS)
Batch No:B18
Title
Roll No Name

VISUALIZATION AND ANALYSIS OF
INDIA’S GDP USING AMAZON REDSHIFT
Table 1: Project/Internship correlation with appropriate POs/PSOs

(Please specifylevel of Correlation, H/M/L against
POs/PSOs)
H High M Moderate L Low
SREENIDHI INSTITUTE OF SCIENCE AND TECHNOLOGY

DEPARTMENT OF COMPUTER SCIENCE ANDENGINEERING
Projects Correlation with
POs/PSOs
PO1 PO2 PO3 PO4 PO5 PO6 PO7 PO8 PO9 PO10 PO11 PO12 PSO1 PSO2 PSO3
M H H H H L M H M H H H H H M
Student : SANTHOSH LOLAM Project Coordinator HOD-CSE


Dept of CSE
23

lOMoARcPSD|44082428
APPENDIX C: DOMAIN OF INTERNSHIP AND NATURE OF INTERNSHIP
Batch No:B18
Title
Roll No Name
VISUALIZATION AND ANALYSIS OF INDIA’S GDP

USING AMAZON REDSHIFT
Table 2: Nature of the Project/Internship work (Please tick √ Appropriate for your
project)
Nature of project
Others
Batch No. Title Product Application Research (Please
specify)
VISUALIZATION
B18 AND ANALYSIS OF
INDIA’S GDP USING √
AMAZON REDSHIFT


Dept of CSE
24

lOMoARcPSD|44082428
Table 3: Domain of the Project/ Internship work (Please tick √ Appropriate

foryour project)
Domain of the project
ARTIFICIAL COMPUTER DATA CLOUD SOFTWARE

Batch Title INTELLIGENCE, NETWORKS, WAREHOUSING, COMPUTING, ENGINEERING,
No. MACHINE INFORMATION DATA MINING, INTERNET IMAGE
LEARNING, AND SECUTIRY,CYBE BIG DATA OF THINGS PROCESSING
DEEP LEARNING R SECURITY ANALYTICS
VISUALIZATION
AND ANALYSIS
B18 √
OF INDIA’S GDP
USING AMAZON
REDSHIFT


Dept of CSE
25

data-analytics-internship-report-santhosh

Uploaded by

Copyright:

Available Formats

You might also like

data-analytics-internship-report-santhosh

Uploaded by

Document Information

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

data-analytics-internship-report-santhosh

Uploaded by

Copyright:

Available Formats

lOMoARcPSD|44082428

Data Analytics Internship Report (Santhosh)

Computer Science (Sreenidhi Institute of Science and Technology)

Scan to open on Studocu

Studocu is not sponsored or endorsed by any college or university

A Summer Industry Internship–II Report

In partial fulfillment of the academic requirements of

Sreenidhi Institute of Science and Technology Yamnampet,

Downloaded by vijaya sawant (vijayasawant968@gmail.com)

Department of Computer Science and Engineering

Project Coordinator Head of the Department

Mrs.B.Vasundhara Devi Dr .Aruna Varanasi

Downloaded by vijaya sawant (vijayasawant968@gmail.com)

Downloaded by vijaya sawant (vijayasawant968@gmail.com)

Downloaded by vijaya sawant (vijayasawant968@gmail.com)

Downloaded by vijaya sawant (vijayasawant968@gmail.com)

I, SANTOSH LOLAM (20311A0599), students of SREENIDHI INSTITUTE OF

SANTOSH LOLAM 20311A0599

Downloaded by vijaya sawant (vijayasawant968@gmail.com)

SANTOSH LOLAM 20311A0599

Downloaded by vijaya sawant (vijayasawant968@gmail.com)

VISUALIZATION AND ANALYSIS OF INDIA’S GDP USING

Downloaded by vijaya sawant (vijayasawant968@gmail.com)

1 3.1 Architectural Design 6

2 3.2 Use Case diagram 8

11 5.9 Query on data 18

Downloaded by vijaya sawant (vijayasawant968@gmail.com)

1.2 Existing System 2

1.3 Proposed System 2

2.1 Functional Requirement Specifications 4

2.2 Performance Requirements 5

2.3 Software Requirements 5

2.4 Hardware Requirements 5

3.1 Architecture Design 6

3.3 UML Diagrams 8

3.3.1 Use Case Diagrams 8

Downloaded by vijaya sawant (vijayasawant968@gmail.com)

6. INTERNSHIP FEEDBACK (Experience) 19

6.1 Challenges faced 19

7. CONCLUSIONS AND FUTURE SCOPE 20

Downloaded by vijaya sawant (vijayasawant968@gmail.com)

The scope of Amazon Redshift is vast, positioning it as a versatile solution in cloud-based

Downloaded by vijaya sawant (vijayasawant968@gmail.com)

1.2 Existing System

The following are the drawbacks of the existing manual System:

1.3 Proposed System

Downloaded by vijaya sawant (vijayasawant968@gmail.com)

driven decision-making processes. In essence, the proposed system maximizes the

Downloaded by vijaya sawant (vijayasawant968@gmail.com)

2.1 Functional Requirement Specification

Data Upload and Retrieval:

specific actions within the system. 4

Downloaded by vijaya sawant (vijayasawant968@gmail.com)

2.2 Performance Requirements

2.3 Software Requirements:

➢ Operating System: Microsoft Windows XP

➢ Programming language: MySQL

➢ Authentication mechanism: AWS Identity and Access Management

➢ Web-Browser: Google Chrome (Version 119.0.6045.200)