Professional Documents
Culture Documents
Data Analysis Essay PDF
Data Analysis Essay PDF
In recent years, data science has rapidly become a buzzword and a popular career course, but the
ideas that make up this field have been around for nearly 3 centuries. This essay aims to provide
my perspective on the development and the future of data science and my previous experience
Despite large volumes of data now available, companies are focused on using data for competitive
advantage in almost every industry. On the other hand, the widespread availability of data has
contributed to an increase in interest in new approaches for extracting useful information and
knowledge from data. The modern discipline called data science has emerged as a new approach
to address this massive collection of data. Today, the majority of all fields in the world are dealing
with various aspects. Mainly in security, health care, industry, agriculture, transport, education,
forecasting, telecommunications, etc. Each field would also gain a different amount of return on
During my master, I took different courses in data science such as artificial intelligence, big data,
business intelligence, and data analysis, the latter is what I was interested in the most because I
always loved how can we transform data into knowledge, insights and tell stories. later one my
first real interaction with data was in my time as an intern at Orange a French multinational
telecommunications corporation. I was an intern as a data analyst and worked on a project related
to customers data usage, in this project I had the opportunity to work on real clients data and
starting from some very basic text data, excel sheets and log files, I had to some data cleaning,
data transformation, and design a data warehouse to end up with some nice and interactive visuals.
In the first place, recognizing the business problem was the first step towards tackling the project.
It is very important for a Data Scientist to ask the right questions. A lot of questions were raised
to understand the real business challenge before the project was completed, be it the available data
sources, the final objectives of the project, etc. Essentially, our objective, in addition to creating
charts for existing data, was to predict the areas of the city that need to be covered by 4 G because
In the second place, data preprocessing because real-world data is not well structured like the ones
founded online in platforms like Kaggle. Therefore, data preprocessing (other people might call it
data munging or data cleaning) is so crucial that I can’t stress enough how important it is.
Finally, visualization and building different models from scratch was a steep learning curve for
me as a person who was still learning from MOOCs and textbooks. Fortunately, Scikit-learn and
Keras (with Tensorflow backend) came to my rescue as they are easy to learn for fast models
prototyping and implementation in Python. In addition, I also learned how to optimize the models
and fine-tuned the hyper parameters for each model using several techniques.
It is evident that data science is an emerging field that requires overall knowledge, mainly in
computational science, statistics and mathematics. New technologies have emerged to deal with
massive amounts of data in any field, and there are many benefits, ranging from health care to
telecommunications. At the same time, care should be taken to ensure that information to the
respondents is not exploited. In the near future, data science will uncover many discoveries that