Download as docx, pdf, or txt
Download as docx, pdf, or txt
You are on page 1of 3

An example of a beginner’s project walkthrough could look like this:

1. You feed ChatGPT the information about the rows and


columns of the data
2. You ask it to create boilerplate code to explore this data for
null values, outliers, and normality
3. You ask it what questions you should ask of this data
4. You ask it to clean the data and build the model for you to
make a prediction on the dependent variable

While it may seem like it is doing all the work for you, you still have to
get this project to run in your environment. You are also prompting and
problem solving as you go along.

There is no guarantee that it will work like there is when you’re copying
someone else’s project, so I feel like this is a nice learning middle
ground for involvement.

An Advanced Practitioner’s Project Walkthrough

Now, let’s think about how a more advanced practitioner would use
this:

1. You could follow the same steps of generating boilerplate code, but
this should be expanded upon. So, you might want to experiment with
more hands-on exploration of the data and hypothesis testing. Maybe,
choose one or two questions you want to answer with data and
descriptive statistics and start analyzing it.

2. For someone who has done a few projects, I recommend generating


some of the code yourself. Let’s say you made a simple bar chart in
plotly. You could feed that in and ask ChatGPT to reformat it, to change
the color or the scale, etc.

By doing this, you can rapidly iterate on visualizations, and you can see
in real time how different tweaks to the code change the graph. This
immediate feedback is great for learning.

3. I also think it is important that you review these changes and see
how they were made. Also if you don’t understand something, just ask
ChatGPT right there to expand on what it did.

4. More advanced practitioners should also focus more heavily on the


data engineering and the pipelines for productionizing code. These are
things that you still need to be fairly hands-on with. I found that
ChatGPT was able to get me part of the way there, but I needed to do a
lot of debugging myself.

5. From there, you may want to go through and have the AI run some
algorithms and do parameter tuning. To be honest, I think this will be
the part of data science that will be automated the fastest. I think
parameter tuning will see diminishing returns for normal practitioners,
but maybe not for the highest level Kagglers.
6. You should focus your time on feature engineering and feature
creation. This is also something that the AI models can help with, but
not completely master. After you’ve got some decent models, see what
data you can add, what features you can create, or what transforms you
can do to increase your results.

In a world with these advanced AI tools, I think it is even more


important to do projects than ever. You have to build things, and share
your work. Fortunately, with these AI tools, it is also easier than ever to
do that. It’s easier produce a web app. It’s easier to work with new
packages that you’ve never worked with before.

References

https://towardsdatascience.com/best-use-chatgpt-learn-data-science-easy-beginner-
b10299c49c4c

You might also like