Download as pdf or txt
Download as pdf or txt
You are on page 1of 20

Welcome everyone!

Please

work in pairs (you can change at the next Lab), use the
whole of the room, and leave alternate rows empty for
TA access

A Quick Introduction to Python for Computational Linguistics

Nigel Collier
Faculty of Modern and Medieval Languages and Linguistics

Li18 Computational Linguistics

1
Acknowledgements

Garth Wells Inspiration (2018)

Costanza Conforti, Edoardo Ponti, Haim Dubossarsky Concept creation


(Jan 2019)

Course design
(2019/21)

Marco Basaldella Flora Liu Yi Zhu


Course trial
Li18 class of 2019/20 and of 2020/21
(2019/21)

You! Our first class


(2021/22)

2
Teaching Assistants 2021/22

Hugo Caffaratti

Yixuan Su

Meiru Zhang

Yinhong Liu

3
Michaelmas: 3 self-study interactive notebooks, 1 in-person
assessment

Week Date Event


1 08/10/2021 Lecture: Introduction
12/10/2021 Programming in Python: 1.2 Introduction to Python
2 15/10/2021 Lecture: Reg expressions, normalization, edit distance
3 22/10/2021 Lecture: Finite state techniques
26/10/2021 Programming in Python: 1.3 Basic NLP with Python
27 or 28/10/2021 Supervision 1
4 29/10/2021 Lecture: N-gram models
5 05/11/2021 Lecture: Naïve Bayes and sentiment classification
09/11/2021 Programming in Python: 1.4 Working with Corpus Data
10 or 11/11/2021 Supervision 2
Lecture: Sequence labelling for part of speech and named
6 12/11/2021 entities
7 19/11/2021 Lecture: Constituency grammars and treebanks
23/11/2021 Programming in Python: Assessments
8 26/11/2021 Lecture: Constituency parsing
1 or 2/12/2021 Supervision 3

Winter break

4
Python 3

We’ll be studying Python 3

• One of the most widely taught programming languages at universities

• Python is a great practical skill if you want to work with data to answer
linguistic questions (corpus linguistics) or build models that can analyse,
process and generate text (NLP)

• Comes with a powerful set of libraries (pre-built functions like spaCy)

• Good online support (e.g. Stack Overflow)

• Free and open source to use

5
Jupyter Notebooks

The course is made available in the form of Jupyter Notebooks – you can
think of this as something like a Web page that mixes teacher notes with live
programming code that you can run.

You have two or three choices for running the course notebooks. Our strongly
preferred choice is to run the course notebooks in Google Colab. This
allows you to login to Google using your CRSid and to use a virtual computer
supplied by Google (within certain limits).

6
Self study and labs

Jupyter notebooks are provided for you to work with both in the lab and as
self-study material.

Work through at your own pace.

Lab sessions are here to support your learning but please feel free to leave
early – you don’t need to stay the full two hours if you have understood the
material.

Exercises for completion at the end of each sub-module (see later).

If you are already experienced in Python, please do help out on the Slack
Channel.

7
Getting Help

Please try the following in this order:

1. Double check the instructions in the Jupyter


notebook
2. Try a Google search (e.g. "Python regular
expressions") or a StackOverflow search (e.g.
"Python regular expressions")
3. Do a search of our PyCompLing Slack channel to
see if the topic is covered there. If not, post a
question.
4. Ask the person next to you
5. Ask a TA (raise a hand)

8
Giving Help

Please do contribute to our community. Consider helping


someone else by:

1. Replying to a request on PyCompLing Slack channel – be


empathetic, use comments to seek clarifications before
posting an answer. Feel free to post code examples – but do
not post exercise solutions.
2. Working with the person next to you
3. Being patient when TAs are busy

9
Dynamic material

• Do expect the online course material to be dynamic

• We will update it with expanded explanations, examples, hints etc. based on


feedback that you provide;

10
Let’s get started

If you haven’t done so already, please

- Plug in your laptops/Chromebooks


- Login to the University WiFi
- Open a Web browser (e.g. Firefox, Chrome)

11
Getting Started

1. Open your Web browser and enter the following in the address box

https://github.com/cambridgeltl/python4cl/

12
Getting Started

2. Scroll down the page until you see the Readme file and the Syllabus for
Module 1. It should look like this:

13
Getting Started

3. To go ahead with Google Colab, click on “1.1 About the Course” using the
“Open in Colab” button.
You should now be looking at the first module of PyCompLing:

14
Getting Started

4. Sign in to Google Colab using your CRSid, e.g. nhc30@csah.cam.ac.uk.


When you press return this will redirect you to the University’s Raven
authentication page. Click on “Raven for staff and students”, enter your CRSid
and Raven password and click “Login”.

15
Getting Started

5. You should now see a change of logo in the top right-hand corner of the
Google Colab page to show that you are logged in:

16
Getting Started

6. Read through and move on to Module 1.2

17
Working through Notebooks

Work through each Jupyter Notebook in order

Feel free to make changes to your copy of the Notebook to explore and enhance
your understanding

Complete exercises at the end of a Notebook ready to hand in and demonstrate


in Week 7

Feel free to go back and revisit Notebooks as you move through the course

18
Getting Started

A few quick notes:


A. When you run a cell containing Python code for the first time in Colab you might get a
message saying something like “This code was not authored in Colab – do you accept to
run it?” Please click “accept”.
B. Take note that Colab ‘hides’ some of the instruction and programming cells. If you click
on the headings you will see within the heading box a message such as “58 cells hidden”
with an arrow by the side. Click on the arrow to unpack the hidden cells and you will see
all the instructions and Python code for learning about this topic.
C. From time to time save your work so that you can continue next time (or show your
answers during the marking sessions!). Go to the File menu in Colab and click Save. You
will be prompted to make a copy of the Notebook in your cloud drive. Accept and make a
copy to continue working on next time you login.

19
Assessment exercises

You can help each other on the practice questions but …

Assessment exercises (at the end of each notebook) must be only your own
work

You will be marked on how well your Python code works and how well you
demonstrate your understanding of the code you have submitted. TAs will ask a
selection of questions to test your understanding.

Test your code yourself, and add in text boxes to explain your answers.

Hand in your answers to Modules 1.2, 1.3 and 1.4 before 17:00 on 19/11. Answers
submitted after this date will not be considered.

Check the rota for your marking slot. Be present at Mill Lane Room 9 at least 5
minutes before your allocated time. Have your Jupyter Notebooks open and ready
to demonstrate.

20

You might also like