DWM Exp10

You might also like

Download as pdf or txt
Download as pdf or txt
You are on page 1of 3

Untitled4.ipynb - Colaboratory https://colab.research.google.com/drive/1mBgeBK93...

1 pip install beautifulsoup4

Requirement already satisfied: beautifulsoup4 in /usr/local/lib/python3.10/dist-pa


Requirement already satisfied: soupsieve>1.2 in /usr/local/lib/python3.10/dist-pac

1 pip install lxml

Requirement already satisfied: lxml in /usr/local/lib/python3.10/dist-packages (4.

1 import bs4 as bs
2 import urllib.request
3 import nltk
4 nltk.download('stopwords')
5 import re
6 nltk.download('punkt')
7
8 scraped_data = urllib.request.urlopen('https://en.wikipedia.org/wiki/Artificial_int
9 article = scraped_data.read()
10
11 parsed_article = bs.BeautifulSoup(article,'lxml')
12
13 paragraphs = parsed_article.find_all('p')
14
15 article_text = ""
16
17 for p in paragraphs:
18 article_text += p.text

[nltk_data] Downloading package stopwords to /root/nltk_data...


[nltk_data] Unzipping corpora/stopwords.zip.
[nltk_data] Downloading package punkt to /root/nltk_data...
[nltk_data] Package punkt is already up-to-date!

1 article_text = re.sub(r'\[[0-9]*\]', ' ', article_text)


2 article_text = re.sub(r'\s+', ' ', article_text)
3 print(article_text)

Artificial intelligence (AI) is the intelligence of machines or software, as oppo

1 formatted_article_text = re.sub('[^a-zA-Z]', ' ', article_text )


2 formatted_article_text = re.sub(r'\s+', ' ', formatted_article_text)
3 print(formatted_article_text)

Artificial intelligence AI is the intelligence of machines or software as opposed

1 sentence_list = nltk.sent_tokenize(article_text)
2 print(sentence_list)

['\nArtificial intelligence (AI) is the intelligence of machines or software, as o

1 of 3 04/10/23, 12:49
Untitled4.ipynb - Colaboratory https://colab.research.google.com/drive/1mBgeBK93...

1 stopwords = nltk.corpus.stopwords.words('english')
2
3 word_frequencies = {}
4 for word in nltk.word_tokenize(formatted_article_text):
5 if word not in stopwords:
6 if word not in word_frequencies.keys():
7 word_frequencies[word] = 1
8 else:
9 word_frequencies[word] += 1

1 maximum_frequncy = max(word_frequencies.values())
2
3 for word in word_frequencies.keys():
4 word_frequencies[word] = (word_frequencies[word]/maximum_frequncy)
5
6 print(word_frequencies)

{'Artificial': 0.058823529411764705, 'intelligence': 0.3697478991596639, 'AI': 1.0

1 sentence_scores = {}
2 for sent in sentence_list:
3 for word in nltk.word_tokenize(sent.lower()):
4 if word in word_frequencies.keys():
5 if len(sent.split(' ')) < 30:
6 if sent not in sentence_scores.keys():
7 sentence_scores[sent] = word_frequencies[word]
8 else:
9 sentence_scores[sent] += word_frequencies[word]
10
11 print(sentence_scores)

{'\nArtificial intelligence (AI) is the intelligence of machines or software, as o

1 import heapq
2 summary_sentences = heapq.nlargest(7, sentence_scores, key=sentence_scores.get
3
4 summary = ' '.join(summary_sentences)
5 print(summary)

[64]
A machine with artificial general intelligence should be able to solve a wide vari
Deep learning has drastically improved the performance of programs in many importa
and others. [45] Deep learning uses artificial neural networks for all of these ty
Artificial intelligence (AI) is the intelligence of machines or software, as oppos
Many researchers began to doubt that the current practices would be able to imitat
Learning algorithms for neural networks use local search to choose the weights tha

2 of 3 04/10/23, 12:49
Untitled4.ipynb - Colaboratory https://colab.research.google.com/drive/1mBgeBK93...

3 of 3 04/10/23, 12:49

You might also like