Download as pdf or txt
Download as pdf or txt
You are on page 1of 3

5/9/2021 html - How can I scrape multiple pages/links at once using VBA?

- Stack Overflow

How can I scrape multiple pages/links at once using VBA?


Asked 1 year ago Active 1 year ago Viewed 450 times

I'm currrently trying to scrape info from this Reddit Page. My goal is to make excel open all the
posts in new tabs and then I want to scrape information from each of those pages, since the
1 starting page doesn't have as much information.

I've been trying for the last few hours to figure this out, but I'm admittedly pretty confused about
how to do it, just overall unsure what to do next, so any pointers would be greatly appreciated!

Here is my current code, it works decently enough but as I said, I'm not sure what I should do
next to open the links it finds one by one and scrape each page for data. The links are scraped off
that first page and then added to my spreadsheet right now, but if possible I'd like to just skip that
step and scrape them all at once.

Thanks! :)

Sub GetData()

Dim objIE As InternetExplorer


Dim itemEle As Object
Dim upvote As Integer, awards As Integer, animated As Integer
Dim postdate As String, upvotepercent As String, oc As String, filetype As String,
linkurl As String, myhtmldata As String, visiComments As String, totalComments As
String, removedComments As String
Dim y As Integer

Set objIE = New InternetExplorer


objIE.Visible = False

objIE.navigate (ActiveCell.Value)
Do While objIE.Busy = True Or objIE.readyState <> 4: DoEvents: Loop

y = 1

For Each itemEle In objIE.document.getElementsByClassName("flat-list buttons")


visiComments = itemEle.getElementsByTagName("a")(0).innerText
linkurl = itemEle.getElementsByTagName("a")(0).href
Sheets("Sheet1").Range("A" & y).Value = visiComments
Sheets("Sheet1").Range("B" & y).Value = linkurl
y = y + 1
Next

End Sub

Your privacy
By clicking “Accept all cookies”, you agree Stack Exchange can store cookies on your device and disclose
Run code snippet Expand snippet
information in accordance with our Cookie Policy.

Accept all cookies Customize settings


html Overflow
Join Stack excel vba internet-explorer
to learn, web-scraping
share knowledge, and build your career. Sign up

https://stackoverflow.com/questions/61598820/how-can-i-scrape-multiple-pages-links-at-once-using-vba 1/3
5/9/2021 html - How can I scrape multiple pages/links at once using VBA? - Stack Overflow

Share Improve this question Follow edited May 4 '20 at 21:32 asked May 4 '20 at 18:10
QHarr Bloggy
71.7k 10 41 77 89 7

@QHarr I'm basically trying to open each of the links (the hrefs) and then scrape a few html elements for
each of them and output those to my spreadsheet. So the data to scrape would be say, for example the # of
upvotes and the output would be a number. – Bloggy May 4 '20 at 19:16

The % Upvoted is the only additional info those pages have, yes, but it's pretty important for my project and
I'm just trying to automate as much as possible. – Bloggy May 4 '20 at 20:24

Yep! Because the percentage is what's got me stuck, really. – Bloggy May 4 '20 at 20:36

1 Answer Active Oldest Votes

You should be able to gather the urls then visit in a loop and write results from page visited to
array, then array to sheet. Add this after your existing line
2
Do While objIE.Busy = True Or objIE.readyState <> 4: DoEvents: Loop

Add:

Dim nodeList As Object , i As Long, urls(), results()

Note: You are only potentially gaining on the page loads, as VBA is single threaded. To do that
you would need to store a reference to each tab, or open all first, then loop through relevant open
windows to do the scrape. My preference would be to keep in same tab to be honest.

Set nodeList = ie.document.querySelectorAll(".comments")


Redim urls(0 To nodeList.Length-1)
Redim results(1 to nodeList.Length, 1 to 3)
'Store all urls in an array to later loop
For i = 0 To nodeList.Length -1
urls(i) = nodeList.item(i).href
Next

For i = LBound(urls) To UBound(urls)


ie.Navigate2 urls(i)
While ie.Busy Or ie.Readystate <> 4: DoEvents:Wend
'may need a pause here
results(i + 1, 1) = ie.document.querySelector("a.title").innerText 'title
Your privacy results(i + 1, 2) = ie.document.querySelector(".number").innerText 'upvotes
results(i + 1, 3) = ie.document.querySelector(".word").NextSibling.nodeValue '%
By clicking
Next“Accept all cookies”, you agree Stack Exchange can store cookies on your device and disclose
information in accordance with
ActiveSheet.Cells( our Cookie Policy.
1,1).Resize(UBound(results, 1) , UBound(results,2)) = results

Accept all cookies Customize settings


Share Improve this answer Follow edited
Join Stack Overflow to learn, share knowledge, andMay 4 '20
build at 23:29
your career. answered May 4 '20 at 21:31
Sign up
QHarr
https://stackoverflow.com/questions/61598820/how-can-i-scrape-multiple-pages-links-at-once-using-vba 2/3
5/9/2021 html - How can I scrape multiple pages/links at once using VBA? - Stack Overflow

71.7k 10 41 77

Does .NodeValue work similarly how .next_sibling works in BeautifulSoup @QHarr? – SIM May 4 '20
at 21:53

Sorry if I took time to reply, I'm trying to understand and not just copy ^^ For some reason it's scraping the
title of the first post in the list just fine, along with the upvotes, but not the %. And then after the macro
finishes I end up with the first post (and its upvotes) repeating over 25 rows instead of all the different
posts. I can't figure out what's causing that. – Bloggy May 4 '20 at 22:25

I checked the HTML and there's another CSS class called "word" that's technically below the one I want,
that might be what's causing issues with the % though that's probably not why it's not scraping the other
posts. – Bloggy May 4 '20 at 22:36

that fixed the first problem, thanks! and yeah, it's writing out [object Text]. – Bloggy May 4 '20 at 22:55

Weirdly enough, It's telling me that the "object doesn't support this property or method". – Bloggy May 4
'20 at 23:15

Your privacy
By clicking “Accept all cookies”, you agree Stack Exchange can store cookies on your device and disclose
information in accordance with our Cookie Policy.

Accept all cookies Customize settings


Join Stack Overflow to learn, share knowledge, and build your career. Sign up

https://stackoverflow.com/questions/61598820/how-can-i-scrape-multiple-pages-links-at-once-using-vba 3/3

You might also like