How to Scrape Info from Google SERPs with Screaming Frog
This article was originally posted on January 16, 2020.
To do this, you need to have an understanding of a little bit of XPath and Google search queries. You also need to know how to use the Screaming Frog SEO tool. I first learned about doing this from Rory Truesdale’s blog post on Search Engine Journal.
Before proceeding, I wanted to tell you that there is a small possibility that scraping Google’s Serps could get your IP blacklisted. Use this method at your own risk. If you’re ok with the risk, then keep reading.
Configuring the Screaming Frog Crawl
First, set Screaming Frog’s mode to List.
I followed Rory’s advice and unchecked all of the boxes in Configuration -> Spider -> Crawl.
If you want to see a rendered screenshot after Screaming Frog crawls Google, you can go to the Rendering tab and select JavaScript.
I found the results to be the same (without the screenshot) if I selected Text Only.
SEO’s, have you ever had to several redirect urls that have similar text in them? Read this blog post on time saving .htaccess redirects.
XPath Selectors for Page Titles, URLs, and Meta Description on Google Serps
Unfortunately, when I used Rory’s XPath selectors in his blog post, I didn’t get any results. His post was created in September 2018. Google probably changed up their html and classes since his blog post. By using Google’s Developer Tools and the Chrome Scraper Extension, I was able find XPath selectors that worked.
To get the page titles from Google Serps, you can use this XPath selector:
//h3[@class="LC20lb MBeuO DKV0Md"]
Updated 12/4/23: If you want to grab the urls that the Google Serp page titles go to, you can use this XPath selector:
//a[@jsname="UWckNb"]/@href
Updaed: 12/4/23: The below XPath selector doesn’t work anymore. I’ll come up with a new selector as soon as I can. If you want to go even further and grab the meta descriptions from Google Serps, you can use this XPath selector:
//div[@class="IsZvec"]
Where to Add XPath Selectors in Screaming Frog
You would go to Configuration -> Custom -> Extraction to add these XPath selectors. You’ll see a window pop up. Type in the title of what you want each type of data being scraped to be. I used SEO Title and URL. Here’s a screenshot of where you would add the XPath selectors.
Select XPath for the first dropdown menu and Extract Text for the second dropdown menu.
In order for these XPath selectors to work, you will need to select the browser that you are using under Configuration -> User-Agent -> Preset User-Agents. If you leave the Preset User-Agents field as the Screaming Frog Spider, you won’t see the extracted values.
I’ve run queries using Chrome, Firefox, and Microsoft Edge as the user-agent with no problem.
In Configuration -> Speed, I set the Max Threads to 1.0, checked the Limit URL/s, and set the Max URL/s to .8 just to be safe. If this value is too high, Google could block you with a CAPTCHA test.
Now, you’re all set to add your Google queries to spider.
Adding Google Search Queries URLs
This is the format the Google query urls take.
https://www.google.com/search?q=your+keyword+phrase
The + is used for spaces. Running this query in Screaming Frog will extract data from the first page of the Google search.
If you want to produce more than just the first page of results, you can add a num parameter like so:
https://www.google.com/search?q=your+keyword+phrase&num=50
Google will show the first 50 results with that query.
You can also use the near parameter to see what the local pack results would look like in that city or zip code.
This only affects the local pack results. The organic results will be tailored to the city that you are in. If you want to see organic results from a different area other than your own, you’ll need to use a VPN and select a server in the city of your interest. In the case above, it would be Chicago, IL.
I manually created my search queries, but you could create an Excel spreadsheet which automatically creates the queries for you. Read more about that in Rory’s blog post.
Once you come up with your Google search queries, you can add them by clicking Upload -> Enter Manually.
Paste your queries in the window that pops up, then click Next, then click OK. To see your custom extraction data, you will need to scroll horizontally to the right. Below a screenshot of extracted data from 6 Google search queries.
Click the Export button located right next to the Upload button to save your crawl and save the document as an Excel file.
Organizing Google Search Query Data in Excel
When you open up your Excel file, you’ll want to delete all of the columns that you don’t need. In this case, it would be all of the columns except the Original Url, SEO Title and Url columns. After doing this, my spreadsheet looked like this:
I wanted the SEO Title and URL columns to be rows, and I wanted the Original Url rows to be columns. To do this, I selected the cell range that contained the SEO Title and Url data (including the titles). Then, I copied the cells, clicked where I wanted the copied cell data to be pasted, went to Edit -> Paste Special, check the Transpose box, and click OK.
After I transposed the cells, my data looked like this:
After deleting the SEO Title and Url data from the top cells, I transposed the Google query urls, placed them above the corresponding columns, and renamed Original Url to something more descriptive. This is what I ended up with.
The data is a lot neater to analyze. If you look at the spreadsheet data, you’ll see that Google thinks that the search query had the intent of looking for a local provider.
Leveraging Your Time by Running Multiple Queries
You could scrape data from much more than 3 queries although I don’t know what the limit is with Screaming Frog. Let’s say after you’ve done your keyword research, and you wanted to see what the top results were for all of your keywords that you are tracking. You could run this crawl and have your data organized in a fraction of the time that it would take you to manually type the inquiries on Google.
You can also use Screaming Frog to scrape Google’s related search data for multiple queries with this XPath selector:
//a[@class="k8XOCe"]
This could give you more keyword ideas to optimize your website for.
Thank you for reading this post. Feel free to share it or comment below.
so impressive ! Chapeau! Love it
Great article. Thanks.
Any suggestions on where to find a list of other XPath Selectors? I’m looking to scrape the Meta Description as well.
I just did some research by inspecting Google SERPs (CTRL + SHIFT + I when on the page and parse through the code). I found the following XPath selectors. It would be great to get a full list without having to do this trial and error approach.
Keep in mind that results only populate if the SERP actually contains them. i.e. not all search results have a Local 3 Pack or Top Stories or Videos, so for those keywords the resulting fields will be blank.
Meta Description //span[@class=”st”]
Searches Related To //p[@class=”nVcaUb”]
Local 3 Pack //div[@class=”dbg0pd”]
Top Stories //div[@class=”y9oXvf”]
Videos //div[@class=”wCIBKb”]
Matthew, thanks for reading my blog post. Hopefully, you got some use out of it. I tried your Meta Description XPath selector, and it didn’t work. I was able to come up with an XPath selector to produce all of the meta descriptions on a Google Serp, and I put it in this section.
I had already listed the related searches XPath selector at the end of this blog post (before the comments).
I believe my blog post is the best thing you’ll get to a full list of Google Serp XPath selectors that will work.
Thank you for this step-by-step it is incredibly helpful! But I’m only able to get the titles of the search page itself e.g. Title: [search query] Google Search, and not from the individual results like you’ve gotten above. Not sure where I’ve gone wrong, any idea why?
Alexa, thanks for reading my post. If you followed my tutorial correctly, you should be able to see the extracted data if you continue to scroll to the right. I just updated the post, so take a look. Also, the current version (12.0) will display titles and urls together even if you are only extracting the titles. You can download version 11.3 and install it in a different directory if you don’t want to see the results like this.
Hi Tom,
first of all: thanks for that great article, so helpful!
I want to scrape a high number of meta descriptions for one query (>6000 results).
I do only get 100 results – do you have any idea on how to overcome that?
Thank you!
I’ve never run a query for that many results. What would you need 6000 results for? Running a query for 6000+ results might trigger a red flag with Google.
Hi!
Great post, very helpful. The configuration went smoothly, but I encountered a problem with 302 redirects. Google blocks my IP after checking about 70 addresses. I tried the slowest possible crawl speed setting and that didn’t help either.
I wish I could crawl hundreds of addresses like this. Do you know how to solve this case?
I’ve never crawled that many addresses.
[…] Ver publicacion […]
Very nice information. Love your creativity. I want to get more future updates. Keep it up.
Thanks, Reynald. I hope to come up with a new blog post this year.
Great read Tom! I came across your post after searching for SERPs info and I’m glad I did. Very insightful. Thanks Brad SEO Group
Brad, thanks for letting me know. That puts a smile on my face!
Hi! I have used this so many times, but now that I try it I can not get it to work! Is it still the same xpath for URL or has it changed?
Erica, the XPath selector for the url has changed. I just updated it. Give it a shot, and let me know if it worked for you. I haven’t figured out the XPath selector to select all meta descriptions yet though.
Any idea, how to scale this and avoid recaptcha v3?
50-100 results per scrape are rookie numbers aren’t they? 😀
I haven’t run a crawl for more than 50 results nor would I want to.