Using Google’s Custom Search to Search the Web

Google doesn’t openly allow crawling of it’s search results but if you’re looking to extract data from Google’s search results in your development project then Google Custom Search may be the work around.

Google’s Custom Search is a customisable search engine that would typically be used to search a website or a specified list of websites. However, this can be configured more broadly to search the web.

This can be achieved by customising the search engine to search a specific list of URLs and then inputing all (or most of) the domain extensions (one per line) preceeded with a star (for example: *.co.uk ). I’ve included a list of domain extensions at the end of this post.

After much testing and tweaking of the API settings, the Custom Search API appears to run off a different search index than the main search results so even when you just look at the vanilla search results (without Google personalizing the results) these will still appear different from directly searching Google. It’s also worth noting that the results may also differ when accessing the results using the custom search engine page versus using the API.

Here’s a link to further details on interacting with Google’s Custom Search via the API: https://developers.google.com/custom-search/json-api/v1/reference/cse/list#request

When using the API you’re able to specify the language to search in and the country to search from, a list of the possible options and further details can be found here: https://developers.google.com/custom-search/docs/xml_results_appendices#countryCodes

List of domain extensions for using Google’s Custom Search to Search the Web

  • *.booking
  • *.boston
  • *.bot
  • *.boutique
  • *.box
  • *.broadway
  • *.broker
  • *.brussels
  • *.budapest
  • *.build
  • *.builders
  • *.business
  • *.buy
  • *.buzz
  • *.bzh
  • *.cab
  • *.cafe
  • *.call
  • *.cam
  • *.camera
  • *.camp
  • *.capetown
  • *.capital
  • *.car
  • *.cards
  • *.care
  • *.career
  • *.careers
  • *.cars
  • *.casa
  • *.case
  • *.cash
  • *.cashbackbonus
  • *.casino
  • *.catering
  • *.catholic
  • *.center
  • *.ceo
  • *.cfd
  • *.charity
  • *.chat
  • *.cheap
  • *.church
  • *.city
  • *.cityeats
  • *.claims
  • *.cleaning
  • *.clinic
  • *.clothing
  • *.cloud
  • *.club
  • *.coach
  • *.codes
  • *.coffee
  • *.college
  • *.cologne
  • *.community
  • *.company
  • *.compare
  • *.computer
  • *.comsec
  • *.condos
  • *.construction
  • *.consulting
  • *.contact
  • *.contractors
  • *.cooking
  • *.cool
  • *.corsica
  • *.country
  • *.coupon
  • *.coupons
  • *.courses
  • *.cpa
  • *.credit
  • *.creditcard
  • *.cricket
  • *.cruise
  • *.cruises
  • *.cymru
  • *.dad
  • *.dance
  • *.data
  • *.date
  • *.dating
  • *.dds
  • *.deal
  • *.dealer
  • *.deals
  • *.degree
  • *.delivery
  • *.democrat
  • *.dental
  • *.dentist
  • *.desi
  • *.design
  • *.dev
  • *.diamonds
  • *.digital
  • *.direct
  • *.directory
  • *.discount
  • *.diy
  • *.docs
  • *.doctor
  • *.dog
  • *.doha
  • *.domains
  • *.dot
  • *.download
  • *.drive
  • *.dubai
  • *.durban
  • *.earth
  • *.eat
  • *.eco
  • *.ecom
  • *.education
  • *.email
  • *.energy
  • *.engineer
  • *.engineering
  • *.enterprises
  • *.equipment
  • *.esq
  • *.estate
  • *.eus
  • *.events
  • *.exchange
  • *.expert
  • *.exposed
  • *.express
  • *.fail
  • *.faith
  • *.family
  • *.fan
  • *.fans
  • *.farm
  • *.fashion
  • *.feedback
  • *.film
  • *.finance
  • *.financial
  • *.financialaid
  • *.fish
  • *.fishing
  • *.fit
  • *.fitness
  • *.flights
  • *.florist
  • *.food
  • *.football
  • *.forsale
  • *.forum
  • *.foundation
  • *.free
  • *.fun
  • *.fund
  • *.furniture
  • *.futbol
  • *.fyi
  • *.gallery
  • *.games
  • *.garden
  • *.gay
  • *.ged
  • *.gent
  • *.gifts
  • *.gives
  • *.giving
  • *.glass
  • *.gle
  • *.global
  • *.gmbh
  • *.gold
  • *.golf
  • *.graphics
  • *.gratis
  • *.green
  • *.gripe
  • *.grocery
  • *.group
  • *.guide
  • *.guru
  • *.hair
  • *.halal
  • *.hamburg
  • *.haus
  • *.health
  • *.healthcare
  • *.helsinki
  • *.here
  • *.hockey
  • *.holdings
  • *.holiday
  • *.horse
  • *.hospital
  • *.host
  • *.hoteis
  • *.hotel
  • *.hoteles
  • *.hotels
  • *.house
  • *.imamat
  • *.immo
  • *.immobilien
  • *.inc
  • *.indians
  • *.industries
  • *.ink
  • *.institute
  • *.insurance
  • *.insure
  • *.international
  • *.investments
  • *.ira
  • *.irish
  • *.islam
  • *.ismaili
  • *.ist
  • *.istanbul
  • *.jetzt
  • *.jewelry
  • *.joburg
  • *.kaufen
  • *.kid
  • *.kids
  • *.kim
  • *.kinder
  • *.kitchen
  • *.kiwi
  • *.koeln
  • *.kyoto
  • *.land
  • *.lat
  • *.latino
  • *.lawyer
  • *.lease
  • *.legal
  • *.lgbt
  • *.life
  • *.lifeinsurance
  • *.lifestyle
  • *.lighting
  • *.limited
  • *.limo
  • *.live
  • *.living
  • *.llc
  • *.llp
  • *.loan
  • *.loans
  • *.london
  • *.lotto
  • *.love
  • *.ltd
  • *.ltda
  • *.luxe
  • *.luxury
  • *.madrid
  • *.mail
  • *.maison
  • *.management
  • *.map
  • *.market
  • *.marketing
  • *.markets
  • *.mba
  • *.med
  • *.media
  • *.medical
  • *.melbourne
  • *.memorial
  • *.men
  • *.menu
  • *.miami
  • *.mls
  • *.mobile
  • *.moda
  • *.moe
  • *.money
  • *.mortgage
  • *.moscow
  • *.motorcycles
  • *.mov
  • *.movie
  • *.movistar
  • *.music
  • *.mutual
  • *.mutualfunds
  • *.nagoya
  • *.navy
  • *.network
  • *.new
  • *.news
  • *.ngo
  • *.ninja
  • *.nrw
  • *.nyc
  • *.okinawa
  • *.one
  • *.onl
  • *.online
  • *.organic
  • *.osaka
  • *.paris
  • *.partners
  • *.parts
  • *.party
  • *.pay
  • *.persiangulf
  • *.pet
  • *.pets
  • *.pharmacy
  • *.phd
  • *.phone
  • *.photography
  • *.photos
  • *.physio
  • *.pictures
  • *.pink
  • *.pizza
  • *.place
  • *.plumbing
  • *.plus
  • *.poker
  • *.porn
  • *.press
  • *.prime
  • *.Pro
  • *.productions
  • *.prof
  • *.promo
  • *.properties
  • *.Protection
  • *.pub
  • *.pw
  • *.qpon
  • *.quebec
  • *.racing
  • *.radio
  • *.realestate
  • *.realtor
  • *.recipes
  • *.red
  • *.rehab
  • *.reise
  • *.reisen
  • *.reit
  • *.ren
  • *.rent
  • *.rentals
  • *.repair
  • *.report
  • *.republican
  • *.rest
  • *.restaurant
  • *.retirement
  • *.review
  • *.reviews
  • *.rich
  • *.rio
  • *.rip
  • *.rocks
  • *.rodeo
  • *.roma
  • *.room
  • *.rsvp
  • *.rugby
  • *.run
  • *.ryukyu
  • *.saarland
  • *.sale
  • *.salon
  • *.sarl
  • *.save
  • *.scholarships
  • *.school
  • *.schule
  • *.science
  • *.scot
  • *.search
  • *.seat
  • *.secure
  • *.security
  • *.services
  • *.sex
  • *.shia
  • *.shiksha
  • *.shoes
  • *.shop
  • *.shopping
  • *.show
  • *.silk
  • *.singles
  • *.site
  • *.ski
  • *.soccer
  • *.social
  • *.software
  • *.solar
  • *.solutions
  • *.song
  • *.soy
  • *.spa
  • *.space
  • *.sport
  • *.sports
  • *.spreadbetting
  • *.srl
  • *.stockholm
  • *.storage
  • *.store
  • *.stream
  • *.studio
  • *.study
  • *.style
  • *.sucks
  • *.supplies
  • *.supply
  • *.support
  • *.surf
  • *.surgery
  • *.swiss
  • *.sydney
  • *.systems
  • *.taipei
  • *.tax
  • *.taxi
  • *.team
  • *.tech
  • *.technology
  • *.Tel
  • *.tennis
  • *.thai
  • *.theater
  • *.theatre
  • *.tickets
  • *.tienda
  • *.tips
  • *.tires
  • *.tirol
  • *.today
  • *.tokyo
  • *.tools
  • *.top
  • *.tour
  • *.tours
  • *.town
  • *.toys
  • *.trade
  • *.trading
  • *.training
  • *.translations
  • *.trust
  • *.tube
  • *.tunes
  • *.university
  • *.uno
  • *.vacations
  • *.vegas
  • *.ventures
  • *.vet
  • *.viajes
  • *.video
  • *.villas
  • *.vin
  • *.vip
  • *.vision
  • *.vlaanderen
  • *.vodka
  • *.vote
  • *.voting
  • *.voto
  • *.voyage
  • *.wales
  • *.wang
  • *.wanggou
  • *.watch
  • *.watches
  • *.web
  • *.webcam
  • *.webs
  • *.website
  • *.wed
  • *.wedding
  • *.weibo
  • *.wien
  • *.wiki
  • *.win
  • *.wine
  • *.work
  • *.works
  • *.world
  • *.wtf
  • *.xin
  • *.xyz
  • *.yoga
  • *.yokohama
  • *.yun
  • *.zip
  • *.zone
  • *.zuerich
  • *.co.uk
  • *.com
  • *.vermögensberater
  • *.vermögensberatung
  • *.дети
  • *.католик
  • *.ком
  • *.москва
  • *.онлайн
  • *.орг
  • *.рус
  • *.сайт
  • *.קום
  • *.ابوظبي
  • *.اتصالات
  • *.بازار
  • *.بيتك
  • *.شبكة
  • *.عرب
  • *.كاثوليك
  • *.كوم
  • *.موبايلي
  • *.موقع
  • *.همراه
  • *.कॉम
  • *.नेट
  • *.संगठन
  • *.คอม
  • *.みんな
  • *.クラウド
  • *.コム
  • *.ストア
  • *.セール
  • *.ファッション
  • *.ポイント
  • *.一号店
  • *.世界
  • *.中文网
  • *.企业
  • *.佛山
  • *.信息
  • *.健康
  • *.八卦
  • *.公司
  • *.公益
  • *.商城
  • *.商店
  • *.商标
  • *.在线
  • *.大拿
  • *.天主教
  • *.娱乐
  • *.家電
  • *.工行
  • *.广东
  • *.广州
  • *.微博
  • *.慈善
  • *.我爱你
  • *.手机
  • *.手表
  • *.招聘
  • *.政务
  • *.政府
  • *.新闻
  • *.时尚
  • *.書籍
  • *.机构
  • *.机构体制
  • *.深圳
  • *.游戏
  • *.点看
  • *.移动
  • *.网址
  • *.网店
  • *.网站
  • *.网络
  • *.购物
  • *.通販
  • *.集团
  • *.電訊盈科
  • *.食品
  • *.餐厅
  • *.香格里拉
  • *.點看
  • *.닷넷
  • *.닷컴

Leave a Reply

Your e-mail address will not be published. Required fields are marked *

6 − one =