doku.pub_swipe-to-unlock-the-primer-on-technology-and-business-strategy-mehta-agashe-detroja
How does Google search work?Whenever you search on Google, the search engine finds thousands — if notmillions — of results for you in just half a second, on average. [125] And theresults are sorted pretty well: 92% of searchers go to a link on the first pageof results. [126] How on earth does Google do it?Google doesn’t actually search the entire internet every time you ask itsomething. It actually has huge databases (tables of information, like inExcel) and algorithms that read over those databases to figure out what toshow you. Let’s dive into how these work.First, Google needs to build up a list of every webpage on the internet; it’s atall order because there are over 30 trillion pages out there. [127] Google usesprograms called spiders to “crawl” over webpages until it’s found all of them(or at least, what Google thinks is all of them). The spiders start on a fewwebpages and add those to Google’s list of pages, called the “index.” Thenthe spiders follow all the outgoing links on those pages and find a new set ofpages, which they add to the index. Next, they follow all the links on thosepages, and so on, until Google can’t find anything else. This process isalways ongoing, and Google is always adding new pages to its index orupdating pages when they change. And the index is huge, weighing in at over100 million gigabytes. [128] If you tried to fit all that on one-terabyte externalhard drives, you’d need 100,000 — which, if you stacked them up, would bearound a mile high. [129]When you search Google, it grabs your query (the text you typed into thesearch bar) and looks through its index to find the webpages that are mostrelevant to your query. How might Google do this? The simplest way wouldbe to just look for occurrences of a particular keyword, kind of like hittingCtrl+F or Cmd+F to search a giant Word document. Indeed, this is howsearch engines in the 90’s used to work: they’d search for your query in theirindex and show the pages that had the most matches, [130] an attribute calledkeyword density. [131] This turned out to be pretty easy to game. If yousearched for the candy bar Snickers, you’d imagine that snickers.com would
be the first hit. But if a search engine just counted the number of times theword “snickers” appeared, anyone could make a random webpage that justsaid “snickers snickers snickers snickers” (and so on) and jump to the top ofthe search results. Clearly, that’s not very useful.PageRankInstead of keyword density, Google’s core innovation is an algorithm calledPageRank, which its founders Larry Page and Sergey Brin created for theirPhD thesis in 1998. [132] Page and Brin noticed that you can estimate awebpage’s importance by looking at which other important pages link tothem. [133] It’s like how, at a social event, you know someone is popular whenthey’re surrounded by other popular people. PageRank gives each webpage ascore, which is based on the PageRank scores of every other page that linksto that page. [134] (The scores of those pages depend on the pages that link tothem, and so on… don’t worry, this doesn’t go on forever, thanks to someclever math. [135] )For instance, if we made a brand-new webpage about Abraham Lincoln, itwould initially have very low PageRank. If some obscure blog added a link toour page, our page would get a small boost to its PageRank. PageRank caresmore about the quality of incoming links rather than the quantity, [136] so evenif dozens of obscure blogs linked to our page, we wouldn’t gain much. But ifa New York Times article (which would probably have a high PageRank)linked to our page, our page would get a huge PageRank boost.So once Google finds all the pages in its index that are relevant to your searchquery, it ranks them using a variety of criteria. [137] PageRank is the major one,but Google has several other criteria: Google considers how recently awebpage was updated, ignores websites that look spammy (like the “snickerssnickers snickers snickers” site we mentioned earlier), considers yourlocation (it could return the NFL if you search for “football” in the US, butthe English Premier League if you search for “football” in England), andmore. [138] One particularly interesting technique is called query rewriting,where Google looks for words with similar meanings to the word yousearched for. For instance, if you searched for “sofa”, Google could also
- Page 2 and 3: Swipe to Unlock:The Non-Coder’s G
- Page 4 and 5: To my friends and family, for suppo
- Page 6 and 7: PrefaceChapter 1: IntroductionWho t
- Page 8 and 9: Chapter 9: Hardware & RobotsWhat ar
- Page 10 and 11: PrefaceI’ve always been fascinate
- Page 12 and 13: However, after a good half hour of
- Page 14 and 15: Who this book is forThe goal of Swi
- Page 16 and 17: What’s insideWe’ll kick off Swi
- Page 18 and 19: Third, we’ll give you some great
- Page 20 and 21: and Strategy. He has previously wor
- Page 22 and 23: Chapter 2: Operating SystemsAndroid
- Page 24 and 25: Like cars, apps built for different
- Page 26 and 27: Google. [23] Plus, Google pays Appl
- Page 28 and 29: Why do Android phones come pre-inst
- Page 30 and 31: money for carriers and phone makers
- Page 32 and 33: made by Lenovo around 2015 came wit
- Page 34 and 35: just letting employees use their pe
- Page 36 and 37: Can Macs get viruses?For years, one
- Page 38 and 39: about Macs’ invincibility, [119]
- Page 42 and 43: return webpages that include terms
- Page 44 and 45: How does Spotify recommend songs to
- Page 46 and 47: How does Facebook decide what shows
- Page 48 and 49: Algorithms like Facebook’s news f
- Page 50 and 51: Uber uses the Google Maps API to dr
- Page 52 and 53: All this brings us back to the ques
- Page 54 and 55: always gets imported from Facebook.
- Page 56 and 57: distances on demand” market, Uber
- Page 58 and 59: weather patterns (green blobs for r
- Page 60 and 61: versions of a feature, A and B.A/B
- Page 62 and 63: against a powerful blend of social
- Page 64 and 65: Why is almost every app free to dow
- Page 66 and 67: frequently. There are still several
- Page 68 and 69: How does Facebook make billions wit
- Page 70 and 71: anyone else, Google and Facebook ca
- Page 72 and 73: Why do online news platforms have s
- Page 74 and 75: How does Airbnb make money?Amazon,
- Page 76 and 77: How does the app Robinhood let you
- Page 78 and 79: Dropbox bought the app and the team
- Page 80 and 81: How is Google Drive like Uber?Befor
- Page 82 and 83: Where do things in “the cloud”
- Page 84 and 85: The best and worst of the cloudThin
- Page 86 and 87: Why can’t you own Photoshop anymo
- Page 88 and 89: Slow adoptionYou might be wondering
How does Google search work?
Whenever you search on Google, the search engine finds thousands — if not
millions — of results for you in just half a second, on average. [125] And the
results are sorted pretty well: 92% of searchers go to a link on the first page
of results. [126] How on earth does Google do it?
Google doesn’t actually search the entire internet every time you ask it
something. It actually has huge databases (tables of information, like in
Excel) and algorithms that read over those databases to figure out what to
show you. Let’s dive into how these work.
First, Google needs to build up a list of every webpage on the internet; it’s a
tall order because there are over 30 trillion pages out there. [127] Google uses
programs called spiders to “crawl” over webpages until it’s found all of them
(or at least, what Google thinks is all of them). The spiders start on a few
webpages and add those to Google’s list of pages, called the “index.” Then
the spiders follow all the outgoing links on those pages and find a new set of
pages, which they add to the index. Next, they follow all the links on those
pages, and so on, until Google can’t find anything else. This process is
always ongoing, and Google is always adding new pages to its index or
updating pages when they change. And the index is huge, weighing in at over
100 million gigabytes. [128] If you tried to fit all that on one-terabyte external
hard drives, you’d need 100,000 — which, if you stacked them up, would be
around a mile high. [129]
When you search Google, it grabs your query (the text you typed into the
search bar) and looks through its index to find the webpages that are most
relevant to your query. How might Google do this? The simplest way would
be to just look for occurrences of a particular keyword, kind of like hitting
Ctrl+F or Cmd+F to search a giant Word document. Indeed, this is how
search engines in the 90’s used to work: they’d search for your query in their
index and show the pages that had the most matches, [130] an attribute called
keyword density. [131] This turned out to be pretty easy to game. If you
searched for the candy bar Snickers, you’d imagine that snickers.com would