Googlebot 如何处理robots.txt文件?How Google handles the robots. txt file? Together these machines are known as the Googlebot . Maybe the GoogleBot hasn't swept by your site just yet. 您的Web服务器拒绝来自Googlebot 的连接。 Your server refused to connect with Googlebot . Together these machines are known as the Googlebot .
您的Web服务器拒绝来自Googlebot 的连接。 Your web server is refusing connections from Googlebot . These computers are collectively known as Googlebot . Nofollow-防止Googlebot 跟踪此网页中的链接。 Nofollow- Prevents the Googlebot from following links from this page. 此工具只为GoogleUser-agent(如Googlebot )提供结果。 This tool provides results only for Google user-agents(such as Googlebot ). 爬虫和googlebot 如何在你的网站和网页上爬行? How are crawlers and Googlebot crawling through your site and your web pages. Txt测试工具验证Googlebot 能否访问您的移动版网站。 Txt testing tool to verify that your mobile version is accessible to Googlebot . 如果它“注意到”服务器无法处理页面过载,Googlebot 会减慢或停止抓取。 If it“notices” that the server can't deal with page overload, Googlebot slows down or stops crawling. 一、爬虫和googlebot 如何在你的网站和网页上爬行? How are crawlers and Googlebot crawling through your site and your web pages. 这个工作是由软件来执行,被成为爬虫或蜘蛛(或者Googlebot ,因为它与Google有关)。 Its performed by the software, called a crawler or a spider(or Googlebot , in case of Google). 隐藏链接的是拟由Googlebot 抓取的链接,但对人类是不可读的,因为:. The intention behind the hidden links is to be crawled by Googlebot , but they are unreadable to humans because:. 这个工作是由软件来执行,被成为爬虫或蜘蛛(或者Googlebot ,因为它与Google有关)。 This task is performed by software, called a crawler or a spider(or Googlebot , as is the case with Google). 一般来说,你是愿意让Googlebot 访问你的网站,这样你的网页才可以被人们在谷歌搜到。 In general you want Googlebot to access your site so your web pages can be found by people searching on Google. 尽管Googlebot 无法抓取这些网页,但是它们依然是网站用户体验的重要组成部分。 While Googlebot will not be able to crawl disallowed pages, they may be a significant part of your site's user experience. 因此,如果你想告诉这个蜘蛛该做什么,一个相对简单的User-agent:Googlebot 线就可以了。 So if you want to tell this spider what to do, a relatively simple User-agent: Googlebot line will do the trick. 但是,Google现在将使用其智能手机Googlebot 抓取,索引和排名该网站的移动版本。 Now, however, Google will use its Smartphone Googlebot to crawl, index, and rank the mobile version of the site as well. Googlebot 往往可以读Flash文件,并提取里面的文本和链接,但忽略文件的结构和上下文。Googlebot can typically read Flash files and extract the text and links in them, but the structure and context are missing. Txt规则的实际经验,这些规则由Googlebot 和其他主要爬虫以及大约5亿依赖REP的网站使用。 Txt rules, used both by Googlebot and other major crawlers, as well as about half a billion websites that rely on REP.”. 表示一个会话ID,您可排除所有包含该ID的网址,确保Googlebot 不会抓取重复的网页。 Indicates a session ID, you may want to exclude all URLs that contain them to ensure Googlebot doesn't crawl duplicate pages. Googlebot (以及大多数其他抓取工具)只会遵守更具体的用户代理行下的规则,并会忽略所有其他规则。Googlebot ,(and most other crawlers) will only obey the rules under the more specific user-agent line, and will ignore all others. 基于语言的抓取:Googlebot 将开始在抓取时使用包含Accept-LanguageHTTPheader的请求。 Language- dependent crawling- here, the Googlebot begins to crawl by using an Accept-Language HTTP header within the request. Google的JohnMueller不鼓励网站链接到主页上的每个页面,称这可能会阻止Googlebot 清楚地了解网站的架构。 Google's John Mueller discourages websites from linking to every page from the home page, saying it may prevent Googlebot from clearly understanding a site's architecture. 对于Googlebot ,我们对各种政策没有任何偏好,并建议网站站长在决定重定向政策时以用户为出发点。 For Googlebot , we do not have any preference and recommend that webmasters consider their users when deciding on their redirection policy. 在上例中,您禁止名为Googlebot 的UserAgent抓取/nogooglebot/以及该目录下的所有内容。 In the above case, you are disallowing the user agent called Googlebot from crawling/nogooglebot/ and all contents below this directory. 看看这些Googlebot 限制,如果没有辨别网站是快还是慢的能力,评估性能似乎是不公平的。 Looking at these Googlebot limitations, it seems to be unfair to assess performance without the capabilities to discern whether the website is fast or slow. O如果您担心使用Googlebot 的代理程序,我们提供了一个方法来验证该抓取工具是否为Googlebot 。 If you're worried about rogue bots using the Googlebot user-agent, we offer a way to verify whether a crawler is actually Googlebot .
Display more examples
Results: 141 ,
Time: 0.0184