Essential Insights on How Google Indexes Images

Google has explained in a blog post how it indexes images. The search engine’s bots look at the textual content on the page, the page’s title and its body to judge the nature of the image. Google also refers to the image’s filename, anchor text pointing to it, the alt text and also the captions of the Image Sitemap.

Google has advised webmasters to ensure that the image filename must be related to the image’s content. If you have any other questions, then see them answered here by Google:

Q: Why do I sometimes see Googlebot crawling my images, rather than Googlebot-Image?

A: Generally this happens when it’s not clear that a URL will lead to an image, so we crawl the URL with Googlebot first. If we find the URL leads to an image, we’ll usually revisit with Googlebot-Image. Because of this, it’s generally a good idea to allow crawling of your images and pages by both Googlebot and Googlebot-Image.

Q: Is it true that there’s a maximum file size for the images?

A: We’re happy to index images of any size; there’s no file size restriction.

Q: What happens to the EXIF, XMP and other metadata my images contain?

A: We may use any information we find to help our users find what they’re looking for more easily. Additionally, information like EXIF data may be displayed in the right-hand sidebar of the interstitial page that appears when you click on an image.

Q: Should I really submit an Image Sitemap? What are the benefits?

A: Yes! Image Sitemaps help us learn about your new images and may also help us learn what the images are about.

Q: I’m using a CDN to host my images; how can I still use an Image Sitemap?

A: Cross-domain restrictions apply only to the Sitemaps’ tag. In Image Sitemaps, the tag is allowed to point to a URL on another domain, so using a CDN for your images is fine. We also encourage you to verify the CDN’s domain name in Webmaster Tools so that we can inform you of any crawl errors that we might find.

Q: Is it a problem if my images can be found on multiple domains or subdomains I own — for example, CDNs or related sites?

A: Generally, the best practice is to have only one copy of any type of content. If you’re duplicating your images across multiple hostnames, our algorithms may pick one copy as the canonical copy of the image, which may not be your preferred version. This can also lead to slower crawling and indexing of your images.

Q: We sometimes see the original source of an image ranked lower than other sources; why is this?

A: Keep in mind that we use the textual content of a page when determining the context of an image. For example, if the original source is a page from an image gallery that has very little text, it can happen that a page with more textual context is chosen to be shown in search. If you feel you’ve identified very bad search results for a particular query, feel free to use the feedback link below the search results or to share your example in our Webmaster Help Forum.”

Comments

 

Published
Categorized as SEO

By Navneet Kaushal

Nav is the founder and CEO of Page Traffic, a premier search engine company known for its assured SEO service, web design and development, copywriting and full time SEO professionals. Navneet has wide experience in natural search engine optimization, internet marketing and PPC campaigns. He is a prolific writer and his articles can be found in the "Best Articles" section of many websites and article banks. As a search engine analyst , he has over 9 years of experience and his knowledge is in application here.