Flash Indexing on Google, Revisited

NOTE: This article has been revised based on the helpful comments from Brian Ussery and Bobby van der Sluis. Thanks very much to both!

In my last posting on Flash indexing by Google, I said that we should watch and wait because thorough testing and standards will begin to emerge over time. Well, the wait is over: Brian Ussery has published the first (so I’ve found) set of case studies on how Google indexes Flash. The results confirm much of what we already suspected.

Ussery conducted four case studies, studying different aspects of how Googlebot interacts with Flash. Here’s the upshot of his studies:

  • CASE STUDY #1: Does Googlebot correctly associate text inside a Flash file with the parent URL (the page that the Flash is embedded in)?
    RESULTS: No. Usually Googlebot indexes the parent URL and the Flash separately.
  • CASE STUDY #2: Can Flash files accrue PageRank?
    RESULTS: Yes. Flash files accrue PageRank independently of their own parent URLs.
  • CASE STUDY #3: Does Googlebot follow URLs containing #anchors?
    RESULTS: No. Googlebot ignores #anchors and only extracts URLs preceding the #anchor.
  • CASE STUDY #4: Can Google translate text inside of a Flash file?
    RESULTS: No. Googlebot can not translate text inside of a Flash file.

All these case studies seem to support the conclusion I came to earlier: Google’s indexing support of Flash is meager at best, and at its worst, could possibly hurt your placement on search engines. So what should we, as Flash developers, do if we want to have a search engine optimized Flash site?

In the past, you could use SWFObject to embed your Flash and it would present the Flash file to people who could support it (people with the correct Flash plugin) and it would present alternate (X)HTML to people who COULDN’T support Flash. In the past, this also included Googlebot. So if you set up alternate (X)HTML, Googlebot would index that and if you built it right, your Flash pages could have nearly the same ranking as regular HTML pages (because Google was seeing HTML). This allowed you to build Flash sites normally, then build a bunch of alternate HTML pages that had the same Flash embedded, write out their content in HTML, use deep linking to have each individual page go to the correct “page” within Flash, and nobody was the wiser — Google indexed your content and people clicking on Google’s links would see the correct Flash. You can click here to see my tutorial on how to build a site this way.

However, that’s no longer the case. Brian Ussery and Bobby van der Sluis in their testing have found that Google now indexes BOTH the (X)HTML content AND the Flash content in SWFObject. You might say “So what? Now Google is guaranteed to find my content — great news!” The problem is that Google, in the search results, could present a link DIRECTLY to your Flash file which may or may not be the correct “page” reflected in your (X)HTML. So now in your Google search results you can have a bunch of (X)HTML pages show up AS WELL AS a bunch of Flash files. There’s no way for you to dictate which page Google puts at the top of the list, so you could have people clicking through to Flash files that you don’t want.

So what do we do? Well, for my two cents I think we should still build sites using the original method for building SEO-friendly websites. While it you may get multiple results, I think that the benefits outweigh the negatives. Here’s why:

  • Not all search engines can index Flash, so the other search engines will index your (X)HTML content and serve up the intended pages as you intended.
  • Google still indexes the (X)HTML pages. Since most (X)HTML pages will be more SEO friendly than the Flash, chances are that the (X)HTML pages will be ranked higher in the results, and since people are more inclined to click higher results, chances are you’ll still get the intended results. I know that “chances are” is contrary to most practices in SEO, but it’s the best we’ve got at this juncture
  • It’s the only option we’ve got. Googlebot is indexing Flash now, whether we like it or not. Given that Googlebot’s ability to index Flash files is not as good as its ability to index (X)HTML files, the only option I can think of to help people find the content they’re looking for is to make sure that we’ve got (X)HTML content that WILL be correctly indexed (alongside the Flash that WON’T).

I sincerely hope that Googlebot’s ability to index Flash sites will continue to get better so that we can simply build Flash sites as Flash sites. I also hope that the entire Flash community continues to do research and testing to help this process. I recognize that the process that I endorse carries with it some problems and I welcome suggestions for better ways to optimize Flash for search engines. But from my current vantage point, this looks like the best way to do it to me.







4 Responses to “Flash Indexing on Google, Revisited”

Thanks for checking out my blog and the post! Here are a few things I wanted to be clear on:

1. Googlebot traverses SWFObject and sees content in Flash. Because Flash files are indexed users without Flash may access Flash directly from search results.

2. Because Googlebot traverses SWFObject it doesn’t seem as though it sees underlying (X)HTML pages or content within.

3. Googlebot crawls text content in Flash.

4. Googlebot ignores #anchors like those provided via SWFAddress. As a result Googlebot “sees” a link to http://www.mysite.com/index.html#aboutUs as http://www.mysite.com/index.html.

5. Because Googlebot ignores #anchors, only the URL prior to the anchor will be indexed meaning users may received different content than desired.

Sorry for the not so good news but I hope this information helps…

-Brian

Brian Ussery added these pithy words on Oct 19 08 at 8:10 am

[...] with direct links to your SWFs, which can completely mess everything up — I wrote a post that explores the problems presented by Google’s ability to index Flash if you want to read more. In a nutshell, however, I personally have come to the conclusion that [...]

TheCosmonaut » Blog Archive » 5 Steps for Building a SEO-Friendly Flash Site Using SWFObject and SWFAddress added these pithy words on Nov 09 08 at 11:59 am

Flash files getting these days very popular, if except Translation everything bot can extract then I thing there not to be any problem of us to make flash site. Thanks for this point out.

seo of india added these pithy words on Nov 13 09 at 9:34 pm

HaI hadn’t yet to think about the ultimately simple ways Google operates. The thing is that even though it crawls your page abundant times, it takes a tonne of due effort on your part in order to get your website to become “relevent” to Google. This will add to my knowledge of search engine optimization.

indexed by google added these pithy words on Jan 02 10 at 8:50 pm

Leave a Reply