Navigation

  • What's the Guru Gonna Say Next? Subscribe now.

    RSS 2.0 | Atom 1.0 | CDF

    TopRank Reader Poll

    Search

    Categories

     

    On this page

    Even Internet Marketers Can Understand Latent Semantic Indexing

    Archive

    Blogroll

    Debra Mastaler
    The Absolute Authority In Link Building for Market Share.
     Fantomaster. A true legend in his own time.
    One of the most intelligent men I have ever had the pleasure of knowing. And quite a looker too in that expensive suit he always wears in public
     Jim Boynkin
    a true in-the-trenches SEO warrior
    Search Engine Optimization Journal
    For their obvious good taste.
    SearchRank Blog
    David Wallace nominated me for an award and I like him !
     SEO Rock Stars
    These are just some of the people I have had the great fortune of meeting, doing business with or just read them all the time because they are either good or entertaining or both. Just do a search for any of these names. Todd Malicoat- Michael Gray- John Andrews- Ed Purkiss- Danny Sullivan - Christine Churchill - Kim Krause- Jenifer Slegg - Jason Duke - Mikkel Svendsen -Ian Mcanerin and more. I wish I could name them al
     Shoemoney
    I don't know of anyone else who has lost so much and gained so much doing it. A man who puts his money where his mouth is.
     Sphinn
    Everybody is doing it!
     Superior SEO insight
    This guy can really open your eyes to the REAL seo world with every post. Excellent!
    Talkndu
    News and Information about Mobile SEO
    The best search news site
    If you can only read one search news site a day searchengineland should be it. Then go Sphinn it!
     This Week in SEO
    Another cool resource to help you remain out standing in your field. Great job guys!
     Wolf-Howl
    the must read blog of a true SEO linking artist

    Disclaimer
    The opinions expressed herein are my own personal opinions and do not represent my employer's view in anyway.

    Send mail to the author(s) E-mail

    Total Posts: 71
    This Year: 0
    This Month: 0
    This Week: 0
    Comments: 400

    Sign In

     Sunday, January 18, 2009
    Sunday, January 18, 2009 1:42:55 AM (Eastern Standard Time, UTC-05:00) ( )

                                                                         If you think Education is expensive – you should try ignorance

     

    As I promised a couple of weeks ago, I’m going to attempt pull back the shroud of mystery that is so often associated with highly complex mathematical concepts to a point that even marketers and Canadians can understand them.

    I learned long ago that sometimes simple concepts the human mind can conceive, turn into a VERY complicated programs when turning that concept into an actual working algorithm. If you are not a scientist involved with Information Retrieval, (IR), then reading the patents and white papers can leave you feeling confused and intimidated.

    So, what I do is read the text of those white papers and patents, (skipping all the math formulas and illustrations), read what some others whom I respect say about those papers, then devise my own testing process derived from I THINK I understand and try it out. Once I prove to myself that I have the “gist” of the concept as it applies to what is important to me, (usually organic, relevant traffic generation),  I’m satisfied that I understand the concept well enough to capitalize and profit from it. It’s not as hard as some might have you believe.

    Today, the "Dumb It Down" anti seo-SEO Guru will de-mystify:

    Latent Semantic Indexing

    LSI has been around for much longer than you’d think. In fact, LSA, (latent semantic analysis), was patented in 1988. But I became aware of it from a man I was lucky enough to call a friend. A very smart man and one of those IR scientist guys I mentioned earlier. He was VERY involved in LSI research and application and he was the one who helped me understand it at a high level and then apply it at a low level. In my experience, when you can learn it high and apply it low, it usually means $.

    His name is Edel Garcia. I haven’t seen him around in a while so I’ve kind of lost touch with him but he still has a great site covering this and related topics and while possibly a little dated, it is still very good foundation kind of info.

    http://www.miislita.com/information-retrieval-tutorial/svd-lsi-tutorial-4-lsi-how-to-calculations.html  

    Ok, let’s get started.

    One of my favorite sites about LSI is http://www.knowledgesearch.org/lsi/lsa_definition.htm

    An excerpt from that page that I feel illustrates the concept pretty well is:

    let's say we use LSI to index our collection of mathematical articles. If the words n-dimensional, manifold and topology appear together in enough articles, the search algorithm will notice that the three terms are semantically close. A search for n-dimensional manifolds will therefore return a set of articles containing that phrase (the same result we would get with a regular search), but also articles that contain just the word topology. The search engine understands nothing about mathematics, but examining a sufficient number of documents teaches it that the three terms are related. It then uses that information to provide an expanded set of results with better recall than a plain keyword search.

    Ignorance is Bliss

    Before we discuss the theoretical underpinnings of LSI, it's worth citing a few actual searches from some sample document collections.

    • In an AP news wire database, a search for Saddam Hussein returns articles on the Gulf War, UN sanctions, the oil embargo, and documents on Iraq that do not contain the Iraqi president's name at all.
    • Looking for articles about Tiger Woods in the same database brings up many stories about the golfer, followed by articles about major golf tournaments that don't mention his name. Constraining the search to days when no articles were written about Tiger Woods still brings up stories about golf tournaments and well-known players.
    • In an image database that uses LSI indexing, a search on Normandy invasion shows images of the Bayeux tapestry - the famous tapestry depicting the Norman invasion of England in 1066, the town of Bayeux, followed by photographs of the English invasion of Normandy in 1944.

    In all these cases LSI is 'smart' enough to see that Saddam Hussein is somehow closely related to Iraq and the Gulf War, that Tiger Woods plays golf, and that Bayeux has close semantic ties to invasions and England. As we will see in our exposition, all of these apparently intelligent connections are artifacts of word use patterns that already exist in our document collection.

     

    Cool ? Not too long of a stretch to grasp huh? BUT what does that tell you about how to build your pages so that you start generating converting traffic from organic results?

    I’m going to give you several links at the end of this post that displays a wide variety of explanations, definitions and opinions that all give you as much depth and insight into the topic as you want to dig for. But when you start looking for what all those definitions mean as it applies to making your competitors suck SERP wind, the list gets pretty short. That’s what I’m trying to help with.

    I’ve read most of them, as painful as many are, and I can save the majority of readers a lot of time and teeth grinding. Boiling all the information down into workable solutions is the key and to do that you don’t need to be a scientist or mathematician. You just have to use your common sense and realize that all these papers were written too satisfy some personal agenda of the author and we can all be pretty sure that agenda had little to do with OUR success.

    So, bottom line, what does all the self serving information tell us as online marketers?

    STAY ON MESSAGE!

    That’s right. Perhaps millions of words online describing a pretty basic process and it all comes down to 3 words. Stay on message.

    I understand how difficult it is to plan out an entire site for the long term. Things change and experience is what we get when we were expecting something else. So even though I am a big supporter of planning and developing objectives and strategies, I’m making it even easier than that. You can do it on a page by page basis and simply spend five minutes thinking about what the new page is about, what is it’s purpose what words best serve that purpose and then stay on message.

    It is really not that different than talking to a few people at the same time. If you had the attention of say half a dozen people for 10 minutes and your objective was to get them to buy a couple of raffle tickets, you may spend a little time talking about the charity you were pitching, maybe a little time talking about yourself to establish trust but the majority of the time you would spend talking about the benefits of them buying the tickets. Why? So they associate the benefits with the product. ASSOCIATE. That’s a big word when it comes to LSI.

    Assuming someone in the group raised their hand and asked a question about some other charity? Would you talk about THAT charity and their benefits? Why would you want to point out the benefits of a different charity? It would confuse the people you just convinced about the benefits of YOUR charity.

    Would you talk about the other charity trying to make them look bad? Again, buy talking about them instead of you, you risk someone disagreeing with you and you lose affinity with them.

    So what is the best route? STAY ON MESSAGE.

    It’s the same with web pages and online content. The only difference is you are not using a vocal medium, you are using a textual and graphical medium.

    So you use the alt tags for graphics to associate words to serve your purpose. You use heading tags, titles, anchor text in links both interior and exterior. You think about what words you want to be put together by the search engine and by the humans reading the words and     STAY ON MESSAGE. If you are talking about plastic surgery, don’t start talking about plastic car parts. Use medical terms, use anatomical terms, use common words and phrases that paint a mental picture of the target topic.

    The next page you build you can talk about anything else you want but on that page, again, talk only about one thing. Use as many words as you like,(the less words you use to get the message clear in the readers mind the better BTW), but only talk about that one thing.

    That’s it. I could elaborate but there is really no need. If you don’t understand staying on topic or if you suffer from ADD or some other affliction that hinders your concentration or focus, then you should seek professional help, but for the mast vajority of us wanting to make a buck online, just focus and stay on message.

     

    Peach Y’all

    The Anti-seo SEO Guru

    References

    en.wikipedia.org/wiki/Latent_semantic_analysis

    www.knowledgesearch.org/lsi/lsa_definition.htm

    lsa.colorado.edu/papers/dp1.LSAintro.pdf

    www.freshpatents.com/Scalable-probabilistic-latent-semantic-analysis-dt20071011ptan20070239431.php

     

     

     

    Don’t eat your lunch before you get to school. You won’t have anything to trade for something better and you’ll go hungry!