Wednesday, February 11, 2009

Idea(s) for Google Alerts

I've been using the alerts mechanism offered through Google Alerts for a couple of months now.

What kind of service does Google Alerts offer, you ask? Well, in a nutshell, it is basically a service that allows you to receive email/rss(feed) based alerts

Now Google is not the first to be offering an alert mechanism engine such as this one. Most major news service allow you to configure some sort of an alert system. Breaking News from CNN alerts come to mind. Also companies like www.cnet.com allow you to configure and receive alerts on any entity that you'd wish. I am sure there are tons of others out there.

The (Google) alerting structure is quite simple, yet by virtue of Google's algorithms (read : Googly spiders crawlin) quite powerful. I mean an alert(ing) structure that would harness the capabilities of one of the best Search engine(s) that we have. Potentially scourge the entire Google index to get you comprehensive listings of all the alerts generated. And then compile and send those alerts out to you in a timely manner. Armed with a BlackBerry or a mobile device of choice, you have information at your fingertips as soon as it is made available.

After using this engine (Google Alerts :) for a couple of months I have brainstormed a couple of suggested improvements for this utility/application/service. So without further ado:

* idea # 1 *
Incorporate Google Alerts with the services rendered by Blogger (for trending and analysis):
Description for this idea:
At the moment all the alerts are sent using the following settings:
Delivery mechanism: email/rss
Frequency : daily, weekly, as it happens
Type : News, Blog, Web, Comprehensive, Video e.t.c

Once you receive these alerts (email/rss), there is very little you can do with this data set when it comes to trending and analysis. I consider email/rss as static entities, whereby you can do very little with the data set, once it is actually delivered to the "destination". If the alerts received can somehow be posted onto the blogger infrastructure (owned by Google), then this would enable you to go have a one-stop repository for all your aggregated alerts. Now you have an alternative to view the data-set in it's entirety from a a trending and analysis perspective.
Pros :
* No more having to log-in to your email to look at any given alert
* Not having to scroll through the alert in your mobile device to go through each and every link. You can just go to your customized blog to get this information
* Trending and Analysis for you would be done automatically (Blogger automatically categorizes each entry by date/month)
* Ability to sort through the data set and remove undesirable alerts. Example: having an alert for the term "Honda" might occasionally bring up alerts about the actual character in the Game Street-Fighter2. Where as you set the alert up for the actual Japanese Car manufacturer (sorry cant think of a better example right now :)
Cons :
* Potential abuse of service (SPAM)
* Could lead to a bit of a clutter in blogosphere

* idea # 2 *
Incorporate Google Alerts with a)Blogger + b)Google Trends
Description for this idea: Same as above. Only in this case you get a little Google Trend window (java applet/png image) show up beside each alert posting that would give you the frequency of how many instances of that term were generated in a specified timeframe. Google could even charge customers building some enhanced feature set into such an architecture.

Think about it : Any Enterprise's marketing team would love to know what kind of "impression" their marketing campaign is actually making it on the web. I'd love to elaborate on this specific idea. If someone from Google is reading this, and I know you are (thanks to Google Analytics :) then please free to get in touch with me.





* Also *
At this point, and again if someone from Google is reading this. I would really encourage you folks to think inside the box. Yes inside. Not just relating to this idea, but all of Google's products in general.

I say this because Google has tons of great/useful products(apps) that make retrieving, sorting and storing information efficient. However there is very little to no synchronization pertaining to how many of those applications work well with each other (if at all). There is so much useful data, I'd just hate for it to go to waste or to be used inefficiently.


Note to users new to Google Alerts/Or those interested in the product : If you happen to manually look up (verb:google) information regarding a particular Company, event or any other "term" per say. Then you might want to check out Google Alerts, it will save you time :)

Thursday, November 27, 2008

Future of Search

I've always been of the mind-frame that the future of search/searching is in-fact incorporating image search. i.e the ability to submit an image and have the search engine report back with all the results it can find in the expanse (Intra/Internet). Video categorization and search would eventually be the next evolutionary step....but image search has to evolve and grow before that occurs.


I did post an idea back in 2006 in this very blog that did touch base upon the searching capability using and tagging entities and images. ( Click here and refer to item "xvi) Google Big-picture:")


Anyhow this is quick little post about a startup I discovered the other day. It's a company called IDEE (inc) which develops advanced image identification and visual search software. They are based out of Toronto and after reading about them initially, signing up for a beta account on their website and playing with the couple of products they have rolled out. After experiencing all of this, well let me just cut to the chase and state that I will be eagerly awaiting for their IPO to be rolled out.


Check out some of their products listed under their main page .


I spent some time checking out the capabilities of one of the application/service they offer. The service is called http://tineye.com. You have to register on the website to get on the beta (instantaneous). Basically it is an image recognition service whereby you upload an image on the website and hit go. In a typical submit/get request the idee servers will crawl through the image set and try and find an identical or near-identical match for the image you submitted. I ran some tests on the http://tineye.com website and here are some of the results I received for the various request (number of items returned):

1- tux (304 results submitted)
2- stanley cup (7 hits)
3- arctic fox (2 hits)
4- image of george bush (486 hits)
5- particular image of stephen hawking (14 hits)
6- particular image of Hal 9000 (2001 space odyssey) (37 results)


I'd be really curious to know how Idee's algorithm's actually work. The way I percieve it, it would obviously have to do with some sort of pattern recognition. But (and this is a presumption on my part) what would end up happening is each one of the images that Idee Inc's server's would crawl, they would carve out some sort of outline for these images and then store it in their back-end database. The outline would be the first step, the algorithm would also assign each one of the images some unique attributes that makes the image unique. Something along the lines of what facial features do for face-recognition. But this indexing and characterization is where the magic is occuring and that's what I would like to know is how it it occuring.

Otherwise if the algorithms are actually going down to the pixellization level then my theory goes to the bin. However...I highly doubt that the indexing is happening down on a pixellation level as the system/cpu overheading for rendering a job like this across the huge expanse of the internet would be too much for the servers to handle.


Already Idee Inc has some big names as their customers, including www.digg.com and Associated Press.


Do keep in mind that http://tineye.com is not the only service they offer :)


If image search is of interest to you, then you'd definitely want to track the progress of this company.


Before I go I have to mention that, I actually got to know about these folks reading the Toronto Star. Ironically now that I do a search on the toronto star website (www.thestar.com) there are no search results returned for either Idee and or tineye or the founding members of this company. Wasted 10 minutes trying to find the article I originally read in print format.

recaptcha

  • Have you spent countless seconds (which seem like hours) trying to figure out a word ---> in order to post a comment on a website you visit. Images that must be translated into words (human/machine differentiation) which look something like the image below:




  • After repeated (failed) attempts...did you eventually get frustrated while trying to decipher the given text...cursing to yourself "all I'm trying to do is post one little comment"

  • If this happened to you..then like myself you probably did doubt your typing capabilities for a tiny period of time (A brief history of annoying times : recaptcha)..thinking perhaps your caps lock are on..and or...your'e getting the fat-finger syndrome. Only to try one more (say the 14th time) and eventually you get through.



    If like me...all of these things have happened to you as well. Well...then my friend please have solace in the fact that you are not alone. All of this has happened to us from time to time. Today after my 8th attempt at trying to post a comment on a given blog I visit (www.michaelgeist.ca). I realized that the service is offered by a company called : "recaptcha". Quickly perusing the website it seems like it started off as a project at the Carnegie Mellon University. Anyhow...I was getting quite frustrated with the various attempts in trying to figure out the words through the text/sounds based information provided. I was eventually able to figure out what the words were, but decided to send the recaptcha folks a lil note, regarding some tips and suggestion. Here is the excerpt:


    Subject : General comments about recaptcha


    Email Body:

    Hello,


    I think you've made it far difficult for the human being to decipher what text to type by the options you provide through your service/application:
    1- Text based values:
    I think it would be far better if each one of the letter had some sort of spacing in between them. Rather then have wavy lines going through the text and/or shrinking/blotted out text for that matter. Do you folks conduct any kind of metrics in regards to how many attempts it takes any given users before they can get through the service provided by you guys and try and do what they are trying to do. I have a funny feeling recaptcha is resulting into humanity losing some precious hours (cumulatively)
    2- Sound :
    This option doesnt even work. Have you guys run any kind of QA testing using external candidates that have never used this service before. And if you did, may I ask what kind of success rate(s) did you get? I've been sitting here for 10 minutes trying to makes sense out of the cacophony that is generated through my speakers when I try and "listen" the the actual word. It's almost as if I have to tune out everything else, then obviously pay utmost attention and then perhaps run some type of pattern recognition/ anti-crypto exercise in order to determine the actual word. There is the main voice of a person who takes 3 seconds to speak out each letter! But then on top of that there is the cacophony of all the other people talking in the background. It reminds me of the telemarketer calls that I occasionally receive from different parts of the globe. Only in that (telemarketer's) case I am able to understand what the person on the phone is actually trying to say.

    Please make your product a bit better (a.k.a user friendly). I am sure the bright minds who came up with this product will be able to figure out how to make this a bit more user friendly without the sacrifice of letting machine decipher/crack the "code" and get through.

    Regards,





    I must say, it has been less than an hour since I sent a note to these folks and I've already got a response. Check it out:

    Hi Adeel,

    reCAPTCHA users have a 96% success rate on images. We're always improving, however we have to be careful to not get too easy and allow bots in.

    As for our audio, we highly recommend using headphones and listening to a CAPTCHA multiple times. We do realize the current design is hard, and are getting ready to release a new type of audio CAPTCHA in the near future.

    - Name Withheld


    It's an evolving technology and they are pushing for improvements. Let's cut them some slack
  • Saturday, October 04, 2008

    www.vudu.com

    A bit short on time right now. But I think this is a very promising start-up.

    Will come back and blog more about this. However let's just say that if you:
  • Enjoy watching movies
  • you believe a trip to BlockBuster (pickup and dropoff) is all too time consuming
  • Netflix is too much of a hassle
  • And DVD or rendering media through disks is simply too old of a technology.

    Then this is something you will appreciate. I haven't had a chance to check it out myself, although I'd love to get my hands on one. But this product is already getting rave reviews anywhere from Wired magazine to Rolling Stones magazine.

    Check it out at:
    http://www.vudu.com/
  • Ideas for Blogger

    Blogger should provide the ability to move posts between two or more blogs.

    Taking it up a notch, some sort of a widget that enables you to port content from one blog to another. This can be related to just one user account, or multiple user accounts.

    Oh and, if anyone is listening : Typing letters from a gif image while "editing" a post doesn't make rational sense. For new posts...yes...as you want to limit the amount of spam feeds. But Edits...not really. Either that or there should be a timeout clause...where you have to start repeating the process after a min of edit mode.

    Monday, September 08, 2008

    Joost via Browser

    Noticed an article being dugg the other day that entailed that Joost will be available over the browser soon.

    Since I was on the original BETA, I tried to get onto the new beta site, which can be found at http://new.joost.com/. However seems like the new Beta is solely on an invitation basis only.

    I along with hundreds (perhaps thousands), had been forever requesting the Joost development team (through their forums) to release a Linux version of the application. There wasn't much in terms of an encouragement, that a Linux based client would ever be delivered. Accessing Joost over a browser, however sounds like a more logical choice.

    However it would be interesting to find out how this decision by Joost actually came about. I say this because a browser version of Joost is already available, however not through Joost themselves. Developer Paul Yunez developed and released a flash-based Joost clone that renders in a browser. I just gave it a go myself and it works for the better part. However after rendering a couple of videos using this interface I noticed that Paul has simply developed an interface that would take videos off a couple of website(s) and render them on the browser through an interface that looks like that of Joost's. So in-effect you have videos hosted on Google video and hence this is simply a clever cloak. Conversely, Joost was hosting most of the content themselves, hence the higher quality videos (not quite HD).
    Paul Yunez's website (so called Joost interface) can be found on the following location. Click here

    In the interim, I would really like to get on the actual Joost beta. If anyone else got on this beta and if you happen to have any invite(s) left, then please send me a note.

    Blogger's Block

    WOW! It has been more than an year since I posted anything on the blog. I guess it was a cumulative effect of a self dubbed "Blogger's Block" (Wait...I just ran a search on Google and as I expected..I am definitely not the first one to coin this term). But yes, the inactivity was most definitely due to the Blogger's Block which was invariably a long term side-effect of a couple of things.

    I guess the time has come to remove the picture of the scrupmtuos Masala Dosa off the main page of the blog and actually start blogging.

    So without further ado!