From http://huangry.spaces.live.com/blog/cns!11F27CD37F710403!2071.entry.
Thursday, November 26, 2009
61 Free Desktop Applications, Webapps, and Tools We're Most Thankful For
Wednesday, June 24, 2009
Another way of life
"Education gives you freedom," Jerry Buss says. Growing up poor in Kemmerer's coal mining region, Buss decided early on that life underground was not for him. "I realized that most of the kids who grew up in the mining camps stayed in those towns and worked in the mines. I didn't see myself doing that; for one thing, I didn't like the idea of being a couple of miles underground with all that stuff over my head. So, freedom became the most important thing in my life, and education became my way out."
As most people know, Jerry Buss is the owner of the Los Angeles Lakers and a past owner of the National Hockey League Kings. However, less people may know that he got his Ph.D in chemistry at the University of Southern California at age 24. I remember one day he was on an interview and was asked how much difference his Ph.D experience made for him to his future success. He said, well, it did not make too much difference on my career or did not mean that I am smarter either, but it did teach me how to handle loneliness, and that is very important.I guess this is so TRUE when you decide to pursue a Ph.D. You should know this is a tough journey where you need to move ahead on your own. You make your own choice on every crossing, taking your own direction step by step, leaving your own persevere footprints behind. And then, one day, you will find yourself a proud and strong person inside. This is the type of personality that I should achieve, and this is also the type of attitude anyone who want success should possess.
Remember, sweet sugar will not lead you to success, but loneliness may.
Friday, July 18, 2008
What am I searching for?
I like this guided search a lot, because most times I find myself not exactly know how to best describe what I want to search. For example, if I want to find a cheap airticket from New York to Shanghai, should I search for "cheap New York airtickets", or "cheap Shanghai airtickets", or more precisely, "cheap airticket from New York to Shanghai"? Unfortunately, most likely the last query will FAIL by experiences, since it seems to contain too many keywords. See, sometimes, not always the more the better :-(
When you search for something, search engines provide you potential pre-defined questions to help you better define your query and find out what you need. This seems to be a popular trend in future. There should be a pre-processing procedure to cluster different queries and then classify the upcoming ones into any of the possible categories. Then, these new queries can in turn help improve the clustering results before the next round. Mmm...sounds like "Active learning". This work is quite challenging, since it needs semantic level natural language analysis to better interpret the words' meaning, instead of just doing simple string matching (it seemed to me that the German guy did only string matching using some distance computation).
"Search, search, search~~" We are doing keyword search everyday.
However, before we rush into the search bar, should we think twice what exactly we are searching for? Or should we not?
Perhaps one day, others will know it better than ourselves.
(PS: Something from Google Official BLog.)
Technologies behind Google ranking
7/16/2008 10:53:00 AM
Search in the last decade has moved from give me what I said to give me what I want. User expectations from search have rightly increased. We work hard to fulfill the expectations of each and every user, and to do that we need to better understand the pages, the queries, and our users. Over the last decade we have pushed the technologies for understanding these three components (of the search process) to completely new dimensions.
When we talk about queries at Google, we use square brackets [ ] to mark the beginning and end of queries (see "How to write queries" by Matt Cutts), a notation I will use throughout this post. (Pages and search results change frequently, so in time, some examples used here may not behave as explained.)
- Understanding pages: Over years we have invested heavily in our crawl and indexing system. As a result we have a very large and very fresh index. In addition to size and freshness, we have improved our index in other ways. One of the key technologies we have developed to understand pages is associating important concepts to a page even when they are not obvious on the page. We find the official homepage for Sprovieri Gallery in London for the Italian query [galleria sprovieri londra], even though the official page does not have either London or Londra on it. In the U.S., a user searching for [cool tech pc vancouver, wa] finds the homepage www.cooltechpc.com even though the page does not mention anywhere that they are in Vancouver, WA. Other technologies we have developed include distinctions between important and less important words in the page and the freshness of the information on the page.
- Understanding queries: It is critical that we understand what our users are looking for (beyond just the few words in their query). We have made several notable advances in this area including a best-in-class spelling suggestion system, an advanced synonyms system, and a very strong concept analysis system.
- Understanding users: Our work on interpreting user intent is aimed at returning results people really want, not just what they said in their query. This work starts with a world class localization system, and adds to it our advanced personalization technology, and several other great strides we have made in interpreting user intent, e.g. Universal Search.
Finally let me briefly mention the latest advance we have made in search: Cross Language Information Retrieval (CLIR). CLIR allows users to first discover information that is not in their language, and then using Google's translation technology, we make this information accessible. I call this advance: give me what I want in any language. A user looking for Tony Blair's biography in Russia who types the query in Russian [Тони Блэр биография] is prompted at the bottom of our results to search the English web with:
Similarly a user searching for Disney movie songs in Egypt with the query [أغاني أفلام ديزني] is prompted to search the English web. We are very excited about CLIR as it truly brings us closer to our mission to organize the world's information and make it universally accessible and useful.I hope my two posts about Google ranking have made it clear that we live and breathe search, and we are more passionate than ever about it. Our fervor for serving all our users worldwide is unprecedented. We pride ourselves in running a very good ranking system, and are working incredibly hard every day to make it even better.
I could go on and on showing examples of state-of-the-art technology that we have developed to make our ranking system as good as it is, but the fact is that search is nowhere close to being a solved problem. Many queries still don't get satisfactory results from Google, and each such query is an opportunity to improve our ranking system. I am confident that with numerous techniques under development in our group, we will make large improvements to our ranking algorithms in the near future.
Posted by Amit Singhal, Google Fellow.
Sunday, June 8, 2008
The SIGMOD Jim Gray Doctoral Dissertation Award
I was reading my Google Reader RSS today and found this:
"SIGMOD has established the annual SIGMOD Jim Gray Doctoral Dissertation Award to recognize excellent research by doctoral candidates in the database field. Until 2008, this award was known as the "SIGMOD Doctoral Dissertation Award." In 2008, SIGMOD, with the unanimous approval of ACM Council, decided to rename the award to honor Dr. Jim Gray. SIGMOD Jim Gray Doctoral Dissertation Award winners and runners-up will be recognized at the SIGMOD conference, and their dissertations will be included in the SIGMOD DiSC and on the SIGMOD Online web site. The award winner will also receive a plaque and present his or her work together with the winners of the SIGMOD Innovations and Test of Time awards."
This reminded me of this respectable person, Jim Gray, who is from Microsoft Research but has gone missing at the sea since early last year. People have still been looking for him, but there is no good news yet. That was really sad. What made me sadder was that not until today did I realize that he also helped in the development of Virtual Earth, which is an advanced online geomapping service to help us locate ourselves, and I am using it right now! Suddenly I feel like he was not a Turing winner far away in CA, but a person who was so close to me! Can't believe something so great in my life, but one of its inventors can no longer enjoy it with us.
I do not know if I have a chance to win this award since it is not exactly my field. But I am very encouaged that it is renamed under him. Because of his efforts, we will never get missing in future.
Saturday, May 31, 2008
A taste of teaching
Last week, it was our last class with Prof. Foster Provot for this semester. This is a PhD level seminar discussing all kinds of topics related to data mining and machine learning. As the only three registered PhD students in Stern, Xiaohan, Mihaela and I were "pushed" to give (bi-)weekly presentations and lead discussions for every paper on those topics.
Oh, god! That was hard! I couldn't understand this. When I sit down in the class and listen to the professors, they are all talking and smiling, making all kinds of jokes, writing gracefully and drawing nice pictures on the board. They are teaching as if doing something really really really easy. However, when I stood in front of the class, no matter how hard I had prepared, I felt nervous, awkward, and then suddenly forgot what I should say. My tones got wondered and my voice became frozen. My confidence was quickly fading out... In fact, I was pretty confident in my presentation skills because I already had some conference/workshop presentation experiences before. I always felt proud of my cool behavior in front of a group of people. But now the truth was that it did not work here! Teaching in class is totally different from giving a short 20-minute talk, at all! For this, I really admire Foster! He is such a sharp person and a great professor. He can always notice the key point in our thoughts and help us sort it out right away. Often times, his questions are actually helpful and informative "hints", which inspire us to think what we have neglected and then better organize our thoughts.
Prof. Anindya Ghose once told me that when you talk to people, you should try to make your point as clear as you can at the first time. Do not wait for people to find themselves confused and then ask you. I believe this is important, but it is not easy to achieve. Sometimes, when we explain something, we have a tendency to either describe it too much that makes the redundancy, or speak too little that leads to the ambiguity. (It seems that the distribution for the intensity of our explanatory words is "bimodal", either too high or too low.) I like Prof. Panos Ipeirotis's teaching, because his way is highly logic. You feel like you are led into a room, and then get to explore by yourself with encouragements time by time. He does not show the whole picture at one time, but leave to us ourselves to find it out. That is coolest part. You never know how big the picture is! Just like an adventure game!
I sometimes was imaging myself in the future, can I do this well when I become a real professor? Will my students enjoy my teaching too? Yea, I believe so! That is my goal and just keep going:-)
Tuesday, May 20, 2008
Data Mining Blogs: The Big List(ZZ from Sandro Saitta)
Sandro Saitta has a full list about the data mining blogs. Just something very nice that can be introduced here:)
: this blog gives news about data mining and AI very frequently (Alberto Roldan)
: the blog of the data mining laboratory at Brigham Young University, mainly about social communities and meta-learning (Data Mining Lab)
: discuss statistics and predictions among other interesting topics (John Aitchison)
: comprehensive posts on technology and news related to data mining and machine learning. Also a lot of very useful resources (Pete Skomoroch)
: although not updated recently, this blog has interesting posts about data visualization and statistics (Donald Farmer)
: a blog that focus on data mining using Microsoft SQL Server (Jamie Mac)
: data mining with a point of view from statistics (Rachel Graham)
: a personal view on data mining with posts on different applications and news (Shane Butler)
: a machine learning blog from a PhD student at Cambridge (Jurgen Van Gael)
: more machine learning oriented but contains a lot of useful information (Pierre Dangauthier)
