Find out when your team reports! Not too long!
Cubs report on Valentine's Day!
Technorati Tags:
springtraining
Brian Dennis shows that you can download a snapshot of the Wikipedia. It looks like it happens once a quarter or so but this is an incredible amount of data to go through and use in different and fun ways.
Also, it could be beneficial for developing and testing various sorts of retrieval / indexing code.
Check out dbpedia which has added semantic information and allows you to query for data.
Technorati Tags:
wikipedia, dbpedia
Greg Linden points to a new paper out of Yahoo Research, talking about the challenges ahead for distributed web search [PDF]. If you are interested in the various pieces which need to scale as you build up any sort of search infrastructure, read the paper.
A couple bits which caught my eye were the mention of P2P search and doing more things on the client-side. I'm curious as to how P2P could help with the distribution of data with machines being both client and server. I would think you wouldn't be using outside, non-trusted machines for this instead creating an internal, trusted P2P network. Could this help? It would be fun to find out.
On the client-side, what about using clients for caches? I realize it would be out of touch with the main index but I would think it could be useful for some cases.
At any rate, the next few years are going to see an increase in the necessity of scaling search as it moves to additional aspects of our lives. Companies which can make themselves scale are going to be out in front of others who can't.
As an aside, one way I like gage research papers are the other papers they cite. If you were to read each paper in the bibliography, you would get a foundational introduction to all aspects of search as well as distributed systems.
Technorati Tags:
distributed, search, scability, greg+linden, distributed+search, p2p+search
The past few days, I've been heads-down, moving one of our infrastructure pieces from supporting a few of our Web applications to also supporting our portal, myEarthlink. This means going from some traffic to quite a bit of traffic. I'd be lying if I said everything went smoothly but (knocking on wood), things do look better and I think we are going to be fine.
I did learn a couple of things, not just technologically but also socially. I'm sure many of these are no-brainers but they were reinforced to me.
Don't rush, Sure everything seems to be blowing up but if you don't take your time you could easily miss a wrong configuration item or something else simple which is causing problems.
Give status as much as possible, when you are working with folks via instant messaging or IRC, it is imperative you let people know what is going on especially if they are waiting on you. The longer you make them wait, the more tension will arrive.
Take care of your own, It's amazing what providing food for system administrators and testing can do to the help being offered.
Know the customer impact, when things are seemingly blowing up, what does the customer see and more importantly sometimes, what doesn't the customer see. This can help you make decisions on when to rollback to previous systems or to just keep going and fix what you can.
Kill your darlings, it's good advice when writing and it's also good advice when working on scaling systems. Many times you have something which was either new or not used much before now. If you hold on to every bit of cool technology at the expense your user's experience, no one is going to be happy. Be able to admit that, get something else working and move on.
Last month, many people talked about the data eBay provided about how they scaled to support the traffic they currently have. In Frank Sommers' post, one thing which jumped out at me was eBay's willingness to change how they were doing things. They moved from a very primitive early system, thru a J2EE architecture to what they have now. You always need to evaluate your systems. Sure, moving from one architecture to another isn't easy and sometimes people have their own agendas in trying to do it but it is your responsibility as a developer to focus on the how this system affects the user and what changes are needed to make things even better for them.
Technorati Tags:
scability, architecture
Well, I'm always thinking about it but Simon Willison is thinking about it more right now. His two most recent posts look at solving the OpenID phishing problem and creating a whitelist via OpenID.
Of the two, the phishing issue seems to be one that needs to be addressed fairly soon. The problem with having something URL-based is that with the redirect from one site to another, the man-in-the-middle attack is somewhat easy to create. If I create a site which lets people use their OpenID logins, I can easily create a secondary site which mimics the OpenID provider's site and thus gives me the user's login information. While this might be too much work right now, once more people start using OpenID, I believe you will see more of this.
Scott Kveton, CEO of JanRain, gives some additional perspective on the issue. I totally agree with Scott that it's very encouraging that so many people are looking into the issue now and hopefully can help direct solutions in an easier manner.
Technorati Tags:
openid, openid+phishing
Bear down, Chicago Bears, make every play clear the way to victory; Bear down, Chicago Bears, put up a fight with a might so fearlessly. We'll never forget the way you thrilled the nation with your T-formation. Bear down, Chicago Bears, and let them know why you're wearing the crown. You're the pride and joy of Illinois, Chicago Bears, bear down.
Technorati Tags: chicago+bears, super+bowl
I'm a sucker for book sites so it should be no surprise that I'm going to check out Shelfari. We'll see how easy it is to just get my book queue in there and that will help decide whether I put more books in.
Technorati Tags:
shelfari
During my blog reading yesterday, a couple of sites were linked to and definitely were noteworthy.
The first is Yahoo TagMaps. It's a project from the Research group and it integrates Flickr photos tagged with location information with Yahoo! Maps so you can move thru locations and photos in a different way. Additional information can be found at SearchEngineLand and the O'Reilly Radar.
The second I found via BoingBoing and it's a Homeless Heat Map for downtown Los Angeles. Taking the available data for the homeless, the information was put onto a map and utilized heat maps to show where folks were congregating. The Map shows the progression for every two weeks. It is interesting to see how everyone moved. blogdowntown gives a glimpse about how the maps were put together as well as some additional information.
What I find cool about these two sites is the use of maps for the user interface. With the explosion of mashups for the various mapping sites, I believe it is beneficial to look at how those maps can be used for other types of data. Using a map gives an added amount of perspective that just text and links doesn't. To see the migration of the homeless over a two week period on a map is a much more effective way of communication then just sending out an email with an Excel spreadsheet attached.
Technorati Tags:
mashup, maps, mapping, maps+as+interface
A few people are trying to get a Microformats LA meetup together for next week since Tantek will be down here in SoCal. If you have more options or want to attend, update the page with your info!
Technorati Tags:
microformats, microformats+los+angeles
I was saddened to read about Greg Linden's decision to stop development on Findory and let the money run its course for the rest of the year until it's gone. Of course, I haven't done a very good job using the site either and I have a feeling I'm not alone or this wouldn't be happening.
I can only hope that Greg continues down the personalization path and does something else. Also, I, selfishly, hope he continues to blog because I know each post shows me something new.
Good luck, Greg!
Technorati Tags:
search, findory, greg+linden, personalization
I was looking a bit at Zirr.us this evening and I've found it pretty interesting. INstead of the traditional lists and outlines for your TODOs, you use tags and varying sizes depending on the task's priority.
I'm not sure if this would work for a day-to-day system but it definitely could be useful for certain things.
Technorati Tags:
tags, zirr.us, productivity, tagging
Firefox 3 as Information Broker
Microformats: Parts 0, 1, 2 and 3
A look ahead to 2007 from danah boyd
Continuous computing: all social, all the time
Ignite Seattle talks online
The Return of Command Line Interfaces
A study of del.icio.us tagging
How Plasticity of Identity doesn't hold up
@toread and Cool: Tagging for Time, Task and Emotion
Continuous Partial Postponement and more on it
Groovy 1.0 + Grails & Compass/Lucene searching!
OJAX - Ajax-powered metasearch
Bill de hÓra gives a great overview of Mercurial, a distributed version control system. It's pretty new but seems to be on the right track. I might have to try it out with some personal projects.
Technorati Tags:
mercurial, distributedversioncontrolsystems
Chris Messina posted on Twitter something yesterday which wouldn't get out of my head, WebKit + iPhone + Microformats + OpenID... mhmm.. Could the iPhone become the killer app for the Microformats movement?
Since Safari (or any WebKit app) can be used and you have the integration with all of the other applications like Mail and iCal plus the ability to utilize maps and reviews, it seems that it's very possible. Having the ability to call someone directly from their page based on the microformats embedded seems like a great thing.
Beyond that, being able to aggregate various microformats and offer search solutions could be perfect for companies. I know Technorati is doing this a bit but there's always room for more.
Technorati Tags:
microformats, chris+messina, iPhone
No, I'm not asking what you are doing. Instead, it's an events site which I hadn't heard of until I read about it on Fred Stutzman's blog.
What I find interesting is seeing the name of a former co-worker, Josh Lerner as the CEO of Team Gigabyte, the team who has built Busytonight. I worked with Josh at CollabNet and I've wondered what he's been up to after working on the Wesley Clark campaign back in '04.
Busytonight definitely seems to be crawling many sites to get that many events in their database. Very cool!
Technorati Tags:
events, busytonight, josh+lerner, fred+stutzman
I was tagged by Brian with the 5 Things meme. I should be honored since this is the first time it's happened.
So here are five things you might not know about me.
I'm a business / productivity book junkie. It doesn't matter how into fiction I get, there is always the lure of the Business section at any bookstore. I just finished The Paradox of Choice and I'll be moving to Wikinomics. Here's my current book queue.
Before writing code for a living, I worked at the following companies: Sav-On Drugs, Cole-Haan, The Gap, Nordstrom's and Starbucks. So I would say I know a thing or two about good brands.
My dad is a Baptist minister and I lived in the fishbowl that is a preacher's kid's life. Definitely not easy to do especially when your beliefs no longer follow your parents.
Between May and October and of last year(2006), I lost 50 pounds. Since then, I've maintained and haven't gained any of it back. I'm pretty proud of both parts of that. It wasn't easy and it still isn't but going to the gym 6 days a week and drinking a ton of water helps so much.
Though you could probably tell I like sports, I'm not sure if I've ever said I lettered in football, baseball, basketball and golf in high school. I played golf at the jr. college I attended and then basketball at Moody.
And with that, I will now tag Gregg, Dave and Josh.
Ugh, that was ugly. I should have taken Brian up on his offer.
It's a sad way to end the year but I still have faith for the future. I definitely believe Weis is the right man for the job but losing like that still sucks. And now I have to wait until September. Thank goodness the Bears are in the playoffs and then it will be almost time for pitchers and catchers to report.
Go Irish!