Here is a list of documents I found really useful when I first started working with Hadoop : Below are the two papers from Google on MapReduce paradigm and GFS

  1. Map Reduce: http://static.googleusercontent.com/external_content/untrusted_dlcp/labs.google.com/en/us/papers/mapreduce-osdi04.pdf
  2. Google File System: http://static.googleusercontent.com/external_content/untrusted_dlcp/labs.google.com/en/us/papers/gfs-sosp2003.pdf

Tips for improving Map-reduce:

  1. http://www.cloudera.com/blog/2009/12/7-tips-for-improving-mapreduce-performance/

General tutorials:

  1. http://www.javaworld.com/javaworld/jw-09-2008/jw-09-hadoop.html
  2. http://developer.yahoo.com/hadoop/tutorial
  3. Troubleshooting – http://www.cs.brandeis.edu/~cs147a/lab/hadoop-troubleshooting/