Monday, May 14, 2012

Hadoop pseudo-cluster installation

Install Java and cloudera yum repo
yum install java-1.6.0-openjdk.x86_64
curl -O http://archive.cloudera.com/redhat/cdh/cloudera-cdh3.repo
mv cloudera-cdh3.repo /etc/yum.repos.d/

Ensure that you have hostname and localhost entries in /etc/hosts
comment out ipv6 entry

Create hadoop user and group manually
Create "hdfs" and "mapred" user with group "hadoop"
groupadd hadoop
useradd -G hadoop hdfs
useradd -G hadoop mapred
passwd hdfs 
passwd mapred 

Install hadoop packages
yum install hadoop-0.20
yum install hadoop-0.20-conf-pseudo

Create directories for hdfs files and mapred temporary files as root
mkdir -p /data/hadoop
chown -R hdfs:hadoop /data/hadoop

as hdfs
chmod -R 755 /data/hadoop
mkdir -p /data/hadoop/cache
chmod 777 /data/hadoop/cache
chmod +t /data/hadoop/cache

mkdir -p /data/hadoop/tmp
chown hdfs:hadoop /data/hadoop/tmp
chmod 777 /data/hadoop/tmp

mkdir -p /data/hadoop/nn
chown hdfs:hadoop /data/hadoop/nn

mkdir -p /data/hadoop/dn
chown hdfs:hadoop /data/hadoop/dn

mkdir -p /data/hadoop/snn
chown hdfs:hadoop /data/hadoop/snn

As mapred:
mkdir /data/hadoop/cache/mapred-tmp
chown mapred:hadoop /data/hadoop/cache/mapred-tmp

mkdir /data/hadoop/cache/mapred-local
chown mapred:hadoop /data/hadoop/cache/mapred-local

mkdir -p /data/hadoop/mapred-system
chmod 777 /data/hadoop/mapred-system
chown -R mapred:hadoop /data/hadoop/mapred-system

Move and store default configuration to another directory as root.
mkdir -p /etc/hadoop/conf.pseudo.copy
cp /etc/hadoop/conf.pseudo/* /etc/hadoop/conf.pseudo.copy/
cd /etc/hadoop/conf.pseudo/

Edit various configuration files with entries for the directories made above
hdfs-site.xml:
    <property>
        <name>dfs.replication</name>
        <value>1</value> 
    </property>
    <property>
        <name>dfs.name.dir</name>
        <value>/data/hadoop/nn</value> 
    </property>
    <property>
        <name>dfs.data.dir</name>
        <value>/data/hadoop/dn</value> 
    </property>
    <property>
        <name>dfs.permissions.supergroup</name>
        <value>hadoop</value> 
    </property>

core-site.xml:
    
    <property>
     <name>hadoop.tmp.dir</name>
        <value>/data/hadoop/tmp/${user.name}</value> 
    </property>
    <property>
        <name>fs.checkpoint.dir</name>
        <value>/data/hadoop/snn</value> 
    </property>

mapred-site.xml:
   
    <property>
        <name>mapred.local.dir</name>
        <value>/data/hadoop/cache/mapred-local</value> 
    </property>
    <property>
        <name>mapred.temp.dir</name>
        <value>/data/hadoop/cache/mapred-tmp</value> 
    </property>
    <property>
       <name>mapred.system.dir</name>
       <value>/data/hadoop/mapred-system</value>
    </property>

You may have to set the JAVA_HOME sometimes. Mostly the scripts should be able to figure that out.
vi /etc/profile and vi /usr/bin/hadoop
export JAVA_HOME=/usr/lib/jvm/java-1.6.0-openjdk.x86_64

You can also add it to : /usr/lib/hadoop-0.20/bin/hadoop-config.sh if "sudo service hadoop-0.20-namenode start" cribs about java_home

Format Hadoop file system with command:
sudo -u hdfs hadoop namenode -format

Now you can start services with the following commands:
/etc/init.d/hadoop-0.20-namenode start 
/etc/init.d/hadoop-0.20-secondarynamenode start
/etc/init.d/hadoop-0.20-datanode star
/etc/init.d/hadoop-0.20-jobtracker start
/etc/init.d/hadoop-0.20-tasktracker start

or you can configure services to startup during boot time:
sudo chkconfig hadoop-0.20-namenode on
sudo chkconfig hadoop-0.20-jobtracker on
sudo chkconfig hadoop-0.20-secondarynamenode on
sudo chkconfig hadoop-0.20-tasktracker on
sudo chkconfig hadoop-0.20-datanode on


Thats it and try testing your installation with a few simple hadoop commands.

73 comments:

  1. Hi,

    For apache hadoop-2.0.0-alpha installation on two linux machines, what should be values of fs.defaultFS and dfs.name.dir and dfs.data.dir properties on both name nodes????

    one machine hostname is rsi-nod-nsn1 and another one is rsi-nod-nsn2...

    i want to make both as federated namenodes.. and both should be used as datanodes too..

    i want to configure both federation anf YARN.

    what should be configuration changes for the same? i am not finding masters, mapred-site.xml, and hadoop-env.sh files in hadoopHome/etc/hadoop folder... how do i make changes for these files?

    regards,
    rashmi

    ReplyDelete
  2. This is a great inspiring tutorials on hadoop.I am pretty much pleased with your good work.You put really very helpful information. Keep it up.
    Hadoop Training in hyderabad

    ReplyDelete
  3. Well Said. The content provided is true up to my knowledge. This made me to understand the concepts very clear. Thanks for sharing this wonderful information in here. Keep blogging article like this. I have bookmarked this page for future reference as well.


    Hadoop Training Chennai | Big Data Training in Chennai | JAVA Course in Chennai

    ReplyDelete
  4. Great information. Thanks for providing us such a useful information. Keep up the good work and continue providing us more quality information from time to time. Big data Hadoop Training

    ReplyDelete
  5. Thanks for providing this informative information…..
    You may also refer-
    http://www.s4techno.com/blog/category/hadoop/

    ReplyDelete
  6. Thanks for sharing Valuable information. Greatful Info about hadoop. Really helpful. Keep sharing........... If it possible share some more tutorials.........

    ReplyDelete

  7. i really likes your blog and You have shared the whole concept really well. and Very beautifully
    บาคาร่าออนไลน์
    gclubwritten,
    soulful read! thanks for sharing.
    GCLUB มือถือ

    ReplyDelete
  8. Does your blog have a contact page? I’m having problems locating it but, I’d like to shoot you an email. I’ve got some recommendations for your blog you might be interested in hearing.
    industrial course in chennai

    ReplyDelete
  9. I am so proud of you and your efforts and work make me realize that anything can be done with patience and sincerity. Well I am here to say that your work has inspired me without a doubt.
    python Training institute in Pune
    python Training institute in Chennai
    python Training institute in Bangalore

    ReplyDelete
  10. Good article.
    For Python training in bangalore,visit:
    Python training in bangalore

    ReplyDelete
  11. Very interesting blog Thank you for sharing such a nice and interesting blog and really very helpful article.sap mm Training in Bangalore

    ReplyDelete
  12. Its really helpful for the users of this site. I am also searching about these type of sites now a days. So your site really helps me for searching the new and great stuff.sap basis Training in Bangalore

    ReplyDelete
  13. Very useful and information content has been shared out here, Thanks for sharing it.sap hr Training in Bangalore

    ReplyDelete
  14. I gathered a lot of information through this article.Every example is easy to undestandable and explaining the logic easily.sap sd Training in Bangalore

    ReplyDelete
  15. Your articles really impressed for me,because of all information so nice.sap ehs Training in Bangalore

    ReplyDelete
  16. Being new to the blogging world I feel like there is still so much to learn. Your tips helped to clarify a few things for me as well as giving.sap bods Training in Bangalore

    ReplyDelete
  17. Really it was an awesome article,very interesting to read.You have provided an nice article,Thanks for sharing.sap abap Training in Bangalore

    ReplyDelete
  18. I know that it takes a lot of effort and hard work to write such an informative content like this.sap fico Training in Bangalore

    ReplyDelete
  19. Attend online training from one of the best training institute Data Science Training in Hyderabad

    ReplyDelete
  20. Thanks for sharing this information. I really Like Very Much.
    best devops online training

    ReplyDelete


  21. Very Helpful Article. livescore It might help you. livescore Thanks For Sharing
    livescore Thank you very much.


    ReplyDelete


  22. This Is Really Useful And Nice Information. livescore
    This are such great articles. livescore This articles can help you to make some new ideas.
    http://site-2272261-6860-7525.mystrikingly.com/blog/r-zentric-brings-supercar-aerodynamics-to-tesla-s-model-3-performance I appreciate for reading my blogs.


    ReplyDelete

  23. This Is Really Useful And Nice Information. บาคาร่า pantip
    This are such great articles. บาคาร่า pantip This articles can help you to make some new ideas.
    https://5e43ec86db9aa.site123.me/blog/open-wheel-racing-in-gta-online-just-got-a-lot-more-interesting I appreciate for reading my blogs.

    ReplyDelete
  24. Mindblowing blog appreciating your endless efforts in developing a truly transparent content. Which probably the best one to come across disclosing the content which people might not aware of it. Thanks for bringing out the amazing content and keep sharing more further.

    360DigiTMG PMP Certification Course

    ReplyDelete
  25. What's more, only an expert partner can help you adapt Salesforce functionality to the existing processes and workflows. Salesforce training in Chennai

    ReplyDelete
  26. hanks for Sharing This Article.It is very so much valuable content. I hope these Commenting lists will help to my website
    devops online training
    best devops online training
    top devops online training

    ReplyDelete
  27. Thanks mate. I am really impressed with your writing talents and also with the layout on your weblog. Appreciate, Is this a paid subject matter or did you customize it yourself? Either way keep up the nice quality writing, it is rare to peer a nice weblog like this one nowadays. Thank you, check also event marketing and Event Invitation Email Examples

    ReplyDelete
  28. Just the way I have expected. Your website really is interesting.
    data scientist course in hyderabad

    ReplyDelete
  29. your blog everyday and try to learn something from your blog. Thank you and I'm waiting for your new post.
    best data science institute in hyderabad

    ReplyDelete
  30. Very nice blog and articles. I am really very happy to visit your blog. Now I am finding which I actually want. I check your blog everyday and try to learn something from your blog. Thank you and I'm waiting for your new post.

    Best Data Science courses in Hyderabad

    ReplyDelete
  31. Really nice and interesting post. I was looking for this kind of information and enjoyed reading this one. Keep posting. Thanks for sharing.
    best data science institute in hyderabad

    ReplyDelete
  32. You completely match our expectation and the variety of our information.
    best data science institute in hyderabad

    ReplyDelete
  33. Nice blog post,
    Google Adwords Certification Course
    For every online business, we need a digital marketing strategy which helps to generate traffic on our website. In order to show our ad on Google for certain keywords, we need a platform known as Google Adwords.

    ReplyDelete
  34. Thanks for such a great post and the review, I am totally impressed! Keep stuff like this coming.
    best data science institute in hyderabad

    ReplyDelete
  35. A good blog always comes-up with new and exciting information and while reading I have feel that this blog is really have all those quality that qualify a blog to be a one.
    digital marketing courses in hyderabad with placement

    ReplyDelete
  36. I was actually browsing the internet for certain information, accidentally came across your blog found it to be very impressive. I am elated to go with the information you have provided on this blog, eventually, it helps the readers whoever goes through this blog. Hoping you continue the spirit to inspire the readers and amaze them with your fabulous content.
    th
    Data Science Course in Faridabad

    ReplyDelete
  37. Hello! I just wish to give an enormous thumbs up for the nice info you've got right here on this post. I will probably be coming back to your weblog for more soon!
    best digital marketing course in hyderabad

    ReplyDelete
  38. This is just the information I am finding everywhere. Thanks for your blog, I just subscribe your blog. This is a nice blog..
    data scientist certification malaysia

    ReplyDelete
  39. These thoughts just blew my mind. I am glad you have posted this. data scientist course in kanpur

    ReplyDelete
  40. I was basically inspecting through the web filtering for certain data and ran over your blog. I am flabbergasted by the data that you have on this blog. It shows how well you welcome this subject. Bookmarked this page, will return for extra.

    ReplyDelete
  41. This comment has been removed by the author.

    ReplyDelete
  42. Through this post, I realize that your great information in playing with all the pieces was exceptionally useful. I advise this is the primary spot where I discover issues I've been scanning for. You have a smart yet alluring method of composing. data scientist course in mysore

    ReplyDelete
  43. Thanks for sharing this information. I really like your blog post very much. You have really shared a informative and interesting blog post . data analytics course in surat

    ReplyDelete
  44. This is a great post. I like this topic.This site has lots of advantage.I found many interesting things from this site. It helps me in many ways.Thanks for posting this again. data analytics course in surat

    ReplyDelete
  45. 28.
    An interesting discussion might be priced at comment. I do think that you need to write read more about this topic, it might be described as a taboo subject but typically consumers are not enough to communicate on such topics. To another location. Cheers

    ReplyDelete
  46. Thanks for posting this info. I just want to let you know that I just check out your site and I find it very interesting and informative. I can't wait to read lots of your posts.
    cyber security training malaysia

    ReplyDelete
  47. Another task that data scientists do mostly is to develop various machine learning models and algorithms so that a large amount of data can be analyzed more efficiently.

    ReplyDelete
  48. It would help if you thought that the data scientists are the highest-paid employees in a company.
    data science course in kochi

    ReplyDelete
  49. Good blog and absolutely exceptional. You can do a lot better, but I still say it's perfect. Keep doing your best. buy verified coinpayments account

    ReplyDelete
  50. Great post i must say and thanks for the information. Education is definitely a sticky subject. However, is still among the leading topics of our time. I appreciate your post and look forward to more. verified ECurrency account

    ReplyDelete
  51. This blog is really nice! I learn more knowledge from this post and Keep sharing with us.
    pvc foam board manufacturers in kerala
    pvc foam board manufacturers kerala

    ReplyDelete
  52. Great post, keep sharing interesting and valuable information to us Software Testing Classes in Pune

    ReplyDelete
  53. Thank you for your fantastic post! I thoroughly loved reading that; you are an excellent author. I'll be sure to bookmark your blog and return someday. I'd like to encourage you to keep up the good work.
    Python institute in hyderabad

    ReplyDelete
  54. "I've already learned so much from this blog post, it's a gold mine of knowledge!"
    Salesforce CPQ Training

    ReplyDelete
  55. "I found this blog post to be incredibly informative and engaging, and it perfectly captures the essence of the topic!"
    Golang Certification

    ReplyDelete