Professional Documents
Culture Documents
Programming With Mahout: Target and Introduction
Programming With Mahout: Target and Introduction
1.2.1 Check out from subversion: (by default the file will be stored in the home folder) svn co http://svn.apache.org/repos/asf/mahout/trunk 1.2.2 change the checked out project folder name to MAHOUT (or any name you like)
mv trunk MAHOUT
Figure 1
1.4, Mahout Display Mahout allows us to do iterative MapReduce job during the processing. It is especially useful when dealing with Data Mining problems, e.g. KMeans. Here we just use some built-in examples to show how does Mahout display clusters with random sample data.
2
Now we have the mahout project in our Eclipse workspace, normally, we have the latest version of the mahout. It is quite convenient for us to study, develop mahout application. Here, we have the source code, you can download at our website. File Name ClustersFilter.java Display.java Graphic.java ReadData.java Description This java file implements the PathFilter Class of Mahout 0.6 Show the results Initialize all Disaplay.java Read CSV file the methods needed in
To write your own code, you should Add all mahout 0.6 libraries. You can find them under the folder
Reference: File Description ClustersCanopy.java: File Name DisplayCanopy.java DisplayDirichlet.java DisplayFuzzyKMeans.java DisplayKMeans.java DisplayMeansShift.java DisplaySpectralKMeans.java Description https://cwiki.apache.org/confluence/display/MAHOUT/Canopy+ Clustering https://cwiki.apache.org/confluence/display/MAHOUT/Dirichlet +Process+Clustering https://cwiki.apache.org/confluence/display/MAHOUT/Fuzzy+KMeans https://cwiki.apache.org/confluence/display/MAHOUT/K-Means +Clustering https://cwiki.apache.org/confluence/display/MAHOUT/Mean+S hift+Clustering https://cwiki.apache.org/confluence/display/MAHOUT/Spectral +Clustering