運用R做樹狀圖

R中的原地址爲

http://rstudio-pubs-static.s3.amazonaws.com/1876_df0bf890dd54461f98719b461d987c3d.html

考慮到原地址可能失效,這裏做簡單的翻譯和備份,有關聚類的R包可以參考cluster包和ape包


以下是正文:


The most basic dendrogram

Let's start with the most basic type of dendrogram. For that purpose we'll use themtcars dataset and we'll calculate a hierarchical clustering with the functionhclust (with the default options).

讓我們從最基本聚類樹狀圖開始。爲此目的,我們將使用mtcars數據集和我們計算的層次聚類hclust函數(與默認選項

# prepare hierarchical cluster 生成層次聚類
hc = hclust(dist(mtcars))
# very simple dendrogram     默認畫法
plot(hc)


We can put the labels of the leafs at the same level like this

我們可以將樣本定義在同一水平 (實在不明白help一下plot函數)

A less basic dendrogram

In order to add more format to the dendrograms, we need to tweek the right parameters. For instance, we could get the following graphic (just for illustration purposes!)

一個基本的樹狀圖

爲了增加更多格式的圖,我們需要修改正確的參數。例如我們可以得到下面的圖形(僅作說明用途

par(op)

##這裏強調一下,par()是對圖進行自動調整,貌似功能還挺強大的。新浪有位哥們兒總結得特別好,附贈地址:

http://blog.sina.com.cn/s/blog_8f5b2a2e0102v0tf.html

貌似可以靠par()函數調整圖的座標軸什麼的,我沒試過哦~~

Alternative dendrograms

An alternative way to produce dendrograms is to specifically convert hclust objects intodendrograms objects

另類聚類圖

將hclude生成的對象轉換爲另類的聚類圖

# using dendrogram objects
hcd = as.dendrogram(hc)
# alternative way to get a dendrogram
plot(hcd)

Having an object of class dendrogram, we can also plot the branches in a triangular form

畫完這個畫三角形的

Zooming-in on dendrograms

Another very useful option is the ability to inspect selected parts of a given tree. For instance, if we wanted to examine the top partitions of the dendrogram, we could cut it at a height of 75

放大在樹狀圖

另一個非常有用的功能是選擇樹的一部分。例如如果我們要研究的樹狀圖的分區我們可以把它在一個高度75

# plot dendrogram with some cuts
op = par(mfrow = c(2, 1))
plot(cut(hcd, h = 75)$upper, main = "Upper tree of cut at h=75")
plot(cut(hcd, h = 75)$lower[[2]], main = "Second branch of lower tree with cut at h=75")

par(op)

Customized dendrograms

In order to get more customized graphics we need a little bit of more code. A very useful resource is the functiondendrapply that can be used to apply a function to all nodes of a dendrgoram. This comes very handy if we want to add some color to the labels.

爲了獲得更多的定製的圖形,我們需要更多的代碼。一個非常有用的功能dendrapply可以應用一個函數的一dendrgoram所有節點。如果我們要添加一些色彩的標籤這是非常方便的


# vector of colors labelColors = c('red', 'blue', 'darkgreen', 'darkgrey',
# 'purple')
labelColors = c("#CDB380", "#036564", "#EB6841", "#EDC951")
# cut dendrogram in 4 clusters
clusMember = cutree(hc, 4)
# function to get color labels
colLab <- function(n) {
    if (is.leaf(n)) {
        a <- attributes(n)
        labCol <- labelColors[clusMember[which(names(clusMember) == a$label)]]
        attr(n, "nodePar") <- c(a$nodePar, lab.col = labCol)
    }
    n
}
# using dendrapply
clusDendro = dendrapply(hcd, colLab)
# make plot
plot(clusDendro, main = "Cool Dendrogram", type = "triangle")

Phylogenetic trees

A very nice tool for displaying more appealing trees is provided by the R packageape. In this case, what we need is to convert thehclust objects intophylo pbjects with the funtions as.phylo

系統進化樹

由R包ape提供更具吸引力的樹非常好的工具我們利用as.phylo功能將hclust objects轉換成phylo對象

# load package ape; remember to install it: install.packages('ape')
library(ape)
# plot basic tree
plot(as.phylo(hc), cex = 0.9, label.offset = 1)


The plot.phylo function has four more different types for plotting a dendrogram. Here they are:

plot.phylo函數的4種不同類型的聚類樹形圖

# cladogram
plot(as.phylo(hc), type = "cladogram", cex = 0.9, label.offset = 1)

# unrooted
plot(as.phylo(hc), type = "unrooted")

下面是我最喜歡的圓形樹形圖

# fan   
plot(as.phylo(hc), type = "fan")

# radial
plot(as.phylo(hc), type = "radial")

Customizing phylogenetic trees

What I really like about the ape package is that we have more control on the appearance of the dendrograms, being able to customize them in different ways. For example:

自定義的系統進化樹

ape 包對樹的性狀有着很多控制,能夠定製他們以不同的方式。例如

# add colors randomly
plot(as.phylo(hc), type = "fan", tip.color = hsv(runif(15, 0.65, 
    0.95), 1, 1, 0.7), edge.color = hsv(runif(10, 0.65, 0.75), 1, 1, 0.7), edge.width = runif(20, 
    0.5, 3), use.edge.length = TRUE, col = "gray80")


Again, we can tweek some parameters according to our needs

我們可以根據需求修改一些參數

# vector of colors
mypal = c("#556270", "#4ECDC4", "#1B676B", "#FF6B6B", "#C44D58")
# cutting dendrogram in 5 clusters
clus5 = cutree(hc, 5)
# plot
op = par(bg = "#E8DDCB")
# Size reflects miles per gallon
plot(as.phylo(hc), type = "fan", tip.color = mypal[clus5], label.offset = 1, 
    cex = log(mtcars$mpg, 10), col = "red")

par(op)

Color in leaves

彩色葉子節點

The R package sparcl provides the ColorDendrogram function that allows to add some color. For example, we can add color to the leaves

R包還提供ColorDendrogram功能來讓我們給聚類樹點顏色看看。比如我們可以給葉子節點來點顏色

# install.packages('sparcl')
library(sparcl)
# colors the leaves of a dendrogram
y = cutree(hc, 3)
ColorDendrogram(hc, y = y, labels = names(y), main = "My Simulated Data", 
    branchlength = 80)

ggdendro

For reasons that are unknown to me, the The R package ggplot2 have no functions to plot dendrograms. However, the ad-hoc packageggdendro offers a decent solution. You would expect to have more customization options, but so far they are rather limited. Anyway, for those of us who are ggploters this is another tool in our toolkit.

R包ggplot2沒有功能繪製樹狀圖的原因我不知道。然而,包ggdendro提供一個像樣的解決方案。你希望有更多的自定義選項,但到目前爲止他們相當有限。不管怎樣對於我們這些誰是ggploters這是我們工具的另一個工具


# install.packages('ggdendro')
library(ggdendro)
# basic option
ggdendrogram(hc)

# another option
ggdendrogram(hc, rotate = TRUE, size = 4, theme_dendro = FALSE, color = "tomato")

# Triangular lines
ddata <- dendro_data(as.dendrogram(hc), type = "triangle")
ggplot(segment(ddata)) + geom_segment(aes(x = x, y = y, xend = xend,
    yend = yend)) + ylim(-10, 150) + geom_text(data = label(ddata), aes(x = x,
    y = y, label = label), angle = 90, lineheight = 0)


Colored dendrogram

Last but not least, there's one more resource available from Romain Francois'saddicted to Rgraph gallery which I find really interesting. The code in R for generating colored dendrograms, which you can download and modify if wanted so, is availablehere

最後,你可以到羅曼弗朗索瓦的圖形庫裏面進一步學習~~~

你甚至可以修改他的代碼

地址是:

http://gallery.r-enthusiasts.com/RGraphGallery.php?graph=79 (貌似要翻牆)

http://addictedtor.free.fr/packages/A2R/lastVersion/R/code.R

# load code of A2R function
source("http://addictedtor.free.fr/packages/A2R/lastVersion/R/code.R")
# colored dendrogram
op = par(bg = "#EFEFEF")
A2Rplot(hc, k = 3, boxes = FALSE, col.up = "gray50", col.down = c("#FF6B6B", 
    "#4ECDC4", "#556270"))

par(op)

# another colored dendrogram
op = par(bg = "gray15")
cols = hsv(c(0.2, 0.57, 0.95), 1, 1, 0.8)
A2Rplot(hc, k = 3, boxes = FALSE, col.up = "gray50", col.down = cols)


par(op)

發表評論
所有評論
還沒有人評論,想成為第一個評論的人麼? 請在上方評論欄輸入並且點擊發布.
相關文章