R語言程序設計week2

2.1控制結構

如下圖

如if

if(x>3){ 這種寫法較常見，有特色的則是這種 y<- if(x>-3) {

y<-10 10

}else{ }else{

y<-0 0

} }

seq_along(x) 輸入是一個向量，輸出爲一個同等長度的向量。在第三個例子中letter其實是一個索引，用letter,h,a ,b ,如果喜歡的話shit也行，這說明索引可以不只用數值

下面是一個內嵌式的for 循環

那個coin相當於拋硬幣，是1的話z就加1,否則減1，直到z不滿足while中的條件爲止

好吧，這一個在很多結果優化中有用到，repeat在使用的時候最後加個循環次數的限制之類的。

所以return要注意，它等於中斷循環外加返回值

2.2函數

這裏主要講了就一個參數匹配的問題

在這個圖中如果 f <-function(a,b){b^2}，那麼結果則是Error in f(2) : argument "b" is missing, with no default，其實本質是還是位置匹配啊

有意思的是lazy evaluation的問題

那麼。。。的用途還有一種如

輸入args(paste)

結果是 function (..., sep = " ", collapse = NULL)

....當參數傳給函數的時候而未知的時候，也要用....，如此處的paste的第一個參數是未知的

2.3 scoping rules

如上圖base一般會在表的最後，而最前則是environ。search()其實就是搜索的列表罷了

dynamic scoping 動態作用域 scoping rule作用域規則 lexical scoping詞法作用域 static scoping靜態作用域

在上圖中可瞄到，x,y 是形參。而z又非局部變量。

Lexical vs. Dynamic Scoping

When a function is defined in the global environment and is subsequently called from the global environment, then the defining environment and the calling environment are the same. This can sometimes give the appearance of dynamic scoping.

> g <- function(x) { 
+ a <- 3
+ x+a+y 
+ }
> g(2)
Error in g(2) : object "y" not found
> y <- 3
> g(2)
[1] 8

Application: Optimization

Why is any of this information useful?

Optimization routines in R like optim, nlm, and optimize require you to pass a function whose argument is a vector of parameters (e.g. a log-likelihood)
However, an object function might depend on a host of other things besides its parameters (like data)
When writing software which does optimization, it may be desirable to allow the user to hold certain parameters fixed

Maximizing a Normal Likelihood

Write a “constructor” function

make.NegLogLik <- function(data, fixed=c(FALSE,FALSE)) {
        params <- fixed
        function(p) {
                params[!fixed] <- p
                mu <- params[1]
                sigma <- params[2]
                a <- -0.5*length(data)*log(2*pi*sigma^2)
                b <- -0.5*sum((data-mu)^2) / (sigma^2)
                -(a + b)
        } 
}

Note: Optimization functions in R minimize functions, so you need to use the negative log-likelihood.

Maximizing a Normal Likelihood

> set.seed(1); normals <- rnorm(100, 1, 2)
> nLL <- make.NegLogLik(normals)
> nLL
function(p) {
                params[!fixed] <- p
                mu <- params[1]
                sigma <- params[2]
                a <- -0.5*length(data)*log(2*pi*sigma^2)
                b <- -0.5*sum((data-mu)^2) / (sigma^2)
                -(a + b)
        }
<environment: 0x165b1a4>
> ls(environment(nLL))
[1] "data"   "fixed"  "params"

Estimating Parameters

> optim(c(mu = 0, sigma = 1), nLL)$par
      mu    sigma
1.218239 1.787343

Fixing σ = 2

> nLL <- make.NegLogLik(normals, c(FALSE, 2))
> optimize(nLL, c(-1, 3))$minimum
[1] 1.217775

Fixing μ = 1

> nLL <- make.NegLogLik(normals, c(1, FALSE))
> optimize(nLL, c(1e-6, 10))$minimum
[1] 1.800596

Plotting the Likelihood

nLL <- make.NegLogLik(normals, c(1, FALSE))
x <- seq(1.7, 1.9, len = 100)
y <- sapply(x, nLL)
plot(x, exp(-(y - min(y))), type = "l")

nLL <- make.NegLogLik(normals, c(FALSE, 2))
x <- seq(0.5, 1.5, len = 100)
y <- sapply(x, nLL)
plot(x, exp(-(y - min(y))), type = "l")

Lexical Scoping Summary

Objective functions can be “built” which contain all of the necessary data for evaluating the function
No need to carry around long argument lists — useful for interactive and exploratory work.
Code can be simplified and cleand up
Reference: Robert Gentleman and Ross Ihaka (2000). “Lexical Scope and Statistical Computing,” JCGS, 9, 491–508.

好吧，接下來是R的向量化運算的一些小點

如x<-c(1.3.5)

x==3 其實如同c(1,3,5)==c(3,3,3)

另一矩陣運算的話，如x<-matrix(1:4,2,2) y<-matrix(rep(10,4),2,2)

x*y就是矩陣內每個元素的乘積，x%*%y纔是真正的矩陣運算。

2.4時間與日期

R有時間與日期的特殊的呈現方式

dates by Date class

Times by POSIXct or POSIXlt class

dates are stored internally(內在的）as the number of days since 1970-01-01

times are stored internally as the number of seconds since 1970-01-01

在使用的時候可用as.Date將字符轉爲日期

如：x
[1] "1970-01-01"
> class(x)
[1] "Date"
> unclass(x)
[1] 0
> unclass(as.Date("1970-01-01"))
[1] 0
> unclass(as.Date("1970-01-03"))
[1] 2

unclass之後就是天數了喲，哈哈

另一個關於unclass的例子

> class(m)
[1] "data.frame"
> unclass(m)
$myFamilyAges
[1] 43 42 12 8 5

$myFamilyGenders
[1] Male Female Female Male Female
Levels: Female Male

attr(,"row.names")
[1] 1 2 3 4 5