2.1控制結構
如下圖
如if
if(x>3){ 這種寫法較常見,有特色的則是這種 y<- if(x>-3) {
y<-10 10
}else{ }else{
y<-0 0
} }
seq_along(x) 輸入是一個向量,輸出爲一個同等長度的向量。在第三個例子中letter其實是一個索引,用letter,h,a ,b ,如果喜歡的話shit也行,這說明索引可以不只用數值
下面是一個內嵌式的for 循環
那個coin相當於拋硬幣,是1的話z就加1,否則減1,直到z不滿足while中的條件爲止
好吧,這一個在很多結果優化中有用到,repeat在使用的時候最後加個循環次數的限制之類的。
所以return要注意,它等於中斷循環外加返回值
2.2函數
這裏主要講了就一個參數匹配的問題
在這個圖中如果 f <-function(a,b){b^2},那麼結果則是Error in f(2) : argument "b" is missing, with no default,其實本質是還是位置匹配啊
有意思的是lazy evaluation的問題
那麼。。。的用途還有一種如
輸入args(paste)
結果是 function (..., sep = " ", collapse = NULL)
....當參數傳給函數的時候而未知的時候,也要用....,如此處的paste的第一個參數是未知的
2.3 scoping rules
如上圖base一般會在表的最後,而最前則是environ。search()其實就是搜索的列表罷了
dynamic scoping 動態作用域 scoping rule作用域規則 lexical scoping詞法作用域 static scoping靜態作用域
在上圖中可瞄到,x,y 是形參。而z又非局部變量。
Lexical vs. Dynamic Scoping
When a function is defined in the global environment and is subsequently called from the global environment, then the defining environment and the calling environment are the same. This can sometimes give the appearance of dynamic scoping.
> g <- function(x) {
+ a <- 3
+ x+a+y
+ }
> g(2)
Error in g(2) : object "y" not found
> y <- 3
> g(2)
[1] 8
Application: Optimization
Why is any of this information useful?
- Optimization routines in R like
optim
,nlm
, andoptimize
require you to pass a function whose argument is a vector of parameters (e.g. a log-likelihood) - However, an object function might depend on a host of other things besides its parameters (like data)
- When writing software which does optimization, it may be desirable to allow the user to hold certain parameters fixed
Maximizing a Normal Likelihood
Write a “constructor” function
make.NegLogLik <- function(data, fixed=c(FALSE,FALSE)) {
params <- fixed
function(p) {
params[!fixed] <- p
mu <- params[1]
sigma <- params[2]
a <- -0.5*length(data)*log(2*pi*sigma^2)
b <- -0.5*sum((data-mu)^2) / (sigma^2)
-(a + b)
}
}
Note: Optimization functions in R minimize functions, so you need to use the negative log-likelihood.
Maximizing a Normal Likelihood
> set.seed(1); normals <- rnorm(100, 1, 2)
> nLL <- make.NegLogLik(normals)
> nLL
function(p) {
params[!fixed] <- p
mu <- params[1]
sigma <- params[2]
a <- -0.5*length(data)*log(2*pi*sigma^2)
b <- -0.5*sum((data-mu)^2) / (sigma^2)
-(a + b)
}
<environment: 0x165b1a4>
> ls(environment(nLL))
[1] "data" "fixed" "params"
Estimating Parameters
> optim(c(mu = 0, sigma = 1), nLL)$par
mu sigma
1.218239 1.787343
Fixing σ = 2
> nLL <- make.NegLogLik(normals, c(FALSE, 2))
> optimize(nLL, c(-1, 3))$minimum
[1] 1.217775
Fixing μ = 1
> nLL <- make.NegLogLik(normals, c(1, FALSE))
> optimize(nLL, c(1e-6, 10))$minimum
[1] 1.800596
Plotting the Likelihood
nLL <- make.NegLogLik(normals, c(1, FALSE))
x <- seq(1.7, 1.9, len = 100)
y <- sapply(x, nLL)
plot(x, exp(-(y - min(y))), type = "l")
nLL <- make.NegLogLik(normals, c(FALSE, 2))
x <- seq(0.5, 1.5, len = 100)
y <- sapply(x, nLL)
plot(x, exp(-(y - min(y))), type = "l")
Lexical Scoping Summary
- Objective functions can be “built” which contain all of the necessary data for evaluating the function
- No need to carry around long argument lists — useful for interactive and exploratory work.
- Code can be simplified and cleand up
- Reference: Robert Gentleman and Ross Ihaka (2000). “Lexical Scope and Statistical Computing,” JCGS, 9, 491–508.
好吧,接下來是R的向量化運算的一些小點
如x<-c(1.3.5)
x==3 其實如同c(1,3,5)==c(3,3,3)
另一矩陣運算的話,如x<-matrix(1:4,2,2) y<-matrix(rep(10,4),2,2)
x*y就是矩陣內每個元素的乘積,x%*%y纔是真正的矩陣運算。
2.4時間與日期
R有時間與日期的特殊的呈現方式
dates by Date class
Times by POSIXct or POSIXlt class
dates are stored internally(內在的)as the number of days since 1970-01-01
times are stored internally as the number of seconds since 1970-01-01
在使用的時候可用as.Date將字符轉爲日期
如:x
[1] "1970-01-01"
> class(x)
[1] "Date"
> unclass(x)
[1] 0
> unclass(as.Date("1970-01-01"))
[1] 0
> unclass(as.Date("1970-01-03"))
[1] 2
unclass之後就是天數了喲,哈哈
另一個關於unclass的例子
> class(m)
[1] "data.frame"
> unclass(m)
$myFamilyAges
[1] 43 42 12 8 5
$myFamilyGenders
[1] Male Female Female Male Female
Levels: Female Male
attr(,"row.names")
[1] 1 2 3 4 5
示例如下:
x<-Sys.time()
> x
[1] "2014-07-10 16:03:18 CST"
> p <-as.POSIXlt(x)
> p
[1] "2014-07-10 16:03:18 CST"
names(unclass(p))
[1] "sec" "min" "hour" "mday" "mon" "year"
[7] "wday" "yday" "isdst" "zone" "gmtoff"
> p$sec
[1] 18.1878
unclass(x)
[1] 1404979398
果真是列表形式啊哈,與另一種POSIXct的區別